Re: Cloud-init datasource ordering

2019-04-23 Thread Bastian Blank
On Wed, Apr 03, 2019 at 08:54:58AM +, Kenn Leth Hansen wrote:
> I've discovered the CloudStack datasource causes cloud-init to wait for 120 
> seconds before continuing to the next datasource if no service is responding 
> on port 80 on the DHCP server address. As I am using an Ec2-style metadata 
> server, I need to wait 120 seconds every time i boot an instance until it is 
> operational as this listens on "magic" IP 169.254.169.254 and I cannot easily 
> redirect traffic to this IP.
> Would it be possible to move the Ec2 datasource up the list like "[ NoCloud, 
> AltCloud, ConfigDrive, OpenStack, Ec2, CloudStack, ${DIGITAL_OCEAN_SOURCE} 
> MAAS, OVF, GCE, None ]"? This also seems to be in line with expectations on 
> how the datasources have been sorted before dc1abbe1.

The data-source override is mostly gone from new images.  The cloud-init
from Buster does even more auto-detection for supported stuff.  Please
test this version.

Regards,
Bastian

-- 
Phasers locked on target, Captain.



Re: Cloud-init datasource ordering

2019-04-04 Thread Thomas Goirand
On 4/3/19 6:42 PM, Paul Graydon wrote:
> On 4/3/19 08:29, Thomas Goirand wrote:
>> On 4/3/19 10:54 AM, Kenn Leth Hansen wrote:
>>> Hi list,
>>>
>>> I've discovered the CloudStack datasource causes cloud-init to wait
>>> for 120 seconds before continuing to the next datasource if no
>>> service is responding on port 80 on the DHCP server address. As I am
>>> using an Ec2-style metadata server, I need to wait 120 seconds every
>>> time i boot an instance until it is operational as this listens on
>>> "magic" IP 169.254.169.254 and I cannot easily redirect traffic to
>>> this IP.
>>>
>>> The CloudStack datasource seems to have been introduced in dc1abbe1
>>> but I don't see any reasoning for the ordering. In commit ce65da02
>>> CloudStack was moved down the list to fix similar symptoms with
>>> OpenStack.
>>>
>>> Would it be possible to move the Ec2 datasource up the list like "[
>>> NoCloud, AltCloud, ConfigDrive, OpenStack, Ec2, CloudStack,
>>> ${DIGITAL_OCEAN_SOURCE} MAAS, OVF, GCE, None ]"? This also seems to
>>> be in line with expectations on how the datasources have been sorted
>>> before dc1abbe1.
>>>
>>> Regards,
>>> Kenn
>> If we do that, then OpenStack people are going to wait 120 seconds. So,
>> bad idea...
>>
>> Cheers,
>>
>> Thomas Goirand (zigo)
> 
> I may be missing something, but why would that impact OpenStack
> privisioning?  They'd still be ahead of CloudStack in the proposed
> ordering.  I'd need to trawl through the code and verify, but wouldn't
> it be sensible to put CloudStack about as far down the line as possible,
> given the overall impact (unless other data sources are expensive?).
> 
> To Kenn's comment though, in general with cloud-init you want to be
> giving it as specific a data source as you can.  If it's affecting you
> on every boot, I'd suggest putting the specific data source that is
> right for your environment in to /etc/cloud.cfg so that cloud-init jumps
> straight to that one instead of wasting time on anything else.
> 
> Cloud-init upstream has tried to introduce ways to shortcut the process
> of figuring out "Where on earth am I" process. A number of the clouds
> put something suitable in the DMI data coming from the BIOS, and I know
> cloud-init is starting to use that as a big pointer.
> 
> Paul

I'm probably missing something here. Could you please copy/paste what
the proposed datasource_list change is exactly, so that everyone
understand what we will do?

Cheers,

Thomas Goirand (zigo)



Re: Re: Cloud-init datasource ordering

2019-04-04 Thread Kenn Leth Hansen
> On 4/3/19 08:29, Thomas Goirand wrote:
>> On 4/3/19 10:54 AM, Kenn Leth Hansen wrote:
>>> Hi list,
>>>
>>> I've discovered the CloudStack datasource causes cloud-init to wait for 120 
>>> seconds before continuing to the next datasource if no service is 
>>> responding on port 80 on the DHCP server address. As I am using an 
>>> Ec2-style metadata server, I need to wait 120 seconds every time i boot an 
>>> instance until it is operational as this listens on "magic" IP 
>>> 169.254.169.254 and I cannot easily redirect traffic to this IP.
>>>
>>> The CloudStack datasource seems to have been introduced in dc1abbe1 but I 
>>> don't see any reasoning for the ordering. In commit ce65da02 CloudStack was 
>>> moved down the list to fix similar symptoms with OpenStack.
>>>
>>> Would it be possible to move the Ec2 datasource up the list like "[ 
>>> NoCloud, AltCloud, ConfigDrive, OpenStack, Ec2, CloudStack, 
>>> ${DIGITAL_OCEAN_SOURCE} MAAS, OVF, GCE, None ]"? This also seems to be in 
>>> line with expectations on how the datasources have been sorted before 
>>> dc1abbe1.
>>>
>>> Regards,
>>> Kenn
>> If we do that, then OpenStack people are going to wait 120 seconds. So,
>> bad idea...
>>
>> Cheers,
>>
>> Thomas Goirand (zigo)
>
>I may be missing something, but why would that impact OpenStack
>privisioning?  They'd still be ahead of CloudStack in the proposed
>ordering.  I'd need to trawl through the code and verify, but wouldn't
>it be sensible to put CloudStack about as far down the line as possible,
>given the overall impact (unless other data sources are expensive?).
>
>To Kenn's comment though, in general with cloud-init you want to be
>giving it as specific a data source as you can.  If it's affecting you
>on every boot, I'd suggest putting the specific data source that is
>right for your environment in to /etc/cloud.cfg so that cloud-init jumps
>straight to that one instead of wasting time on anything else.
>
>Cloud-init upstream has tried to introduce ways to shortcut the process
>of figuring out "Where on earth am I" process. A number of the clouds
>put something suitable in the DMI data coming from the BIOS, and I know
>cloud-init is starting to use that as a big pointer.
>
>Paul

I'd really appreciate if we can find an order which suits the overall
community's use-cases.

I think my proposal is sane and prioritizes the most common use-cases.

But again, this is why I am posting on the list to get a nuanced discussion.

In the meantime I'll investigate if I can provide the appropriate DMI data
to cloud-init for selection of the Ec2 datasource.

Thanks.

/Kenn



Re: Cloud-init datasource ordering

2019-04-04 Thread Kenn Leth Hansen
> > > > Would it be possible to move the Ec2 datasource up the list like "[
> > > > NoCloud, AltCloud, ConfigDrive, OpenStack, Ec2, CloudStack,
> > > > ${DIGITAL_OCEAN_SOURCE} MAAS, OVF, GCE, None ]"? This also seems to
> > > > be in line with expectations on how the datasources have been
> > > > sorted before dc1abbe1.
> > > >
> > > If we do that, then OpenStack people are going to wait 120 seconds.
> > > So,
> > > bad idea...
> >
> > Hmm, this situation is likely going to just get worse as more
> > datasources are added.
> >
> > Can we reduce the timeout?
> >
> > Try datasources in parallel and first one that responds wins?
> >
> > Is it worth having multiple images with the order set appropriately?
> 
> Yes, I think the expectation is that you should be overriding the
> default datasource list to specify only the source(s) relevant to your
> particular deployment platform. The list can be specified in the
> debconf cloud-init/datasources value.
> 
> For example, we specify an appropriate value for our Amazon EC2 images
> at 
> https://salsa.debian.org/cloud-team/debian-cloud-images/blob/master/config_space/debconf/EC2
> 
> noah

This could be a solution for us, but I'd rather use the official images instead
of having the infrastructure in place to build them internally. 

Once the image has been booted we're changing the cloud-init datasource
order for future startups of the image.

/Kenn


Re: Cloud-init datasource ordering

2019-04-03 Thread Paul Graydon

On 4/3/19 08:29, Thomas Goirand wrote:

On 4/3/19 10:54 AM, Kenn Leth Hansen wrote:

Hi list,

I've discovered the CloudStack datasource causes cloud-init to wait for 120 seconds 
before continuing to the next datasource if no service is responding on port 80 on the 
DHCP server address. As I am using an Ec2-style metadata server, I need to wait 120 
seconds every time i boot an instance until it is operational as this listens on 
"magic" IP 169.254.169.254 and I cannot easily redirect traffic to this IP.

The CloudStack datasource seems to have been introduced in dc1abbe1 but I don't 
see any reasoning for the ordering. In commit ce65da02 CloudStack was moved 
down the list to fix similar symptoms with OpenStack.

Would it be possible to move the Ec2 datasource up the list like "[ NoCloud, 
AltCloud, ConfigDrive, OpenStack, Ec2, CloudStack, ${DIGITAL_OCEAN_SOURCE} MAAS, OVF, 
GCE, None ]"? This also seems to be in line with expectations on how the datasources 
have been sorted before dc1abbe1.

Regards,
Kenn

If we do that, then OpenStack people are going to wait 120 seconds. So,
bad idea...

Cheers,

Thomas Goirand (zigo)


I may be missing something, but why would that impact OpenStack 
privisioning?  They'd still be ahead of CloudStack in the proposed 
ordering.  I'd need to trawl through the code and verify, but wouldn't 
it be sensible to put CloudStack about as far down the line as possible, 
given the overall impact (unless other data sources are expensive?).


To Kenn's comment though, in general with cloud-init you want to be 
giving it as specific a data source as you can.  If it's affecting you 
on every boot, I'd suggest putting the specific data source that is 
right for your environment in to /etc/cloud.cfg so that cloud-init jumps 
straight to that one instead of wasting time on anything else.


Cloud-init upstream has tried to introduce ways to shortcut the process 
of figuring out "Where on earth am I" process. A number of the clouds 
put something suitable in the DMI data coming from the BIOS, and I know 
cloud-init is starting to use that as a big pointer.


Paul



Re: Cloud-init datasource ordering

2019-04-03 Thread Noah Meyerhans
On Thu, Apr 04, 2019 at 09:27:11AM +1300, Andrew Ruthven wrote:
> > > Would it be possible to move the Ec2 datasource up the list like "[
> > > NoCloud, AltCloud, ConfigDrive, OpenStack, Ec2, CloudStack,
> > > ${DIGITAL_OCEAN_SOURCE} MAAS, OVF, GCE, None ]"? This also seems to
> > > be in line with expectations on how the datasources have been
> > > sorted before dc1abbe1.
> > > 
> > If we do that, then OpenStack people are going to wait 120 seconds.
> > So,
> > bad idea...
> 
> Hmm, this situation is likely going to just get worse as more
> datasources are added.
> 
> Can we reduce the timeout?
> 
> Try datasources in parallel and first one that responds wins?
> 
> Is it worth having multiple images with the order set appropriately? 

Yes, I think the expectation is that you should be overriding the
default datasource list to specify only the source(s) relevant to your
particular deployment platform. The list can be specified in the
debconf cloud-init/datasources value.

For example, we specify an appropriate value for our Amazon EC2 images
at 
https://salsa.debian.org/cloud-team/debian-cloud-images/blob/master/config_space/debconf/EC2

noah



Re: Cloud-init datasource ordering

2019-04-03 Thread Thomas Goirand
On 4/3/19 10:54 AM, Kenn Leth Hansen wrote:
> Hi list,
> 
> I've discovered the CloudStack datasource causes cloud-init to wait for 120 
> seconds before continuing to the next datasource if no service is responding 
> on port 80 on the DHCP server address. As I am using an Ec2-style metadata 
> server, I need to wait 120 seconds every time i boot an instance until it is 
> operational as this listens on "magic" IP 169.254.169.254 and I cannot easily 
> redirect traffic to this IP.
> 
> The CloudStack datasource seems to have been introduced in dc1abbe1 but I 
> don't see any reasoning for the ordering. In commit ce65da02 CloudStack was 
> moved down the list to fix similar symptoms with OpenStack.
> 
> Would it be possible to move the Ec2 datasource up the list like "[ NoCloud, 
> AltCloud, ConfigDrive, OpenStack, Ec2, CloudStack, ${DIGITAL_OCEAN_SOURCE} 
> MAAS, OVF, GCE, None ]"? This also seems to be in line with expectations on 
> how the datasources have been sorted before dc1abbe1.
> 
> Regards,
> Kenn

If we do that, then OpenStack people are going to wait 120 seconds. So,
bad idea...

Cheers,

Thomas Goirand (zigo)



Cloud-init datasource ordering

2019-04-03 Thread Kenn Leth Hansen
Hi list,

I've discovered the CloudStack datasource causes cloud-init to wait for 120 
seconds before continuing to the next datasource if no service is responding on 
port 80 on the DHCP server address. As I am using an Ec2-style metadata server, 
I need to wait 120 seconds every time i boot an instance until it is 
operational as this listens on "magic" IP 169.254.169.254 and I cannot easily 
redirect traffic to this IP.

The CloudStack datasource seems to have been introduced in dc1abbe1 but I don't 
see any reasoning for the ordering. In commit ce65da02 CloudStack was moved 
down the list to fix similar symptoms with OpenStack.

Would it be possible to move the Ec2 datasource up the list like "[ NoCloud, 
AltCloud, ConfigDrive, OpenStack, Ec2, CloudStack, ${DIGITAL_OCEAN_SOURCE} 
MAAS, OVF, GCE, None ]"? This also seems to be in line with expectations on how 
the datasources have been sorted before dc1abbe1.


Regards,
Kenn