Re: [openstack-dev] [trove] datastore migration issues

2013-12-20 Thread Greg Hill
Thanks for the input.  I'll go ahead with this plan then.

Greg

On Dec 20, 2013, at 2:06 AM, Vipul Sabhaya 
mailto:vip...@gmail.com>> wrote:

I am fine with requiring the deployer to update default values, if they don’t 
make sense for their given deployment.  However, not having any value for 
older/existing instances, when the code requires it is not good.  So let’s 
create a default datastore of mysql, with a default version, and set that as 
the datastore for older instances.  A deployer can then run trove-manage to 
update the default record created.


On Thu, Dec 19, 2013 at 6:14 PM, Tim Simpson 
mailto:tim.simp...@rackspace.com>> wrote:
I second Rob and Greg- we need to not allow the instance table to have nulls 
for the datastore version ID. I can't imagine that as Trove grows and evolves, 
that edge case is something we'll always remember to code and test for, so 
let's cauterize things now by no longer allowing it at all.

The fact that the migration scripts can't, to my knowledge, accept parameters 
for what the dummy datastore name and version should be isn't great, but I 
think it would be acceptable enough to make the provided default values 
sensible and ask operators who don't like it to manually update the database.

- Tim




From: Robert Myers [myer0...@gmail.com<mailto:myer0...@gmail.com>]
Sent: Thursday, December 19, 2013 9:59 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [trove] datastore migration issues

I think that we need to be good citizens and at least add dummy data. Because 
it is impossible to know who all is using this, the list you have is probably 
complete. But Trove has been available for quite some time and all these users 
will not be listening on this thread. Basically anytime you have a database 
migration that adds a required field you *have* to alter the existing rows. If 
we don't we're basically telling everyone who upgrades that we the 'Database as 
a Service' team don't care about data integrity in our own product :)

Robert


On Thu, Dec 19, 2013 at 9:25 AM, Greg Hill 
mailto:greg.h...@rackspace.com>> wrote:
We did consider doing that, but decided it wasn't really any different from the 
other options as it required the deployer to know to alter that data.  That 
would require the fewest code changes, though.  It was also my understanding 
that mysql variants were a possibility as well (percona and mariadb), which is 
what brought on the objection to just defaulting in code.  Also, we can't 
derive the version being used, so we *could* fill it with a dummy version and 
assume mysql, but I don't feel like that solves the problem or the objections 
to the earlier solutions.  And then we also have bogus data in the database.

Since there's no perfect solution, I'm really just hoping to gather consensus 
among people who are running existing trove installations and have yet to 
upgrade to the newer code about what would be easiest for them.  My 
understanding is that list is basically HP and Rackspace, and maybe Ebay?, but 
the hope was that bringing the issue up on the list might confirm or refute 
that assumption and drive the conversation to a suitable workaround for those 
affected, which hopefully isn't that many organizations at this point.

The options are basically:

1. Put the onus on the deployer to correct existing records in the database.
2. Have the migration script put dummy data in the database which you have to 
correct.
3. Put the onus on the deployer to fill out values in the config value

Greg

On Dec 18, 2013, at 8:46 PM, Robert Myers 
mailto:myer0...@gmail.com>> wrote:


There is the database migration for datastores. We should add a function to  
back fill the existing data with either a dummy data or set it to 'mysql' as 
that was the only possibility before data stores.

On Dec 18, 2013 3:23 PM, "Greg Hill" 
mailto:greg.h...@rackspace.com>> wrote:
I've been working on fixing a bug related to migrating existing installations 
to the new datastore code:

https://bugs.launchpad.net/trove/+bug/1259642

The basic gist is that existing instances won't have any data in the 
datastore_version_id field in the database unless we somehow populate that data 
during migration, and not having that data populated breaks a lot of things 
(including the ability to list instances or delete or resize old instances).  
It's impossible to populate that data in an automatic, generic way, since it's 
highly vendor-dependent on what database and version they currently support, 
and there's not enough data in the older schema to populate the new tables 
automatically.

So far, we've come up with some non-optimal solutions:

1. The first iteration was to assume 'mysql' as the database manager on 
in

Re: [openstack-dev] [trove] datastore migration issues

2013-12-20 Thread Vipul Sabhaya
I am fine with requiring the deployer to update default values, if they
don’t make sense for their given deployment.  However, not having any value
for older/existing instances, when the code requires it is not good.  So
let’s create a default datastore of mysql, with a default version, and set
that as the datastore for older instances.  A deployer can then run
trove-manage to update the default record created.


On Thu, Dec 19, 2013 at 6:14 PM, Tim Simpson wrote:

>  I second Rob and Greg- we need to not allow the instance table to have
> nulls for the datastore version ID. I can't imagine that as Trove grows and
> evolves, that edge case is something we'll always remember to code and test
> for, so let's cauterize things now by no longer allowing it at all.
>
>  The fact that the migration scripts can't, to my knowledge, accept
> parameters for what the dummy datastore name and version should be isn't
> great, but I think it would be acceptable enough to make the provided
> default values sensible and ask operators who don't like it to manually
> update the database.
>
>  - Tim
>
>
>
>  --
> *From:* Robert Myers [myer0...@gmail.com]
> *Sent:* Thursday, December 19, 2013 9:59 AM
> *To:* OpenStack Development Mailing List (not for usage questions)
> *Subject:* Re: [openstack-dev] [trove] datastore migration issues
>
>   I think that we need to be good citizens and at least add dummy data.
> Because it is impossible to know who all is using this, the list you have
> is probably complete. But Trove has been available for quite some time and
> all these users will not be listening on this thread. Basically anytime you
> have a database migration that adds a required field you *have* to alter
> the existing rows. If we don't we're basically telling everyone who
> upgrades that we the 'Database as a Service' team don't care about data
> integrity in our own product :)
>
>  Robert
>
>
> On Thu, Dec 19, 2013 at 9:25 AM, Greg Hill wrote:
>
>>  We did consider doing that, but decided it wasn't really any different
>> from the other options as it required the deployer to know to alter that
>> data.  That would require the fewest code changes, though.  It was also my
>> understanding that mysql variants were a possibility as well (percona and
>> mariadb), which is what brought on the objection to just defaulting in
>> code.  Also, we can't derive the version being used, so we *could* fill it
>> with a dummy version and assume mysql, but I don't feel like that solves
>> the problem or the objections to the earlier solutions.  And then we also
>> have bogus data in the database.
>>
>>   Since there's no perfect solution, I'm really just hoping to gather
>> consensus among people who are running existing trove installations and
>> have yet to upgrade to the newer code about what would be easiest for them.
>>  My understanding is that list is basically HP and Rackspace, and maybe
>> Ebay?, but the hope was that bringing the issue up on the list might
>> confirm or refute that assumption and drive the conversation to a suitable
>> workaround for those affected, which hopefully isn't that many
>> organizations at this point.
>>
>>  The options are basically:
>>
>>  1. Put the onus on the deployer to correct existing records in the
>> database.
>> 2. Have the migration script put dummy data in the database which you
>> have to correct.
>> 3. Put the onus on the deployer to fill out values in the config value
>>
>>  Greg
>>
>>  On Dec 18, 2013, at 8:46 PM, Robert Myers  wrote:
>>
>>  There is the database migration for datastores. We should add a
>> function to  back fill the existing data with either a dummy data or set it
>> to 'mysql' as that was the only possibility before data stores.
>> On Dec 18, 2013 3:23 PM, "Greg Hill"  wrote:
>>
>>> I've been working on fixing a bug related to migrating existing
>>> installations to the new datastore code:
>>>
>>>  https://bugs.launchpad.net/trove/+bug/1259642
>>>
>>>  The basic gist is that existing instances won't have any data in the
>>> datastore_version_id field in the database unless we somehow populate that
>>> data during migration, and not having that data populated breaks a lot of
>>> things (including the ability to list instances or delete or resize old
>>> instances).  It's impossible to populate that data in an automatic, generic
>>> way, since it's highly vendor-dependent on what database and version 

Re: [openstack-dev] [trove] datastore migration issues

2013-12-19 Thread Tim Simpson
I second Rob and Greg- we need to not allow the instance table to have nulls 
for the datastore version ID. I can't imagine that as Trove grows and evolves, 
that edge case is something we'll always remember to code and test for, so 
let's cauterize things now by no longer allowing it at all.

The fact that the migration scripts can't, to my knowledge, accept parameters 
for what the dummy datastore name and version should be isn't great, but I 
think it would be acceptable enough to make the provided default values 
sensible and ask operators who don't like it to manually update the database.

- Tim




From: Robert Myers [myer0...@gmail.com]
Sent: Thursday, December 19, 2013 9:59 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [trove] datastore migration issues

I think that we need to be good citizens and at least add dummy data. Because 
it is impossible to know who all is using this, the list you have is probably 
complete. But Trove has been available for quite some time and all these users 
will not be listening on this thread. Basically anytime you have a database 
migration that adds a required field you *have* to alter the existing rows. If 
we don't we're basically telling everyone who upgrades that we the 'Database as 
a Service' team don't care about data integrity in our own product :)

Robert


On Thu, Dec 19, 2013 at 9:25 AM, Greg Hill 
mailto:greg.h...@rackspace.com>> wrote:
We did consider doing that, but decided it wasn't really any different from the 
other options as it required the deployer to know to alter that data.  That 
would require the fewest code changes, though.  It was also my understanding 
that mysql variants were a possibility as well (percona and mariadb), which is 
what brought on the objection to just defaulting in code.  Also, we can't 
derive the version being used, so we *could* fill it with a dummy version and 
assume mysql, but I don't feel like that solves the problem or the objections 
to the earlier solutions.  And then we also have bogus data in the database.

Since there's no perfect solution, I'm really just hoping to gather consensus 
among people who are running existing trove installations and have yet to 
upgrade to the newer code about what would be easiest for them.  My 
understanding is that list is basically HP and Rackspace, and maybe Ebay?, but 
the hope was that bringing the issue up on the list might confirm or refute 
that assumption and drive the conversation to a suitable workaround for those 
affected, which hopefully isn't that many organizations at this point.

The options are basically:

1. Put the onus on the deployer to correct existing records in the database.
2. Have the migration script put dummy data in the database which you have to 
correct.
3. Put the onus on the deployer to fill out values in the config value

Greg

On Dec 18, 2013, at 8:46 PM, Robert Myers 
mailto:myer0...@gmail.com>> wrote:


There is the database migration for datastores. We should add a function to  
back fill the existing data with either a dummy data or set it to 'mysql' as 
that was the only possibility before data stores.

On Dec 18, 2013 3:23 PM, "Greg Hill" 
mailto:greg.h...@rackspace.com>> wrote:
I've been working on fixing a bug related to migrating existing installations 
to the new datastore code:

https://bugs.launchpad.net/trove/+bug/1259642

The basic gist is that existing instances won't have any data in the 
datastore_version_id field in the database unless we somehow populate that data 
during migration, and not having that data populated breaks a lot of things 
(including the ability to list instances or delete or resize old instances).  
It's impossible to populate that data in an automatic, generic way, since it's 
highly vendor-dependent on what database and version they currently support, 
and there's not enough data in the older schema to populate the new tables 
automatically.

So far, we've come up with some non-optimal solutions:

1. The first iteration was to assume 'mysql' as the database manager on 
instances without a datastore set.
2. The next iteration was to make the default value be configurable in 
trove.conf, but default to 'mysql' if it wasn't set.
3. It was then proposed that we could just use the 'default_datastore' value 
from the config, which may or may not be set by the operator.

My problem with any of these approaches beyond the first is that requiring 
people to populate config values in order to successfully migrate to the newer 
code is really no different than requiring them to populate the new database 
tables with appropriate data and updating the existing instances with the 
appropriate values.  Either way, it's now highly dependent on pe

Re: [openstack-dev] [trove] datastore migration issues

2013-12-19 Thread Robert Myers
I think that we need to be good citizens and at least add dummy data.
Because it is impossible to know who all is using this, the list you have
is probably complete. But Trove has been available for quite some time and
all these users will not be listening on this thread. Basically anytime you
have a database migration that adds a required field you *have* to alter
the existing rows. If we don't we're basically telling everyone who
upgrades that we the 'Database as a Service' team don't care about data
integrity in our own product :)

Robert


On Thu, Dec 19, 2013 at 9:25 AM, Greg Hill  wrote:

>  We did consider doing that, but decided it wasn't really any different
> from the other options as it required the deployer to know to alter that
> data.  That would require the fewest code changes, though.  It was also my
> understanding that mysql variants were a possibility as well (percona and
> mariadb), which is what brought on the objection to just defaulting in
> code.  Also, we can't derive the version being used, so we *could* fill it
> with a dummy version and assume mysql, but I don't feel like that solves
> the problem or the objections to the earlier solutions.  And then we also
> have bogus data in the database.
>
>   Since there's no perfect solution, I'm really just hoping to gather
> consensus among people who are running existing trove installations and
> have yet to upgrade to the newer code about what would be easiest for them.
>  My understanding is that list is basically HP and Rackspace, and maybe
> Ebay?, but the hope was that bringing the issue up on the list might
> confirm or refute that assumption and drive the conversation to a suitable
> workaround for those affected, which hopefully isn't that many
> organizations at this point.
>
>  The options are basically:
>
>  1. Put the onus on the deployer to correct existing records in the
> database.
> 2. Have the migration script put dummy data in the database which you have
> to correct.
> 3. Put the onus on the deployer to fill out values in the config value
>
>  Greg
>
>  On Dec 18, 2013, at 8:46 PM, Robert Myers  wrote:
>
>  There is the database migration for datastores. We should add a function
> to  back fill the existing data with either a dummy data or set it to
> 'mysql' as that was the only possibility before data stores.
> On Dec 18, 2013 3:23 PM, "Greg Hill"  wrote:
>
>> I've been working on fixing a bug related to migrating existing
>> installations to the new datastore code:
>>
>>  https://bugs.launchpad.net/trove/+bug/1259642
>>
>>  The basic gist is that existing instances won't have any data in the
>> datastore_version_id field in the database unless we somehow populate that
>> data during migration, and not having that data populated breaks a lot of
>> things (including the ability to list instances or delete or resize old
>> instances).  It's impossible to populate that data in an automatic, generic
>> way, since it's highly vendor-dependent on what database and version they
>> currently support, and there's not enough data in the older schema to
>> populate the new tables automatically.
>>
>>  So far, we've come up with some non-optimal solutions:
>>
>>  1. The first iteration was to assume 'mysql' as the database manager on
>> instances without a datastore set.
>> 2. The next iteration was to make the default value be configurable in
>> trove.conf, but default to 'mysql' if it wasn't set.
>> 3. It was then proposed that we could just use the 'default_datastore'
>> value from the config, which may or may not be set by the operator.
>>
>>  My problem with any of these approaches beyond the first is that
>> requiring people to populate config values in order to successfully migrate
>> to the newer code is really no different than requiring them to populate
>> the new database tables with appropriate data and updating the existing
>> instances with the appropriate values.  Either way, it's now highly
>> dependent on people deploying the upgrade to know about this change and
>> react accordingly.
>>
>>  Does anyone have a better solution that we aren't considering?  Is this
>> even worth the effort given that trove has so few current deployments that
>> we can just make sure everyone is populating the new tables as part of
>> their upgrade path and not bother fixing the code to deal with the legacy
>> data?
>>
>>  Greg
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>  ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___

Re: [openstack-dev] [trove] datastore migration issues

2013-12-19 Thread Greg Hill
We did consider doing that, but decided it wasn't really any different from the 
other options as it required the deployer to know to alter that data.  That 
would require the fewest code changes, though.  It was also my understanding 
that mysql variants were a possibility as well (percona and mariadb), which is 
what brought on the objection to just defaulting in code.  Also, we can't 
derive the version being used, so we *could* fill it with a dummy version and 
assume mysql, but I don't feel like that solves the problem or the objections 
to the earlier solutions.  And then we also have bogus data in the database.

Since there's no perfect solution, I'm really just hoping to gather consensus 
among people who are running existing trove installations and have yet to 
upgrade to the newer code about what would be easiest for them.  My 
understanding is that list is basically HP and Rackspace, and maybe Ebay?, but 
the hope was that bringing the issue up on the list might confirm or refute 
that assumption and drive the conversation to a suitable workaround for those 
affected, which hopefully isn't that many organizations at this point.

The options are basically:

1. Put the onus on the deployer to correct existing records in the database.
2. Have the migration script put dummy data in the database which you have to 
correct.
3. Put the onus on the deployer to fill out values in the config value

Greg

On Dec 18, 2013, at 8:46 PM, Robert Myers 
mailto:myer0...@gmail.com>> wrote:


There is the database migration for datastores. We should add a function to  
back fill the existing data with either a dummy data or set it to 'mysql' as 
that was the only possibility before data stores.

On Dec 18, 2013 3:23 PM, "Greg Hill" 
mailto:greg.h...@rackspace.com>> wrote:
I've been working on fixing a bug related to migrating existing installations 
to the new datastore code:

https://bugs.launchpad.net/trove/+bug/1259642

The basic gist is that existing instances won't have any data in the 
datastore_version_id field in the database unless we somehow populate that data 
during migration, and not having that data populated breaks a lot of things 
(including the ability to list instances or delete or resize old instances).  
It's impossible to populate that data in an automatic, generic way, since it's 
highly vendor-dependent on what database and version they currently support, 
and there's not enough data in the older schema to populate the new tables 
automatically.

So far, we've come up with some non-optimal solutions:

1. The first iteration was to assume 'mysql' as the database manager on 
instances without a datastore set.
2. The next iteration was to make the default value be configurable in 
trove.conf, but default to 'mysql' if it wasn't set.
3. It was then proposed that we could just use the 'default_datastore' value 
from the config, which may or may not be set by the operator.

My problem with any of these approaches beyond the first is that requiring 
people to populate config values in order to successfully migrate to the newer 
code is really no different than requiring them to populate the new database 
tables with appropriate data and updating the existing instances with the 
appropriate values.  Either way, it's now highly dependent on people deploying 
the upgrade to know about this change and react accordingly.

Does anyone have a better solution that we aren't considering?  Is this even 
worth the effort given that trove has so few current deployments that we can 
just make sure everyone is populating the new tables as part of their upgrade 
path and not bother fixing the code to deal with the legacy data?

Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [trove] datastore migration issues

2013-12-18 Thread Robert Myers
There is the database migration for datastores. We should add a function
to  back fill the existing data with either a dummy data or set it to
'mysql' as that was the only possibility before data stores.
On Dec 18, 2013 3:23 PM, "Greg Hill"  wrote:

>  I've been working on fixing a bug related to migrating existing
> installations to the new datastore code:
>
>  https://bugs.launchpad.net/trove/+bug/1259642
>
>  The basic gist is that existing instances won't have any data in the
> datastore_version_id field in the database unless we somehow populate that
> data during migration, and not having that data populated breaks a lot of
> things (including the ability to list instances or delete or resize old
> instances).  It's impossible to populate that data in an automatic, generic
> way, since it's highly vendor-dependent on what database and version they
> currently support, and there's not enough data in the older schema to
> populate the new tables automatically.
>
>  So far, we've come up with some non-optimal solutions:
>
>  1. The first iteration was to assume 'mysql' as the database manager on
> instances without a datastore set.
> 2. The next iteration was to make the default value be configurable in
> trove.conf, but default to 'mysql' if it wasn't set.
> 3. It was then proposed that we could just use the 'default_datastore'
> value from the config, which may or may not be set by the operator.
>
>  My problem with any of these approaches beyond the first is that
> requiring people to populate config values in order to successfully migrate
> to the newer code is really no different than requiring them to populate
> the new database tables with appropriate data and updating the existing
> instances with the appropriate values.  Either way, it's now highly
> dependent on people deploying the upgrade to know about this change and
> react accordingly.
>
>  Does anyone have a better solution that we aren't considering?  Is this
> even worth the effort given that trove has so few current deployments that
> we can just make sure everyone is populating the new tables as part of
> their upgrade path and not bother fixing the code to deal with the legacy
> data?
>
>  Greg
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev