Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key
On Wed, 2014-04-23 at 22:07 -0500, Steven Kaufer wrote: yeah, we're talking about thousands and thousands of rows that have to be updated before the API can be restarted… There's also a possibility of adding support for the status codes, but keeping the string columns in the database, and then using the nova object versioning to migrate the object schema over time to the point where the migration is a simple DROP COLUMN. I like that idea better, TBH, but we're probably talking about a long-time deprecation here, like on the order of a couple of releases; that would give plenty of time for the majority of the records to be revisited and make the final migration run for a lot shorter time. -- Thanks for the discussion. No prob, sorry for the delayed response... So how would this new flow work? In Juno would there be an additional status_int column that would be populated and (eventually) replace the existing status (as string) column? That would be the cleanest way, yes. How would the object versioning populate the new column for the existing records? Within the nova.objects.instance.Instance object itself, we can put a small check-and-transform function in the object to do the translation in-line. Any examples or details that would help explain how this could work would be appreciated. Probably worth putting a blueprint up about it. I can work with you on it, if you'd like, though it will likely be after the summit until I have time to work on it. Lastly, is there agreement that this is an issue that needs to be addressed? Note that this seems to be a pervasive problem, I've investigated the status column in cinder and nova but I suspect that the same issue exists in other components. Yes, the same issue unfortunately exists in lots of the other components, and they don't have the benefit of the nova objects work in them, which makes it a lot more of a nuisance to migrate the database schema. Though, personally I'm not entirely sure going through the effort of doing a long-time in-object translation is worth it. A change to the database schema, even on tens of millions of records wouldn't take more than a couple minutes. But it all depends on the operator's tolerance for downtime, since the instances table would certainly be locked for the duration of the migration. Best, -jay Thanks, Steven Kaufer Kevin L. Mitchell kevin.mitch...@rackspace.com Rackspace ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key
Jay, Thanks again for the reply. If this migration is implemented using the object versioning, then the new status as int column cannot be utilized (ie, sorted on) until the existing status as string column is eventually dropped. Is this correct? If so, then this approach will not actually solve the globalization sort problem until more release cycles have completed -- this does not seem like a viable solution. Until we know that the new status as int column is populated then we cannot use it as a sortable column. In theory, a deployer could conditionally choose to migrate to the new column if they needed that function and were willing to take the hit during the migration. However, this just complicates the sorting logic since we would then need to know which column to use during the sort (the new int column if the migration has completed or the old string column if the migration has not completed). Thanks, Steven Kaufer Jay Pipes jaypi...@gmail.com wrote on 04/28/2014 09:05:51 AM: From: Jay Pipes jaypi...@gmail.com To: openstack-dev@lists.openstack.org, Date: 04/28/2014 09:07 AM Subject: Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key On Wed, 2014-04-23 at 22:07 -0500, Steven Kaufer wrote: yeah, we're talking about thousands and thousands of rows that have to be updated before the API can be restarted… There's also a possibility of adding support for the status codes, but keeping the string columns in the database, and then using the nova object versioning to migrate the object schema over time to the point where the migration is a simple DROP COLUMN. I like that idea better, TBH, but we're probably talking about a long-time deprecation here, like on the order of a couple of releases; that would give plenty of time for the majority of the records to be revisited and make the final migration run for a lot shorter time. -- Thanks for the discussion. No prob, sorry for the delayed response... So how would this new flow work? In Juno would there be an additional status_int column that would be populated and (eventually) replace the existing status (as string) column? That would be the cleanest way, yes. How would the object versioning populate the new column for the existing records? Within the nova.objects.instance.Instance object itself, we can put a small check-and-transform function in the object to do the translation in-line. Any examples or details that would help explain how this could work would be appreciated. Probably worth putting a blueprint up about it. I can work with you on it, if you'd like, though it will likely be after the summit until I have time to work on it. Lastly, is there agreement that this is an issue that needs to be addressed? Note that this seems to be a pervasive problem, I've investigated the status column in cinder and nova but I suspect that the same issue exists in other components. Yes, the same issue unfortunately exists in lots of the other components, and they don't have the benefit of the nova objects work in them, which makes it a lot more of a nuisance to migrate the database schema. Though, personally I'm not entirely sure going through the effort of doing a long-time in-object translation is worth it. A change to the database schema, even on tens of millions of records wouldn't take more than a couple minutes. But it all depends on the operator's tolerance for downtime, since the instances table would certainly be locked for the duration of the migration. Best, -jay Thanks, Steven Kaufer Kevin L. Mitchell kevin.mitch...@rackspace.com Rackspace ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key
I am trying to address the following use case: - Assume that the REST APIs support returning data based on a user-defined sort key (assuming that this get approved: https://review.openstack.org/#/c/84451/) - UI contains a table showing items (servers, volumes, etc.) and their status (as a sortable column) and uses pagination to get only a page of data - UI is translated into a non-English language - User wants to sort the table by status In this case, the sorting by status is done against the English key values in the database (active, error, etc.). The UI will then translate the status values into the user's locale and (from the user's perspective) the data will not be in sorted order -- thus confusing and frustrating the user. Note that UI cannot do the sort client-side since pagination is used so it the client only has a sub-set of the total data. I have prototyped a sort by case solution that would allow status to be sorted by severity. In SQL the case statement can be used to map a string to an int and then sort the rows based on the int value (ie, error=0, building=1, active=2, etc.). Using this approach, sorting by status would result in an enum-like sort (based on severity) instead of an alphabetical sort based on the English key values. This solution allows enum-like data to be sorted in a consistent way across all locales -- the solution is generic and can be applied to any column where the values are a known set. The case processing would need to be done in the common paginate_query function: https://github.com/openstack/oslo-incubator/blob/master/openstack/common/db/sqlalchemy/utils.py#L62 This type of sort would not be the default behavior for a status column (or any enum-like column) and the caller would need to specify a unique sort direction key for it (ie, 'asc_case' or 'desc_case'). In theory, this type of sorting support could also be globally enabled/disabled by a deployer (default would be disabled) to further reduce impact. Lastly, I have some performance data and sorting the status by case (vs. alphabetical) has a minimal impact on performance. Before I create a proposal for juno I wanted to get some early feedback on the high-level approach. Please reply with feedback on this solution. Thanks, Steven Kaufer___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key
Hi Steven, thx for the detailed email. Some comments inline... On Wed, 2014-04-23 at 13:59 -0500, Steven Kaufer wrote: I am trying to address the following use case: - Assume that the REST APIs support returning data based on a user-defined sort key (assuming that this get approved: https://review.openstack.org/#/c/84451/) - UI contains a table showing items (servers, volumes, etc.) and their status (as a sortable column) and uses pagination to get only a page of data - UI is translated into a non-English language - User wants to sort the table by status In this case, the sorting by status is done against the English key values in the database (active, error, etc.). The UI will then translate the status values into the user's locale and (from the user's perspective) the data will not be in sorted order -- thus confusing and frustrating the user. Note that UI cannot do the sort client-side since pagination is used so it the client only has a sub-set of the total data. Indeed, this is a problem. I have prototyped a sort by case solution that would allow status to be sorted by severity. In SQL the case statement can be used to map a string to an int and then sort the rows based on the int value (ie, error=0, building=1, active=2, etc.). Using this approach, sorting by status would result in an enum-like sort (based on severity) instead of an alphabetical sort based on the English key values. This solution allows enum-like data to be sorted in a consistent way across all locales -- the solution is generic and can be applied to any column where the values are a known set. The case processing would need to be done in the common paginate_query function: https://github.com/openstack/oslo-incubator/blob/master/openstack/common/db/sqlalchemy/utils.py#L62 This type of sort would not be the default behavior for a status column (or any enum-like column) and the caller would need to specify a unique sort direction key for it (ie, 'asc_case' or 'desc_case'). In theory, this type of sorting support could also be globally enabled/disabled by a deployer (default would be disabled) to further reduce impact. Lastly, I have some performance data and sorting the status by case (vs. alphabetical) has a minimal impact on performance. Before I create a proposal for juno I wanted to get some early feedback on the high-level approach. Please reply with feedback on this solution. So, I feel that the above solution (while innovative certainly! :) ) is not actually addressing the underlying source of the problem here, and that is that statuses are stored in the database as English-language strings instead of integer code values in a lookup table. By addressing the underlying source of the problem -- by changing the instances.{vm,task,power}_state columns to an integer value and using a lookup table in the cases when translation from code to display is needed -- we both solve the problem of i18n sorting and increase the database performance, since queries on these state columns will use an index on a datatype with a much slimmer width. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key
Jay, Thanks for the reply. I agree that changing the datamodel would be the ideal solution. But, to be honest, the scope of that change frightens me. How would you recommend that a change like this would be handled (in addition to the DB migration work)? We obviously cannot break existing codepaths that assume that the existing English key values would be returned from the DB. Is there an existing layer that would perform the mapping between the enum values in the DB and the String keys? Thanks, Steven Kaufer Jay Pipes jaypi...@gmail.com wrote on 04/23/2014 02:56:14 PM: From: Jay Pipes jaypi...@gmail.com To: openstack-dev@lists.openstack.org, Date: 04/23/2014 02:56 PM Subject: Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key Hi Steven, thx for the detailed email. Some comments inline... On Wed, 2014-04-23 at 13:59 -0500, Steven Kaufer wrote: I am trying to address the following use case: - Assume that the REST APIs support returning data based on a user-defined sort key (assuming that this get approved: https://review.openstack.org/#/c/84451/) - UI contains a table showing items (servers, volumes, etc.) and their status (as a sortable column) and uses pagination to get only a page of data - UI is translated into a non-English language - User wants to sort the table by status In this case, the sorting by status is done against the English key values in the database (active, error, etc.). The UI will then translate the status values into the user's locale and (from the user's perspective) the data will not be in sorted order -- thus confusing and frustrating the user. Note that UI cannot do the sort client-side since pagination is used so it the client only has a sub-set of the total data. Indeed, this is a problem. I have prototyped a sort by case solution that would allow status to be sorted by severity. In SQL the case statement can be used to map a string to an int and then sort the rows based on the int value (ie, error=0, building=1, active=2, etc.). Using this approach, sorting by status would result in an enum-like sort (based on severity) instead of an alphabetical sort based on the English key values. This solution allows enum-like data to be sorted in a consistent way across all locales -- the solution is generic and can be applied to any column where the values are a known set. The case processing would need to be done in the common paginate_query function: https://github.com/openstack/oslo-incubator/blob/master/ openstack/common/db/sqlalchemy/utils.py#L62 This type of sort would not be the default behavior for a status column (or any enum-like column) and the caller would need to specify a unique sort direction key for it (ie, 'asc_case' or 'desc_case'). In theory, this type of sorting support could also be globally enabled/disabled by a deployer (default would be disabled) to further reduce impact. Lastly, I have some performance data and sorting the status by case (vs. alphabetical) has a minimal impact on performance. Before I create a proposal for juno I wanted to get some early feedback on the high-level approach. Please reply with feedback on this solution. So, I feel that the above solution (while innovative certainly! :) ) is not actually addressing the underlying source of the problem here, and that is that statuses are stored in the database as English-language strings instead of integer code values in a lookup table. By addressing the underlying source of the problem -- by changing the instances.{vm,task,power}_state columns to an integer value and using a lookup table in the cases when translation from code to display is needed -- we both solve the problem of i18n sorting and increase the database performance, since queries on these state columns will use an index on a datatype with a much slimmer width. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key
On Wed, 2014-04-23 at 15:29 -0500, Steven Kaufer wrote: Thanks for the reply. I agree that changing the datamodel would be the ideal solution. But, to be honest, the scope of that change frightens me. How would you recommend that a change like this would be handled (in addition to the DB migration work)? We obviously cannot break existing codepaths that assume that the existing English key values would be returned from the DB. Is there an existing layer that would perform the mapping between the enum values in the DB and the String keys? Theoretically, at least, all the status values should be using and comparing against constants; see, for instance, nova/compute/vm_states.py. Those could be converted to integers, with a translation table specified in the database migration. The scary thing, though, is just how long that migration will end up taking… -- Kevin L. Mitchell kevin.mitch...@rackspace.com Rackspace ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key
On Wed, 2014-04-23 at 17:19 -0500, Kevin L. Mitchell wrote: On Wed, 2014-04-23 at 15:29 -0500, Steven Kaufer wrote: Thanks for the reply. I agree that changing the datamodel would be the ideal solution. But, to be honest, the scope of that change frightens me. How would you recommend that a change like this would be handled (in addition to the DB migration work)? We obviously cannot break existing codepaths that assume that the existing English key values would be returned from the DB. Is there an existing layer that would perform the mapping between the enum values in the DB and the String keys? Theoretically, at least, all the status values should be using and comparing against constants; see, for instance, nova/compute/vm_states.py. Those could be converted to integers, with a translation table specified in the database migration. The scary thing, though, is just how long that migration will end up taking Worth it, IMO :) There's also a possibility of adding support for the status codes, but keeping the string columns in the database, and then using the nova object versioning to migrate the object schema over time to the point where the migration is a simple DROP COLUMN. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key
On Wed, 2014-04-23 at 18:28 -0400, Jay Pipes wrote: Worth it, IMO :) yeah, we're talking about thousands and thousands of rows that have to be updated before the API can be restarted… There's also a possibility of adding support for the status codes, but keeping the string columns in the database, and then using the nova object versioning to migrate the object schema over time to the point where the migration is a simple DROP COLUMN. I like that idea better, TBH, but we're probably talking about a long-time deprecation here, like on the order of a couple of releases; that would give plenty of time for the majority of the records to be revisited and make the final migration run for a lot shorter time. -- Kevin L. Mitchell kevin.mitch...@rackspace.com Rackspace ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key
yeah, we're talking about thousands and thousands of rows that have to be updated before the API can be restarted… There's also a possibility of adding support for the status codes, but keeping the string columns in the database, and then using the nova object versioning to migrate the object schema over time to the point where the migration is a simple DROP COLUMN. I like that idea better, TBH, but we're probably talking about a long-time deprecation here, like on the order of a couple of releases; that would give plenty of time for the majority of the records to be revisited and make the final migration run for a lot shorter time. -- Thanks for the discussion. So how would this new flow work? In Juno would there be an additional status_int column that would be populated and (eventually) replace the existing status (as string) column? How would the object versioning populate the new column for the existing records? Any examples or details that would help explain how this could work would be appreciated. Lastly, is there agreement that this is an issue that needs to be addressed? Note that this seems to be a pervasive problem, I've investigated the status column in cinder and nova but I suspect that the same issue exists in other components. Thanks, Steven Kaufer Kevin L. Mitchell kevin.mitch...@rackspace.com Rackspace ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev