Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key

2014-04-28 Thread Jay Pipes
On Wed, 2014-04-23 at 22:07 -0500, Steven Kaufer wrote:
  yeah, we're talking about thousands and thousands of rows that have
 to
  be updated before the API can be restarted…
  
   There's also a possibility of adding support for the status codes,
 but
   keeping the string columns in the database, and then using the
 nova
   object versioning to migrate the object schema over time to the
 point
   where the migration is a simple DROP COLUMN.
  
  I like that idea better, TBH, but we're probably talking about a
  long-time deprecation here, like on the order of a couple of
 releases;
  that would give plenty of time for the majority of the records to be
  revisited and make the final migration run for a lot shorter time.
  -- 
 
 Thanks for the discussion.

No prob, sorry for the delayed response...

 So how would this new flow work?
 In Juno would there be an additional status_int column that would be
 populated and (eventually) replace the existing status (as string)
 column?

That would be the cleanest way, yes.

 How would the object versioning populate the new column for the
 existing records?

Within the nova.objects.instance.Instance object itself, we can put a
small check-and-transform function in the object to do the translation
in-line.

 Any examples or details that would help explain how this could work
 would be appreciated.

Probably worth putting a blueprint up about it. I can work with you on
it, if you'd like, though it will likely be after the summit until I
have time to work on it.

 Lastly, is there agreement that this is an issue that needs to be
 addressed? Note that this seems to be a pervasive problem, I've
 investigated the status column in cinder and nova but I suspect that
 the same issue exists in other components.

Yes, the same issue unfortunately exists in lots of the other
components, and they don't have the benefit of the nova objects work in
them, which makes it a lot more of a nuisance to migrate the database
schema. Though, personally I'm not entirely sure going through the
effort of doing a long-time in-object translation is worth it. A change
to the database schema, even on tens of millions of records wouldn't
take more than a couple minutes. But it all depends on the operator's
tolerance for downtime, since the instances table would certainly be
locked for the duration of the migration.

Best,
-jay

 Thanks,
 Steven Kaufer
 
 
  Kevin L. Mitchell kevin.mitch...@rackspace.com
  Rackspace
  
  
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key

2014-04-28 Thread Steven Kaufer
Jay,

Thanks again for the reply.  If this migration is implemented using the
object versioning, then the new status as int column cannot be utilized
(ie, sorted on) until the existing status as string column is eventually
dropped.

Is this correct?  If so, then this approach will not actually solve the
globalization sort problem until more release cycles have completed -- this
does not seem like a viable solution.

Until we know that the new status as int column is populated then we
cannot use it as a sortable column.

In theory, a deployer could conditionally choose to migrate to the new
column if they needed that function and were willing to take the hit during
the migration.  However, this just complicates the sorting logic since we
would then need to know which column to use during the sort (the new int
column if the migration has completed or the old string column if the
migration has not completed).

Thanks,
Steven Kaufer

Jay Pipes jaypi...@gmail.com wrote on 04/28/2014 09:05:51 AM:

 From: Jay Pipes jaypi...@gmail.com
 To: openstack-dev@lists.openstack.org,
 Date: 04/28/2014 09:07 AM
 Subject: Re: [openstack-dev] [Globalization] REST API sorting by
 status severity vs. alphabetical status key

 On Wed, 2014-04-23 at 22:07 -0500, Steven Kaufer wrote:
   yeah, we're talking about thousands and thousands of rows that have
  to
   be updated before the API can be restarted…
  
There's also a possibility of adding support for the status codes,
  but
keeping the string columns in the database, and then using the
  nova
object versioning to migrate the object schema over time to the
  point
where the migration is a simple DROP COLUMN.
  
   I like that idea better, TBH, but we're probably talking about a
   long-time deprecation here, like on the order of a couple of
  releases;
   that would give plenty of time for the majority of the records to be
   revisited and make the final migration run for a lot shorter time.
   --
 
  Thanks for the discussion.

 No prob, sorry for the delayed response...

  So how would this new flow work?
  In Juno would there be an additional status_int column that would be
  populated and (eventually) replace the existing status (as string)
  column?

 That would be the cleanest way, yes.

  How would the object versioning populate the new column for the
  existing records?

 Within the nova.objects.instance.Instance object itself, we can put a
 small check-and-transform function in the object to do the translation
 in-line.

  Any examples or details that would help explain how this could work
  would be appreciated.

 Probably worth putting a blueprint up about it. I can work with you on
 it, if you'd like, though it will likely be after the summit until I
 have time to work on it.

  Lastly, is there agreement that this is an issue that needs to be
  addressed? Note that this seems to be a pervasive problem, I've
  investigated the status column in cinder and nova but I suspect that
  the same issue exists in other components.

 Yes, the same issue unfortunately exists in lots of the other
 components, and they don't have the benefit of the nova objects work in
 them, which makes it a lot more of a nuisance to migrate the database
 schema. Though, personally I'm not entirely sure going through the
 effort of doing a long-time in-object translation is worth it. A change
 to the database schema, even on tens of millions of records wouldn't
 take more than a couple minutes. But it all depends on the operator's
 tolerance for downtime, since the instances table would certainly be
 locked for the duration of the migration.

 Best,
 -jay

  Thanks,
  Steven Kaufer
 
 
   Kevin L. Mitchell kevin.mitch...@rackspace.com
   Rackspace
  
  
   ___
   OpenStack-dev mailing list
   OpenStack-dev@lists.openstack.org
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key

2014-04-23 Thread Steven Kaufer


I am trying to address the following use case:

- Assume that the REST APIs support returning data based on a user-defined
sort key (assuming that this get approved:
https://review.openstack.org/#/c/84451/)
- UI contains a table showing items (servers, volumes, etc.) and their
status (as a sortable column) and uses pagination to get only a page of
data
- UI is translated into a non-English language
- User wants to sort the table by status

In this case, the sorting by status is done against the English key values
in the database (active, error, etc.).  The UI will then translate the
status values into the user's locale and (from the user's perspective) the
data will not be in sorted order -- thus confusing and frustrating the
user.  Note that UI cannot do the sort client-side since pagination is used
so it the client only has a sub-set of the total data.

I have prototyped a sort by case solution that would allow status to be
sorted by severity.  In SQL the case statement can be used to map a
string to an int and then sort the rows based on the int value (ie,
error=0, building=1, active=2, etc.).  Using this approach, sorting by
status would result in an enum-like sort (based on severity) instead of an
alphabetical sort based on the English key values.  This solution allows
enum-like data to be sorted in a consistent way across all locales -- the
solution is generic and can be applied to any column where the values are a
known set.

The case processing would need to be done in the common paginate_query
function:
https://github.com/openstack/oslo-incubator/blob/master/openstack/common/db/sqlalchemy/utils.py#L62

This type of sort would not be the default behavior for a status column (or
any enum-like column) and the caller would need to specify a unique sort
direction key for it (ie, 'asc_case' or 'desc_case').  In theory, this type
of sorting support could also be globally enabled/disabled by a deployer
(default would be disabled) to further reduce impact.

Lastly, I have some performance data and sorting the status by case (vs.
alphabetical) has a minimal impact on performance.

Before I create a proposal for juno I wanted to get some early feedback on
the high-level approach.  Please reply with feedback on this solution.

Thanks,
Steven Kaufer___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key

2014-04-23 Thread Jay Pipes
Hi Steven, thx for the detailed email. Some comments inline...

On Wed, 2014-04-23 at 13:59 -0500, Steven Kaufer wrote:
 I am trying to address the following use case:
 
 - Assume that the REST APIs support returning data based on a
 user-defined sort key (assuming that this get approved:
  https://review.openstack.org/#/c/84451/)
 - UI contains a table showing items (servers, volumes, etc.) and their
 status (as a sortable column) and uses pagination to get only a page
 of data
 - UI is translated into a non-English language
 - User wants to sort the table by status
 
 In this case, the sorting by status is done against the English key
 values in the database (active, error, etc.).  The UI will then
 translate the status values into the user's locale and (from the
 user's perspective) the data will not be in sorted order -- thus
 confusing and frustrating the user.  Note that UI cannot do the sort
 client-side since pagination is used so it the client only has a
 sub-set of the total data.

Indeed, this is a problem.

 I have prototyped a sort by case solution that would allow status to
 be sorted by severity.  In SQL the case statement can be used to map
 a string to an int and then sort the rows based on the int value (ie,
 error=0, building=1, active=2, etc.).  Using this approach, sorting by
 status would result in an enum-like sort (based on severity) instead
 of an alphabetical sort based on the English key values.  This
 solution allows enum-like data to be sorted in a consistent way across
 all locales -- the solution is generic and can be applied to any
 column where the values are a known set.
 
 The case processing would need to be done in the common paginate_query
 function:
  
 https://github.com/openstack/oslo-incubator/blob/master/openstack/common/db/sqlalchemy/utils.py#L62
 
 This type of sort would not be the default behavior for a status
 column (or any enum-like column) and the caller would need to specify
 a unique sort direction key for it (ie, 'asc_case' or 'desc_case').
  In theory, this type of sorting support could also be globally
 enabled/disabled by a deployer (default would be disabled) to further
 reduce impact.
 
 Lastly, I have some performance data and sorting the status by case
 (vs. alphabetical) has a minimal impact on performance.
 
 Before I create a proposal for juno I wanted to get some early
 feedback on the high-level approach.  Please reply with feedback on
 this solution.

So, I feel that the above solution (while innovative certainly! :) ) is
not actually addressing the underlying source of the problem here, and
that is that statuses are stored in the database as English-language
strings instead of integer code values in a lookup table.

By addressing the underlying source of the problem -- by changing the
instances.{vm,task,power}_state columns to an integer value and using a
lookup table in the cases when translation from code to display is
needed -- we both solve the problem of i18n sorting and increase the
database performance, since queries on these state columns will use an
index on a datatype with a much slimmer width.

Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key

2014-04-23 Thread Steven Kaufer

Jay,

Thanks for the reply.  I agree that changing the datamodel would be the
ideal solution.  But, to be honest, the scope of that change frightens me.

How would you recommend that a change like this would be handled (in
addition to the DB migration work)?  We obviously cannot break existing
codepaths that assume that the existing English key values would be
returned from the DB.

Is there an existing layer that would perform the mapping between the enum
values in the DB and the String keys?

Thanks,
Steven Kaufer

Jay Pipes jaypi...@gmail.com wrote on 04/23/2014 02:56:14 PM:

 From: Jay Pipes jaypi...@gmail.com
 To: openstack-dev@lists.openstack.org,
 Date: 04/23/2014 02:56 PM
 Subject: Re: [openstack-dev] [Globalization] REST API sorting by
 status severity vs. alphabetical status key

 Hi Steven, thx for the detailed email. Some comments inline...

 On Wed, 2014-04-23 at 13:59 -0500, Steven Kaufer wrote:
  I am trying to address the following use case:
 
  - Assume that the REST APIs support returning data based on a
  user-defined sort key (assuming that this get approved:
   https://review.openstack.org/#/c/84451/)
  - UI contains a table showing items (servers, volumes, etc.) and their
  status (as a sortable column) and uses pagination to get only a page
  of data
  - UI is translated into a non-English language
  - User wants to sort the table by status
 
  In this case, the sorting by status is done against the English key
  values in the database (active, error, etc.).  The UI will then
  translate the status values into the user's locale and (from the
  user's perspective) the data will not be in sorted order -- thus
  confusing and frustrating the user.  Note that UI cannot do the sort
  client-side since pagination is used so it the client only has a
  sub-set of the total data.

 Indeed, this is a problem.

  I have prototyped a sort by case solution that would allow status to
  be sorted by severity.  In SQL the case statement can be used to map
  a string to an int and then sort the rows based on the int value (ie,
  error=0, building=1, active=2, etc.).  Using this approach, sorting by
  status would result in an enum-like sort (based on severity) instead
  of an alphabetical sort based on the English key values.  This
  solution allows enum-like data to be sorted in a consistent way across
  all locales -- the solution is generic and can be applied to any
  column where the values are a known set.
 
  The case processing would need to be done in the common paginate_query
  function:
   https://github.com/openstack/oslo-incubator/blob/master/
 openstack/common/db/sqlalchemy/utils.py#L62
 
  This type of sort would not be the default behavior for a status
  column (or any enum-like column) and the caller would need to specify
  a unique sort direction key for it (ie, 'asc_case' or 'desc_case').
   In theory, this type of sorting support could also be globally
  enabled/disabled by a deployer (default would be disabled) to further
  reduce impact.
 
  Lastly, I have some performance data and sorting the status by case
  (vs. alphabetical) has a minimal impact on performance.
 
  Before I create a proposal for juno I wanted to get some early
  feedback on the high-level approach.  Please reply with feedback on
  this solution.

 So, I feel that the above solution (while innovative certainly! :) ) is
 not actually addressing the underlying source of the problem here, and
 that is that statuses are stored in the database as English-language
 strings instead of integer code values in a lookup table.

 By addressing the underlying source of the problem -- by changing the
 instances.{vm,task,power}_state columns to an integer value and using a
 lookup table in the cases when translation from code to display is
 needed -- we both solve the problem of i18n sorting and increase the
 database performance, since queries on these state columns will use an
 index on a datatype with a much slimmer width.

 Best,
 -jay


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key

2014-04-23 Thread Kevin L. Mitchell
On Wed, 2014-04-23 at 15:29 -0500, Steven Kaufer wrote:
 Thanks for the reply.  I agree that changing the datamodel would be
 the ideal solution.  But, to be honest, the scope of that change
 frightens me.
 
 How would you recommend that a change like this would be handled (in
 addition to the DB migration work)?  We obviously cannot break
 existing codepaths that assume that the existing English key values
 would be returned from the DB.
 
 Is there an existing layer that would perform the mapping between the
 enum values in the DB and the String keys?

Theoretically, at least, all the status values should be using and
comparing against constants; see, for instance,
nova/compute/vm_states.py.  Those could be converted to integers, with a
translation table specified in the database migration.  The scary thing,
though, is just how long that migration will end up taking…
-- 
Kevin L. Mitchell kevin.mitch...@rackspace.com
Rackspace


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key

2014-04-23 Thread Jay Pipes
On Wed, 2014-04-23 at 17:19 -0500, Kevin L. Mitchell wrote:
 On Wed, 2014-04-23 at 15:29 -0500, Steven Kaufer wrote:
  Thanks for the reply.  I agree that changing the datamodel would be
  the ideal solution.  But, to be honest, the scope of that change
  frightens me.
  
  How would you recommend that a change like this would be handled (in
  addition to the DB migration work)?  We obviously cannot break
  existing codepaths that assume that the existing English key values
  would be returned from the DB.
  
  Is there an existing layer that would perform the mapping between the
  enum values in the DB and the String keys?
 
 Theoretically, at least, all the status values should be using and
 comparing against constants; see, for instance,
 nova/compute/vm_states.py.  Those could be converted to integers, with a
 translation table specified in the database migration.  The scary thing,
 though, is just how long that migration will end up taking

Worth it, IMO :)

There's also a possibility of adding support for the status codes, but
keeping the string columns in the database, and then using the nova
object versioning to migrate the object schema over time to the point
where the migration is a simple DROP COLUMN.

Best,
-jay



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key

2014-04-23 Thread Kevin L. Mitchell
On Wed, 2014-04-23 at 18:28 -0400, Jay Pipes wrote:
 Worth it, IMO :)

yeah, we're talking about thousands and thousands of rows that have to
be updated before the API can be restarted…

 There's also a possibility of adding support for the status codes, but
 keeping the string columns in the database, and then using the nova
 object versioning to migrate the object schema over time to the point
 where the migration is a simple DROP COLUMN.

I like that idea better, TBH, but we're probably talking about a
long-time deprecation here, like on the order of a couple of releases;
that would give plenty of time for the majority of the records to be
revisited and make the final migration run for a lot shorter time.
-- 
Kevin L. Mitchell kevin.mitch...@rackspace.com
Rackspace


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Globalization] REST API sorting by status severity vs. alphabetical status key

2014-04-23 Thread Steven Kaufer
 yeah, we're talking about thousands and thousands of rows that have to
 be updated before the API can be restarted…

  There's also a possibility of adding support for the status codes, but
  keeping the string columns in the database, and then using the nova
  object versioning to migrate the object schema over time to the point
  where the migration is a simple DROP COLUMN.

 I like that idea better, TBH, but we're probably talking about a
 long-time deprecation here, like on the order of a couple of releases;
 that would give plenty of time for the majority of the records to be
 revisited and make the final migration run for a lot shorter time.
 --

Thanks for the discussion.

So how would this new flow work?
In Juno would there be an additional status_int column that would be
populated and (eventually) replace the existing status (as string) column?
How would the object versioning populate the new column for the existing
records?

Any examples or details that would help explain how this could work would
be appreciated.

Lastly, is there agreement that this is an issue that needs to be
addressed? Note that this seems to be a pervasive problem, I've
investigated the status column in cinder and nova but I suspect that the
same issue exists in other components.

Thanks,
Steven Kaufer


 Kevin L. Mitchell kevin.mitch...@rackspace.com
 Rackspace


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev