Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-19 Thread Sean Dague
On 09/18/2014 08:49 PM, Clint Byrum wrote:
 Excerpts from Christopher Yeoh's message of 2014-09-18 16:57:12 -0700:
 On Thu, 18 Sep 2014 12:12:28 -0400
 Sean Dague s...@dague.net wrote:
 When we can return the json-schema to user in the future, can we say
 that means API accepting utf8 or utf8mb4 is discoverable? If it is
 discoverable, then we needn't limit anything in our python code.

 Honestly, we should accept utf8 (no weird mysqlism not quite utf8). We
 should make the default scheme for our dbs support that on names (but
 only for the name columns). The failure of a backend to do utf8 for
 real should return an error to the user. Let's not make this more
 complicated than it needs to be.

 I agree that discoverability for this is not the way to go - I think its
 too complicated for end users. I don't know enough about mysql to know
 if utf8mb4 is going to a performance issue but if its not then we
 should just support utf-8 properly. 

 We can we can catch the db errors. However whilst converting db
 errors causing 500s is fairly straightforward when an error occurs that
 deep in Nova it also means a lot of potential unwinding work in the db
 and compute layers which is complicated and error prone. So i'd prefer
 to avoid the situation with input validation in the first place. 
 
 Just to add a reference into the discussion:
 
 http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html
 
 It does have the same limitation of making fixed width keys and CHAR()
 columns. It goes from 3 bytes per CHAR position, to 4, so it should not
 be a database wide default, but something we use sparingly.
 
 Note that the right answer for things that are not utf-8 (like UUID's)
 is not to set a charset of latin1, but use BINARY/VARBINARY. Last
 time I tried I had a difficult time coercing SQLAlchemy to model the
 difference.. but maybe I just didn't look in the right part of the manual.

Agreed, honestly if we could get the UUIDs to be actually BINARY in the
database I suspect it would have a pretty substantial performance
increase based on past projects that did the same transition. The join
time goes way down... and at least in Nova we do a ton of joins.

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Christopher Yeoh
On Sat, 13 Sep 2014 06:48:19 -0400
Sean Dague s...@dague.net wrote:

 On 09/13/2014 02:28 AM, Kenichi Oomichi wrote:
  
  Hi Chris,
  
  Thanks for bring it up here,
  
  -Original Message-
  From: Chris St. Pierre [mailto:stpie...@metacloud.com]
  Sent: Saturday, September 13, 2014 2:53 AM
  To: openstack-dev@lists.openstack.org
  Subject: [openstack-dev] [nova] Expand resource name allowed
  characters
 
  We have proposed that the allowed characters for all resource
  names in Nova (flavors, aggregates, etc.) be expanded to all
  printable unicode characters and horizontal spaces:
  https://review.openstack.org/#/c/119741
 
  Currently, the only allowed characters in most resource names are
  alphanumeric, space, and [.-_].
 
 
  We have proposed this change for two principal reasons:
 
  1. We have customers who have migrated data forward since Essex,
  when no restrictions were in place, and thus have characters in
  resource names that are disallowed in the current version of
  OpenStack. This is only likely to be useful to people migrating
  from Essex or earlier, since the current restrictions were added
  in Folsom.
 
  2. It's pretty much always a bad idea to add unnecessary
  restrictions without a good reason. While we don't have an
  immediate need to use, for example, the ever-useful
  http://codepoints.net/U+1F4A9 in a flavor name, it's hard to come
  up with a reason people *shouldn't* be allowed to use it.
 
  That said, apparently people have had a need to not be allowed to
  use some characters, but it's not clear why:
  https://bugs.launchpad.net/nova/+bug/977187 So I guess if anyone
  knows any reason why these printable characters should not be
  joined in holy resource naming, speak now or forever hold your
  peace.
  
  I also could not find the reason of current restriction on the bug
  report, and I'd like to know it as the history.
  On v2 API(not v2.1), each resource name contains the following
  restriction for its name:
  
Resource  | Length  | Pattern
   ---+-+--
aggregate | 1-255   | nothing
backup| nothing | nothing
flavor| 1-255   | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+
  | |   [a-zA-Z0-9. _-]*$'
keypair   | 1-255   | '^[a-zA-Z0-9 _-]+$'
server| 1-255   | nothing
cell  | 1-255   | don't contain . and !
  
  On v2.1 API, we have applied the same restriction rule[1] for whole
  resource names for API consistency, so maybe we need to consider
  this topic for whole names.
  
  [1]:
  https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44
 
 Honestly, I bet this had to do with how the database used to be set
 up.
 

So it turns out that utf8 support in MySQL does not support UTF-8 4 byte
multibyte characters (only 1-3 bytes). For example if you do a create
image call with an image name to glance with a 4 byte multibyte
character in the name it will 500. I'd guess we do something
similar in places with the Nova API where we have inadequate input
validation. If you want 4 byte support you need to use utf8mb4 instead.

I don't know if postgresql has any restrictions (I don't think it
does) or if db2 does too. But I don't think we can/should make it a
complete free for all. It should at most be what most databases support.

I think its a big enough change that this late in the cycle we should
push it off to Kilo. It's always much easier to loosen input validation
than tighten it (or have to have an oops revert on an officially
released Nova). Perhaps some tests to verify that everything we allow
past the input validation checks we can actually store.


 That being said, i'm pro letting names be 'utf8'. The id fields that
 are strings (like flavor_id) I think we should keep constrained, as
 we do actually do joins on them. (And as jay said, the current utf8
 schema is actually highly inefficient and bloaty.)

Chris

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Sean Dague
On 09/18/2014 06:38 AM, Christopher Yeoh wrote:
 On Sat, 13 Sep 2014 06:48:19 -0400
 Sean Dague s...@dague.net wrote:
 
 On 09/13/2014 02:28 AM, Kenichi Oomichi wrote:

 Hi Chris,

 Thanks for bring it up here,

 -Original Message-
 From: Chris St. Pierre [mailto:stpie...@metacloud.com]
 Sent: Saturday, September 13, 2014 2:53 AM
 To: openstack-dev@lists.openstack.org
 Subject: [openstack-dev] [nova] Expand resource name allowed
 characters

 We have proposed that the allowed characters for all resource
 names in Nova (flavors, aggregates, etc.) be expanded to all
 printable unicode characters and horizontal spaces:
 https://review.openstack.org/#/c/119741

 Currently, the only allowed characters in most resource names are
 alphanumeric, space, and [.-_].


 We have proposed this change for two principal reasons:

 1. We have customers who have migrated data forward since Essex,
 when no restrictions were in place, and thus have characters in
 resource names that are disallowed in the current version of
 OpenStack. This is only likely to be useful to people migrating
 from Essex or earlier, since the current restrictions were added
 in Folsom.

 2. It's pretty much always a bad idea to add unnecessary
 restrictions without a good reason. While we don't have an
 immediate need to use, for example, the ever-useful
 http://codepoints.net/U+1F4A9 in a flavor name, it's hard to come
 up with a reason people *shouldn't* be allowed to use it.

 That said, apparently people have had a need to not be allowed to
 use some characters, but it's not clear why:
 https://bugs.launchpad.net/nova/+bug/977187 So I guess if anyone
 knows any reason why these printable characters should not be
 joined in holy resource naming, speak now or forever hold your
 peace.

 I also could not find the reason of current restriction on the bug
 report, and I'd like to know it as the history.
 On v2 API(not v2.1), each resource name contains the following
 restriction for its name:

   Resource  | Length  | Pattern
  ---+-+--
   aggregate | 1-255   | nothing
   backup| nothing | nothing
   flavor| 1-255   | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+
 | |   [a-zA-Z0-9. _-]*$'
   keypair   | 1-255   | '^[a-zA-Z0-9 _-]+$'
   server| 1-255   | nothing
   cell  | 1-255   | don't contain . and !

 On v2.1 API, we have applied the same restriction rule[1] for whole
 resource names for API consistency, so maybe we need to consider
 this topic for whole names.

 [1]:
 https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44

 Honestly, I bet this had to do with how the database used to be set
 up.

 
 So it turns out that utf8 support in MySQL does not support UTF-8 4 byte
 multibyte characters (only 1-3 bytes). For example if you do a create
 image call with an image name to glance with a 4 byte multibyte
 character in the name it will 500. I'd guess we do something
 similar in places with the Nova API where we have inadequate input
 validation. If you want 4 byte support you need to use utf8mb4 instead.

Oh... fun. :(

 I don't know if postgresql has any restrictions (I don't think it
 does) or if db2 does too. But I don't think we can/should make it a
 complete free for all. It should at most be what most databases support.
 
 I think its a big enough change that this late in the cycle we should
 push it off to Kilo. It's always much easier to loosen input validation
 than tighten it (or have to have an oops revert on an officially
 released Nova). Perhaps some tests to verify that everything we allow
 past the input validation checks we can actually store.

So, honestly, that seems like a pendulum swing in an odd way.

Havana use anything you want!
Icehouse ?
Juno strict asci!
Kilo utf8

Can't we just catch the db exception correctly in glance and not have it
explode? And then allow it. Exploding with a 500 on a bad name seems the
wrong thing to do anyway.

That would also mean that if the user changed their database to support
utf8mb4 (which they might want to do if it was important to them) it
would all work.

I think some release notes would be fine to explain the current
situation and limitations.

Basically, lets skate towards the puck here, realizing some corner cases
exist, but that we're moving in the direction we want to be, not back
tracking.

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Ken'ichi Ohmichi
2014-09-18 19:57 GMT+09:00 Sean Dague s...@dague.net:
 On 09/18/2014 06:38 AM, Christopher Yeoh wrote:
 On Sat, 13 Sep 2014 06:48:19 -0400
 Sean Dague s...@dague.net wrote:

 We have proposed that the allowed characters for all resource
 names in Nova (flavors, aggregates, etc.) be expanded to all
 printable unicode characters and horizontal spaces:
 https://review.openstack.org/#/c/119741

 Currently, the only allowed characters in most resource names are
 alphanumeric, space, and [.-_].


 We have proposed this change for two principal reasons:

 1. We have customers who have migrated data forward since Essex,
 when no restrictions were in place, and thus have characters in
 resource names that are disallowed in the current version of
 OpenStack. This is only likely to be useful to people migrating
 from Essex or earlier, since the current restrictions were added
 in Folsom.

 2. It's pretty much always a bad idea to add unnecessary
 restrictions without a good reason. While we don't have an
 immediate need to use, for example, the ever-useful
 http://codepoints.net/U+1F4A9 in a flavor name, it's hard to come
 up with a reason people *shouldn't* be allowed to use it.

 That said, apparently people have had a need to not be allowed to
 use some characters, but it's not clear why:
 https://bugs.launchpad.net/nova/+bug/977187 So I guess if anyone
 knows any reason why these printable characters should not be
 joined in holy resource naming, speak now or forever hold your
 peace.

 I also could not find the reason of current restriction on the bug
 report, and I'd like to know it as the history.
 On v2 API(not v2.1), each resource name contains the following
 restriction for its name:

   Resource  | Length  | Pattern
  ---+-+--
   aggregate | 1-255   | nothing
   backup| nothing | nothing
   flavor| 1-255   | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+
 | |   [a-zA-Z0-9. _-]*$'
   keypair   | 1-255   | '^[a-zA-Z0-9 _-]+$'
   server| 1-255   | nothing
   cell  | 1-255   | don't contain . and !

 On v2.1 API, we have applied the same restriction rule[1] for whole
 resource names for API consistency, so maybe we need to consider
 this topic for whole names.

 [1]:
 https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44

 Honestly, I bet this had to do with how the database used to be set
 up.


 So it turns out that utf8 support in MySQL does not support UTF-8 4 byte
 multibyte characters (only 1-3 bytes). For example if you do a create
 image call with an image name to glance with a 4 byte multibyte
 character in the name it will 500. I'd guess we do something
 similar in places with the Nova API where we have inadequate input
 validation. If you want 4 byte support you need to use utf8mb4 instead.

 Oh... fun. :(

 I don't know if postgresql has any restrictions (I don't think it
 does) or if db2 does too. But I don't think we can/should make it a
 complete free for all. It should at most be what most databases support.

 I think its a big enough change that this late in the cycle we should
 push it off to Kilo. It's always much easier to loosen input validation
 than tighten it (or have to have an oops revert on an officially
 released Nova). Perhaps some tests to verify that everything we allow
 past the input validation checks we can actually store.

 So, honestly, that seems like a pendulum swing in an odd way.

 Havana use anything you want!
 Icehouse ?
 Juno strict asci!
 Kilo utf8

Correct validation history is

Essex: use anything you want!
Folsom: strict asci!
[..]
Juno: strict asci!

So I don't think we should make the input validation loose right now
to avoid a pendulum swing.


 Can't we just catch the db exception correctly in glance and not have it
 explode? And then allow it. Exploding with a 500 on a bad name seems the
 wrong thing to do anyway.

 That would also mean that if the user changed their database to support
 utf8mb4 (which they might want to do if it was important to them) it
 would all work.

 I think some release notes would be fine to explain the current
 situation and limitations.

 Basically, lets skate towards the puck here, realizing some corner cases
 exist, but that we're moving in the direction we want to be, not back
 tracking.

One idea is that: How about using base64 encoding/decoding if non-ascii
chars come? REST API layer would encode resource names if necessary
before passing it to DB layer, and we don't need to consider backend DB
features. The disadvantage is that available name length becomes short if
non-ascii chars. But maybe that would be acceptable..

Thanks
Ken'ichi Ohmichi

---

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Sean Dague
On 09/18/2014 07:19 AM, Ken'ichi Ohmichi wrote:
 2014-09-18 19:57 GMT+09:00 Sean Dague s...@dague.net:
 On 09/18/2014 06:38 AM, Christopher Yeoh wrote:
 On Sat, 13 Sep 2014 06:48:19 -0400
 Sean Dague s...@dague.net wrote:

 We have proposed that the allowed characters for all resource
 names in Nova (flavors, aggregates, etc.) be expanded to all
 printable unicode characters and horizontal spaces:
 https://review.openstack.org/#/c/119741

 Currently, the only allowed characters in most resource names are
 alphanumeric, space, and [.-_].


 We have proposed this change for two principal reasons:

 1. We have customers who have migrated data forward since Essex,
 when no restrictions were in place, and thus have characters in
 resource names that are disallowed in the current version of
 OpenStack. This is only likely to be useful to people migrating
 from Essex or earlier, since the current restrictions were added
 in Folsom.

 2. It's pretty much always a bad idea to add unnecessary
 restrictions without a good reason. While we don't have an
 immediate need to use, for example, the ever-useful
 http://codepoints.net/U+1F4A9 in a flavor name, it's hard to come
 up with a reason people *shouldn't* be allowed to use it.

 That said, apparently people have had a need to not be allowed to
 use some characters, but it's not clear why:
 https://bugs.launchpad.net/nova/+bug/977187 So I guess if anyone
 knows any reason why these printable characters should not be
 joined in holy resource naming, speak now or forever hold your
 peace.

 I also could not find the reason of current restriction on the bug
 report, and I'd like to know it as the history.
 On v2 API(not v2.1), each resource name contains the following
 restriction for its name:

   Resource  | Length  | Pattern
  ---+-+--
   aggregate | 1-255   | nothing
   backup| nothing | nothing
   flavor| 1-255   | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+
 | |   [a-zA-Z0-9. _-]*$'
   keypair   | 1-255   | '^[a-zA-Z0-9 _-]+$'
   server| 1-255   | nothing
   cell  | 1-255   | don't contain . and !

 On v2.1 API, we have applied the same restriction rule[1] for whole
 resource names for API consistency, so maybe we need to consider
 this topic for whole names.

 [1]:
 https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44

 Honestly, I bet this had to do with how the database used to be set
 up.


 So it turns out that utf8 support in MySQL does not support UTF-8 4 byte
 multibyte characters (only 1-3 bytes). For example if you do a create
 image call with an image name to glance with a 4 byte multibyte
 character in the name it will 500. I'd guess we do something
 similar in places with the Nova API where we have inadequate input
 validation. If you want 4 byte support you need to use utf8mb4 instead.

 Oh... fun. :(

 I don't know if postgresql has any restrictions (I don't think it
 does) or if db2 does too. But I don't think we can/should make it a
 complete free for all. It should at most be what most databases support.

 I think its a big enough change that this late in the cycle we should
 push it off to Kilo. It's always much easier to loosen input validation
 than tighten it (or have to have an oops revert on an officially
 released Nova). Perhaps some tests to verify that everything we allow
 past the input validation checks we can actually store.

 So, honestly, that seems like a pendulum swing in an odd way.

 Havana use anything you want!
 Icehouse ?
 Juno strict asci!
 Kilo utf8
 
 Correct validation history is
 
 Essex: use anything you want!
 Folsom: strict asci!
 [..]
 Juno: strict asci!

 So I don't think we should make the input validation loose right now
 to avoid a pendulum swing.

Ok, great. That history makes me ok with status quo. I didn't realize it
went back so far.

 Can't we just catch the db exception correctly in glance and not have it
 explode? And then allow it. Exploding with a 500 on a bad name seems the
 wrong thing to do anyway.

 That would also mean that if the user changed their database to support
 utf8mb4 (which they might want to do if it was important to them) it
 would all work.

 I think some release notes would be fine to explain the current
 situation and limitations.

 Basically, lets skate towards the puck here, realizing some corner cases
 exist, but that we're moving in the direction we want to be, not back
 tracking.
 
 One idea is that: How about using base64 encoding/decoding if non-ascii
 chars come? REST API layer would encode resource names if necessary
 before passing it to DB layer, and we don't need to consider backend DB
 features. The disadvantage is that available name length becomes short if
 non-ascii chars. But maybe that would be acceptable..

Honestly, we should utf8 through to the db. If the user's db doesn't
support it... that's their implementation issue.

Also... glance should not 

Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Flavio Percoco
On 09/18/2014 01:28 PM, Sean Dague wrote:
 On 09/18/2014 07:19 AM, Ken'ichi Ohmichi wrote:
 2014-09-18 19:57 GMT+09:00 Sean Dague s...@dague.net:
 On 09/18/2014 06:38 AM, Christopher Yeoh wrote:
 On Sat, 13 Sep 2014 06:48:19 -0400
 Sean Dague s...@dague.net wrote:

 We have proposed that the allowed characters for all resource
 names in Nova (flavors, aggregates, etc.) be expanded to all
 printable unicode characters and horizontal spaces:
 https://review.openstack.org/#/c/119741

 Currently, the only allowed characters in most resource names are
 alphanumeric, space, and [.-_].


 We have proposed this change for two principal reasons:

 1. We have customers who have migrated data forward since Essex,
 when no restrictions were in place, and thus have characters in
 resource names that are disallowed in the current version of
 OpenStack. This is only likely to be useful to people migrating
 from Essex or earlier, since the current restrictions were added
 in Folsom.

 2. It's pretty much always a bad idea to add unnecessary
 restrictions without a good reason. While we don't have an
 immediate need to use, for example, the ever-useful
 http://codepoints.net/U+1F4A9 in a flavor name, it's hard to come
 up with a reason people *shouldn't* be allowed to use it.

 That said, apparently people have had a need to not be allowed to
 use some characters, but it's not clear why:
 https://bugs.launchpad.net/nova/+bug/977187 So I guess if anyone
 knows any reason why these printable characters should not be
 joined in holy resource naming, speak now or forever hold your
 peace.

 I also could not find the reason of current restriction on the bug
 report, and I'd like to know it as the history.
 On v2 API(not v2.1), each resource name contains the following
 restriction for its name:

   Resource  | Length  | Pattern
  ---+-+--
   aggregate | 1-255   | nothing
   backup| nothing | nothing
   flavor| 1-255   | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+
 | |   [a-zA-Z0-9. _-]*$'
   keypair   | 1-255   | '^[a-zA-Z0-9 _-]+$'
   server| 1-255   | nothing
   cell  | 1-255   | don't contain . and !

 On v2.1 API, we have applied the same restriction rule[1] for whole
 resource names for API consistency, so maybe we need to consider
 this topic for whole names.

 [1]:
 https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44

 Honestly, I bet this had to do with how the database used to be set
 up.


 So it turns out that utf8 support in MySQL does not support UTF-8 4 byte
 multibyte characters (only 1-3 bytes). For example if you do a create
 image call with an image name to glance with a 4 byte multibyte
 character in the name it will 500. I'd guess we do something
 similar in places with the Nova API where we have inadequate input
 validation. If you want 4 byte support you need to use utf8mb4 instead.

 Oh... fun. :(

 I don't know if postgresql has any restrictions (I don't think it
 does) or if db2 does too. But I don't think we can/should make it a
 complete free for all. It should at most be what most databases support.

 I think its a big enough change that this late in the cycle we should
 push it off to Kilo. It's always much easier to loosen input validation
 than tighten it (or have to have an oops revert on an officially
 released Nova). Perhaps some tests to verify that everything we allow
 past the input validation checks we can actually store.

 So, honestly, that seems like a pendulum swing in an odd way.

 Havana use anything you want!
 Icehouse ?
 Juno strict asci!
 Kilo utf8

 Correct validation history is

 Essex: use anything you want!
 Folsom: strict asci!
 [..]
 Juno: strict asci!

 So I don't think we should make the input validation loose right now
 to avoid a pendulum swing.
 
 Ok, great. That history makes me ok with status quo. I didn't realize it
 went back so far.
 
 Can't we just catch the db exception correctly in glance and not have it
 explode? And then allow it. Exploding with a 500 on a bad name seems the
 wrong thing to do anyway.

 That would also mean that if the user changed their database to support
 utf8mb4 (which they might want to do if it was important to them) it
 would all work.

 I think some release notes would be fine to explain the current
 situation and limitations.

 Basically, lets skate towards the puck here, realizing some corner cases
 exist, but that we're moving in the direction we want to be, not back
 tracking.

 One idea is that: How about using base64 encoding/decoding if non-ascii
 chars come? REST API layer would encode resource names if necessary
 before passing it to DB layer, and we don't need to consider backend DB
 features. The disadvantage is that available name length becomes short if
 non-ascii chars. But maybe that would be acceptable..
 
 Honestly, we should utf8 through to the db. If the user's db doesn't
 support it... that's their 

Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Chris St. Pierre
On Thu, Sep 18, 2014 at 4:19 AM, Ken'ichi Ohmichi ken1ohmi...@gmail.com
wrote:

 Correct validation history is

 Essex: use anything you want!
 Folsom: strict asci!
 [..]
 Juno: strict asci!


I'm not sure that's quite right. My patch doesn't actually add Unicode
support; that was already added in
825499fffc7a320466e976d2842e175c2d158c0e, which appears to have gone in for
Icehouse.  So:

Essex: Use anything you want
Folsom: Strict ASCII, inconsistent restrictions
Grizzly: Strict ASCII, inconsistent restrictions
Icehouse: Unicode, inconsistent restrictions
Juno: Unicode, consistent restrictions
Kilo (?): Use anything you want

At any rate, if accepting Unicode is an issue, then it's been an issue for
a while.

-- 
Chris St. Pierre
Senior Software Engineer
metacloud.com
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Alex Xu

On 2014年09月18日 18:57, Sean Dague wrote:

On 09/18/2014 06:38 AM, Christopher Yeoh wrote:

On Sat, 13 Sep 2014 06:48:19 -0400
Sean Dague s...@dague.net wrote:


On 09/13/2014 02:28 AM, Kenichi Oomichi wrote:

Hi Chris,

Thanks for bring it up here,


-Original Message-
From: Chris St. Pierre [mailto:stpie...@metacloud.com]
Sent: Saturday, September 13, 2014 2:53 AM
To: openstack-dev@lists.openstack.org
Subject: [openstack-dev] [nova] Expand resource name allowed
characters

We have proposed that the allowed characters for all resource
names in Nova (flavors, aggregates, etc.) be expanded to all
printable unicode characters and horizontal spaces:
https://review.openstack.org/#/c/119741

Currently, the only allowed characters in most resource names are
alphanumeric, space, and [.-_].


We have proposed this change for two principal reasons:

1. We have customers who have migrated data forward since Essex,
when no restrictions were in place, and thus have characters in
resource names that are disallowed in the current version of
OpenStack. This is only likely to be useful to people migrating
from Essex or earlier, since the current restrictions were added
in Folsom.

2. It's pretty much always a bad idea to add unnecessary
restrictions without a good reason. While we don't have an
immediate need to use, for example, the ever-useful
http://codepoints.net/U+1F4A9 in a flavor name, it's hard to come
up with a reason people *shouldn't* be allowed to use it.

That said, apparently people have had a need to not be allowed to
use some characters, but it's not clear why:
https://bugs.launchpad.net/nova/+bug/977187 So I guess if anyone
knows any reason why these printable characters should not be
joined in holy resource naming, speak now or forever hold your
peace.

I also could not find the reason of current restriction on the bug
report, and I'd like to know it as the history.
On v2 API(not v2.1), each resource name contains the following
restriction for its name:

   Resource  | Length  | Pattern
  ---+-+--
   aggregate | 1-255   | nothing
   backup| nothing | nothing
   flavor| 1-255   | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+
 | |   [a-zA-Z0-9. _-]*$'
   keypair   | 1-255   | '^[a-zA-Z0-9 _-]+$'
   server| 1-255   | nothing
   cell  | 1-255   | don't contain . and !

On v2.1 API, we have applied the same restriction rule[1] for whole
resource names for API consistency, so maybe we need to consider
this topic for whole names.

[1]:
https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44

Honestly, I bet this had to do with how the database used to be set
up.


So it turns out that utf8 support in MySQL does not support UTF-8 4 byte
multibyte characters (only 1-3 bytes). For example if you do a create
image call with an image name to glance with a 4 byte multibyte
character in the name it will 500. I'd guess we do something
similar in places with the Nova API where we have inadequate input
validation. If you want 4 byte support you need to use utf8mb4 instead.

Oh... fun. :(


I don't know if postgresql has any restrictions (I don't think it
does) or if db2 does too. But I don't think we can/should make it a
complete free for all. It should at most be what most databases support.

I think its a big enough change that this late in the cycle we should
push it off to Kilo. It's always much easier to loosen input validation
than tighten it (or have to have an oops revert on an officially
released Nova). Perhaps some tests to verify that everything we allow
past the input validation checks we can actually store.

So, honestly, that seems like a pendulum swing in an odd way.

Havana use anything you want!
Icehouse ?
Juno strict asci!
Kilo utf8

Can't we just catch the db exception correctly in glance and not have it
explode? And then allow it. Exploding with a 500 on a bad name seems the
wrong thing to do anyway.

That would also mean that if the user changed their database to support
utf8mb4 (which they might want to do if it was important to them) it
would all work.

I think some release notes would be fine to explain the current
situation and limitations.

Basically, lets skate towards the puck here, realizing some corner cases
exist, but that we're moving in the direction we want to be, not back
tracking.

When we can return the json-schema to user in the future, can we say 
that means API accepting utf8 or utf8mb4 is discoverable? If it is 
discoverable, then we needn't limit anything in our python code.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Sean Dague
On 09/18/2014 12:08 PM, Alex Xu wrote:
 On 2014年09月18日 18:57, Sean Dague wrote:
 On 09/18/2014 06:38 AM, Christopher Yeoh wrote:
 On Sat, 13 Sep 2014 06:48:19 -0400
 Sean Dague s...@dague.net wrote:

 On 09/13/2014 02:28 AM, Kenichi Oomichi wrote:
 Hi Chris,

 Thanks for bring it up here,

 -Original Message-
 From: Chris St. Pierre [mailto:stpie...@metacloud.com]
 Sent: Saturday, September 13, 2014 2:53 AM
 To: openstack-dev@lists.openstack.org
 Subject: [openstack-dev] [nova] Expand resource name allowed
 characters

 We have proposed that the allowed characters for all resource
 names in Nova (flavors, aggregates, etc.) be expanded to all
 printable unicode characters and horizontal spaces:
 https://review.openstack.org/#/c/119741

 Currently, the only allowed characters in most resource names are
 alphanumeric, space, and [.-_].


 We have proposed this change for two principal reasons:

 1. We have customers who have migrated data forward since Essex,
 when no restrictions were in place, and thus have characters in
 resource names that are disallowed in the current version of
 OpenStack. This is only likely to be useful to people migrating
 from Essex or earlier, since the current restrictions were added
 in Folsom.

 2. It's pretty much always a bad idea to add unnecessary
 restrictions without a good reason. While we don't have an
 immediate need to use, for example, the ever-useful
 http://codepoints.net/U+1F4A9 in a flavor name, it's hard to come
 up with a reason people *shouldn't* be allowed to use it.

 That said, apparently people have had a need to not be allowed to
 use some characters, but it's not clear why:
 https://bugs.launchpad.net/nova/+bug/977187 So I guess if anyone
 knows any reason why these printable characters should not be
 joined in holy resource naming, speak now or forever hold your
 peace.
 I also could not find the reason of current restriction on the bug
 report, and I'd like to know it as the history.
 On v2 API(not v2.1), each resource name contains the following
 restriction for its name:

Resource  | Length  | Pattern
   ---+-+--
aggregate | 1-255   | nothing
backup| nothing | nothing
flavor| 1-255   | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+
  | |   [a-zA-Z0-9. _-]*$'
keypair   | 1-255   | '^[a-zA-Z0-9 _-]+$'
server| 1-255   | nothing
cell  | 1-255   | don't contain . and !

 On v2.1 API, we have applied the same restriction rule[1] for whole
 resource names for API consistency, so maybe we need to consider
 this topic for whole names.

 [1]:
 https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44

 Honestly, I bet this had to do with how the database used to be set
 up.

 So it turns out that utf8 support in MySQL does not support UTF-8 4 byte
 multibyte characters (only 1-3 bytes). For example if you do a create
 image call with an image name to glance with a 4 byte multibyte
 character in the name it will 500. I'd guess we do something
 similar in places with the Nova API where we have inadequate input
 validation. If you want 4 byte support you need to use utf8mb4 instead.
 Oh... fun. :(

 I don't know if postgresql has any restrictions (I don't think it
 does) or if db2 does too. But I don't think we can/should make it a
 complete free for all. It should at most be what most databases support.

 I think its a big enough change that this late in the cycle we should
 push it off to Kilo. It's always much easier to loosen input validation
 than tighten it (or have to have an oops revert on an officially
 released Nova). Perhaps some tests to verify that everything we allow
 past the input validation checks we can actually store.
 So, honestly, that seems like a pendulum swing in an odd way.

 Havana use anything you want!
 Icehouse ?
 Juno strict asci!
 Kilo utf8

 Can't we just catch the db exception correctly in glance and not have it
 explode? And then allow it. Exploding with a 500 on a bad name seems the
 wrong thing to do anyway.

 That would also mean that if the user changed their database to support
 utf8mb4 (which they might want to do if it was important to them) it
 would all work.

 I think some release notes would be fine to explain the current
 situation and limitations.

 Basically, lets skate towards the puck here, realizing some corner cases
 exist, but that we're moving in the direction we want to be, not back
 tracking.

 When we can return the json-schema to user in the future, can we say
 that means API accepting utf8 or utf8mb4 is discoverable? If it is
 discoverable, then we needn't limit anything in our python code.

Honestly, we should accept utf8 (no weird mysqlism not quite utf8). We
should make the default scheme for our dbs support that on names (but
only for the name columns). The failure of a backend to do utf8 for real
should return an error to the user. Let's not make this more 

Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Christopher Yeoh
On Thu, 18 Sep 2014 13:45:42 +0200
Flavio Percoco fla...@redhat.com wrote:

 On 09/18/2014 01:28 PM, Sean Dague wrote:
  On 09/18/2014 07:19 AM, Ken'ichi Ohmichi wrote:
  2014-09-18 19:57 GMT+09:00 Sean Dague s...@dague.net:
  On 09/18/2014 06:38 AM, Christopher Yeoh wrote:
  On Sat, 13 Sep 2014 06:48:19 -0400
  Sean Dague s...@dague.net wrote:
 
  We have proposed that the allowed characters for all resource
  names in Nova (flavors, aggregates, etc.) be expanded to all
  printable unicode characters and horizontal spaces:
  https://review.openstack.org/#/c/119741
 
  Currently, the only allowed characters in most resource names
  are alphanumeric, space, and [.-_].
 
 
  We have proposed this change for two principal reasons:
 
  1. We have customers who have migrated data forward since
  Essex, when no restrictions were in place, and thus have
  characters in resource names that are disallowed in the
  current version of OpenStack. This is only likely to be
  useful to people migrating from Essex or earlier, since the
  current restrictions were added in Folsom.
 
  2. It's pretty much always a bad idea to add unnecessary
  restrictions without a good reason. While we don't have an
  immediate need to use, for example, the ever-useful
  http://codepoints.net/U+1F4A9 in a flavor name, it's hard to
  come up with a reason people *shouldn't* be allowed to use it.
 
  That said, apparently people have had a need to not be
  allowed to use some characters, but it's not clear why:
  https://bugs.launchpad.net/nova/+bug/977187 So I guess if
  anyone knows any reason why these printable characters should
  not be joined in holy resource naming, speak now or forever
  hold your peace.
 
  I also could not find the reason of current restriction on the
  bug report, and I'd like to know it as the history.
  On v2 API(not v2.1), each resource name contains the following
  restriction for its name:
 
Resource  | Length  | Pattern
   ---+-+--
aggregate | 1-255   | nothing
backup| nothing | nothing
flavor| 1-255   | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+
  | |   [a-zA-Z0-9. _-]*$'
keypair   | 1-255   | '^[a-zA-Z0-9 _-]+$'
server| 1-255   | nothing
cell  | 1-255   | don't contain . and !
 
  On v2.1 API, we have applied the same restriction rule[1] for
  whole resource names for API consistency, so maybe we need to
  consider this topic for whole names.
 
  [1]:
  https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44
 
  Honestly, I bet this had to do with how the database used to be
  set up.
 
 
  So it turns out that utf8 support in MySQL does not support
  UTF-8 4 byte multibyte characters (only 1-3 bytes). For example
  if you do a create image call with an image name to glance with
  a 4 byte multibyte character in the name it will 500. I'd guess
  we do something similar in places with the Nova API where we
  have inadequate input validation. If you want 4 byte support you
  need to use utf8mb4 instead.
 
  Oh... fun. :(
 
  I don't know if postgresql has any restrictions (I don't think it
  does) or if db2 does too. But I don't think we can/should make
  it a complete free for all. It should at most be what most
  databases support.
 
  I think its a big enough change that this late in the cycle we
  should push it off to Kilo. It's always much easier to loosen
  input validation than tighten it (or have to have an oops
  revert on an officially released Nova). Perhaps some tests to
  verify that everything we allow past the input validation checks
  we can actually store.
 
  So, honestly, that seems like a pendulum swing in an odd way.
 
  Havana use anything you want!
  Icehouse ?
  Juno strict asci!
  Kilo utf8
 
  Correct validation history is
 
  Essex: use anything you want!
  Folsom: strict asci!
  [..]
  Juno: strict asci!
 
  So I don't think we should make the input validation loose right
  now to avoid a pendulum swing.
  
  Ok, great. That history makes me ok with status quo. I didn't
  realize it went back so far.
  
  Can't we just catch the db exception correctly in glance and not
  have it explode? And then allow it. Exploding with a 500 on a bad
  name seems the wrong thing to do anyway.
 
  That would also mean that if the user changed their database to
  support utf8mb4 (which they might want to do if it was important
  to them) it would all work.
 
  I think some release notes would be fine to explain the current
  situation and limitations.
 
  Basically, lets skate towards the puck here, realizing some
  corner cases exist, but that we're moving in the direction we
  want to be, not back tracking.
 
  One idea is that: How about using base64 encoding/decoding if
  non-ascii chars come? REST API layer would encode resource names
  if necessary before passing it to DB layer, and we don't need to
  consider backend DB features. The disadvantage is that 

Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Monty Taylor
On 09/17/2014 10:44 AM, Day, Phil wrote:
 -Original Message-
 From: Jay Pipes [mailto:jaypi...@gmail.com]
 Sent: 12 September 2014 19:37
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [nova] Expand resource name allowed
 characters

 Had to laugh about the PILE OF POO character :) Comments inline...

 Can we get support for that in gerrit ?

I think we should have +2, +1, -1 and UNICODE POO


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Clint Byrum
Excerpts from Christopher Yeoh's message of 2014-09-18 16:57:12 -0700:
 On Thu, 18 Sep 2014 12:12:28 -0400
 Sean Dague s...@dague.net wrote:
   When we can return the json-schema to user in the future, can we say
   that means API accepting utf8 or utf8mb4 is discoverable? If it is
   discoverable, then we needn't limit anything in our python code.
  
  Honestly, we should accept utf8 (no weird mysqlism not quite utf8). We
  should make the default scheme for our dbs support that on names (but
  only for the name columns). The failure of a backend to do utf8 for
  real should return an error to the user. Let's not make this more
  complicated than it needs to be.
 
 I agree that discoverability for this is not the way to go - I think its
 too complicated for end users. I don't know enough about mysql to know
 if utf8mb4 is going to a performance issue but if its not then we
 should just support utf-8 properly. 
 
 We can we can catch the db errors. However whilst converting db
 errors causing 500s is fairly straightforward when an error occurs that
 deep in Nova it also means a lot of potential unwinding work in the db
 and compute layers which is complicated and error prone. So i'd prefer
 to avoid the situation with input validation in the first place. 

Just to add a reference into the discussion:

http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html

It does have the same limitation of making fixed width keys and CHAR()
columns. It goes from 3 bytes per CHAR position, to 4, so it should not
be a database wide default, but something we use sparingly.

Note that the right answer for things that are not utf-8 (like UUID's)
is not to set a charset of latin1, but use BINARY/VARBINARY. Last
time I tried I had a difficult time coercing SQLAlchemy to model the
difference.. but maybe I just didn't look in the right part of the manual.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-18 Thread Sean Dague
On 09/18/2014 07:57 PM, Christopher Yeoh wrote:
 On Thu, 18 Sep 2014 12:12:28 -0400
 Sean Dague s...@dague.net wrote:
 When we can return the json-schema to user in the future, can we say
 that means API accepting utf8 or utf8mb4 is discoverable? If it is
 discoverable, then we needn't limit anything in our python code.

 Honestly, we should accept utf8 (no weird mysqlism not quite utf8). We
 should make the default scheme for our dbs support that on names (but
 only for the name columns). The failure of a backend to do utf8 for
 real should return an error to the user. Let's not make this more
 complicated than it needs to be.
 
 I agree that discoverability for this is not the way to go - I think its
 too complicated for end users. I don't know enough about mysql to know
 if utf8mb4 is going to a performance issue but if its not then we
 should just support utf-8 properly. 
 
 We can we can catch the db errors. However whilst converting db
 errors causing 500s is fairly straightforward when an error occurs that
 deep in Nova it also means a lot of potential unwinding work in the db
 and compute layers which is complicated and error prone. So i'd prefer
 to avoid the situation with input validation in the first place. 

Honestly, it's not that bad. We've been catching and translating those
errors in the past.

Not supporting the whole utf8 charset is silly, it's there for a reason.

And the point being, we'll make the db names fields catch up over time.
Again, skate to where you think the puck is going to be.

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-17 Thread Day, Phil
 -Original Message-
 From: Jay Pipes [mailto:jaypi...@gmail.com]
 Sent: 12 September 2014 19:37
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [nova] Expand resource name allowed
 characters
 
 Had to laugh about the PILE OF POO character :) Comments inline...

Can we get support for that in gerrit ?
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-15 Thread Daniel P. Berrange
On Fri, Sep 12, 2014 at 01:52:35PM -0400, Chris St. Pierre wrote:
 We have proposed that the allowed characters for all resource names in Nova
 (flavors, aggregates, etc.) be expanded to all printable unicode characters
 and horizontal spaces: https://review.openstack.org/#/c/119741
 
 Currently, the only allowed characters in most resource names are
 alphanumeric, space, and [.-_].
 
 We have proposed this change for two principal reasons:
 
 1. We have customers who have migrated data forward since Essex, when no
 restrictions were in place, and thus have characters in resource names that
 are disallowed in the current version of OpenStack. This is only likely to
 be useful to people migrating from Essex or earlier, since the current
 restrictions were added in Folsom.
 
 2. It's pretty much always a bad idea to add unnecessary restrictions
 without a good reason. While we don't have an immediate need to use, for
 example, the ever-useful http://codepoints.net/U+1F4A9 in a flavor name,
 it's hard to come up with a reason people *shouldn't* be allowed to use it.
 
 That said, apparently people have had a need to not be allowed to use some
 characters, but it's not clear why:
 https://bugs.launchpad.net/nova/+bug/977187
 
 So I guess if anyone knows any reason why these printable characters should
 not be joined in holy resource naming, speak now or forever hold your peace.

I would consider any place where there is a user specified, free form
string intended for end user consumption should be totally unrestricted
in the characters it allows. To arbitrarily restrict the user is a bug.
If there are current technical reasons for the restriction we should look
at what we must do to resolve them.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-15 Thread Chris St. Pierre
On Mon, Sep 15, 2014 at 4:34 AM, Daniel P. Berrange berra...@redhat.com
wrote:

 To arbitrarily restrict the user is a bug.


QFT.

This is why I don't feel like a blueprint should be necessary -- this is a
fairly simple changes that fixes what's pretty undeniably a bug. I also
don't see much consensus on whether or not I need to go through the
interminable blueprint process to get this accepted.

So since everyone seems to think that this is at least not a bad idea, and
since no one seems to know why it was originally changed, what stands
between me and a +2?

Thanks.

-- 
Chris St. Pierre
Senior Software Engineer
metacloud.com
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-15 Thread Daniel P. Berrange
On Mon, Sep 15, 2014 at 09:21:45AM -0500, Chris St. Pierre wrote:
 On Mon, Sep 15, 2014 at 4:34 AM, Daniel P. Berrange berra...@redhat.com
 wrote:
 
  To arbitrarily restrict the user is a bug.
 
 
 QFT.
 
 This is why I don't feel like a blueprint should be necessary -- this is a
 fairly simple changes that fixes what's pretty undeniably a bug. I also
 don't see much consensus on whether or not I need to go through the
 interminable blueprint process to get this accepted.
 
 So since everyone seems to think that this is at least not a bad idea, and
 since no one seems to know why it was originally changed, what stands
 between me and a +2?

Submit a fix for it, I'll happily +2 it without a blueprint. We're going
to be adopting a more lenient policy on what needs a blueprint in kilo
and so I don't think this would need one in that proposal anyway.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-15 Thread Jay Pipes

On 09/15/2014 10:21 AM, Chris St. Pierre wrote:

On Mon, Sep 15, 2014 at 4:34 AM, Daniel P. Berrange berra...@redhat.com
mailto:berra...@redhat.com wrote:

To arbitrarily restrict the user is a bug.


QFT.

This is why I don't feel like a blueprint should be necessary -- this is
a fairly simple changes that fixes what's pretty undeniably a bug. I
also don't see much consensus on whether or not I need to go through the
interminable blueprint process to get this accepted.

So since everyone seems to think that this is at least not a bad idea,
and since no one seems to know why it was originally changed,


I believe I did:

http://lists.openstack.org/pipermail/openstack-dev/2014-September/045924.html

 what

stands between me and a +2?


Bug fix priorities, feature freeze exceptions, and review load.

Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-15 Thread Chris St. Pierre
On Mon, Sep 15, 2014 at 11:16 AM, Jay Pipes jaypi...@gmail.com wrote:

 I believe I did:

 http://lists.openstack.org/pipermail/openstack-dev/2014-
 September/045924.html


Sorry, missed your explanation. I think Sean's suggestion -- to keep ID
fields restricted, but de-restrict name fields -- walks a nice middle
ground between database bloat/performance concerns and user experience.


  what

 stands between me and a +2?


 Bug fix priorities, feature freeze exceptions, and review load.
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Well, sure. I meant other than that. :)

My review is at https://review.openstack.org/#/c/119421/ if anyone does
find time to +N it. Thanks all!

-- 
Chris St. Pierre
Senior Software Engineer
metacloud.com
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-15 Thread Chris St. Pierre
Linking clearly isn't my strong suit:
https://review.openstack.org/#/c/119741/

On Mon, Sep 15, 2014 at 1:58 PM, Chris St. Pierre stpie...@metacloud.com
wrote:

 On Mon, Sep 15, 2014 at 11:16 AM, Jay Pipes jaypi...@gmail.com wrote:

 I believe I did:

 http://lists.openstack.org/pipermail/openstack-dev/2014-
 September/045924.html


 Sorry, missed your explanation. I think Sean's suggestion -- to keep ID
 fields restricted, but de-restrict name fields -- walks a nice middle
 ground between database bloat/performance concerns and user experience.


  what

 stands between me and a +2?


 Bug fix priorities, feature freeze exceptions, and review load.
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


 Well, sure. I meant other than that. :)

 My review is at https://review.openstack.org/#/c/119421/ if anyone does
 find time to +N it. Thanks all!

 --
 Chris St. Pierre
 Senior Software Engineer
 metacloud.com




-- 
Chris St. Pierre
Senior Software Engineer
metacloud.com
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-13 Thread Kenichi Oomichi

Hi Chris,

Thanks for bring it up here,

 -Original Message-
 From: Chris St. Pierre [mailto:stpie...@metacloud.com]
 Sent: Saturday, September 13, 2014 2:53 AM
 To: openstack-dev@lists.openstack.org
 Subject: [openstack-dev] [nova] Expand resource name allowed characters
 
 We have proposed that the allowed characters for all resource names in Nova 
 (flavors, aggregates, etc.) be expanded to
 all printable unicode characters and horizontal spaces: 
 https://review.openstack.org/#/c/119741
 
 Currently, the only allowed characters in most resource names are 
 alphanumeric, space, and [.-_].
 
 
 We have proposed this change for two principal reasons:
 
 1. We have customers who have migrated data forward since Essex, when no 
 restrictions were in place, and thus have characters
 in resource names that are disallowed in the current version of OpenStack. 
 This is only likely to be useful to people
 migrating from Essex or earlier, since the current restrictions were added in 
 Folsom.
 
 2. It's pretty much always a bad idea to add unnecessary restrictions without 
 a good reason. While we don't have an immediate
 need to use, for example, the ever-useful http://codepoints.net/U+1F4A9 in a 
 flavor name, it's hard to come up with a
 reason people *shouldn't* be allowed to use it.
 
 That said, apparently people have had a need to not be allowed to use some 
 characters, but it's not clear why:
 https://bugs.launchpad.net/nova/+bug/977187
 So I guess if anyone knows any reason why these printable characters should 
 not be joined in holy resource naming, speak
 now or forever hold your peace.

I also could not find the reason of current restriction on the bug report,
and I'd like to know it as the history.
On v2 API(not v2.1), each resource name contains the following restriction
for its name:

  Resource  | Length  | Pattern
 ---+-+--
  aggregate | 1-255   | nothing
  backup| nothing | nothing
  flavor| 1-255   | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+
| |   [a-zA-Z0-9. _-]*$'
  keypair   | 1-255   | '^[a-zA-Z0-9 _-]+$'
  server| 1-255   | nothing
  cell  | 1-255   | don't contain . and !

On v2.1 API, we have applied the same restriction rule[1] for whole resource
names for API consistency, so maybe we need to consider this topic for whole
names.

[1]: 
https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44

Thanks
Ken'ichi Ohmichi

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-13 Thread Sean Dague
On 09/13/2014 02:28 AM, Kenichi Oomichi wrote:
 
 Hi Chris,
 
 Thanks for bring it up here,
 
 -Original Message-
 From: Chris St. Pierre [mailto:stpie...@metacloud.com]
 Sent: Saturday, September 13, 2014 2:53 AM
 To: openstack-dev@lists.openstack.org
 Subject: [openstack-dev] [nova] Expand resource name allowed characters

 We have proposed that the allowed characters for all resource names in Nova 
 (flavors, aggregates, etc.) be expanded to
 all printable unicode characters and horizontal spaces: 
 https://review.openstack.org/#/c/119741

 Currently, the only allowed characters in most resource names are 
 alphanumeric, space, and [.-_].


 We have proposed this change for two principal reasons:

 1. We have customers who have migrated data forward since Essex, when no 
 restrictions were in place, and thus have characters
 in resource names that are disallowed in the current version of OpenStack. 
 This is only likely to be useful to people
 migrating from Essex or earlier, since the current restrictions were added 
 in Folsom.

 2. It's pretty much always a bad idea to add unnecessary restrictions 
 without a good reason. While we don't have an immediate
 need to use, for example, the ever-useful http://codepoints.net/U+1F4A9 in a 
 flavor name, it's hard to come up with a
 reason people *shouldn't* be allowed to use it.

 That said, apparently people have had a need to not be allowed to use some 
 characters, but it's not clear why:
 https://bugs.launchpad.net/nova/+bug/977187
 So I guess if anyone knows any reason why these printable characters should 
 not be joined in holy resource naming, speak
 now or forever hold your peace.
 
 I also could not find the reason of current restriction on the bug report,
 and I'd like to know it as the history.
 On v2 API(not v2.1), each resource name contains the following restriction
 for its name:
 
   Resource  | Length  | Pattern
  ---+-+--
   aggregate | 1-255   | nothing
   backup| nothing | nothing
   flavor| 1-255   | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+
 | |   [a-zA-Z0-9. _-]*$'
   keypair   | 1-255   | '^[a-zA-Z0-9 _-]+$'
   server| 1-255   | nothing
   cell  | 1-255   | don't contain . and !
 
 On v2.1 API, we have applied the same restriction rule[1] for whole resource
 names for API consistency, so maybe we need to consider this topic for whole
 names.
 
 [1]: 
 https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44

Honestly, I bet this had to do with how the database used to be set up.

That being said, i'm pro letting names be 'utf8'. The id fields that are
strings (like flavor_id) I think we should keep constrained, as we do
actually do joins on them. (And as jay said, the current utf8 schema is
actually highly inefficient and bloaty.)

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Expand resource name allowed characters

2014-09-12 Thread Jay Pipes

Had to laugh about the PILE OF POO character :) Comments inline...

On 09/12/2014 01:52 PM, Chris St. Pierre wrote:

We have proposed that the allowed characters for all resource names in
Nova (flavors, aggregates, etc.) be expanded to all printable unicode
characters and horizontal spaces: https://review.openstack.org/#/c/119741

Currently, the only allowed characters in most resource names are
alphanumeric, space, and [.-_].

We have proposed this change for two principal reasons:

1. We have customers who have migrated data forward since Essex, when no
restrictions were in place, and thus have characters in resource names
that are disallowed in the current version of OpenStack. This is only
likely to be useful to people migrating from Essex or earlier, since the
current restrictions were added in Folsom.


As this will affect the public REST APIs, the change should have a 
blueprint spec at the very least written for it. I don't remember why 
precisely the restrictions were put in place to begin with, but I'd 
imagine it probably had to do with early database schemas that may not 
have supported UTF-8 on the name columns by default.


Unfortunately, to my dismay, we addressed this in a number of projects 
(like Nova) by just changing ALL string columns to UTF-8, which, as I've 
stated a few times on the ML and in meetings, blows up index and 
temporary table sizes dramatically in MySQL. We should really be using 
UTF-8 on demand for columns like names. Having UTF-8 character set and 
collations on string fields for, say, UUID fields, is a giant waste of 
space.


Anyway, all this to say, yeah, there really shouldn't be restrictions 
like this on the name columns, but we need to be careful about changes 
to public APIs and it would be good to have the API microversioning in 
place before we accept such a change.



2. It's pretty much always a bad idea to add unnecessary restrictions
without a good reason. While we don't have an immediate need to use, for
example, the ever-useful http://codepoints.net/U+1F4A9 in a flavor name,
it's hard to come up with a reason people *shouldn't* be allowed to use it.


LOL. Love it. Somebody at a public cloud should trademark the first 
public PILE OF POO flavor name.


Best,
-jay


That said, apparently people have had a need to not be allowed to use
some characters, but it's not clear why:
https://bugs.launchpad.net/nova/+bug/977187

So I guess if anyone knows any reason why these printable characters
should not be joined in holy resource naming, speak now or forever hold
your peace.

Thanks!

--
Chris St. Pierre
Senior Software Engineer
metacloud.com http://metacloud.com


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev