On 09/18/2014 01:28 PM, Sean Dague wrote: > On 09/18/2014 07:19 AM, Ken'ichi Ohmichi wrote: >> 2014-09-18 19:57 GMT+09:00 Sean Dague <[email protected]>: >>> On 09/18/2014 06:38 AM, Christopher Yeoh wrote: >>>> On Sat, 13 Sep 2014 06:48:19 -0400 >>>> Sean Dague <[email protected]> wrote: >>>>>>> >>>>>>> We have proposed that the allowed characters for all resource >>>>>>> names in Nova (flavors, aggregates, etc.) be expanded to all >>>>>>> printable unicode characters and horizontal spaces: >>>>>>> https://review.openstack.org/#/c/119741 >>>>>>> >>>>>>> Currently, the only allowed characters in most resource names are >>>>>>> alphanumeric, space, and [.-_]. >>>>>>> >>>>>>> >>>>>>> We have proposed this change for two principal reasons: >>>>>>> >>>>>>> 1. We have customers who have migrated data forward since Essex, >>>>>>> when no restrictions were in place, and thus have characters in >>>>>>> resource names that are disallowed in the current version of >>>>>>> OpenStack. This is only likely to be useful to people migrating >>>>>>> from Essex or earlier, since the current restrictions were added >>>>>>> in Folsom. >>>>>>> >>>>>>> 2. It's pretty much always a bad idea to add unnecessary >>>>>>> restrictions without a good reason. While we don't have an >>>>>>> immediate need to use, for example, the ever-useful >>>>>>> http://codepoints.net/U+1F4A9 in a flavor name, it's hard to come >>>>>>> up with a reason people *shouldn't* be allowed to use it. >>>>>>> >>>>>>> That said, apparently people have had a need to not be allowed to >>>>>>> use some characters, but it's not clear why: >>>>>>> https://bugs.launchpad.net/nova/+bug/977187 So I guess if anyone >>>>>>> knows any reason why these printable characters should not be >>>>>>> joined in holy resource naming, speak now or forever hold your >>>>>>> peace. >>>>>> >>>>>> I also could not find the reason of current restriction on the bug >>>>>> report, and I'd like to know it as the history. >>>>>> On v2 API(not v2.1), each resource name contains the following >>>>>> restriction for its name: >>>>>> >>>>>> Resource | Length | Pattern >>>>>> -----------+---------+---------------------------------- >>>>>> aggregate | 1-255 | nothing >>>>>> backup | nothing | nothing >>>>>> flavor | 1-255 | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+ >>>>>> | | [a-zA-Z0-9. _-]*$' >>>>>> keypair | 1-255 | '^[a-zA-Z0-9 _-]+$' >>>>>> server | 1-255 | nothing >>>>>> cell | 1-255 | don't contain "." and "!" >>>>>> >>>>>> On v2.1 API, we have applied the same restriction rule[1] for whole >>>>>> resource names for API consistency, so maybe we need to consider >>>>>> this topic for whole names. >>>>>> >>>>>> [1]: >>>>>> https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44 >>>>> >>>>> Honestly, I bet this had to do with how the database used to be set >>>>> up. >>>>> >>>> >>>> So it turns out that utf8 support in MySQL does not support UTF-8 4 byte >>>> multibyte characters (only 1-3 bytes). For example if you do a create >>>> image call with an image name to glance with a 4 byte multibyte >>>> character in the name it will 500. I'd guess we do something >>>> similar in places with the Nova API where we have inadequate input >>>> validation. If you want 4 byte support you need to use utf8mb4 instead. >>> >>> Oh... fun. :( >>> >>>> I don't know if postgresql has any restrictions (I don't think it >>>> does) or if db2 does too. But I don't think we can/should make it a >>>> complete free for all. It should at most be what most databases support. >>>> >>>> I think its a big enough change that this late in the cycle we should >>>> push it off to Kilo. It's always much easier to loosen input validation >>>> than tighten it (or have to have an "oops" revert on an officially >>>> released Nova). Perhaps some tests to verify that everything we allow >>>> past the input validation checks we can actually store. >>> >>> So, honestly, that seems like a pendulum swing in an odd way. >>> >>> Havana "use anything you want!" >>> Icehouse ? >>> Juno "strict asci!" >>> Kilo "utf8" >> >> Correct validation history is >> >> Essex: "use anything you want!" >> Folsom: "strict asci!" >> [..] >> Juno: "strict asci!" >> >> So I don't think we should make the input validation loose right now >> to avoid a pendulum swing. > > Ok, great. That history makes me ok with status quo. I didn't realize it > went back so far. > >>> Can't we just catch the db exception correctly in glance and not have it >>> explode? And then allow it. Exploding with a 500 on a bad name seems the >>> wrong thing to do anyway. >>> >>> That would also mean that if the user changed their database to support >>> utf8mb4 (which they might want to do if it was important to them) it >>> would all work. >>> >>> I think some release notes would be fine to explain the current >>> situation and limitations. >>> >>> Basically, lets skate towards the puck here, realizing some corner cases >>> exist, but that we're moving in the direction we want to be, not back >>> tracking. >> >> One idea is that: How about using base64 encoding/decoding if non-ascii >> chars come? REST API layer would encode resource names if necessary >> before passing it to DB layer, and we don't need to consider backend DB >> features. The disadvantage is that available name length becomes short if >> non-ascii chars. But maybe that would be acceptable.. > > Honestly, we should utf8 through to the db. If the user's db doesn't > support it... that's their implementation issue.
+1 > > Also... glance should not explode. That seems like a critical bug. Has > anyone filed that? I agree it should not explode. That seems bad. I'll repeat Sean's question: Has anyone filed that? Otherwise, I think I'll go ahead and create a bug for it. Thanks, Flavio -- @flaper87 Flavio Percoco _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
