On 09/18/2014 07:19 AM, Ken'ichi Ohmichi wrote:
> 2014-09-18 19:57 GMT+09:00 Sean Dague <s...@dague.net>:
>> On 09/18/2014 06:38 AM, Christopher Yeoh wrote:
>>> On Sat, 13 Sep 2014 06:48:19 -0400
>>> Sean Dague <s...@dague.net> wrote:
>>>>>>
>>>>>> We have proposed that the allowed characters for all resource
>>>>>> names in Nova (flavors, aggregates, etc.) be expanded to all
>>>>>> printable unicode characters and horizontal spaces:
>>>>>> https://review.openstack.org/#/c/119741
>>>>>>
>>>>>> Currently, the only allowed characters in most resource names are
>>>>>> alphanumeric, space, and [.-_].
>>>>>>
>>>>>>
>>>>>> We have proposed this change for two principal reasons:
>>>>>>
>>>>>> 1. We have customers who have migrated data forward since Essex,
>>>>>> when no restrictions were in place, and thus have characters in
>>>>>> resource names that are disallowed in the current version of
>>>>>> OpenStack. This is only likely to be useful to people migrating
>>>>>> from Essex or earlier, since the current restrictions were added
>>>>>> in Folsom.
>>>>>>
>>>>>> 2. It's pretty much always a bad idea to add unnecessary
>>>>>> restrictions without a good reason. While we don't have an
>>>>>> immediate need to use, for example, the ever-useful
>>>>>> http://codepoints.net/U+1F4A9 in a flavor name, it's hard to come
>>>>>> up with a reason people *shouldn't* be allowed to use it.
>>>>>>
>>>>>> That said, apparently people have had a need to not be allowed to
>>>>>> use some characters, but it's not clear why:
>>>>>> https://bugs.launchpad.net/nova/+bug/977187 So I guess if anyone
>>>>>> knows any reason why these printable characters should not be
>>>>>> joined in holy resource naming, speak now or forever hold your
>>>>>> peace.
>>>>>
>>>>> I also could not find the reason of current restriction on the bug
>>>>> report, and I'd like to know it as the history.
>>>>> On v2 API(not v2.1), each resource name contains the following
>>>>> restriction for its name:
>>>>>
>>>>>   Resource  | Length  | Pattern
>>>>>  -----------+---------+----------------------------------
>>>>>   aggregate | 1-255   | nothing
>>>>>   backup    | nothing | nothing
>>>>>   flavor    | 1-255   | '^[a-zA-Z0-9. _-]*[a-zA-Z0-9_-]+
>>>>>             |         |   [a-zA-Z0-9. _-]*$'
>>>>>   keypair   | 1-255   | '^[a-zA-Z0-9 _-]+$'
>>>>>   server    | 1-255   | nothing
>>>>>   cell      | 1-255   | don't contain "." and "!"
>>>>>
>>>>> On v2.1 API, we have applied the same restriction rule[1] for whole
>>>>> resource names for API consistency, so maybe we need to consider
>>>>> this topic for whole names.
>>>>>
>>>>> [1]:
>>>>> https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L44
>>>>
>>>> Honestly, I bet this had to do with how the database used to be set
>>>> up.
>>>>
>>>
>>> So it turns out that utf8 support in MySQL does not support UTF-8 4 byte
>>> multibyte characters (only 1-3 bytes). For example if you do a create
>>> image call with an image name to glance with a 4 byte multibyte
>>> character in the name it will 500. I'd guess we do something
>>> similar in places with the Nova API where we have inadequate input
>>> validation. If you want 4 byte support you need to use utf8mb4 instead.
>>
>> Oh... fun. :(
>>
>>> I don't know if postgresql has any restrictions (I don't think it
>>> does) or if db2 does too. But I don't think we can/should make it a
>>> complete free for all. It should at most be what most databases support.
>>>
>>> I think its a big enough change that this late in the cycle we should
>>> push it off to Kilo. It's always much easier to loosen input validation
>>> than tighten it (or have to have an "oops" revert on an officially
>>> released Nova). Perhaps some tests to verify that everything we allow
>>> past the input validation checks we can actually store.
>>
>> So, honestly, that seems like a pendulum swing in an odd way.
>>
>> Havana "use anything you want!"
>> Icehouse ?
>> Juno "strict asci!"
>> Kilo "utf8"
> 
> Correct validation history is
> 
> Essex: "use anything you want!"
> Folsom: "strict asci!"
> [..]
> Juno: "strict asci!"
>
> So I don't think we should make the input validation loose right now
> to avoid a pendulum swing.

Ok, great. That history makes me ok with status quo. I didn't realize it
went back so far.

>> Can't we just catch the db exception correctly in glance and not have it
>> explode? And then allow it. Exploding with a 500 on a bad name seems the
>> wrong thing to do anyway.
>>
>> That would also mean that if the user changed their database to support
>> utf8mb4 (which they might want to do if it was important to them) it
>> would all work.
>>
>> I think some release notes would be fine to explain the current
>> situation and limitations.
>>
>> Basically, lets skate towards the puck here, realizing some corner cases
>> exist, but that we're moving in the direction we want to be, not back
>> tracking.
> 
> One idea is that: How about using base64 encoding/decoding if non-ascii
> chars come? REST API layer would encode resource names if necessary
> before passing it to DB layer, and we don't need to consider backend DB
> features. The disadvantage is that available name length becomes short if
> non-ascii chars. But maybe that would be acceptable..

Honestly, we should utf8 through to the db. If the user's db doesn't
support it... that's their implementation issue.

Also... glance should not explode. That seems like a critical bug. Has
anyone filed that?

        -Sean

-- 
Sean Dague
http://dague.net

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to