On Tue, Jun 21, 2016 at 3:16 PM, James E. Blair <[email protected]> wrote:
> Morgan Fainberg <[email protected]> writes: > > > As I have been converting Zuul and NodePool to python3, I have had to do > a > > bunch of changes around encode() and decode() of strings since gear is > > (properly) an implementation of a protocol that requires binary data > > (rather than text_strings). > > > > What this has highlighted is that gear should be made a bit more friendly > > to use in the python world. We already explicitly assume a utf-8 encoding > > for when things are turned into "binary" when crafting the job object in > > certain cases [1]. I have discussed this with Jim Blair, and we both > agree > > that the ability to still reference attributes such as "job.name" (in a > > simple manner that is straight forward), is important. > > Thanks for this email -- with this and a little browsing of the gear > code, I think I've absorbed most of the context and am able to process > this. :) > > > Here is the outline of the change I'm proposing: > > > > * The main consumable part of gear (public classes) will convert the > > "string" data we assign ( name[2], unique[3]) into utf-8-encoded bytes > via > > @property, @property.setter, and @property.getter for public consumption. > > This seems good. I think we can make the assertion that in the API > these things should always be utf-8 encoded strings, and do what's > needed behind the scenes to encode/decode to bytes. > > I think this is the bulk of what we need to do to make gear > user-friendly. > > ++ This is where I am going to focus. > > * Arguments are explicitly supposed to be a binary_blob [4]. I am unsure > if > > this should also be automatically converted *or* if it should be the > > inverse, .arguments and .arguments_string ? > > This is the thing we can't handle automatically. At least, we can on > the input side (if the user passes in a string, we could auto-encode but > leave bytes alone -- but on the output side since the protocol is > un-typed, we wouldn't know what to do). So maybe this is a place where > we should force the user to encode/decode. > > Looking at the gear code, I think that was the intent in a lot of > places, but I think we may have gotten ahead of ourselves and written > code assuming that isinstance('string', bytes) is False (which is the > case in python3, but not in python2): > > > https://github.com/openstack-infra/gear/blob/59d29104cb9c370e49b44f313e568bd43b172e1b/gear/__init__.py#L589 > > That line of code explains a lot to me. > > I think we should stick close to the idea that arguments are binary > blobs. > > From that, it seems to follow that reading job.arguments should get you > a bytes object. > > On the one hand, I don't think it's unreasonable to say that users must > encode/decode on that boundary. > > On the other hand, I think we could easily have a job.arguments_string > property that decodes that from utf-8 for convenience, so why not? I > think I'm in favor of this. > > If we do that, then I think the question is: do we have gear > automatically encode job arguments (in the Job constructor) if the user > passes in a string? I tend to favor this as well because it should > almost always do the right thing in both python2 and python3 without > inhibiting usage for either strings or binary data. Does that sound > reasonable? > > I've spent a lot of time thinking about this, I'm going to say that I agree, but I feel like the _string property will be mostly unused. I want to see how things look before we add it in. If it makes things a lot better we can always add it. > > * Internally gear will reference the encoded bits via the underlying > > <name>_binary form, which will allow direct access to a non-"string" form > > of the data itself in the case that there is interesting things that need > > to be handled via binary packing (for example) instead of "stringified" > > versions. > > Sounds good. Maybe we should call it "<name>_bytes" though for clarity? > /bikeshed > > HAH! Sure. But as long as the bikeshed is cherenkov radiation blue. > > * For compatibility the main @property.setter will handle either > > binary_type or string_type (so we don't break everyone). > > Most of these are set in the constructor -- having a setter do that > should make the constructor do the right thing, but in case there are > cases where we need to remove type validation, I think it's worth saying > that our intent is to have the constructor handle both forms as well for > the same reason. > > > * The "<name>_binary" will enforce that the data with be a binary_type > > only. > > > > > > I think this can be done in a single release of gear with minimal impact > to > > those using it. For what it is worth, it is unlikely that anyone has used > > gear extensively in python3 as of yet because of recent bug fixes that > > addressed py2->py3 compat issues around dict.values() and similar list() > -> > > iter() changes. > > Agreed. > > > See the one question in the above proposal for "arguments". > > > > [1] > > > https://github.com/openstack-infra/gear/blob/59d29104cb9c370e49b44f313e568bd43b172e1b/gear/__init__.py#L86 > > [2] > > > https://github.com/openstack-infra/gear/blob/59d29104cb9c370e49b44f313e568bd43b172e1b/gear/__init__.py#L2054 > > [3] > > > https://github.com/openstack-infra/gear/blob/59d29104cb9c370e49b44f313e568bd43b172e1b/gear/__init__.py#L2058 > > [4] > > > https://github.com/openstack-infra/gear/blob/59d29104cb9c370e49b44f313e568bd43b172e1b/gear/__init__.py#L2056 > > > > Thanks, > > --Morgan > > _______________________________________________ > > OpenStack-Infra mailing list > > [email protected] > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra > > Thanks again! > > -Jim > Cheers, --Morgan
_______________________________________________ OpenStack-Infra mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
