Andy, The logic for what is a property and which part is the key and which part is the value does appear to require special handling of keys with spaces [1]. I agree this is problematic.
I also agree that your argument regarding the separation of 'name' and 'displayName' is correct and enforcing them to be handled explicitly is more correct. I also like your proposal for handling resource bundles though I think there will be some hurdles to cross in terms of where that larger bundle lives and how it is managed. We'll need to allow for local (to a given nar) definition of those bundles and a way for them to be overridden someplace more central. Furhter, I think we'll need to be careful about the namespace of processor properties because we might want to let their be other things that get internationalized like Strings used for logging and so on. So... To do this then we'd start enforcing that 'name' of PropertyDescriptors honors the property naming rules as found [1]. We'd also need to enforce that displayName always be specified and if not we throw an exception in the builder perhaps. We also need to document the heck out of this for the 1.x migration guide for developers and we need to go through all processors in nifi. Finally, we'd need to make sure that name+displayName is sufficient and the resource bundle for internationalization is an optional thing. I am ok with this. I think it is pretty disruptive but I acknowledge it is more correct and ultimately a wise step. Did I capture the spirit of your message well? [1] https://docs.oracle.com/javase/8/docs/api/java/util/Properties.html#load-java.io.Reader- On Fri, May 6, 2016 at 1:07 PM, Andy LoPresto <[email protected]> wrote: > Joe, > > My concern with the ordered scenarios you listed is that I believe the level > of awareness you describe will still cause difficulties based on the name > value. > > Consider the scenario where I added the ability to encrypt or decrypt using > a raw key with the EncryptContent processor [1]. I had to add a new > property, and I’ve made inline comments below: > > 1) How can I provide a new property to this processor? > Answer: Add a property descriptor and set the name. This name will be > used to refer to the property descriptor whenever serialized/saving > the config and it will be rendered through the REST API and thus made > available as the property name in the UI. > > > In this case, I chose “raw-key-hex” as the name, following the convention in > this processor. I used “Raw Key (Hex)” as the displayName. If I had been > adding this to a new processor or one without existing properties to > demonstrate convention, I would have only known about “name”, and would have > provided “Raw Key (Hex)” as the name, as that is what I want to be displayed > in the UI. > > > 2) Oh snap. I wish I had used a different name because I've found a > better way to communicate intent to the user. How do I do this? > Answer: Go ahead and set displayName. NiFi will continue to use the > 'name' for serialization/config saving but will use the displayName > for what is shown to the user in the UI. > > > Let’s say that I realize non-crypto people don’t understand “Raw Key (Hex)” > and I want to change it to “32 or 64 character key in hexadecimal format”. I > learn about displayName after trying to change the name directly results in > backward compatibility issues. > > > 3) I would like to support locale sensitive representations of my > property name. How can I do this? > Answer: Add a resource bundle with entries for your property 'name' > value. This means the resource bundle needs to exist and your > property 'name' must adhere to resource bundle key naming requirements > [1]. If this is supplied and can be looked up then this will be used > and otherwise will fallback to using displayName value if present and > otherwise will fallback to using the value of 'name'. > > > I decide I would like to internationalize my processor and support Spanish > as an additional language. I configure a resource bundle with the phrase > “Clave de cifrado (formato hexadecimal)”. What is the “key” I need to assign > this value to? It’s the original name value, or “Raw Key (Hex)”. At this > point, the name does not adhere to the key naming constraints [2] > (technically spaces can be escaped but it requires additional effort), and I > have no way to modify this without breaking the object resolution in the > flow.xml.gz. > > I maintain that in an ideal world, providing the name and displayName values > from the beginning establishes a clear and consistent convention that > separates object resolution and UI display and presents a cohesive example > for any follow-on properties declared in the processor. To me, this drives > “community” because it provides a clear example for new developers, so they > don’t get comments on their first PR saying “Oh, there is an issue when > using a spaced- or quoted- name value that will cause problems down the road > but unless you read a specific page of the documentation you wouldn’t know > that.” The “cost” of providing both values up front is negligible to me, but > the benefit is high. > > Because we are talking about internationalization, I also think it might be > beneficial to discuss the model for resource bundles. While it would be nice > if every contributor of a processor provided their own resource bundles, I > think this really fragments the operation and will make large-scale > translation efforts very difficult. What if, instead, a global resource > bundle per language/locale was provided, and individual messages were > collected in a single location, and namespaced appropriately? Thus the Kafka > processor batch size key translation and ExecuteSQL processor batch size key > translation could both be declared in one resource file. > > Something like: > > messages_es_ES.properties: > > org.apache.nifi.processors.standard.ExecuteSQL.batch_size=tamaño del lote > org.apache.nifi.processors.kafka.PutKafka=tamaño del lote de mensajes > > This would make it much easier for someone contributing translation services > to focus on a specific language in one place, rather than searching the > entire application for 100+ resource bundles for their language alone. I > don’t think we can expect someone who contributes specific processor > functionality to be able to translate those messages into multiple, much > less every, language, but I think it is more reasonable that someone capable > of translating Kafka messages to another language can do the same for SQL > messages. > > > > [1] > https://github.com/apache/nifi/commit/498b5023ce4f99e67184f399de740b142fca510d#diff-16f04588b3426d04bbafe4ff1c7dda69R139 > [2] > https://stackoverflow.com/questions/5491517/spaces-in-message-bundle-keys > > > Andy LoPresto > [email protected] > [email protected] > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 > > On May 6, 2016, at 8:04 AM, Brandon DeVries <[email protected]> wrote: > > Joe, > > Alright, I buy that. As a community, we try not to do #2 (stop > laughing...). But we still allow someone writing / maintaining custom > components to do so without figuring out how to set up a resource bundle. > +1. > > Brandon > > On Fri, May 6, 2016 at 10:51 AM Joe Witt <[email protected]> wrote: > > Brandon > > Understood. I'd say I lean more towards providing 1, 2, and 3 because > allows us to be flexible in what we accept and conservative in what we > do. By that I mean we as a community can go ahead and enforce that we > basically just adopt 3 and use RTC to help enforce that. But that > from an API and doing examples perspective we allow 1 and 2 as well. > I don't think it adds much weight but does add flexibility. > > Like you, I think I'm ok either way. This was a good discussion and > ending up with supporting internationalization and flexibility in > displayed name while retaining configuration sanity - is a win. > > Thanks > Joe > > On Fri, May 6, 2016 at 10:45 AM, Brandon DeVries <[email protected]> wrote: > > I guess my only objection is, do we really need Option 2? If all you > > want > > to do is provide a name, fine. But, if you want to change the name (for > better communicating the purpose to the user, be that in English, French, > or Esperanto) option 3 provides the mechanism for that. It's not as easy > for the developer as option 2, but it has the benefit of starting the > process of creating resource bundles for different languages (and > potentially deprecating a semi-confusing option). This means that > > someone > > wanting to add support for a different language (e.g. french) has a lower > barrier to entry, in that they can just find the > > "displayNames.properties" > > file, and put a "displayNames_fr_FR.properties next to it. Ultimately > > its > > a trade off of conveniences... the developer wanting to change a property > name vs. a potential contributor wanting to improve language support. I > guess I would argue that this shouldn't be that hard for a developer to > figure out, and lean in favor of encouraging language contributions. > > Ultimately, it isn't' that strong an objection though, and it does sound > like this is heading in the right direction... > > Brandon > > On Fri, May 6, 2016 at 10:25 AM Joe Witt <[email protected]> wrote: > > Definitely on board with the idea that the 'name' will be the key to a > resource bundle. It does imply such names will need to follow > necessary conventions to be valid resource bundle keys. > > However, in the spirit of always thinking about the developer path to > productivity I am hopeful we can come up with a nice way to not > require them to setup a resource bundle. > > The idea being that the following order of operations/thought would > > exist: > > > 1) How can I provide a new property to this processor? > Answer: Add a property descriptor and set the name. This name will be > used to refer to the property descriptor whenever serialized/saving > the config and it will be rendered through the REST API and thus made > available as the property name in the UI. > > 2) Oh snap. I wish I had used a different name because I've found a > better way to communicate intent to the user. How do I do this? > Answer: Go ahead and set displayName. NiFi will continue to use the > 'name' for serialization/config saving but will use the displayName > for what is shown to the user in the UI. > > 3) I would like to support locale sensitive representations of my > property name. How can I do this? > Answer: Add a resource bundle with entries for your property 'name' > value. This means the resource bundle needs to exist and your > property 'name' must adhere to resource bundle key naming requirements > [1]. If this is supplied and can be looked up then this will be used > and otherwise will fallback to using displayName value if present and > otherwise will fallback to using the value of 'name'. > > And in any event we still need to better document/articulate this > model as the root of this thread was that we hadn't effectively > communicated the existence of displayName. I agree this discussion > ended up getting us to a great place though as we should all strive to > support internationalization. > > With an approach like this I am onboard. I think this balances our > goals of having a simple to use API but also allows those who want to > support multiple locales to do so cleanly. > > Thanks > Joe > > [1] > > https://docs.oracle.com/javase/tutorial/i18n/resbundle/propfile.html > > > On Fri, May 6, 2016 at 9:33 AM, Brandon DeVries <[email protected]> wrote: > > +1. I like that better. Deprecate displayName(), and set it > "automatically" based on the locale from properties. The name of the > property (which should never change) is the key into the > > ResourceBundle. > > > Brandon > > > On Fri, May 6, 2016 at 9:24 AM Matt Burgess <[email protected]> > > wrote: > > > Same here. Internationalization is often implemented as properties > files/resources, where you possibly load in a file based on the > > system > > setting for Locale (like processor_names_en_US.properties). If we > > were > > to do internationalization this way (i.e. a non-code based solution, > which is more flexible), then ironically displayName() might/should > > be > > deprecated in favor of using the value of name() as the key in a > properties/lookup file; the corresponding value would be the > appropriate locale-specific "display name". > > Brandon's links show this approach, I have seen this i18n approach on > other projects/products and it seems to work pretty well. > > Regards, > Matt > > On Fri, May 6, 2016 at 9:11 AM, Joe Witt <[email protected]> wrote: > > I share Bryan's perspective. > > On Fri, May 6, 2016 at 9:05 AM, Bryan Bende <[email protected]> > > wrote: > > I might just be resistant to change, but I am still on the fence a > > little > > bit... > > In the past the idea has always been you start out with name, and > > if > > you > > later need to change what is displayed in the UI, then you add > > displayName > > after the fact. > > It sounds like the issue is that a lot of people don't know about > displayName, so I am totally in favor of increasing awareness > > through > > documentation, > but I'm struggling with telling people that they should set > > displayName > > as > > the default behavior. > > For code that is contributed to NiFi, I would expect the > > PMC/committer > > doing the review/merging to notice if an existing property name > > was > > being > > changed and point that out in the review. > If it was someone else's custom NAR, or even made it through the > > review, I > > think what happens is that the flow starts up and the > > processor/component > > becomes invalid because it now has an unknown property. > This is the same behavior when we remove a property from an > > existing > > processor and someone upgrades, and we deemed this acceptable > > behavior > > for > > that scenario. > > The internationalization aspect could possibly sell me more, but I > > think I > > would need someone to explain how internationalization would be > implemented, and how setting the displayName helps. > What Brandon described makes sense to me because it is something > > outside > > the Java code, which is different then saying all > > PropertyDescriptor > > instances need name and displayName. > > -Bryan > > > On Fri, May 6, 2016 at 8:44 AM, Brandon DeVries <[email protected]> > > wrote: > > > +1. I think being able to move the displayName out of code an > > into > > config > > / ResourceBundle will make it much easier to support i18n[1]. > > If no > > config > > is provided, the name is the default... otherwise, the name > > displayed > > is > > determined by the locale. Updating the mock framework to > > complain > > about > > the absence of a ResourceBundle will encourage adoption, and > > we'll > > gradually work our way to not being English only. > > Brandon > > [1] > > https://docs.oracle.com/javase/tutorial/i18n/intro/quick.html > > > On Thu, May 5, 2016 at 11:46 PM Adam Lamar <[email protected] > > > wrote: > > > On Tue, May 3, 2016 at 12:09 PM, Andy LoPresto < > > [email protected] > > > wrote: > > As a result of some conversations and some varying feedback > > on > > PRs, I’d > > like > > to discuss with the community an issue I see with > > PropertyDescriptor > > name > > and displayName attributes. I’ll describe the scenarios that > > cause > > issues > > and my proposed solution, and then solicit responses and > > other > > perspectives. > > Andy, > > I'd be +1 on this as well. I think the power of this approach > > will > > become more clear over time, and some of the automation will > > make > > it > > possible to more widely enforce. > > What do you think about a mixed mode where config reading code > > can > > fetch the property value with either the name or display name > > as > > the > > key, defaulting to the name if it is present? A sample read of > flow.xml.gz might look like this: > > * Processor asks for value of MY_CONFIG_PROPERTY > * Configuration code looks for key "my-config-property", > > returns > > if > > present > > * Configuration code looks for key "My Config Property", > > returns > > if > > present > > * On finding no valid key, configuration returns > > blank/default/null > > value (whatever is done today) > > On configuration write, the new attributes could be written as > > the > > normalized name (instead of display name), to allow processors > > that > > have made the switch to start using the normalized name field > > and > > start taking advantage of the new features around it (e.g. > internationalization). Perhaps a disadvantage to this approach > > is > > that > > it auto-upgrades an existing flow.xml.gz, making it difficult > > to > > downgrade from (for example) 0.7 back to 0.6. > > A strategy like this (or something similar) might help speed > > adoption > > by making the process a bit smoother for existing flow.xml.gz > > files > > and easier to upgrade processors incrementally. At 1.0, display > > names > > as configuration keys could be dropped, and the upgrade path > > for > > users > > on old 0.x releases would be to upgrade to the latest 0.x > > before > > making the jump to 1.0. > > Cheers, > Adam > > > > > >
