[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user ijokarumawak commented on the issue: https://github.com/apache/nifi/pull/2777 Thanks @MikeThomsen LGTM +1, merging. ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user ijokarumawak commented on the issue: https://github.com/apache/nifi/pull/2777 Thanks @markap14 , that is true. My previous comment on the difference between doc and actual behavior was a false alarm. I expected that but got different result, so I got confused. But now I understand what happened in my test. A LookupService was in a child group, using a RecordReader defined in a parent group. And I set a variable in the child group. But the RecordReader in the parent group was not able to use it. Configuring variable at the parent group worked as expected. - Parent PG - RecordReader: `Schema Name=${schema.name}` - Variable registry: `schema.name=parent` (this value can be used) - Child PG - LookupService using the RecordReader in the parent PG - Variable registry: `schema.name=child` (the RecordReader doesn't have access to variables defined in child PG) Also, I found that how SchemaAccessStrategy classes caches `PropertyValue` to evaluate EL at runtime. @MikeThomsen RestLookupService PR #2723 needs to be updated to follow this pattern when it evaluates EL. I will comment on that PR for detail. ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2777 @markap14 @ijokarumawak cleaned it up by removing that attribute and it's rebased against master (as of my last pull this morning). Can one of you review? ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2777 @ijokarumawak whenever Expression Language is evaluated from a PropertyDescriptor, it always has access to the Variable Registry. The framework provides the variable registry to the compiled expression when it is evaluated. ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user ijokarumawak commented on the issue: https://github.com/apache/nifi/pull/2777 As for Variable Registry, I think passing only FlowFile attribute is enough, same as the way AbstractRecordProcessor is doing right now. https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/AbstractRecordProcessor.java#L113 In that sense, I found a contradiction between current documentation and behavior. Schema Name, Schema Version, Schema Branch and Schema Text are documented that those supports Variable Registry and FlowFile Attributes, but it can only access to FlowFile attributes. I assume this pattern is not considered when ExpressionLanguageScope is added. Probably for another JIRA (not filed yet). ![image](https://user-images.githubusercontent.com/1107620/41296857-d025934e-6e98-11e8-9521-29c135f57da6.png) ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user ottobackwards commented on the issue: https://github.com/apache/nifi/pull/2777 @markap14 that certainly makes sense, thanks for taking the time to respond ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user ijokarumawak commented on the issue: https://github.com/apache/nifi/pull/2777 @markap14 I agree with that. We shouldn't break existing stuffs. I've been thinking about this after I posted my last comment and now I believe having separate map is a better idea. It's possible to mix those to evaluate an EL with both values if new implementation really needs so, while keeping existing ones work as they are. Sorry for making a confusion. @MikeThomsen now we can remove `Attributes Regular Expression` from LookupRecord, and just pass all FlowFile attribute as context as you did with LookupAttribute. In addition to that, there is one more things that I want to clarify on this improvement. Do we also want to access Variable Registry values through the `context` map? If so, how can we do that? ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2777 In hindsight, a "Context" type of object would have been a good idea in this situation, had it been done that way in the first place. However, while it's a helpful design pattern, we have to be cognizant of the fact that we always have performed a release that contains the Service API the way that it is. We need to avoid changing that, as any custom Controller Services would break if we change the API in a non-backward-compatible way. It would also break any custom processors that try to use a Lookup Service. While we do have some flexibility in changing Service API's after they have been released, it is kind of a nuclear option that should be avoided if we can. Especially for something that is as widely used as the lookup stuff. This is largely why I want to avoid also including attributes in the same map - it would potentially break existing services if new things suddenly are fed into that map that are not part of the lookup's coordinates. By introducing the `Map context` and providing a default implementation in the interface, all existing services will work as they previously did, and all existing processors will work as the previously did. However, new or updated implementations will still be able to take advantage of FlowFile Attributes. ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user ottobackwards commented on the issue: https://github.com/apache/nifi/pull/2777 It may be overkill, that is true. But if you have to keep adding new functions to account for different scenarios that isn't great either and may suggest something like that would be good to have. Having a context or resolver for this type of thing isn't that radical. Having a context sets the interface, that is true, but the implementation can be any kind of policy/strategy/composition you may require. That being said, just a thought anyways. ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2777 @ottobackwards I think that's overkill because a Map is fine for the main variables and a Map also works for passing EL output. A major advantage of using two maps is that the "context" one is not set in stone as @markap14 pointed out and can change as our use changes without breaking much. ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user ottobackwards commented on the issue: https://github.com/apache/nifi/pull/2777 Maybe there should be a `Context` interface, and there can be support for implementations that support more than one map or type of backing. I think the limitation here is using a literal Map instead of a logical construct. This is similar to variable resolution in a custom language. You may need more than a map. `LookupContext context = new AttributeAndCoordinateLookup(attrs, coords);` ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2777 @ijokarumawak > Adding those values into lookup coordinate may not sound that wrong, if we keep the consistent overlaying order. It is, however, dangerous to do with LookupService implementations that are based around query builders like the Mongo one. The Mongo one does a straight conversion of the coordinate map into a Mongo query, so if you add any value other than a key the user specifies you will break the query in most cases. ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user ijokarumawak commented on the issue: https://github.com/apache/nifi/pull/2777 Thanks @markap14 for pointing that. Separating the coordinate and context make sense and implementations will be cleaner by doing so. Although I agree with the idea, let me try to explain why I wanted to put everything into the same map. I wanted to muddle all variables into the same map because some lookup target, such as URL for the RestLookupService have both lookup coordinate and environment depending part that can be passed as FlowFile attributes or variable registry. For example, let's say we want to make following URL to be populated by a Expression Language to be used by RestLookupService. `http://test.example.com:8080/service/john.smith/friend/12345` I think the URL has both lookup and contextual part in it. - `john.smith and `12345` are lookup coordinates that vary from input, in LookupRecord context, it varies per recocrd - But hostname and port is more of contextual values, that varies per FlowFile or environment If the URL property is configured as `http://${apiHost}:${apiPort}/service/${userName}/friend/${friendId}`, and if it can only refer lookup coordinate, then `apiHost` and `apiPort` need to be set for each record lookup. And to do so, user need to configure dynamic properties at LookupRecord processor using record paths, which can be awkward since RecordPathValidator doesn't allow literal String value. A work-around is to use `concat('test.example.com')` to return the constant value for every record lookup. To make this scenario more flexible while meeting with the idea of separating lookup coordinates and context, I wrote MikeThomsen/nifi#1 so that target URL can be configured by 2 properties (if necessary). `BASE_URL` can use Variable Registry, and `URL` can use coordinate. If we pursue the path to separate lookup and coordinate more implementations like that will be needed in different places. That has both pros / cons I think. On the other hand, the variables those can be referred by an EL is already muddled a lot IMHO. It contains System properties, Variable registry, and FlowFile attribute values and EL can utilize those without knowing where it's configured. Adding those values into lookup coordinate may not sound that wrong, if we keep the consistent overlaying order. ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2777 @markap14 @ijokarumawak updated based on the last comment. ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2777 @MikeThomsen @ijokarumawak Hey guys, I'm sorry to chime in a bit late here. I don't think this is really the approach that we want to take here, though. What our interface provides currently is a mechanism to pass in the 'coordinates', i.e., the location for finding the data that we want. The approach here takes this concepts and muddies it by providing arbitrary key/value pairs that don't make sense for the lookup. I would propose an alternative approach, which would be to add a new method to the interface that has a default implementation: ``` default Optional lookup(Map coordinates, Map context) throws LookupFailureException { return lookup(coordinates); } ``` Where `context` is used for the FlowFile attributes (I'm referring to it as `context` instead of `attributes` because there may well be a case where we want to provide some other value that is not specifically a FlowFile attribute). Here is why I am suggesting this: - It provides a clean interface that properly separates the data's coordinates from FlowFile attributes. - It prevents any collisions between FlowFile attribute names and coordinates. - It maintains backward compatibility, and we know that it won't change the behavior of existing services or processors/components using those services - even those that may have been implemented by others outside of the Apache realm. - If attributes are passed in by a Processor, those attributes will be ignored anyway unless the Controller Service is specifically updated to make use of those attributes, such as via Expression Language. In such a case, the Controller Service can simply be updated at that time to make use of the new method instead of the existing method. ---
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2777 @ijokarumawak Review? ---