[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-13 Thread ijokarumawak
Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/2777
  
Thanks @MikeThomsen LGTM +1, merging.


---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread ijokarumawak
Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/2777
  
Thanks @markap14 , that is true. My previous comment on the difference 
between doc and actual behavior was a false alarm.

I expected that but got different result, so I got confused. But now I 
understand what happened in my test. A LookupService was in a child group, 
using a RecordReader defined in a parent group. And I set a variable in the 
child group. But the RecordReader in the parent group was not able to use it. 
Configuring variable at the parent group worked as expected.

- Parent PG
  - RecordReader: `Schema Name=${schema.name}`
  - Variable registry: `schema.name=parent` (this value can be used)
  - Child PG
- LookupService using the RecordReader in the parent PG
- Variable registry: `schema.name=child` (the RecordReader doesn't have 
access to variables defined in child PG)

Also, I found that how SchemaAccessStrategy classes caches `PropertyValue` 
to evaluate EL at runtime. @MikeThomsen RestLookupService PR #2723 needs to be 
updated to follow this pattern when it evaluates EL. I will comment on that PR 
for detail.


---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2777
  
@markap14 @ijokarumawak cleaned it up by removing that attribute and it's 
rebased against master (as of my last pull this morning). Can one of you review?


---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread markap14
Github user markap14 commented on the issue:

https://github.com/apache/nifi/pull/2777
  
@ijokarumawak whenever Expression Language is evaluated from a 
PropertyDescriptor, it always has access to the Variable Registry. The 
framework provides the variable registry to the compiled expression when it is 
evaluated.


---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread ijokarumawak
Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/2777
  
As for Variable Registry, I think passing only FlowFile attribute is 
enough, same as the way AbstractRecordProcessor is doing right now. 
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/AbstractRecordProcessor.java#L113

In that sense, I found a contradiction between current documentation and 
behavior. Schema Name, Schema Version, Schema Branch and Schema Text are 
documented that those supports Variable Registry and FlowFile Attributes, but 
it can only access to FlowFile attributes. I assume this pattern is not 
considered when ExpressionLanguageScope is added. Probably for another JIRA 
(not filed yet).

![image](https://user-images.githubusercontent.com/1107620/41296857-d025934e-6e98-11e8-9521-29c135f57da6.png)




---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/nifi/pull/2777
  
@markap14 that certainly makes sense, thanks for taking the time to respond


---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread ijokarumawak
Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/2777
  
@markap14 I agree with that. We shouldn't break existing stuffs. I've been 
thinking about this after I posted my last comment and now I believe having 
separate map is a better idea. It's possible to mix those to evaluate an EL 
with both values if new implementation really needs so, while keeping existing 
ones work as they are.

Sorry for making a confusion.

@MikeThomsen now we can remove `Attributes Regular Expression` from 
LookupRecord, and just pass all FlowFile attribute as context as you did with 
LookupAttribute.

In addition to that, there is one more things that I want to clarify on 
this improvement.
Do we also want to access Variable Registry values through the `context` 
map? If so, how can we do that?


---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread markap14
Github user markap14 commented on the issue:

https://github.com/apache/nifi/pull/2777
  
In hindsight, a "Context" type of object would have been a good idea in 
this situation, had it been done that way in the first place. However, while 
it's a helpful design pattern, we have to be cognizant of the fact that we 
always have performed a release that contains the Service API the way that it 
is. We need to avoid changing that, as any custom Controller Services would 
break if we change the API in a non-backward-compatible way. It would also 
break any custom processors that try to use a Lookup Service.
While we do have some flexibility in changing Service API's after they have 
been released, it is kind of a nuclear option that should be avoided if we can. 
Especially for something that is as widely used as the lookup stuff.
This is largely why I want to avoid also including attributes in the same 
map - it would potentially break existing services if new things suddenly are 
fed into that map that are not part of the lookup's coordinates. By introducing 
the `Map context` and providing a default implementation in the 
interface, all existing services will work as they previously did, and all 
existing processors will work as the previously did. However, new or updated 
implementations will still be able to take advantage of FlowFile Attributes.


---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/nifi/pull/2777
  
It may be overkill, that is true.  But if you have to keep adding new 
functions to account for different scenarios that isn't great either and may 
suggest something like that would be good to have. Having a context or resolver 
for this type of thing isn't that radical. Having a context sets the interface, 
that is true, but the implementation can be any kind of 
policy/strategy/composition you may require.

That being said, just a thought anyways.



---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2777
  
@ottobackwards I think that's overkill because a Map is fine for the main 
variables and a Map also works for passing EL output. A major advantage of 
using two maps is that the "context" one is not set in stone as @markap14 
pointed out and can change as our use changes without breaking much.


---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/nifi/pull/2777
  
Maybe there should be a `Context` interface, and there can be support for 
implementations that support more than one map or type of backing.  I think the 
limitation here is using a literal Map instead of a logical construct.

This is similar to variable resolution in a custom language.  You may  need 
more than a map.

`LookupContext context = new AttributeAndCoordinateLookup(attrs, coords);`





---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2777
  
@ijokarumawak 

> Adding those values into lookup coordinate may not sound that wrong, if 
we keep the consistent overlaying order.

It is, however, dangerous to do with LookupService implementations that are 
based around query builders like the Mongo one. The Mongo one does a straight 
conversion of the coordinate map into a Mongo query, so if you add any value 
other than a key the user specifies you will break the query in most cases.


---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-12 Thread ijokarumawak
Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/2777
  
Thanks @markap14 for pointing that. Separating the coordinate and context 
make sense and implementations will be cleaner by doing so. Although I agree 
with the idea, let me try to explain why I wanted to put everything into the 
same map.

I wanted to muddle all variables into the same map because some lookup 
target, such as URL for the RestLookupService have both lookup coordinate and 
environment depending part that can be passed as FlowFile attributes or 
variable registry.

For example, let's say we want to make following URL to be populated by a 
Expression Language to be used by RestLookupService. 
`http://test.example.com:8080/service/john.smith/friend/12345`
I think the URL has both lookup and contextual part in it.

- `john.smith and `12345` are lookup coordinates that vary from input, in 
LookupRecord context, it varies per recocrd
- But hostname and port is more of contextual values, that varies per 
FlowFile or environment

If the URL property is configured as 
`http://${apiHost}:${apiPort}/service/${userName}/friend/${friendId}`, and if 
it can only refer lookup coordinate, then `apiHost` and `apiPort` need to be 
set for each record lookup. And to do so, user need to configure dynamic 
properties at LookupRecord processor using record paths, which can be awkward 
since RecordPathValidator doesn't allow literal String value. A work-around is 
to use `concat('test.example.com')` to return the constant value for every 
record lookup.

To make this scenario more flexible while meeting with the idea of 
separating lookup coordinates and context, I wrote MikeThomsen/nifi#1  so that 
target URL can be configured by 2 properties (if necessary). `BASE_URL` can use 
Variable Registry, and `URL` can use coordinate.

If we pursue the path to separate lookup and coordinate more 
implementations like that will be needed in different places. That has both 
pros / cons I think. 

On the other hand, the variables those can be referred by an EL is already 
muddled a lot IMHO. It contains System properties, Variable registry, and 
FlowFile attribute values and EL can utilize those without knowing where it's 
configured.

Adding those values into lookup coordinate may not sound that wrong, if we 
keep the consistent overlaying order.


---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-11 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2777
  
@markap14 @ijokarumawak updated based on the last comment.


---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-11 Thread markap14
Github user markap14 commented on the issue:

https://github.com/apache/nifi/pull/2777
  
@MikeThomsen @ijokarumawak Hey guys, I'm sorry to chime in a bit late here. 
I don't think this is really the approach that we want to take here, though. 
What our interface provides currently is a mechanism to pass in the 
'coordinates', i.e., the location for finding the data that we want. The 
approach here takes this concepts and muddies it by providing arbitrary 
key/value pairs that don't make sense for the lookup.

I would propose an alternative approach, which would be to add a new method 
to the interface that has a default implementation:
```
default Optional lookup(Map coordinates, Map context) throws LookupFailureException {
return lookup(coordinates);
}
```
Where `context` is used for the FlowFile attributes (I'm referring to it as 
`context` instead of `attributes` because there may well be a case where we 
want to provide some other value that is not specifically a FlowFile 
attribute). Here is why I am suggesting this:
- It provides a clean interface that properly separates the data's 
coordinates from FlowFile attributes.
- It prevents any collisions between FlowFile attribute names and 
coordinates.
- It maintains backward compatibility, and we know that it won't change the 
behavior of existing services or processors/components using those services - 
even those that may have been implemented by others outside of the Apache realm.
- If attributes are passed in by a Processor, those attributes will be 
ignored anyway unless the Controller Service is specifically updated to make 
use of those attributes, such as via Expression Language. In such a case, the 
Controller Service can simply be updated at that time to make use of the new 
method instead of the existing method.



---


[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

2018-06-09 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2777
  
@ijokarumawak Review?


---