[ 
https://issues.apache.org/jira/browse/NIFI-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509341#comment-16509341
 ] 

ASF GitHub Bot commented on NIFI-5287:
--------------------------------------

Github user ijokarumawak commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    Thanks @markap14 for pointing that. Separating the coordinate and context 
make sense and implementations will be cleaner by doing so. Although I agree 
with the idea, let me try to explain why I wanted to put everything into the 
same map.
    
    I wanted to muddle all variables into the same map because some lookup 
target, such as URL for the RestLookupService have both lookup coordinate and 
environment depending part that can be passed as FlowFile attributes or 
variable registry.
    
    For example, let's say we want to make following URL to be populated by a 
Expression Language to be used by RestLookupService. 
`http://test.example.com:8080/service/john.smith/friend/12345`
    I think the URL has both lookup and contextual part in it.
    
    - `john.smith and `12345` are lookup coordinates that vary from input, in 
LookupRecord context, it varies per recocrd
    - But hostname and port is more of contextual values, that varies per 
FlowFile or environment
    
    If the URL property is configured as 
`http://${apiHost}:${apiPort}/service/${userName}/friend/${friendId}`, and if 
it can only refer lookup coordinate, then `apiHost` and `apiPort` need to be 
set for each record lookup. And to do so, user need to configure dynamic 
properties at LookupRecord processor using record paths, which can be awkward 
since RecordPathValidator doesn't allow literal String value. A work-around is 
to use `concat('test.example.com')` to return the constant value for every 
record lookup.
    
    To make this scenario more flexible while meeting with the idea of 
separating lookup coordinates and context, I wrote MikeThomsen/nifi#1  so that 
target URL can be configured by 2 properties (if necessary). `BASE_URL` can use 
Variable Registry, and `URL` can use coordinate.
    
    If we pursue the path to separate lookup and coordinate more 
implementations like that will be needed in different places. That has both 
pros / cons I think. 
    
    On the other hand, the variables those can be referred by an EL is already 
muddled a lot IMHO. It contains System properties, Variable registry, and 
FlowFile attribute values and EL can utilize those without knowing where it's 
configured.
    
    Adding those values into lookup coordinate may not sound that wrong, if we 
keep the consistent overlaying order.


> LookupRecord should supply flowfile attributes to the lookup service
> --------------------------------------------------------------------
>
>                 Key: NIFI-5287
>                 URL: https://issues.apache.org/jira/browse/NIFI-5287
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Mike Thomsen
>            Assignee: Mike Thomsen
>            Priority: Major
>
> -LookupRecord should supply the flowfile attributes to the lookup service. It 
> should be done as follows:-
>  # -Provide a regular expression to choose which attributes are used.-
>  # -The chosen attributes should be foundation of the coordinates map used 
> for the lookup.-
>  # -If a configured key collides with a flowfile attribute, it should 
> override the flowfile attribute in the coordinate map.-
> Mark had the right idea:
>  
> I would propose an alternative approach, which would be to add a new method 
> to the interface that has a default implementation:
> {{default Optional<T> lookup(Map<String, Object> coordinates, Map<String, 
> String> context) throws LookupFailureException \{ return lookup(coordinates); 
> } }}
> Where {{context}} is used for the FlowFile attributes (I'm referring to it as 
> {{context}} instead of {{attributes}} because there may well be a case where 
> we want to provide some other value that is not specifically a FlowFile 
> attribute). Here is why I am suggesting this:
>  * It provides a clean interface that properly separates the data's 
> coordinates from FlowFile attributes.
>  * It prevents any collisions between FlowFile attribute names and 
> coordinates.
>  * It maintains backward compatibility, and we know that it won't change the 
> behavior of existing services or processors/components using those services - 
> even those that may have been implemented by others outside of the Apache 
> realm.
>  * If attributes are passed in by a Processor, those attributes will be 
> ignored anyway unless the Controller Service is specifically updated to make 
> use of those attributes, such as via Expression Language. In such a case, the 
> Controller Service can simply be updated at that time to make use of the new 
> method instead of the existing method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to