Louis-Etienne,

NIFI-190 isn't scheduled on anything as of yet.  We had some design
questions/ideas and your example informs it even further.

I think the custom proc method you mention will work out well.
Ultimately there will need to be one anyway to deal with the logic of
merging this particular format+schema.

Thanks
Joe

On Sun, Dec 6, 2015 at 11:28 PM,  <[email protected]> wrote:
> Joe,
>
> Thanks for the prompt reply.
> About the merge, both message will be JSON and I need some specific part from 
> both.
> I'll recheck the doc to see what my options are, but I think that using 
> FlowFile Streams and a custom processor that would do the logic might be good
>
> About the HoldProcessor, you must talk about NIFI-190. The way you describe 
> it seems to what I'm looking for
> But in the JIRA and looking quickly at the PR it seems like I would lose the 
> message from Topic2.
>
> I'll dig in the code of the PR and the MergeContent processor in order to 
> have a better understanding.
>
> Was that JIRA scheduled for a specific milestone? It would probably be a 
> great addition but maybe it require a lot of change that I dont see yet
>
> Regards,
> Louis-Etienne
>
>> On 6, 2015, at 9:42 PM, Joe Witt <[email protected]> wrote:
>>
>> Louis-Etienne,
>>
>> My initial thought is your idea with MergeContent is the right one.
>> However, the issue there is not just the combining of the data but the
>> 'what does merging truly mean in that case'.  So it is a bit undefined
>> what the next step will be.  Merge the content?  If so, how?  What is
>> the format and schema of the objects before the merge and after?
>>
>> Another member of the community had an idea for a concept of a
>> HoldProcessor.  It would allow these sorts of multi-object gates to
>> occur.  The same issue exists of what to do once the gate criteria is
>> hit but at that point you'd have more control over it.  MergeContent
>> is an already prescribed set of behaviors whereas HoldContent would
>> let you choose the next gate.  We really should get on with helping
>> get that contribution in.
>>
>> Thanks
>> Joe
>>
>>> On Sun, Dec 6, 2015 at 9:35 PM, Louis-Étienne Dorval <[email protected]> 
>>> wrote:
>>> Hi everyone!
>>>
>>> I'm very excited to start using NiFi and I think that it will be very
>>> usefull for a some projects.
>>>
>>> I've been playing with it for some times and did a few basic flow, but I'm
>>> having a hard time figuring how to achieve a part of my flow or if NiFi will
>>> be able to do it.
>>> I'm building a flow around existing systems, so NiFi would run in parallel
>>> of that and gather the output of these systems (everything is asynchronous)
>>> to take actions.
>>>
>>> Everything starts with a GetJMSTopic on Topic1, then follows 2-3 processor
>>> that does Attribute Extractions.
>>> During that time the existing system will process the same message, enrich
>>> the message (but also remove some usefull information) and will publish on
>>> Topic2.
>>> I need the message from Topic2, so I've added another GetJMSTopic on Topic2.
>>> Then I need to somehow take the FlowFile from Topic1 and from Topic2,
>>> "merge" them together in order to have the attributes from both FlowFiles.
>>> After that I will probably need to use the GetMongo to access some
>>> information. This will probably create a new FlowFile that I need to "merge"
>>> with the others.
>>> Then I'll put that in HBase or something else, not sure yet.
>>>
>>> The part that I'm not sure how to solve is the "merge" of multiples
>>> FlowFile, I hesitate between using the MergeContent processor and the
>>> DetectDuplicate:
>>>
>>> MergeContent seems like what I needs but the existing systems might add some
>>> latency (and it will increase when there's a lot of publish on Topic1) so I
>>> would need to increase the 'Maximum number of Bins'.
>>> It will probably affect the performance of the system but how bad?
>>> DetectDuplicate, it would feel akward to use that since it's not really a
>>> duplicate, but it would be more lightweight (only keeps a hash). But will I
>>> be able to find the previous FlowFile with "original.flowfile.description" ?
>>>
>>>
>>> Let me know if there's another option that I didn't look into.
>>> Or maybe my problem is really trivial but I need to change my perspective on
>>> it...
>>>
>>> Best regards,
>>> Louis-Etienne

Reply via email to