Louis-Etienne, NIFI-190 isn't scheduled on anything as of yet. We had some design questions/ideas and your example informs it even further.
I think the custom proc method you mention will work out well. Ultimately there will need to be one anyway to deal with the logic of merging this particular format+schema. Thanks Joe On Sun, Dec 6, 2015 at 11:28 PM, <[email protected]> wrote: > Joe, > > Thanks for the prompt reply. > About the merge, both message will be JSON and I need some specific part from > both. > I'll recheck the doc to see what my options are, but I think that using > FlowFile Streams and a custom processor that would do the logic might be good > > About the HoldProcessor, you must talk about NIFI-190. The way you describe > it seems to what I'm looking for > But in the JIRA and looking quickly at the PR it seems like I would lose the > message from Topic2. > > I'll dig in the code of the PR and the MergeContent processor in order to > have a better understanding. > > Was that JIRA scheduled for a specific milestone? It would probably be a > great addition but maybe it require a lot of change that I dont see yet > > Regards, > Louis-Etienne > >> On 6, 2015, at 9:42 PM, Joe Witt <[email protected]> wrote: >> >> Louis-Etienne, >> >> My initial thought is your idea with MergeContent is the right one. >> However, the issue there is not just the combining of the data but the >> 'what does merging truly mean in that case'. So it is a bit undefined >> what the next step will be. Merge the content? If so, how? What is >> the format and schema of the objects before the merge and after? >> >> Another member of the community had an idea for a concept of a >> HoldProcessor. It would allow these sorts of multi-object gates to >> occur. The same issue exists of what to do once the gate criteria is >> hit but at that point you'd have more control over it. MergeContent >> is an already prescribed set of behaviors whereas HoldContent would >> let you choose the next gate. We really should get on with helping >> get that contribution in. >> >> Thanks >> Joe >> >>> On Sun, Dec 6, 2015 at 9:35 PM, Louis-Étienne Dorval <[email protected]> >>> wrote: >>> Hi everyone! >>> >>> I'm very excited to start using NiFi and I think that it will be very >>> usefull for a some projects. >>> >>> I've been playing with it for some times and did a few basic flow, but I'm >>> having a hard time figuring how to achieve a part of my flow or if NiFi will >>> be able to do it. >>> I'm building a flow around existing systems, so NiFi would run in parallel >>> of that and gather the output of these systems (everything is asynchronous) >>> to take actions. >>> >>> Everything starts with a GetJMSTopic on Topic1, then follows 2-3 processor >>> that does Attribute Extractions. >>> During that time the existing system will process the same message, enrich >>> the message (but also remove some usefull information) and will publish on >>> Topic2. >>> I need the message from Topic2, so I've added another GetJMSTopic on Topic2. >>> Then I need to somehow take the FlowFile from Topic1 and from Topic2, >>> "merge" them together in order to have the attributes from both FlowFiles. >>> After that I will probably need to use the GetMongo to access some >>> information. This will probably create a new FlowFile that I need to "merge" >>> with the others. >>> Then I'll put that in HBase or something else, not sure yet. >>> >>> The part that I'm not sure how to solve is the "merge" of multiples >>> FlowFile, I hesitate between using the MergeContent processor and the >>> DetectDuplicate: >>> >>> MergeContent seems like what I needs but the existing systems might add some >>> latency (and it will increase when there's a lot of publish on Topic1) so I >>> would need to increase the 'Maximum number of Bins'. >>> It will probably affect the performance of the system but how bad? >>> DetectDuplicate, it would feel akward to use that since it's not really a >>> duplicate, but it would be more lightweight (only keeps a hash). But will I >>> be able to find the previous FlowFile with "original.flowfile.description" ? >>> >>> >>> Let me know if there's another option that I didn't look into. >>> Or maybe my problem is really trivial but I need to change my perspective on >>> it... >>> >>> Best regards, >>> Louis-Etienne
