Hi everyone! I'm very excited to start using NiFi and I think that it will be very usefull for a some projects.
I've been playing with it for some times and did a few basic flow, but I'm having a hard time figuring how to achieve a part of my flow or if NiFi will be able to do it. I'm building a flow around existing systems, so NiFi would run in parallel of that and gather the output of these systems (everything is asynchronous) to take actions. Everything starts with a GetJMSTopic on Topic1, then follows 2-3 processor that does Attribute Extractions. During that time the existing system will process the same message, enrich the message (but also remove some usefull information) and will publish on Topic2. I need the message from Topic2, so I've added another GetJMSTopic on Topic2. Then I need to somehow take the FlowFile from Topic1 and from Topic2, "merge" them together in order to have the attributes from both FlowFiles. After that I will probably need to use the GetMongo to access some information. This will probably create a new FlowFile that I need to "merge" with the others. Then I'll put that in HBase or something else, not sure yet. The part that I'm not sure how to solve is the "merge" of multiples FlowFile, I hesitate between using the MergeContent processor and the DetectDuplicate: MergeContent seems like what I needs but the existing systems might add some latency (and it will increase when there's a lot of publish on Topic1) so I would need to increase the 'Maximum number of Bins'. It will probably affect the performance of the system but how bad? DetectDuplicate, it would feel akward to use that since it's not really a duplicate, but it would be more lightweight (only keeps a hash). But will I be able to find the previous FlowFile with "original.flowfile.description" ? Let me know if there's another option that I didn't look into. Or maybe my problem is really trivial but I need to change my perspective on it... Best regards, Louis-Etienne
