Hello Collecting the data: - Use Consume Kafka
Routing each event based on its content: - There is no out of the box processor to route Avro just yet. We will hopefully tackle that soon. In the mean time you could covert the Avro to JSON and use RouteJson. If you want the data back in avro you can convert back to avro. So it would be something like - ConvertAvroToJson - EvaluateJsonPath - ConvertJsonToAvro At this point you'll have a flow file attribute which tells you about the binning you want. This approach will have the performance hit of the conversions so if you can avoid going back to Avro that will help or if you use the ExecuteScript or related processors you can whip up some Groovy to deal with the Avro directly. You have lots of options there. Merging events together to sympathize with HDFS block sizes: - Use MergeContent. It supports binning data together using a flow file attribute such as the one you would have pulled in the previous section above. Delivery to HDFS - Use PutHDFS Thanks Joe On Tue, Nov 22, 2016 at 7:59 AM, E Worthy <[email protected]> wrote: > Is this not possible with this product? > > > On Sunday, November 20, 2016 8:50 PM, E Worthy <[email protected]> > wrote: > > > Hello, > > I have Avro data coming in from Kafka and I need to place this data into the > proper HDFS partition based on values in each row of data. Is this possible > in Nifi? I'm looking at the PutHDFS, ConvertAvroToORC, and expressions. Do I > somehow set a variable using an expression for each row and use that in the > PutHDFS? > > Thanks for any help, > Eric > >
