Re: Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread Conrad Crampton
My 2p. If the kaka.key value (very simple json), you could use UpdateAttribute and use some expression language - specifically the string manipulation functions to extract the part you want. I like the power or ExecuteProcessor by the way. And I agree, this community is phenomenally responsive

Re: Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)
Thanks, Liam. I considered that but some of our files are rather large. It may be a short term solution, though. From: Lee Laim mailto:lee.l...@gmail.com>> Reply-To: "users@nifi.apache.org" mailto:users@nifi.apache.org>> Date: Monday, March 21, 2016 at 3:17 PM T

Re: CSV/delimited to Parquet conversion via Nifi

2016-03-21 Thread Dmitry Goldenberg
Since NiFi has ConvertJsonToAvro and ConvertCsvToAvro processors, would it make sense to add a feature request for a ConvertJsonToParquet processor and a ConvertCsvToParquet processor? - Dmitry On Mon, Mar 21, 2016 at 9:23 PM, Matt Burgess wrote: > Edmon, > > NIFI-1663 [1] was created to add OR

Re: CSV/delimited to Parquet conversion via Nifi

2016-03-21 Thread Matt Burgess
Edmon, NIFI-1663 [1] was created to add ORC support to NiFi. If you have a target dataset that has been created with Parquet format, I think you can use ConvertCSVtoAvro then StoreInKiteDataset to get flow files in Parquet format into Hive, HDFS, etc. Others in the community know a lot more about

Re: Create row keys for HBase from Json messages

2016-03-21 Thread Bryan Bende
Hong, Glad to hear you are getting started with NiFi! What do your property names look like on EvaluatJsonPath? Typically if you wanted to extract the effective timestamp, event id, and applicant id from your example json, then you would add properties to EvaluateJsonPath like the following: eff

CSV/delimited to Parquet conversion via Nifi

2016-03-21 Thread Edmon Begoli
Is there a way to do straight CSV(PSV) to Parquet or ORC conversion via Nifi, or do I always need to push the data through some of the "data engines" - Drill, Spark, Hive, etc.?

Create row keys for HBase from Json messages

2016-03-21 Thread Hong Li
I'm a new user for Nifi, and just started my first Nifi project, where we need to move Json messages into HBase. After I read the templates and user guide, I see I still need help to learn how to concatenate the values pulled out from the Json messages to form a unique row key for HBase tables. G

Re: Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread Lee Laim
Chris, Depending on the size* of the flowfile content, a combination of ExtractText and ReplaceText might also work. This is what I am picturing: ExtractText the entire contents of the flowfile into a new attribute flowfile.original. ReplaceText with the ${kafka.key}. This will place the ${kaf

Re: Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)
Thanks everyone. While I’m naturally disappointed that this doesn’t exist, I am hyper charged about the responsiveness and enthusiasm of the NiFi community! From: Matt Burgess mailto:mattyb...@gmail.com>> Reply-To: "users@nifi.apache.org" mailto:users@nifi.apache.o

Re: Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread Matt Burgess
One way (in NiFi 0.5.0+) is to use the ExecuteScript processor, which gives you full control over the session and flowfile(s). For example if you had JSON in your "kafka.key" attribute such as "{"data": {"myKey": "myValue"}}" , you could use the following Groovy script to parse out the value of th

Re: Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread Joe Witt
Chris - also you were clear. I was just too quick to reply and didn't read carefully. On Mon, Mar 21, 2016 at 1:53 PM, Mark Payne wrote: > Chris, > > Unfortunately, at this time, the EvaluateJsonPath requires that the JSON to > evaluate be the content of the FlowFIle. > There already does exist

Re: Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread Mark Payne
Chris, Unfortunately, at this time, the EvaluateJsonPath requires that the JSON to evaluate be the content of the FlowFIle. There already does exist a ticket [1] that would allow you to specify an attribute to use as the JSON instead of requiring that it be the content only. Unfortunately, this

Re: Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)
Joe, Thanks for the reply. I think I was not clear. The JSON I need to evaluate is in a FlowFile attribute (kafka.key) which I need to be able to evaluate without modifying the original FlowFile content (which was read from the Kafka topic). What I can’t figure out is how to squirrel away th

Re: Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread Joe Percivall
Hello Chris, The EvaluateJsonPath processor has the property "Destination" which gives you the option to send it either to the FlowFile content or a FlowFile attribute. Selecting "flowfile-attribute" will place the value in the "kafka.key" attribute of the FlowFile. You can find documentation f

Re: Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread Joe Witt
Chris, Sounds like you have the right flow in mind already. EvaluateJSONPath does not write content. It merely evaluates the given jsonpath expression against the content of the flowfile and if appropriate creates a flowfile attribute of what it finds. For example if you have JSON from Twitter

Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)
What I need to do is read a file from Kafka. The Kafka key contains a JSON string which I need to turn in FlowFile attributes while preserving the original FlowFile content. Obviously I can use EvaluteJsonPath but that necessitates replacing the FlowFile content with the kaka.key attribute, th