Re: Json Split

2016-05-17 Thread Madhukar Thota
I simply went with ExecuteScript Processor to do the job:

Here is the code i am using:

import org.apache.commons.io.IOUtils
import java.nio.charset.*

def flowFile = session.get();
if (flowFile == null) {
return;
}
flowFile = session.write(flowFile,
{ inputStream, outputStream ->
def jsonInput = IOUtils.toString(inputStream,
StandardCharsets.UTF_8)
def values = jsonInput.split('\\r?\\n')
outputStream.write(values[1].getBytes(StandardCharsets.UTF_8))
} as StreamCallback)
session.transfer(flowFile, ExecuteScript.REL_SUCCESS)

Thanks,
Madhu

On Tue, May 17, 2016 at 3:12 PM, Bryan Bende <bbe...@gmail.com> wrote:

> I think another alternative could be to use RouteText...
>
> If you set the Matching Strategy to "starts with" and add a dynamic
> property called "matched" with a value of {"json  which will send any lines
> that start with {"json to the matched relationship.
>
> On Tue, May 17, 2016 at 3:08 PM, Bryan Bende <bbe...@gmail.com> wrote:
>
>> If you only want the second JSON document, can you send the output of
>> SplitText to EvaluateJsonPath and configure it to extract $.json ?
>>
>> In your original example only the second document had a field called
>> "json", and the matched relationship coming out of EvaluateJsonPath will
>> only receive the json documents that had the path being extracted.
>>
>> -Bryan
>>
>>
>> On Tue, May 17, 2016 at 1:52 PM, Madhukar Thota <madhukar.th...@gmail.com
>> > wrote:
>>
>>> How do i get  entry-3: {"json":"data","extracted":"from","message":
>>> "payload"} only?
>>>
>>> On Tue, May 17, 2016 at 1:52 PM, Madhukar Thota <
>>> madhukar.th...@gmail.com> wrote:
>>>
>>>> Hi Andrew,
>>>>
>>>> I configured as you suggested, but in the queue i see three entries..
>>>>
>>>>
>>>> entry-1: {"index":{"_index":"mylogger-2014.06.05","_type":"
>>>> mytype-host.domain.com"}}
>>>> {"json":"data","extracted":"from","message":"payload"}
>>>>
>>>> entry-2: {"index":{"_index":"mylogger-2014.06.05","_type":"
>>>> mytype-host.domain.com"}}
>>>>
>>>> entry-3: {"json":"data","extracted":"from","message":"payload"}
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, May 17, 2016 at 1:29 PM, Andrew Grande <agra...@hortonworks.com
>>>> > wrote:
>>>>
>>>>> Try SplitText with a header line count of 1. It should skip it and
>>>>> give the 2nd line as a result.
>>>>>
>>>>> Andrew
>>>>>
>>>>> From: Madhukar Thota <madhukar.th...@gmail.com>
>>>>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>>>>> Date: Tuesday, May 17, 2016 at 12:31 PM
>>>>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>>>>> Subject: Re: Json Split
>>>>>
>>>>> Hi Bryan,
>>>>>
>>>>> I tried with lineCount 1, i see it splitting two documents. But i need
>>>>> to only one document
>>>>>
>>>>> "{"json":"data","extracted":"from","message":"payload"}"
>>>>>
>>>>> How can i get that?
>>>>>
>>>>> On Tue, May 17, 2016 at 12:21 PM, Bryan Bende <bbe...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I think this would probably be better handled by SplitText with a
>>>>>> line count of 1.
>>>>>>
>>>>>> SplitJson would be more for splitting an array of JSON documents, or
>>>>>> a field that is an array.
>>>>>>
>>>>>> -Bryan
>>>>>>
>>>>>> On Tue, May 17, 2016 at 12:15 PM, Madhukar Thota <
>>>>>> madhukar.th...@gmail.com> wrote:
>>>>>>
>>>>>>> I have a incoming json from kafka with two documents seperated by
>>>>>>> new line
>>>>>>>
>>>>>>> {"index":{"_index":"mylogger-2014.06.05","_type":"mytype-host.domain.com"}}{"json":"data","extracted":"from","message":"payload"}
>>>>>>>
>>>>>>>
>>>>>>> I want to get the second document after new line. How can i split
>>>>>>> the json by new line using SplitJSOn processor.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


Re: Json Split

2016-05-17 Thread Bryan Bende
I think another alternative could be to use RouteText...

If you set the Matching Strategy to "starts with" and add a dynamic
property called "matched" with a value of {"json  which will send any lines
that start with {"json to the matched relationship.

On Tue, May 17, 2016 at 3:08 PM, Bryan Bende <bbe...@gmail.com> wrote:

> If you only want the second JSON document, can you send the output of
> SplitText to EvaluateJsonPath and configure it to extract $.json ?
>
> In your original example only the second document had a field called
> "json", and the matched relationship coming out of EvaluateJsonPath will
> only receive the json documents that had the path being extracted.
>
> -Bryan
>
>
> On Tue, May 17, 2016 at 1:52 PM, Madhukar Thota <madhukar.th...@gmail.com>
> wrote:
>
>> How do i get  entry-3: {"json":"data","extracted":"from","message":
>> "payload"} only?
>>
>> On Tue, May 17, 2016 at 1:52 PM, Madhukar Thota <madhukar.th...@gmail.com
>> > wrote:
>>
>>> Hi Andrew,
>>>
>>> I configured as you suggested, but in the queue i see three entries..
>>>
>>>
>>> entry-1: {"index":{"_index":"mylogger-2014.06.05","_type":"
>>> mytype-host.domain.com"}}
>>> {"json":"data","extracted":"from","message":"payload"}
>>>
>>> entry-2: {"index":{"_index":"mylogger-2014.06.05","_type":"
>>> mytype-host.domain.com"}}
>>>
>>> entry-3: {"json":"data","extracted":"from","message":"payload"}
>>>
>>>
>>>
>>>
>>>
>>> On Tue, May 17, 2016 at 1:29 PM, Andrew Grande <agra...@hortonworks.com>
>>> wrote:
>>>
>>>> Try SplitText with a header line count of 1. It should skip it and give
>>>> the 2nd line as a result.
>>>>
>>>> Andrew
>>>>
>>>> From: Madhukar Thota <madhukar.th...@gmail.com>
>>>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>>>> Date: Tuesday, May 17, 2016 at 12:31 PM
>>>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>>>> Subject: Re: Json Split
>>>>
>>>> Hi Bryan,
>>>>
>>>> I tried with lineCount 1, i see it splitting two documents. But i need
>>>> to only one document
>>>>
>>>> "{"json":"data","extracted":"from","message":"payload"}"
>>>>
>>>> How can i get that?
>>>>
>>>> On Tue, May 17, 2016 at 12:21 PM, Bryan Bende <bbe...@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I think this would probably be better handled by SplitText with a line
>>>>> count of 1.
>>>>>
>>>>> SplitJson would be more for splitting an array of JSON documents, or a
>>>>> field that is an array.
>>>>>
>>>>> -Bryan
>>>>>
>>>>> On Tue, May 17, 2016 at 12:15 PM, Madhukar Thota <
>>>>> madhukar.th...@gmail.com> wrote:
>>>>>
>>>>>> I have a incoming json from kafka with two documents seperated by new
>>>>>> line
>>>>>>
>>>>>> {"index":{"_index":"mylogger-2014.06.05","_type":"mytype-host.domain.com"}}{"json":"data","extracted":"from","message":"payload"}
>>>>>>
>>>>>>
>>>>>> I want to get the second document after new line. How can i split the
>>>>>> json by new line using SplitJSOn processor.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


Re: Json Split

2016-05-17 Thread Bryan Bende
If you only want the second JSON document, can you send the output of
SplitText to EvaluateJsonPath and configure it to extract $.json ?

In your original example only the second document had a field called
"json", and the matched relationship coming out of EvaluateJsonPath will
only receive the json documents that had the path being extracted.

-Bryan


On Tue, May 17, 2016 at 1:52 PM, Madhukar Thota <madhukar.th...@gmail.com>
wrote:

> How do i get  entry-3: {"json":"data","extracted":"from","message":
> "payload"} only?
>
> On Tue, May 17, 2016 at 1:52 PM, Madhukar Thota <madhukar.th...@gmail.com>
> wrote:
>
>> Hi Andrew,
>>
>> I configured as you suggested, but in the queue i see three entries..
>>
>>
>> entry-1: {"index":{"_index":"mylogger-2014.06.05","_type":"
>> mytype-host.domain.com"}}
>> {"json":"data","extracted":"from","message":"payload"}
>>
>> entry-2: {"index":{"_index":"mylogger-2014.06.05","_type":"
>> mytype-host.domain.com"}}
>>
>> entry-3: {"json":"data","extracted":"from","message":"payload"}
>>
>>
>>
>>
>>
>> On Tue, May 17, 2016 at 1:29 PM, Andrew Grande <agra...@hortonworks.com>
>> wrote:
>>
>>> Try SplitText with a header line count of 1. It should skip it and give
>>> the 2nd line as a result.
>>>
>>> Andrew
>>>
>>> From: Madhukar Thota <madhukar.th...@gmail.com>
>>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>>> Date: Tuesday, May 17, 2016 at 12:31 PM
>>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>>> Subject: Re: Json Split
>>>
>>> Hi Bryan,
>>>
>>> I tried with lineCount 1, i see it splitting two documents. But i need
>>> to only one document
>>>
>>> "{"json":"data","extracted":"from","message":"payload"}"
>>>
>>> How can i get that?
>>>
>>> On Tue, May 17, 2016 at 12:21 PM, Bryan Bende <bbe...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> I think this would probably be better handled by SplitText with a line
>>>> count of 1.
>>>>
>>>> SplitJson would be more for splitting an array of JSON documents, or a
>>>> field that is an array.
>>>>
>>>> -Bryan
>>>>
>>>> On Tue, May 17, 2016 at 12:15 PM, Madhukar Thota <
>>>> madhukar.th...@gmail.com> wrote:
>>>>
>>>>> I have a incoming json from kafka with two documents seperated by new
>>>>> line
>>>>>
>>>>> {"index":{"_index":"mylogger-2014.06.05","_type":"mytype-host.domain.com"}}{"json":"data","extracted":"from","message":"payload"}
>>>>>
>>>>>
>>>>> I want to get the second document after new line. How can i split the
>>>>> json by new line using SplitJSOn processor.
>>>>>
>>>>
>>>>
>>>
>>
>


Re: Json Split

2016-05-17 Thread Madhukar Thota
Hi Andrew,

I configured as you suggested, but in the queue i see three entries..


entry-1: {"index":{"_index":"mylogger-2014.06.05","_type":"
mytype-host.domain.com"}}
{"json":"data","extracted":"from","message":"payload"}

entry-2: {"index":{"_index":"mylogger-2014.06.05","_type":"
mytype-host.domain.com"}}

entry-3: {"json":"data","extracted":"from","message":"payload"}





On Tue, May 17, 2016 at 1:29 PM, Andrew Grande <agra...@hortonworks.com>
wrote:

> Try SplitText with a header line count of 1. It should skip it and give
> the 2nd line as a result.
>
> Andrew
>
> From: Madhukar Thota <madhukar.th...@gmail.com>
> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
> Date: Tuesday, May 17, 2016 at 12:31 PM
> To: "users@nifi.apache.org" <users@nifi.apache.org>
> Subject: Re: Json Split
>
> Hi Bryan,
>
> I tried with lineCount 1, i see it splitting two documents. But i need to
> only one document
>
> "{"json":"data","extracted":"from","message":"payload"}"
>
> How can i get that?
>
> On Tue, May 17, 2016 at 12:21 PM, Bryan Bende <bbe...@gmail.com> wrote:
>
>> Hello,
>>
>> I think this would probably be better handled by SplitText with a line
>> count of 1.
>>
>> SplitJson would be more for splitting an array of JSON documents, or a
>> field that is an array.
>>
>> -Bryan
>>
>> On Tue, May 17, 2016 at 12:15 PM, Madhukar Thota <
>> madhukar.th...@gmail.com> wrote:
>>
>>> I have a incoming json from kafka with two documents seperated by new
>>> line
>>>
>>> {"index":{"_index":"mylogger-2014.06.05","_type":"mytype-host.domain.com"}}{"json":"data","extracted":"from","message":"payload"}
>>>
>>>
>>> I want to get the second document after new line. How can i split the
>>> json by new line using SplitJSOn processor.
>>>
>>
>>
>


Re: Json Split

2016-05-17 Thread Andrew Grande
Try SplitText with a header line count of 1. It should skip it and give the 2nd 
line as a result.

Andrew

From: Madhukar Thota <madhukar.th...@gmail.com<mailto:madhukar.th...@gmail.com>>
Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" 
<users@nifi.apache.org<mailto:users@nifi.apache.org>>
Date: Tuesday, May 17, 2016 at 12:31 PM
To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" 
<users@nifi.apache.org<mailto:users@nifi.apache.org>>
Subject: Re: Json Split

Hi Bryan,

I tried with lineCount 1, i see it splitting two documents. But i need to only 
one document

"{"json":"data","extracted":"from","message":"payload"}"

How can i get that?

On Tue, May 17, 2016 at 12:21 PM, Bryan Bende 
<bbe...@gmail.com<mailto:bbe...@gmail.com>> wrote:
Hello,

I think this would probably be better handled by SplitText with a line count of 
1.

SplitJson would be more for splitting an array of JSON documents, or a field 
that is an array.

-Bryan

On Tue, May 17, 2016 at 12:15 PM, Madhukar Thota 
<madhukar.th...@gmail.com<mailto:madhukar.th...@gmail.com>> wrote:
I have a incoming json from kafka with two documents seperated by new line


{"index":{"_index":"mylogger-2014.06.05","_type":"mytype-host.domain.com<http://mytype-host.domain.com>"}}{"json":"data","extracted":"from","message":"payload"}

I want to get the second document after new line. How can i split the json by 
new line using SplitJSOn processor.




Re: Json Split

2016-05-17 Thread Madhukar Thota
Hi Bryan,

I tried with lineCount 1, i see it splitting two documents. But i need to
only one document

"{"json":"data","extracted":"from","message":"payload"}"

How can i get that?

On Tue, May 17, 2016 at 12:21 PM, Bryan Bende  wrote:

> Hello,
>
> I think this would probably be better handled by SplitText with a line
> count of 1.
>
> SplitJson would be more for splitting an array of JSON documents, or a
> field that is an array.
>
> -Bryan
>
> On Tue, May 17, 2016 at 12:15 PM, Madhukar Thota  > wrote:
>
>> I have a incoming json from kafka with two documents seperated by new line
>>
>> {"index":{"_index":"mylogger-2014.06.05","_type":"mytype-host.domain.com"}}{"json":"data","extracted":"from","message":"payload"}
>>
>>
>> I want to get the second document after new line. How can i split the
>> json by new line using SplitJSOn processor.
>>
>
>


Re: Json Split

2016-05-17 Thread Bryan Bende
Hello,

I think this would probably be better handled by SplitText with a line
count of 1.

SplitJson would be more for splitting an array of JSON documents, or a
field that is an array.

-Bryan

On Tue, May 17, 2016 at 12:15 PM, Madhukar Thota 
wrote:

> I have a incoming json from kafka with two documents seperated by new line
>
> {"index":{"_index":"mylogger-2014.06.05","_type":"mytype-host.domain.com"}}{"json":"data","extracted":"from","message":"payload"}
>
>
> I want to get the second document after new line. How can i split the json
> by new line using SplitJSOn processor.
>


Json Split

2016-05-17 Thread Madhukar Thota
I have a incoming json from kafka with two documents seperated by new line

{"index":{"_index":"mylogger-2014.06.05","_type":"mytype-host.domain.com"}}{"json":"data","extracted":"from","message":"payload"}


I want to get the second document after new line. How can i split the json
by new line using SplitJSOn processor.