Re: Json Split

Madhukar Thota Tue, 17 May 2016 12:15:03 -0700

I simply went with ExecuteScript Processor to do the job:

Here is the code i am using:


import org.apache.commons.io.IOUtils
import java.nio.charset.*

def flowFile = session.get();
if (flowFile == null) {
    return;
}
flowFile = session.write(flowFile,
    { inputStream, outputStream ->
        def jsonInput = IOUtils.toString(inputStream,
StandardCharsets.UTF_8)
        def values = jsonInput.split('\\r?\\n')
        outputStream.write(values[1].getBytes(StandardCharsets.UTF_8))
    } as StreamCallback)
session.transfer(flowFile, ExecuteScript.REL_SUCCESS)

Thanks,
Madhu

On Tue, May 17, 2016 at 3:12 PM, Bryan Bende <bbe...@gmail.com> wrote:

> I think another alternative could be to use RouteText...
>
> If you set the Matching Strategy to "starts with" and add a dynamic
> property called "matched" with a value of {"json  which will send any lines
> that start with {"json to the matched relationship.
>
> On Tue, May 17, 2016 at 3:08 PM, Bryan Bende <bbe...@gmail.com> wrote:
>
>> If you only want the second JSON document, can you send the output of
>> SplitText to EvaluateJsonPath and configure it to extract $.json ?
>>
>> In your original example only the second document had a field called
>> "json", and the matched relationship coming out of EvaluateJsonPath will
>> only receive the json documents that had the path being extracted.
>>
>> -Bryan
>>
>>
>> On Tue, May 17, 2016 at 1:52 PM, Madhukar Thota <madhukar.th...@gmail.com
>> > wrote:
>>
>>> How do i get  entry-3: {"json":"data","extracted":"from","message":
>>> "payload"} only?
>>>
>>> On Tue, May 17, 2016 at 1:52 PM, Madhukar Thota <
>>> madhukar.th...@gmail.com> wrote:
>>>
>>>> Hi Andrew,
>>>>
>>>> I configured as you suggested, but in the queue i see three entries..
>>>>
>>>>
>>>> entry-1: {"index":{"_index":"mylogger-2014.06.05","_type":"
>>>> mytype-host.domain.com"}}
>>>> {"json":"data","extracted":"from","message":"payload"}
>>>>
>>>> entry-2: {"index":{"_index":"mylogger-2014.06.05","_type":"
>>>> mytype-host.domain.com"}}
>>>>
>>>> entry-3: {"json":"data","extracted":"from","message":"payload"}
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, May 17, 2016 at 1:29 PM, Andrew Grande <agra...@hortonworks.com
>>>> > wrote:
>>>>
>>>>> Try SplitText with a header line count of 1. It should skip it and
>>>>> give the 2nd line as a result.
>>>>>
>>>>> Andrew
>>>>>
>>>>> From: Madhukar Thota <madhukar.th...@gmail.com>
>>>>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>>>>> Date: Tuesday, May 17, 2016 at 12:31 PM
>>>>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>>>>> Subject: Re: Json Split
>>>>>
>>>>> Hi Bryan,
>>>>>
>>>>> I tried with lineCount 1, i see it splitting two documents. But i need
>>>>> to only one document
>>>>>
>>>>> "{"json":"data","extracted":"from","message":"payload"}"
>>>>>
>>>>> How can i get that?
>>>>>
>>>>> On Tue, May 17, 2016 at 12:21 PM, Bryan Bende <bbe...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I think this would probably be better handled by SplitText with a
>>>>>> line count of 1.
>>>>>>
>>>>>> SplitJson would be more for splitting an array of JSON documents, or
>>>>>> a field that is an array.
>>>>>>
>>>>>> -Bryan
>>>>>>
>>>>>> On Tue, May 17, 2016 at 12:15 PM, Madhukar Thota <
>>>>>> madhukar.th...@gmail.com> wrote:
>>>>>>
>>>>>>> I have a incoming json from kafka with two documents seperated by
>>>>>>> new line
>>>>>>>
>>>>>>> {"index":{"_index":"mylogger-2014.06.05","_type":"mytype-host.domain.com"}}{"json":"data","extracted":"from","message":"payload"}
>>>>>>>
>>>>>>>
>>>>>>> I want to get the second document after new line. How can i split
>>>>>>> the json by new line using SplitJSOn processor.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Json Split

Reply via email to