Hello Etienne. Yes, Matt may have mentioned that approach and I started to
look into it.

My initial thought was this: is it much of a savings? My rudimentary
process works in three process steps - each simple in configuration. The
JoltTransformationJSON would eliminate only one processor, and it looks
fairly complex to configure. It appears to require a Custom Transformation
Class Name, a Custom Module Directory, and a Jolt Specification. For folks
who have done it before those may be an afterthought. But as is often the
case with NiFi, if you've never used a processor sometimes it is hard to
find concrete examples to configure NiFi processors, services, schemas, etc
etc. I opted to take the more familiar path, not being familiar with the
Jolt transformation processor.

Am happy to learn and will see if there's much out there in way of examples
to configure JoltTransformationJSON. For now I'll use my less elegant
solution that works gets me where i need to be: pumping data through my
production system.

Good suggestion. Thanks again.

On Thu, Dec 5, 2019 at 8:20 AM Etienne Jouvin <[email protected]>
wrote:

> Hello.
>
> Why don't you use a JoltTransformation process first to produce multiple
> element in JSON according value in the array, and duplicate common
> attributes for all.
> And then, you do the split.
>
> Etienne
>
>
> Le jeu. 5 déc. 2019 à 14:11, James McMahon <[email protected]> a
> écrit :
>
>> Daeho and Matt, thank you for all your suggestions. You helped me get to
>> a solution. Here is how I unwound my incoming JSON with a simple flow,
>>
>> My incoming JSON flowfile looks like this:
>> {
>>   "KEY1":"value1",
>>   "KEY2":"value2",
>>   "FNAMES":["A","B","C","D"],
>>   "KEY4":2
>> }
>> My goal is to have a flowfile for each of A, B, C, and D, with attribute
>> THIS_NAME set to each singular FNAMES value, and also preserving KEY1,
>> KEY2, KEY3 as flowfile attributes with their values pulled from the JSON.
>>
>> Final flow: ListFile->FetchFile->EvaluateJsonPath->SplitJson->ExtractText
>>
>> EvaluateJsonPath grabs all JSON key/values to attributes. At this point
>> though, FNAMES attribute is ["A","B","C","D"] -- not quite what we require.
>>
>> SplitJson creates four flowfiles from one, its configuration setting
>> JsonPath Expression as $.FNAMES . This results in four flowfiles. We're
>> almost home.
>>
>> The flowfile content is now just each of the singular values from FNAMES.
>> ExtractText creates attribute THIS_NAME configured like this:
>> Include Capture Group 0 false
>> Dynamic property added is THIS_NAME, configured to regex pattern (.*) .
>> (Bad idea in general in any situation where content length may vary to
>> large content, but not in our case where we know the values in the original
>> JSON list are no larger than half a KB.)
>>
>> After this ExtractText step we have all our attributes, including
>> fragment.count of 4 and a common fragment.identifier we can later use to
>> reunite all after individual processing, with a MergeContent or similar.
>>
>> Thank you once again.
>>
>> On Thu, Dec 5, 2019 at 6:36 AM 노대호Daeho Ro <[email protected]>
>> wrote:
>>
>>> Hm.... I might wrong.
>>>
>>> It wouldn't preserve other keys, so you have to evaluate other keys
>>> first, and split FNAMES and evaluate again. Sorry for the confusion.
>>>
>>> 2019년 12월 5일 (목) 오후 8:29, James McMahon <[email protected]>님이 작성:
>>>
>>>> Typo in my initial reply. I did use $.FNAMES. It drops all the other
>>>> key/value pairs in the output split result flowfiles.
>>>> I configured my SplitJSON like so:
>>>> JsonPathExpression    $.FNAME*S*
>>>> Null Value Representation     empty string
>>>>
>>>> If there are two values in the json array for that key FNAME*S*, I do
>>>> get two output flowfiles. But the only value present in the output is the
>>>> value from the split of the value list of FNAMES. All my other JSON keys
>>>> and values are not present. How do I tell SplitJSON to also retain all the
>>>> key/values I did not split on?
>>>>
>>>> On Thu, Dec 5, 2019 at 6:15 AM 노대호Daeho Ro <[email protected]>
>>>> wrote:
>>>>
>>>>> Path to be $.FNAMES, that will work I guess.
>>>>>
>>>>> 2019년 12월 5일 (목) 오후 8:10, James McMahon <[email protected]>님이 작성:
>>>>>
>>>>>> I should add that I also tried this for JsonPathExpression $.*
>>>>>> That result also wasn't what I require, because it gave me 14
>>>>>> different flowfiles each with only one value - - the two that resulted 
>>>>>> from
>>>>>> the FNAME key, and one for each of the other 12 keys that had only one
>>>>>> value.
>>>>>> My incoming JSON flowfile looks like this:
>>>>>> {
>>>>>>   "KEY1":"value1",
>>>>>>   "KEY2":"value2",
>>>>>>    .
>>>>>>    .
>>>>>>   "FNAMES":["A","B"],
>>>>>>   "KEY13":2
>>>>>> }
>>>>>>
>>>>>> This is what I need as output:
>>>>>> {
>>>>>>   "KEY1":"value1",
>>>>>>   "KEY2":"value2",
>>>>>>    .
>>>>>>    .
>>>>>>   "FNAMES":"A",,
>>>>>>   "KEY13":2
>>>>>> }
>>>>>>
>>>>>> and
>>>>>>
>>>>>> {
>>>>>>   "KEY1":"value1",
>>>>>>   "KEY2":"value2",
>>>>>>    .
>>>>>>    .
>>>>>>   "FNAMES":"B",
>>>>>>   "KEY13":2
>>>>>> }
>>>>>>
>>>>>> How does one configure SplitJSON to accomplish that?
>>>>>>
>>>>>> On Thu, Dec 5, 2019 at 5:59 AM James McMahon <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Daeho, I configured my SplitJSON like so:
>>>>>>> JsonPathExpression    $.FNAME
>>>>>>> Null Value Representation     empty string
>>>>>>>
>>>>>>> If there are two values in the json array for that key FNAME, I do
>>>>>>> get two output flowfiles. But the only value present in the output is 
>>>>>>> the
>>>>>>> value from the split of the list. All my other JSON keys and values are 
>>>>>>> not
>>>>>>> present. How do I tell SplitJSON to also retain all the key/values I did
>>>>>>> not split on?
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Dec 4, 2019 at 9:26 PM 노대호Daeho Ro <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Of course.
>>>>>>>>
>>>>>>>> There is a processor, the name is SplitJson. It can split the JSON
>>>>>>>> text by defined key. For example, if there is a key name is 'fname' 
>>>>>>>> and has
>>>>>>>> the value [a, b, c]. Once you split the JSON by that processor, the
>>>>>>>> resulted JSON will have the same key and values for others but 'fname' 
>>>>>>>> will
>>>>>>>> be a for the first JSON , b for the second and so on.
>>>>>>>>
>>>>>>>> After that, do the EvaluateJsonPath for FNAME then it will have a
>>>>>>>> and b and c for each splited flowfiles. Thus, I recommend you to place 
>>>>>>>> the
>>>>>>>> SplitJson processor in front of the  EvaluateJsonPath processor.
>>>>>>>>
>>>>>>>> 2019년 12월 5일 (목) 오전 10:58, James McMahon <[email protected]>님이
>>>>>>>> 작성:
>>>>>>>>
>>>>>>>>> I don’t quite follow, Daeho. FNAME is an attribute that results
>>>>>>>>> *from* EvaluateJSonPath. Can you explain what you mean by
>>>>>>>>> splitting the Jason key before EvaluateJSonPath?
>>>>>>>>> Jim
>>>>>>>>>
>>>>>>>>> On Wed, Dec 4, 2019 at 7:45 PM 노대호Daeho Ro <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> I think you can split the json key for FNAME just before the
>>>>>>>>>> EvaluateJsonPath processor. Then, the fragment.* attributes will be
>>>>>>>>>> automatically created.
>>>>>>>>>>
>>>>>>>>>> 2019년 12월 5일 (목) 오전 8:24, Matt Burgess <[email protected]>님이
>>>>>>>>>> 작성:
>>>>>>>>>>
>>>>>>>>>>> Jim,
>>>>>>>>>>>
>>>>>>>>>>> As of NiFi 1.8.0 [1], you should be able to do this with
>>>>>>>>>>> UpdateAttribute -> DuplicateFlowFile -> UpdateAttribute pattern,
>>>>>>>>>>> the
>>>>>>>>>>> first getting the number of values in the list via the count() EL
>>>>>>>>>>> function, the second using that (minus 1) to generate
>>>>>>>>>>> duplicates, each
>>>>>>>>>>> with a copy.index attribute set. That attribute can be used in
>>>>>>>>>>> another
>>>>>>>>>>> UpdateAttribute with getDelimitedField() EL function for each
>>>>>>>>>>> flow
>>>>>>>>>>> file to get its own value from FNAME. You may need to rename
>>>>>>>>>>> some of
>>>>>>>>>>> the attributes to fragment.* in order to use a merge processor,
>>>>>>>>>>> but I
>>>>>>>>>>> think all the necessary values are covered. Please let me know
>>>>>>>>>>> if this
>>>>>>>>>>> works for you or not, I added various improvements in order to
>>>>>>>>>>> support
>>>>>>>>>>> use cases like this, but if I missed something I can certainly
>>>>>>>>>>> add it.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Matt
>>>>>>>>>>>
>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/NIFI-5454
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Dec 4, 2019 at 4:54 PM James McMahon <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > I have a series of attributes that result from an
>>>>>>>>>>> EvaluateJSonPath. One of those attributes, FNAME, appears to be a 
>>>>>>>>>>> list of
>>>>>>>>>>> values like so: [“A”,”B”,”C”]. I want to split my flow file into 
>>>>>>>>>>> one for
>>>>>>>>>>> each list element. I need my results to have the original content, 
>>>>>>>>>>> all the
>>>>>>>>>>> original attributes, and its value for the split result out of the 
>>>>>>>>>>> list as
>>>>>>>>>>> a new attribute. I need to also know the split count, and be able 
>>>>>>>>>>> to later
>>>>>>>>>>> merge my flow files after evaluating the results of the split.
>>>>>>>>>>> > How can I accomplish this?
>>>>>>>>>>> > Thanks very much in advance.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>>>>>>> [email protected] / *M* +82 10-6366-2636
>>>>>>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>>>>>>> <https://www.google.com/maps/search/%EC%84%9C%EC%9A%B8%EC%8B%9C+%EC%84%9C%EC%B4%88%EA%B5%AC+%EA%B0%95%EB%82%A8%EB%8C%80%EB%A1%9C+327,+13%EC%B8%B5?entry=gmail&source=g>
>>>>>>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>>>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>>>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>>>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>>>>>>> ------------------------------
>>>>>>>>>> *Confidentiality Note:* This email may contain confidential
>>>>>>>>>> and/or private information.
>>>>>>>>>> If you received this email in error please delete and notify the
>>>>>>>>>> sender.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>>>>> [email protected] / *M* +82 10-6366-2636
>>>>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>>>>> ------------------------------
>>>>>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>>>>>> private information.
>>>>>>>> If you received this email in error please delete and notify the
>>>>>>>> sender.
>>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> 노대호  *Daeho Ro */ Service Dev.
>>>>> [email protected] / *M* +82 10-6366-2636
>>>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>>>> [image: Bespin Global] <https://bespinglobal.com/>
>>>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>>>> ------------------------------
>>>>> *Confidentiality Note:* This email may contain confidential and/or
>>>>> private information.
>>>>> If you received this email in error please delete and notify the
>>>>> sender.
>>>>>
>>>>
>>>
>>> --
>>> 노대호  *Daeho Ro */ Service Dev.
>>> [email protected] / *M* +82 10-6366-2636
>>> *KR* 06167 서울시 서초구 강남대로 327, 13층
>>> [image: Bespin Global] <https://bespinglobal.com/>
>>> 국내 최다 클라우드 인증 자격을 보유한 MSP • 국내 유일 ISO 인증을
>>> 확보한 MSP • 가트너가 인정한 한중일 유일한 MSP
>>> www.bespinglobal.com • ISO 27001:2013 • ISO 9001:2015 Certified
>>> ------------------------------
>>> *Confidentiality Note:* This email may contain confidential and/or
>>> private information.
>>> If you received this email in error please delete and notify the sender.
>>>
>>

Reply via email to