So I’m pretty lost now, all the suggestions from Matt will not solve my problem that I need to have all contents of a flow file as attritube key -paired…
A good place to have it would be on ConvertAvroToJSON so that it has a option to say if it goes to attribute or to FlowFile, defaulting to Flowfile. Would be the Changed accepted ? I would create a PR for it. Jorge Machado > On 20 Mar 2018, at 22:35, Otto Fowler <[email protected]> wrote: > > We could start with routeOnJsonPath and do the record path as the need > arises? > > > On March 20, 2018 at 16:06:34, Matt Burgess ([email protected]) wrote: > > Rather than restricting it to JSONPath, perhaps we should have a > RouteOnRecordPath or RouteRecord using the RecordPath API? Even better > would be the ability to use RecordPath functions in QueryRecord, but > that involves digging into Calcite as well. I realize JSONPath might > have more capabilities than RecordPath at the moment, but it seems a > shame to force the user to convert to JSON to use a "RouteOnJSONPath" > processor, the record-aware processors are meant to replace that kind > of format-specific functionality. > > Regards, > Matt > > On Tue, Mar 20, 2018 at 12:19 PM, Sivaprasanna > <[email protected]> wrote: >> Like the idea that Otto suggested. RoutOnJSONPath makes more sense since >> making the flattened JSON write to attributes is restricted to that >> processor alone. >> >> On Tue, Mar 20, 2018 at 8:37 PM, Otto Fowler <[email protected]> >> wrote: >> >>> Why not create a new processor that does routeOnJSONPath and works on > the >>> flow file? >>> >>> >>> On March 20, 2018 at 10:39:37, Jorge Machado ([email protected]) wrote: >>> >>> So that is what we actually are doing EvaluateJsonPath the problem with >>> that is, that is hard to build something generic if we need to specify > each >>> property by his name, that’s why this idea. >>> >>> Should I make a PR for this or is this to business specific ? >>> >>> >>> Jorge Machado >>> >>>> On 20 Mar 2018, at 15:30, Bryan Bende <[email protected]> wrote: >>>> >>>> Ok so I guess it depends whether you end up needing all 30 fields as >>>> attributes to achieve the logic in your flow, or if you only need a >>>> couple. >>>> >>>> If you only need a couple you could probably use EvaluateJsonPath >>>> after FlattenJson to extract just the couple of fields you need into >>>> attributes. >>>> >>>> If you need them all then I guess it makes sense to want the option to >>>> flatten into attributes. >>>> >>>> On Tue, Mar 20, 2018 at 10:14 AM, Jorge Machado <[email protected]> wrote: >>>>> From there on we use a lot of routeOnAttritutes and use that values > on >>> sql queries to other tables like select * from someTable where >>> id=${myExtractedAttribute} >>>>> To be honest I tryed JoltTransformJSON but I could not get it working > :) >>>>> >>>>> Jorge Machado >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> On 20 Mar 2018, at 15:12, Matt Burgess <[email protected]> wrote: >>>>>> >>>>>> I think Bryan is asking about what happens AFTER this part of the >>>>>> flow. For example, if you are doing routing you can use QueryRecord >>>>>> (and you won't need the SplitJson), if you are doing transformations >>>>>> you can use JoltTransformJSON (often without SplitJson as well), > etc. >>>>>> >>>>>> Regards, >>>>>> Matt >>>>>> >>>>>> On Tue, Mar 20, 2018 at 10:08 AM, Jorge Machado <[email protected]> > wrote: >>>>>>> Hi Bryan, >>>>>>> >>>>>>> thanks for the help. >>>>>>> Our Flow: ExecuteSql -> convertToJSON -> SplitJson -> ExecuteScript >>> with attachedcode 1. >>>>>>> >>>>>>> We are now writting a custom processor that does this which is a > copy >>> of FlattenJson but instead of putting the result into a flowfile we put > it >>> into the attributes. >>>>>>> That’s why I asked if it makes sense to contribute this back >>>>>>> >>>>>>> >>>>>>> >>>>>>> Attached code 1: >>>>>>> >>>>>>> import org.apache.commons.io.IOUtils >>>>>>> import java.nio.charset.* >>>>>>> def flowFile = session.get(); >>>>>>> if (flowFile == null) { >>>>>>> return; >>>>>>> } >>>>>>> def slurper = new groovy.json.JsonSlurper() >>>>>>> def attrs = [:] as Map<String,String> >>>>>>> session.read(flowFile, >>>>>>> { inputStream -> >>>>>>> def text = IOUtils.toString(inputStream, StandardCharsets.UTF_8) >>>>>>> def obj = slurper.parseText(text) >>>>>>> obj.each {k,v -> >>>>>>> if(v!=null && v.toString()!=""){ >>>>>>> attrs[k] = v.toString() >>>>>>> } >>>>>>> } >>>>>>> } as InputStreamCallback) >>>>>>> flowFile = session.putAllAttributes(flowFile, attrs) >>>>>>> session.transfer(flowFile, REL_SUCCESS) >>>>>>> >>>>>>> some code removed >>>>>>> >>>>>>> >>>>>>> Jorge Machado >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 20 Mar 2018, at 15:03, Bryan Bende <[email protected]> wrote: >>>>>>>> >>>>>>>> Ok it is still not clear what the reason for needing it in > attributes >>>>>>>> is though... Is there another processor you are using after this > that >>>>>>>> only works off attributes? >>>>>>>> >>>>>>>> Just trying to understand if there is another way to accomplish > what >>>>>>>> you want to do. >>>>>>>> >>>>>>>> On Tue, Mar 20, 2018 at 9:50 AM, Jorge Machado <[email protected]> >>> wrote: >>>>>>>>> We are using nifi for Workflow and we get from a database like >>> job_status and job_name and some nested json columns. (30 columns) >>>>>>>>> We need to put it as attributes from the Flow file and not the >>> content. For the first part (columns without a json is done by groovy >>> script) but then would be nice to use this standard processor and > instead >>> of writing this to a flow content write it to attributes. >>>>>>>>> >>>>>>>>> >>>>>>>>> Jorge Machado >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 20 Mar 2018, at 14:47, Bryan Bende <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> What would be the main use case for wanting all the flattened >>> values >>>>>>>>>> in attributes? >>>>>>>>>> >>>>>>>>>> If the reason was to keep the original content, we could > probably >>> just >>>>>>>>>> added an original relationship. >>>>>>>>>> >>>>>>>>>> Also, I think FlattenJson supports flattening a flow file where > the >>>>>>>>>> root is an array of JSON documents (although I'm not totally > sure), >>> so >>>>>>>>>> you'd have to consider what to do in that case. >>>>>>>>>> >>>>>>>>>> On Tue, Mar 20, 2018 at 5:26 AM, Pierre Villard >>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>> No I do see how this could be convenient in some cases. My > comment >>> was >>>>>>>>>>> more: you can certainly submit a PR for that feature, but it'll >>> need to be >>>>>>>>>>> clearly documented using the appropriate annotations, >>> documentation, and >>>>>>>>>>> property descriptions. >>>>>>>>>>> >>>>>>>>>>> 2018-03-20 10:20 GMT+01:00 Jorge Machado <[email protected]>: >>>>>>>>>>> >>>>>>>>>>>> Hi Pierre, I’m aware of that. So This means the change would > not >>> be >>>>>>>>>>>> accepted correct ? >>>>>>>>>>>> >>>>>>>>>>>> Regards >>>>>>>>>>>> >>>>>>>>>>>> Jorge Machado >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On 20 Mar 2018, at 09:54, Pierre Villard < >>> [email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Jorge, >>>>>>>>>>>>> >>>>>>>>>>>>> I think this should be carefully documented to remind users > that >>> the >>>>>>>>>>>>> attributes are in memory. Doing what you propose would mean >>> having in >>>>>>>>>>>>> memory the full content of the flow file as long as the flow >>> file is >>>>>>>>>>>>> processed in the workflow (unless you remove attributes using >>>>>>>>>>>>> UpdateAttributes). >>>>>>>>>>>>> >>>>>>>>>>>>> Pierre >>>>>>>>>>>>> >>>>>>>>>>>>> 2018-03-20 7:55 GMT+01:00 Jorge Machado <[email protected]>: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hey guys, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I would like to change the FlattenJson Procerssor to be >>> possible to >>>>>>>>>>>>>> Flatten to the attributes instead of Only to content. Is > this a >>> good >>>>>>>>>>>> Idea ? >>>>>>>>>>>>>> would the PR be accepted ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> Jorge Machado >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>
