MikeThomsen commented on issue #3414: NIFI-5900 Add a SplitLargeJson processor
URL: https://github.com/apache/nifi/pull/3414#issuecomment-496493612
 
 
   @arenger 
   
   > but it does not show up in the UI
   
   You need to add an entry to `org.apache.nifi.processor.Processor` under 
`resources/services` in the standard processors bundle.
   
   FWIW, I think going to JsonSurfer is the right call, in part because it 
wraps known quantities like Gson and Jackson. When I did that streaming json 
parser service, it was pretty easy once I got the hang of it.
   
   The only thing in your implementation I am worried about is the 1:1 
flowfile/record thing. I think you should have batching built in as an option. 
The use case that caused my initial use of JsonSurfer was a 16GB JSON file that 
would have put about 80M flowfiles into the queue. Not typical, but even 
putting a few hundred thousand tiny flowfiles at once is not necessarily a good 
thing either.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to