ChrisSamo632 commented on PR #7890:
URL: https://github.com/apache/nifi/pull/7890#issuecomment-1775663229

   > @ChrisSamo632 @exceptionfactory I understand the history from #7782 that 
there had to be two processors `JoltTransformJSON` and `JoltTransformRecord`. 
But my question is why is there duplicate code between the two processors? This 
MR and NIFI-11959 (#7678) are consequences of duplicate code (i.e. the need to 
make the same fix in both places). Is there something we can do here to 
refactor and extract the duplicate code to be shared with these two classes?
   
   @dan-s1 absolutely, I was thinking similar when I made the comment about the 
`*Record` processor in this PR, although I didn't think it was worth extending 
this PR to cover such a refactor. Could you raise a separate Jira ticket to 
refactor these processors to merge the logic where possible, e.g. the 
`PropertyDescriptor`s and such, at the very least, could be merged together.
   
   It would be good to explore whether things such as the custom UI available 
in the `JoltTransformJSON` could be merged together with `JoltTransformRecord` 
in the future, plus the handling of JOLT logic where possible. I'm not overly 
familiar with these processor, but without spending much time looking at the 
code, I'd hope that the only real difference *should* be whether the processor 
is applying the transform to the entire FlowFile content (i.e. in the `*JSON` 
processor) or every Record within (e.g. JSON Object within an Array, or every 
line of an ndjson/jsonl/json-ld file) - once the appropriate content has been 
read into memory, presumably the same JOLT transformation logic should be 
executed, then the result serialised either directly as the FlowFile's content 
(i.e. for `*JSON`) or via the configured `Record Set Writer` (i.e. for 
`*Record`).
   
   I've tried to do a similar thing for the `PutElastisearch*` processors 
previously (i.e. `*JSON` and `*Record`), which both extend an 
`AbstractPutElasticsearch` base in an attempt to minimise repetition and avoid 
maintenance problems/inconsistent behaviour between the processors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to