[ 
https://issues.apache.org/jira/browse/NIFI-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446337#comment-16446337
 ] 

ASF GitHub Bot commented on NIFI-4456:
--------------------------------------

Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2640#discussion_r183161097
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/JsonPathReader.java
 ---
    @@ -48,15 +48,17 @@
     import com.jayway.jsonpath.JsonPath;
     
     @Tags({"json", "jsonpath", "record", "reader", "parser"})
    -@CapabilityDescription("Parses JSON records and evaluates user-defined 
JSON Path's against each JSON object. The root element may be either "
    -    + "a single JSON object or a JSON array. If a JSON array is found, 
each JSON object within that array is treated as a separate record. "
    -    + "User-defined properties define the fields that should be extracted 
from the JSON in order to form the fields of a Record. Any JSON field "
    -    + "that is not extracted via a JSONPath will not be returned in the 
JSON Records.")
    +@CapabilityDescription("Parses JSON records and evaluates user-defined 
JSON Path's against each JSON object. The reader does not require the "
    --- End diff --
    
    I would be hesitant to indicate "The reader does not require the flow file 
content to be well-formed JSON." This gives me the impression that improper 
JSON will still be handled correctly, perhaps by skipping over the invalid 
parts? Perhaps we should word it as "While the reader expects each record to be 
well-formed JSON, the content of a FlowFile may consist of many records, either 
as a well-formed JSON array, or a series of JSON records with optional 
whitespace between them, such as the common 'JSON-per-line' format." or 
something of that nature.


> Update JSON Record Reader / Writer to allow for 'json per line' format
> ----------------------------------------------------------------------
>
>                 Key: NIFI-4456
>                 URL: https://issues.apache.org/jira/browse/NIFI-4456
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Matt Burgess
>            Priority: Major
>
> It is common, especially for archiving purposes, to have many JSON objects 
> combined with new-lines in between, in order to delimit the records. It would 
> be useful to allow record readers and writers to support this, instead of 
> requiring that JSON records being elements in a JSON Array.
> For example, the following JSON Is considered two records:
> {code}
> [
>   { "greeting" : "hello", "id" : 1 },
>   { "greeting" : "good-bye", "id" : 2 }
> ]
> {code}
> It would be beneficial to also support the format:
> {code}
> { "greeting" : "hello", "id" : 1 }
> { "greeting" : "good-bye", "id" : 2 }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to