[ 
https://issues.apache.org/jira/browse/NIFI-360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342647#comment-14342647
 ] 

ASF GitHub Bot commented on NIFI-360:
-------------------------------------

GitHub user apiri opened a pull request:

    https://github.com/apache/incubator-nifi/pull/31

    NIFI-360: Create Processors to work against JSON data - Refactoring

    NIFI-360: Create Processors to work against JSON data
    
    This pull request closes the original PR #28 and provides incorporation of 
the feedback and suggestions provided therein.
    
    Changes include:
    * Caching of JsonPath elements upon validation and clearing these elements 
upon modifications so that they are recalculated on the following trigger
    
    * Switched to SupportsBatching for EvaluateJsonPath and removed the nested 
loops
    
    * Created Usage pages for each of the processors which include a note about 
the full buffering of JSON documents in memory.
    
    * Created a stripped down PathCompiler, JsonPathExpressionValidator, which 
performs an exception free analysis of JsonPath to check for validity.  There 
is still room left for optimization but would need a fair amount of refactoring 
of the JsonPath library.  I have a good feel for how the library is structure 
and where it could be improved, but the current project does not provide for 
the filing of issues, so it has been added on my fork, apiri/JsonPath#1.
     
    * Updated code to provide provenance content-modified for EvaluateJsonPath 
and fork for SplitJson.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apiri/incubator-nifi NIFI-360

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-nifi/pull/31.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #31
    
----
commit 81234f3a6dd1effd4c66bcbdbe8d119843b7fdf2
Author: Aldrin Piri <[email protected]>
Date:   2015-02-20T21:05:16Z

    Removing the batched get of flowfiles to utilize the framework provided 
batching support

commit 46bf048b2407ddbd61024d2d0e69beccb2f72014
Author: Aldrin Piri <[email protected]>
Date:   2015-02-22T01:26:58Z

    Adding an abstract class to serve as a base class for JsonPath related 
processors and preferring this for much of the functionality present in 
JsonUtils.

commit d6948601cd26d5a7c6cfccaaef297cebe91a9d2e
Author: Aldrin Piri <[email protected]>
Date:   2015-02-24T03:51:37Z

    Adjusting scope of methods as they are specific to the JsonPath related 
processors

commit caed7f8468bfee4db47c892bd17d2e36908fc8d2
Author: Aldrin Piri <[email protected]>
Date:   2015-02-28T17:36:30Z

    Adding missing subject property on the JsonPath validator.

commit 6a89745ec8646acfdacbf27511cb992c821c51b9
Author: Aldrin Piri <[email protected]>
Date:   2015-02-28T20:54:52Z

    Adding a provenance event for EvaluateJsonPath when content is overwritten 
with selected expression.

commit 57aa5dd63f25c281ba95869bfbfab465465b29f8
Author: Aldrin Piri <[email protected]>
Date:   2015-02-28T21:04:43Z

    Providing provenance fork event for the created segments generated by 
SplitJson.

commit 162f02b12fd7b6d1a78efab09942ceffea49e07d
Author: Aldrin Piri <[email protected]>
Date:   2015-03-01T02:24:25Z

    Removing the separate reads for validation preferring to do the read once 
and handle any exceptions.

commit b1f971335a464f8de34d4f360327f1c9e5e02bbe
Author: Aldrin Piri <[email protected]>
Date:   2015-03-01T05:13:22Z

    Adding processor documentation for EvaluateJsonPath and SplitJson

commit 4d3cff3592d16d1ce5608b20a025edf34a7c69d7
Author: Aldrin Piri <[email protected]>
Date:   2015-03-01T18:26:03Z

    Removing JsonUtils as all functionality was migrated into 
AbstractJsonPathProcessor given its limited utility outside of those classes.  
Adjusting validation approach for JsonPath processors to accomodate caching of 
expressions.

commit 84602ca3e9935da74788aa893998efd2685dc873
Author: Aldrin Piri <[email protected]>
Date:   2015-03-01T18:27:40Z

    Removing extraneous logging statement.

commit 5a2a8fc6befb8305407a3f9b90443b66237e14f5
Author: Aldrin Piri <[email protected]>
Date:   2015-03-01T18:37:29Z

    Adding notes about JsonPath loading contents into memory for both JsonPath 
processors.

commit 484687a67b12fb5fc13dabd7851f83a6e4a898be
Author: Aldrin Piri <[email protected]>
Date:   2015-03-01T19:16:22Z

    Adjusting onRemoved methods for both JsonPath processors to clean up 
entries on exit.

commit 973b493386c71017f9baa233d4ec178251e64f53
Author: Aldrin Piri <[email protected]>
Date:   2015-03-01T21:31:32Z

    Adjusting handling of map to cache data items on an instance basis.

commit 4618f46f278c50238ec98bf7021337564e641de0
Author: Aldrin Piri <[email protected]>
Date:   2015-03-02T02:23:01Z

    Adding JsonPathExpressionValidator to perform an exception free validation 
of JsonPath expressions.  This is used as a screen before attempting a compile.

commit e6ebaa4ced457785f33df39c8635056d8a683ac4
Author: Aldrin Piri <[email protected]>
Date:   2015-03-02T02:42:56Z

    Adding licensing and notice information for JsonPath

----


> Create Processors to work against JSON data
> -------------------------------------------
>
>                 Key: NIFI-360
>                 URL: https://issues.apache.org/jira/browse/NIFI-360
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: Aldrin Piri
>            Assignee: Aldrin Piri
>            Priority: Minor
>              Labels: processor
>
> I have created two Processors, EvaluateJsonPath and SplitJson which are 
> analogs of the functionality provided through EvaluateXPath and SpiltXML.
> Both are powered primarily around the usage of [JsonPath by 
> Jayway|https://github.com/jayway/JsonPath].
> Their capability descriptions are provided below:
> {panel:title= EvaluateJsonPath}
> Evaluates one or more JsonPath expressions against the content of a FlowFile. 
>  The results of those expressions are assigned to FlowFile Attributes or are 
> written to the content of the FlowFile itself, depending on configuration of 
> the Processor. JsonPaths are entered by adding user-defined properties; the 
> name of the property maps to the Attribute Name into which the result will be 
> placed (if the Destination is flowfile-attribute; otherwise, the property 
> name is ignored). 
> The value of the property must be a valid JsonPath expression. If the 
> JsonPath evaluates to a JSON array or JSON object and the Return Type is set 
> to 'scalar' the FlowFile will be unmodified and will be routed to failure. A 
> Return Type of JSON can return scalar values if the provided JsonPath 
> evaluates to the specified value and will be routed as a match. If 
> Destination is 'flowfile-content' and the JsonPath does not evaluate to a 
> defined path, the FlowFile will be routed to 'unmatched' without having its 
> contents modified. If Destination is flowfile-attribute and the expression 
> matches nothing, attributes will be created with empty strings as the value, 
> and the FlowFile will always be routed to 'matched.'
> {panel}
> {panel:title=SplitJson}
> Splits a JSON File into multiple, separate FlowFiles for an array element 
> specified by a JsonPath expression. Each generated FlowFile is comprised of 
> an element of the specified array and transferred to relationship 'split, 
> with the original file transferred to the 'original' relationship. If the 
> specified JsonPath is not found or  does not evaluate to an array element, 
> the original file is routed to 'failure' and no files are generated.
> {panel}
> One item of note is the transitive dependency of ASM through Json-Smart 
> through JsonPath.
> I have included, what I believe is needed to appropriately make use of this 
> item in the LICENSE.  Review of its correctness would is requested.
> Any feedback is appreciated.  Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to