[ 
https://issues.apache.org/jira/browse/NIFI-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821858#comment-15821858
 ] 

ASF GitHub Bot commented on NIFI-1135:
--------------------------------------

GitHub user mcgilman opened a pull request:

    https://github.com/apache/nifi/pull/1413

    NIFI-1135: Returning event summaries instead of full events

    NIFI-1135:
    - Adding additional parameters to be able to limit the size of the 
provenance response. Specifically, whether the events should be summarized and 
whether events should be returned incrementally before the query has completed.
    - Ensuring the cluster node address is included in provenance events 
returned.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mcgilman/nifi NIFI-1135

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nifi/pull/1413.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1413
    
----
commit a89638d2f3928bc7c547996d282c85e89698f8bf
Author: Matt Gilman <matt.c.gil...@gmail.com>
Date:   2017-01-13T14:25:10Z

    NIFI-1135:
    - Adding additional parameters to be able to limit the size of the 
provenance response. Specifically, whether the events should be summarized and 
whether events should be returned incrementally before the query has completed.
    - Ensuring the cluster node address is included in provenance events 
returned.

----


> For Provenance Query, bring back Event Summaries instead of the Events 
> themselves
> ---------------------------------------------------------------------------------
>
>                 Key: NIFI-1135
>                 URL: https://issues.apache.org/jira/browse/NIFI-1135
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework, Core UI
>    Affects Versions: 1.0.0
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>
> Currently, when we query Provenance, we pull back up to 1000 events. These 
> are full Provenance Events with attributes, etc. If the query takes a long 
> time, we will request those objects that already have matched the query many 
> times. This amounts to a great deal of heap being used and sending back very 
> large JSON objects (10+ MB is not uncommon and it could potentially be far 
> worse).
> We should instead use a ProvenanceEventSummary object. This object should 
> contain just the info shown in the results table and the pointer to the 
> actual event in the Provenance Store. This allows us to return the queries 
> much faster, store less data in the heap, and provide less data back to the 
> end user with virtually the same experience.
> The one place that this would differ in UX is when the user clicks the "info" 
> button to view the entire provenance event, we would have to pull the event 
> back from the server, rather than already having that in memory.
> We should consider storing all of the fields in the results table in Lucene 
> to provide faster results. Otherwise, we could still get potentially better 
> results with the current approach if we just ensure that the first fields 
> that we store are those in the results table. This allows us to read just a 
> small portion of the event from file and deserializing just a small amount of 
> data before moving on to the next event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to