[
https://issues.apache.org/jira/browse/NIFI-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15115880#comment-15115880
]
Aldrin Piri commented on NIFI-1438:
-----------------------------------
[~hijakk] Thanks for bridging the gap from SO to here. Gets hard to work
through things in those comments.
For additional context, please see corresponding question on SO:
http://stackoverflow.com/questions/34958347/mergecontent-with-nifi-inconsistent-length
> Unexpected results using MergeProcessor
> ----------------------------------------
>
> Key: NIFI-1438
> URL: https://issues.apache.org/jira/browse/NIFI-1438
> Project: Apache NiFi
> Issue Type: Bug
> Affects Versions: 0.4.1
> Environment: OSX 10.10.5, Java 8u45
> Reporter: Josh Harrison
> Attachments: nifi-merge-problem.xml, nifi-problem.tgz
>
>
> Hello, I'm opening a ticket in reference to the stack overflow question I had
> at
> http://stackoverflow.com/questions/34958347/mergecontent-with-nifi-inconsistent-length
>
> To summarize, despite Aldrin's help, I have been unable to get the expected
> merge behavior out of a template like the one attached, ingesting data like
> is attached.
> The goal is to ingest all of the zips in /tmp/nifidemo/source, extract the
> zip files contained therein, each line being a json object. With json
> routing, I extract and route for further processing ONLY items where the
> "tags" item contains the tag "xyz".
> These routed files should be aggregated by "mergeContent" into a bucket with,
> at minimum, 1000 lines – or after being starved for 30 seconds, whatever
> occurs first.
> The behavior observed in my real template is replicated in this example –
> merge content appears to be routing to buckets based on the original file
> name, and not aggregating 1000 lines at a time as expected. Within a few
> seconds of the template being run, many files are written with unexpected
> line counts.
> More confusingly, this isn't a consistent pattern - files may be run
> repeatedly and do not generate the same number of lines in the result each
> time.
> The content of the input files was randomly generated so that approximately
> 10% of the objects would contain the tag "xyz" (5000 lines in each input
> file, there should be approximately 500 lines of – there are result files
> that contain over 400 lines, but many contain 15-30 lines. There are also a
> number of files with a "uuid.json" style name, all containing one line.
> The attached contains a generic template that replicates the problem – it
> seems to throw some errors but they don't appear to be related to the problem
> I'm working on (and my real template doesn't throw the failures, but still
> exhibits the same behavior).
> I am running Nifi 0.4.1 on a Mac OSX 10.10.5 system and JRE 8u45.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)