[
https://issues.apache.org/jira/browse/NIFI-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183282#comment-15183282
]
Pierre Villard commented on NIFI-1438:
--------------------------------------
[~hijakk] I tried to have a look to your problem but it seems that the template
you gave is incomplete (only containing GetFile processor). Could you update
the attached template? Meanwhile, I'll try to reproduce your observation with
the data you gave. Thanks!
> Unexpected results using MergeProcessor
> ----------------------------------------
>
> Key: NIFI-1438
> URL: https://issues.apache.org/jira/browse/NIFI-1438
> Project: Apache NiFi
> Issue Type: Bug
> Affects Versions: 0.4.1
> Environment: OSX 10.10.5, Java 8u45
> Reporter: Josh Harrison
> Attachments: nifi-merge-problem.xml, nifi-problem.tgz
>
>
> Hello, I'm opening a ticket in reference to the stack overflow question I had
> at
> http://stackoverflow.com/questions/34958347/mergecontent-with-nifi-inconsistent-length
>
> To summarize, despite Aldrin's help, I have been unable to get the expected
> merge behavior out of a template like the one attached, ingesting data like
> is attached.
> The goal is to ingest all of the zips in /tmp/nifidemo/source, extract the
> zip files contained therein, each line being a json object. With json
> routing, I extract and route for further processing ONLY items where the
> "tags" item contains the tag "xyz".
> These routed files should be aggregated by "mergeContent" into a bucket with,
> at minimum, 1000 lines – or after being starved for 30 seconds, whatever
> occurs first.
> The behavior observed in my real template is replicated in this example –
> merge content appears to be routing to buckets based on the original file
> name, and not aggregating 1000 lines at a time as expected. Within a few
> seconds of the template being run, many files are written with unexpected
> line counts.
> More confusingly, this isn't a consistent pattern - files may be run
> repeatedly and do not generate the same number of lines in the result each
> time.
> The content of the input files was randomly generated so that approximately
> 10% of the objects would contain the tag "xyz" (5000 lines in each input
> file, there should be approximately 500 lines of – there are result files
> that contain over 400 lines, but many contain 15-30 lines. There are also a
> number of files with a "uuid.json" style name, all containing one line.
> The attached contains a generic template that replicates the problem – it
> seems to throw some errors but they don't appear to be related to the problem
> I'm working on (and my real template doesn't throw the failures, but still
> exhibits the same behavior).
> I am running Nifi 0.4.1 on a Mac OSX 10.10.5 system and JRE 8u45.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)