I’m not an expert with MergeRecord, but looking at your screenshots, I’d guess
that your setup is taking that long to reach one of the defined “maximum”
settings, e.g. 2GB, 5,000,000 records, or 3600 seconds (1 hour).
How large (number of records and content size in bytes) are the typical
FlowF
I have back pressure object threshold set to 10 on that queue and my
swap threshold is 20. I don't think though when I had the issue the
number of flow files was very high in the queue in question since the issue
was now at updaterecord after I did a mergecontent that greatly reduced the
n
Hey Robert,
How big are the FlowFile queues that you have in front of your
MergeContent/MergeRecord processors? Or, more specifically, what do you have
configured for the back pressure threshold? I ask because there was a fix in
1.11.0 [1] that had to do with ordering when swapping and ensuring
Sorry one other thing I thought of that may help. I noticed on 1.11.4 when
I would stop the updaterecord processor it would take a long period of time
for the processor to stop (threads were hanging), but when I went back to
1.9.2 the processor would stop in a very timely manner. Not sure if that
I had more updates on this.
Yesterday I again attempted to upgrade one of our 1.9.2 clusters that is
now using mergecontent vs mergerecord. The flow had been running on 1.9.2
for about a week with no issue. I did the upgrade to 1.11.4, and saw about
3 of 10 nodes not being able to keep up. The
Sorry for the delayed answer, but was doing some testing this week and
found a few more things out.
First to answer some of your questions.
I would say with no actual raw numbers, it was worse than a 10%
degradation. I say this since the flow was badly backing up, and a 10%
decrease in performan
Robert,
What kind of performance degradation were you seeing here? I put together some
simple flows to see if I could reproduce using 1.9.2 and current master.
My flow consisted of GenerateFlowFile (generating 2 CSV rows per FlowFile) ->
ConvertRecord (to Avro) -> MergeRecord (read Avro, write A
Joe,
In that part of the flow, we are using avro readers and writers. We are
using snappy compression (which could be part of the problem). Since we
are using avro at that point the embedded schema is being used by the
reader and the writer is using the schema name property along with an
interna
Robert,
Can you please detail the record readers and writers involved and how
schemas are accessed? There can be very important performance related
changes in the parsers/serializers of the given formats. And we've added a
lot to make schema caching really capable but you have to opt into it. I
be balanced to the same node.
The order of ProcessorB received will probably not the same as ProcessorA
emited. And the order is nondeterministic.
Thanks,
Lei
wangl...@geekplus.com.cn
From: Koji Kawamura
Date: 2019-10-20 18:02
To: users
Subject: Re: Re: MergeRecord can not guarantee the
ngl...@geekplus.com.cn
>
>
> From: wangl...@geekplus.com.cn
> Date: 2019-10-16 10:21
> To: dev; users
> CC: dev
> Subject: Re: Re: MergeRecord can not guarantee the ordering of the input
> sequence?
> Hi Koji,
> Actually i have set all connections to FIFO and concurrency t
Seems it is because of the balance strategy that is used.
The balance will not guarantee the the order.
Thanks,
Lei
wangl...@geekplus.com.cn
From: wangl...@geekplus.com.cn
Date: 2019-10-16 10:21
To: dev; users
CC: dev
Subject: Re: Re: MergeRecord can not guarantee the ordering of the input
This is nondeterministic.
I think I should look up the MergeRecord code and do further debug.
Thanks,
Lei
wangl...@geekplus.com.cn
From: Koji Kawamura
Date: 2019-10-16 09:46
To: users
CC: dev
Subject: Re: MergeRecord can not guarantee the ordering of the input sequence?
Hi Lei,
How about
Hi Lei,
How about setting FIFO prioritizer at all the preceding connections
before the MergeRecord?
Without setting any prioritizer, FlowFile ordering is nondeterministic.
Thanks,
Koji
On Tue, Oct 15, 2019 at 8:56 PM wangl...@geekplus.com.cn
wrote:
>
>
> If FlowFile A, B, C enter the MergeReco
Good afternoon,
Another thing to help you out maybe ...
You can also tweak the nifi.properties setting:
nifi.queue.swap.threshold=2
This setting will control the value of the max flowfile count on a
connection if exceeded it will flush those flowfiles to disk.
I am not sure however there is
Aurélien,
In that case you're looking to merge about 500,000 FlowFiles into a single
FlowFile, so you'll
definitely want to use a cascading approach. I'd shoot for about 1 MB for the
first MergeRecord
and then merge 128 of those together for the second MergeRecord.
The provenance backpressure i
-Original Message-
From: Koji Kawamura [mailto:ijokaruma...@gmail.com]
Sent: vendredi 13 avril 2018 09:20
To: users@nifi.apache.org
Subject: Re: MergeRecord
Hi,
Just FYI,
If I replaces the schema doc comment by UpdateAttribute, I was able to merge
records.
${inferred.avro.schema:replaceAll
Hi,
Just FYI,
If I replaces the schema doc comment by UpdateAttribute, I was able to
merge records.
${inferred.avro.schema:replaceAll('"Type inferred from [^"]+"', '""')}
I looked at InferAvroSchema and underlying Kite source code, but
there's no option to suppress the doc comment when inferring
Hi,
I've tested InferAvroSchema and MergeRecord scenario.
As you described, records are not merged as expected.
The reason in my case is, InferAvroSchema generates schema text like this:
inferred.avro.schema
{ "type" : "record", "name" : "example", "doc" : "Schema generated by
Kite", "fields" : [
Hello.
Thanks for the answer.
The 20k is just the last test, I’ve tested with 100,1000, with an input queue
of 10k, and it doesn’t change anything.
I will try to simplify the test case and to not use the inferred schema.
Regards
> Le 13 avr. 2018 à 04:50, Koji Kawamura a écrit :
>
> He
Hello,
I checked your template. Haven't run the flow since I don't have
sample input XML files.
However, when I looked at the MergeRecord processor configuration, I found that:
Minimum Number of Records = 2
Max Bin Age = 10 sec
By briefly looked at MergeRecord source code, it expires a bin th
21 matches
Mail list logo