Hi,

I've tested InferAvroSchema and MergeRecord scenario.
As you described, records are not merged as expected.

The reason in my case is, InferAvroSchema generates schema text like this:
inferred.avro.schema
{ "type" : "record", "name" : "example", "doc" : "Schema generated by
Kite", "fields" : [ { "name" : "Key", "type" : "long", "doc" : "Type
inferred from '4'" }, { "name" : "Value", "type" : "string", "doc" :
"Type inferred from 'four'" } ] }

And, MergedRecord uses that schema text as groupId even if
'Correlation Attribute' is specified.
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/MergeRecord.java#L348

So, even if schema is the same, if actual values vary, merging group
id will be different.
If you can use SchemaRegistry, it should work as expected.

Thanks,
Koji

On Fri, Apr 13, 2018 at 2:45 PM, DEHAY Aurelien
<aurelien.de...@faurecia.com> wrote:
>
> Hello.
>
> Thanks for the answer.
>
> The 20k is just the last test, I’ve tested with 100,1000, with an input queue 
> of 10k, and it doesn’t change anything.
>
> I will try to simplify the test case and to not use the inferred schema.
>
> Regards
>
>> Le 13 avr. 2018 à 04:50, Koji Kawamura <ijokaruma...@gmail.com> a écrit :
>>
>> Hello,
>>
>> I checked your template. Haven't run the flow since I don't have
>> sample input XML files.
>> However, when I looked at the MergeRecord processor configuration, I found 
>> that:
>> Minimum Number of Records = 20000
>> Max Bin Age = 10 sec
>>
>> By briefly looked at MergeRecord source code, it expires a bin that is
>> not complete after Max Bin Age.
>> Do you have 20,000 records to merge always within 10 sec window?
>> If not, I recommend to lower the minimum number of records.
>>
>> I haven't checked actual MergeRecord behavior so I may be wrong, but
>> worth to change the configuration.
>>
>> Hope this helps,
>> Koji
>>
>>
>> On Fri, Apr 13, 2018 at 12:26 AM, DEHAY Aurelien
>> <aurelien.de...@faurecia.com> wrote:
>>> Hello.
>>>
>>> Please see the template attached. The problem we have is that, however any 
>>> configuration we can set in the mergerecord, we can't manage it to actually 
>>> merge record.
>>>
>>> All the record are the same format, we put an inferschema not to have to 
>>> write it down ourselves. The only differences between schemas is then that 
>>> the doc="" field are different. Is it possible for it to prevent the 
>>> merging?
>>>
>>> Thanks for any pointer or info.
>>>
>>>
>>> Aurélien DEHAY
>>>
>>>
>>>
>>> This electronic transmission (and any attachments thereto) is intended 
>>> solely for the use of the addressee(s). It may contain confidential or 
>>> legally privileged information. If you are not the intended recipient of 
>>> this message, you must delete it immediately and notify the sender. Any 
>>> unauthorized use or disclosure of this message is strictly prohibited.  
>>> Faurecia does not guarantee the integrity of this transmission and shall 
>>> therefore never be liable if the message is altered or falsified nor for 
>>> any virus, interception or damage to your system.
>
> This electronic transmission (and any attachments thereto) is intended solely 
> for the use of the addressee(s). It may contain confidential or legally 
> privileged information. If you are not the intended recipient of this 
> message, you must delete it immediately and notify the sender. Any 
> unauthorized use or disclosure of this message is strictly prohibited.  
> Faurecia does not guarantee the integrity of this transmission and shall 
> therefore never be liable if the message is altered or falsified nor for any 
> virus, interception or damage to your system.
>

Reply via email to