Hello.

We looked in first place in the InferSchema to see if there was an option for 
that.

Anyway, thank you very much, it works fine with the update attribute. 


Aurélien DEHAY
Big Data Architect
+33 616 815 441
aurelien.de...@faurecia.com 

2 rue Hennape - 92735 Nanterre Cedex – France



-----Original Message-----
From: Koji Kawamura [mailto:ijokaruma...@gmail.com] 
Sent: vendredi 13 avril 2018 09:20
To: users@nifi.apache.org
Subject: Re: MergeRecord

Hi,

Just FYI,
If I replaces the schema doc comment by UpdateAttribute, I was able to merge 
records.
${inferred.avro.schema:replaceAll('"Type inferred from [^"]+"', '""')}

I looked at InferAvroSchema and underlying Kite source code, but there's no 
option to suppress the doc comment when inferring schema unfortunately.

Thanks,
Koji

On Fri, Apr 13, 2018 at 4:11 PM, Koji Kawamura <ijokaruma...@gmail.com> wrote:
> Hi,
>
> I've tested InferAvroSchema and MergeRecord scenario.
> As you described, records are not merged as expected.
>
> The reason in my case is, InferAvroSchema generates schema text like this:
> inferred.avro.schema
> { "type" : "record", "name" : "example", "doc" : "Schema generated by 
> Kite", "fields" : [ { "name" : "Key", "type" : "long", "doc" : "Type 
> inferred from '4'" }, { "name" : "Value", "type" : "string", "doc" :
> "Type inferred from 'four'" } ] }
>
> And, MergedRecord uses that schema text as groupId even if 
> 'Correlation Attribute' is specified.
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-stand
> ard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/proc
> essors/standard/MergeRecord.java#L348
>
> So, even if schema is the same, if actual values vary, merging group 
> id will be different.
> If you can use SchemaRegistry, it should work as expected.
>
> Thanks,
> Koji
>
> On Fri, Apr 13, 2018 at 2:45 PM, DEHAY Aurelien 
> <aurelien.de...@faurecia.com> wrote:
>>
>> Hello.
>>
>> Thanks for the answer.
>>
>> The 20k is just the last test, I’ve tested with 100,1000, with an input 
>> queue of 10k, and it doesn’t change anything.
>>
>> I will try to simplify the test case and to not use the inferred schema.
>>
>> Regards
>>
>>> Le 13 avr. 2018 à 04:50, Koji Kawamura <ijokaruma...@gmail.com> a écrit :
>>>
>>> Hello,
>>>
>>> I checked your template. Haven't run the flow since I don't have 
>>> sample input XML files.
>>> However, when I looked at the MergeRecord processor configuration, I found 
>>> that:
>>> Minimum Number of Records = 20000
>>> Max Bin Age = 10 sec
>>>
>>> By briefly looked at MergeRecord source code, it expires a bin that 
>>> is not complete after Max Bin Age.
>>> Do you have 20,000 records to merge always within 10 sec window?
>>> If not, I recommend to lower the minimum number of records.
>>>
>>> I haven't checked actual MergeRecord behavior so I may be wrong, but 
>>> worth to change the configuration.
>>>
>>> Hope this helps,
>>> Koji
>>>
>>>
>>> On Fri, Apr 13, 2018 at 12:26 AM, DEHAY Aurelien 
>>> <aurelien.de...@faurecia.com> wrote:
>>>> Hello.
>>>>
>>>> Please see the template attached. The problem we have is that, however any 
>>>> configuration we can set in the mergerecord, we can't manage it to 
>>>> actually merge record.
>>>>
>>>> All the record are the same format, we put an inferschema not to have to 
>>>> write it down ourselves. The only differences between schemas is then that 
>>>> the doc="" field are different. Is it possible for it to prevent the 
>>>> merging?
>>>>
>>>> Thanks for any pointer or info.
>>>>
>>>>
>>>> Aurélien DEHAY
>>>>
>>>>
>>>>
>>>> This electronic transmission (and any attachments thereto) is intended 
>>>> solely for the use of the addressee(s). It may contain confidential or 
>>>> legally privileged information. If you are not the intended recipient of 
>>>> this message, you must delete it immediately and notify the sender. Any 
>>>> unauthorized use or disclosure of this message is strictly prohibited.  
>>>> Faurecia does not guarantee the integrity of this transmission and shall 
>>>> therefore never be liable if the message is altered or falsified nor for 
>>>> any virus, interception or damage to your system.
>>
>> This electronic transmission (and any attachments thereto) is intended 
>> solely for the use of the addressee(s). It may contain confidential or 
>> legally privileged information. If you are not the intended recipient of 
>> this message, you must delete it immediately and notify the sender. Any 
>> unauthorized use or disclosure of this message is strictly prohibited.  
>> Faurecia does not guarantee the integrity of this transmission and shall 
>> therefore never be liable if the message is altered or falsified nor for any 
>> virus, interception or damage to your system.
>>

This electronic transmission (and any attachments thereto) is intended solely 
for the use of the addressee(s). It may contain confidential or legally 
privileged information. If you are not the intended recipient of this message, 
you must delete it immediately and notify the sender. Any unauthorized use or 
disclosure of this message is strictly prohibited.  Faurecia does not guarantee 
the integrity of this transmission and shall therefore never be liable if the 
message is altered or falsified nor for any virus, interception or damage to 
your system.

Reply via email to