[jira] [Commented] (NIFI-5805) Avro Record Writer service creates byte buffer for every Writer created

ASF GitHub Bot (JIRA) Thu, 08 Nov 2018 17:34:04 -0800


    [ 
https://issues.apache.org/jira/browse/NIFI-5805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680705#comment-16680705
 ]


ASF GitHub Bot commented on NIFI-5805:
--------------------------------------

Github user ijokarumawak commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/3160#discussion_r232117307
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/AvroRecordSetWriter.java
 ---
    @@ -68,16 +76,40 @@
             .required(true)
             .build();
     
    +    static final PropertyDescriptor ENCODER_POOL_SIZE = new Builder()
    +        .name("encoder-pool-size")
    +        .displayName("Encoder Pool Size")
    +        .description("Avro Writers require the use of an Encoder. Creation 
of Encoders is expensive, but once created, they can be reused. This property 
controls the maximum number of Encoders that" +
    +            " can be pooled and reused. Setting this value too small can 
result in degraded performance, but setting it higher can result in more heap 
being used.")
    --- End diff --
    
    Just for clarification, I'd suggest adding a note mentioning that, this 
property doesn't have any effect with 'Embed Avro Schema' strategy.


> Avro Record Writer service creates byte buffer for every Writer created
> -----------------------------------------------------------------------
>
>                 Key: NIFI-5805
>                 URL: https://issues.apache.org/jira/browse/NIFI-5805
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>
> When we use the Avro RecordSet Writer, and do not embed the schema, the 
> Writer uses the Avro BinaryEncoder object to serialize the data. This object 
> can be initialized, but instead we create a new one for each writer. This 
> results in creating a new 64 KB byte[] each time. When we are writing many 
> records to a given FlowFile, this is not a big deal. However, when used in 
> PublishKafkaRecord or similar processors, where a new writer must be created 
> for every Record, this can have a very significant performance impact.
> An improvement would be to have the user configure the maximum number of 
> BinaryEncoder objects to pool and then use a simple pooling mechanism to 
> reuse these objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (NIFI-5805) Avro Record Writer service creates byte buffer for every Writer created

Reply via email to