[ 
https://issues.apache.org/jira/browse/BEAM-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345912#comment-16345912
 ] 

Kenneth Knowles commented on BEAM-3572:
---------------------------------------

I think I see what you mean in terms of excess allocation. The buffering was 
added as an optimization :-)

While the {{Coder}} itself should be observably immutable, there is no problem 
with mutation under the hood to manage a pool of buffers. The real issue, which 
you alluded to, is that coders are required to be thread safe. The reason that 
{{BufferedElementCountingOutputStream}} can be used despite lack of thread 
safety is that it is only local.

Having either {{IterableLikeCoder}} or {{BufferedElementCountingOutputStream}} 
do their own suballocation makes sense, with the usual caveats of bugs and 
leaks from that sort of code. Definitely better encapsulation for 
{{BufferedElementCountingOutputStream}} to own it unless it doesn't have enough 
info to do it well. I'm willing to trust that you came to this because you 
actually hit this in practice, or are at least driven by a benchmark.

> Reduce inefficient allocations in coders
> ----------------------------------------
>
>                 Key: BEAM-3572
>                 URL: https://issues.apache.org/jira/browse/BEAM-3572
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Bill Neubauer
>            Assignee: Bill Neubauer
>            Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BufferedElementCountingOutputStream's constructor allocates a new buffer to 
> wrap the input OutputStream. This gets called on each invocation of encode() 
> from IterableLikeCoder. Since Coder is designed to be stateless, but thisĀ 
> buffer holds state and isn't threadsafe, we can't just have the caller manage 
> the buffer. Modifying the constructor to use a pool of buffers to reduce the 
> number of allocations will help performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to