[ 
https://issues.apache.org/jira/browse/FLINK-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564974#comment-16564974
 ] 

Stephan Ewen commented on FLINK-10003:
--------------------------------------

This is a tradoff in being broadly applicable versus help some cases 
specifically. This interface was specifically meant as a stateless encoder, 
being used across streams.

IN contrast, the {{BulkEncoder}}, as used for Parquet, binds to a single stream.

One good way to look at this would be to write a JSONEncoder and a AvroEncoder 
and see how well that works. If it works well, we could leave the interface as 
it is. Otherwise, we adjust it.

> Encoder interface inefficient when wanting to use more sophisticated 
> outputstreams
> ----------------------------------------------------------------------------------
>
>                 Key: FLINK-10003
>                 URL: https://issues.apache.org/jira/browse/FLINK-10003
>             Project: Flink
>          Issue Type: Improvement
>          Components: Streaming Connectors
>    Affects Versions: 1.6.0
>            Reporter: Chesnay Schepler
>            Priority: Major
>
> The {{StreamingFileSink}} uses the {{Encoder}} interface to serialize data.
> {code}
> public interface Encoder<IN> extends Serializable {
>       void encode(IN element, OutputStream stream) throws IOException;
> }
> {code}
> The implementation (with the exception for strings) must be provided by the 
> user.
> To use any {{OutputStream}} implementation that is a little more convenient 
> than the base {{OutputStream}} (like {{DataOutputStream}}) requires creating 
> a new stream for every single record. If an implementation is used that 
> potentially buffers data users additionally have to call {{flush()}}.
> Instead we could allow specifying an optional factory for the streams, that 
> would be called once for each part file, and modify the {{Encoder}} interface 
> to have a generic type for the output stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to