Chesnay Schepler created FLINK-10003:
----------------------------------------

             Summary: Encoder interface inefficient when wanting to use more 
sophisticated outputstreams
                 Key: FLINK-10003
                 URL: https://issues.apache.org/jira/browse/FLINK-10003
             Project: Flink
          Issue Type: Improvement
          Components: Streaming Connectors
    Affects Versions: 1.6.0
            Reporter: Chesnay Schepler


The {{StreamingFileSink}} uses the {{Encoder}} interface to serialize data.
{code}
public interface Encoder<IN> extends Serializable {
        void encode(IN element, OutputStream stream) throws IOException;
}
{code}

The implementation (with the exception for strings) must be provided by the 
user.
To use any {{OutputStream}} implementation that is a little more convenient 
than the base {{OutputStream}} (like {{DataOutputStream}}) requires creating a 
new stream for every single record. If an implementation is used that 
potentially buffers data users additionally have to call {{flush()}}.

Instead we could allow specifying an optional factory for the streams, that 
would be called once for each part file, and modify the {{Encoder}} interface 
to have a generic type for the output stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to