[
https://issues.apache.org/jira/browse/FLINK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15228074#comment-15228074
]
ASF GitHub Bot commented on FLINK-3637:
---------------------------------------
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1826#issuecomment-206292450
Great to hear! Could you please close this PR, github didn't close
automatically.
> Change RollingSink Writer interface to allow wider range of outputs
> -------------------------------------------------------------------
>
> Key: FLINK-3637
> URL: https://issues.apache.org/jira/browse/FLINK-3637
> Project: Flink
> Issue Type: Improvement
> Components: Streaming Connectors
> Reporter: Lasse Dalegaard
> Assignee: Lasse Dalegaard
> Labels: features
> Fix For: 1.1.0
>
>
> Currently the RollingSink Writer interface only works with
> FSDataOutputStreams, which precludes it from being used with some existing
> libraries like Apache ORC and Parquet.
> To fix this, a new Writer interface can be created, which receives FileSystem
> and Path objects, instead of FSDataOutputStream.
> To ensure exactly-once semantics, the Writer interface must also be extended
> so that the current write-offset can be retrieved at checkpointing time. For
> formats like ORC this requires a footer to be written, before the offset is
> returned. Checkpointing already calls flush on the writer, but either flush
> needs to return the current length of the output file, or alternatively a new
> method has to be added for this.
> The existing Writer interface can be recreated with a wrapper on top of the
> new Writer interface. The existing code that manages the FSDataOutputStream
> can then be moved into this new wrapper.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)