[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field
[ https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118310#comment-17118310 ] Jark Wu commented on FLINK-14356: - Responsed in the PR. > Introduce "single-field" format to (de)serialize message to a single field > -- > > Key: FLINK-14356 > URL: https://issues.apache.org/jira/browse/FLINK-14356 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / API >Reporter: jinfeng >Assignee: jinfeng >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > > I want to use flink sql to write kafka messages directly to hdfs. The > serialization and deserialization of messages are not involved in the middle. > The bytes of the message directly convert the first field of Row. However, > the current RowSerializationSchema does not support the conversion of bytes > to VARBINARY. Can we add some special RowSerializationSchema and > RowDerializationSchema ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field
[ https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117752#comment-17117752 ] Robert Metzger commented on FLINK-14356: [~jark], [~twalthr] There's a pull request available for this ticket: https://github.com/apache/flink/pull/11896 > Introduce "single-field" format to (de)serialize message to a single field > -- > > Key: FLINK-14356 > URL: https://issues.apache.org/jira/browse/FLINK-14356 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / API >Reporter: jinfeng >Assignee: jinfeng >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0 > > > I want to use flink sql to write kafka messages directly to hdfs. The > serialization and deserialization of messages are not involved in the middle. > The bytes of the message directly convert the first field of Row. However, > the current RowSerializationSchema does not support the conversion of bytes > to VARBINARY. Can we add some special RowSerializationSchema and > RowDerializationSchema ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field
[ https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097991#comment-17097991 ] Jark Wu commented on FLINK-14356: - FYI: KSQL also has a similar feature: https://github.com/confluentinc/ksql/blob/master/design-proposals/klip-3-serialization-of-single-fields.md > Introduce "single-field" format to (de)serialize message to a single field > -- > > Key: FLINK-14356 > URL: https://issues.apache.org/jira/browse/FLINK-14356 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / API >Reporter: jinfeng >Assignee: jinfeng >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0 > > > I want to use flink sql to write kafka messages directly to hdfs. The > serialization and deserialization of messages are not involved in the middle. > The bytes of the message directly convert the first field of Row. However, > the current RowSerializationSchema does not support the conversion of bytes > to VARBINARY. Can we add some special RowSerializationSchema and > RowDerializationSchema ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field
[ https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949244#comment-16949244 ] Timo Walther commented on FLINK-14356: -- If we go for STRING as well, I suggest to implement at least: CHAR/VARCHAR/BINARY/VARBINARY/TINYINT/INT/SMALLINT/BIGINT. The implementation effort is not very big. > Introduce "single-field" format to (de)serialize message to a single field > -- > > Key: FLINK-14356 > URL: https://issues.apache.org/jira/browse/FLINK-14356 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / API >Reporter: jinfeng >Assignee: jinfeng >Priority: Major > > I want to use flink sql to write kafka messages directly to hdfs. The > serialization and deserialization of messages are not involved in the middle. > The bytes of the message directly convert the first field of Row. However, > the current RowSerializationSchema does not support the conversion of bytes > to VARBINARY. Can we add some special RowSerializationSchema and > RowDerializationSchema ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field
[ https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949097#comment-16949097 ] Jark Wu commented on FLINK-14356: - Hi [~twalthr], I think STRING is also a commonly used case if the file format is a text-like. I think we can have a {{single-value}} or {{single-field}} with only VARBINARY supported in the first version and evolve it step by step. What do you think? > Introduce "single-field" format to (de)serialize message to a single field > -- > > Key: FLINK-14356 > URL: https://issues.apache.org/jira/browse/FLINK-14356 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / API >Reporter: jinfeng >Priority: Major > > I want to use flink sql to write kafka messages directly to hdfs. The > serialization and deserialization of messages are not involved in the middle. > The bytes of the message directly convert the first field of Row. However, > the current RowSerializationSchema does not support the conversion of bytes > to VARBINARY. Can we add some special RowSerializationSchema and > RowDerializationSchema ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field
[ https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948788#comment-16948788 ] jinfeng commented on FLINK-14356: - I am very happy to contribute this . It would be simple to implement the raw format that only support VARBINARY > Introduce "single-field" format to (de)serialize message to a single field > -- > > Key: FLINK-14356 > URL: https://issues.apache.org/jira/browse/FLINK-14356 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / API >Reporter: jinfeng >Priority: Major > > I want to use flink sql to write kafka messages directly to hdfs. The > serialization and deserialization of messages are not involved in the middle. > The bytes of the message directly convert the first field of Row. However, > the current RowSerializationSchema does not support the conversion of bytes > to VARBINARY. Can we add some special RowSerializationSchema and > RowDerializationSchema ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field
[ https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948608#comment-16948608 ] Timo Walther commented on FLINK-14356: -- Alternatively, we could call it {{raw}} and really only support VARBINARY. I think this would simplify the design. What do you think? > Introduce "single-field" format to (de)serialize message to a single field > -- > > Key: FLINK-14356 > URL: https://issues.apache.org/jira/browse/FLINK-14356 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table > SQL / API >Reporter: jinfeng >Priority: Major > > I want to use flink sql to write kafka messages directly to hdfs. The > serialization and deserialization of messages are not involved in the middle. > The bytes of the message directly convert the first field of Row. However, > the current RowSerializationSchema does not support the conversion of bytes > to VARBINARY. Can we add some special RowSerializationSchema and > RowDerializationSchema ? -- This message was sent by Atlassian Jira (v8.3.4#803005)