[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field

2020-05-27 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118310#comment-17118310
 ] 

Jark Wu commented on FLINK-14356:
-

Responsed in the PR.

> Introduce "single-field" format to (de)serialize message to a single field
> --
>
> Key: FLINK-14356
> URL: https://issues.apache.org/jira/browse/FLINK-14356
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Reporter: jinfeng
>Assignee: jinfeng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> I want to use flink sql to write kafka messages directly to hdfs. The 
> serialization and deserialization of messages are not involved in the middle. 
>  The bytes of the message directly convert the first field of Row.  However, 
> the current RowSerializationSchema does not support the conversion of bytes 
> to VARBINARY. Can we add some special RowSerializationSchema and 
> RowDerializationSchema ? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field

2020-05-27 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117752#comment-17117752
 ] 

Robert Metzger commented on FLINK-14356:


[~jark], [~twalthr] There's a pull request available for this ticket: 
https://github.com/apache/flink/pull/11896

> Introduce "single-field" format to (de)serialize message to a single field
> --
>
> Key: FLINK-14356
> URL: https://issues.apache.org/jira/browse/FLINK-14356
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Reporter: jinfeng
>Assignee: jinfeng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> I want to use flink sql to write kafka messages directly to hdfs. The 
> serialization and deserialization of messages are not involved in the middle. 
>  The bytes of the message directly convert the first field of Row.  However, 
> the current RowSerializationSchema does not support the conversion of bytes 
> to VARBINARY. Can we add some special RowSerializationSchema and 
> RowDerializationSchema ? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field

2020-05-02 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097991#comment-17097991
 ] 

Jark Wu commented on FLINK-14356:
-

FYI: KSQL also has a similar feature: 
https://github.com/confluentinc/ksql/blob/master/design-proposals/klip-3-serialization-of-single-fields.md

> Introduce "single-field" format to (de)serialize message to a single field
> --
>
> Key: FLINK-14356
> URL: https://issues.apache.org/jira/browse/FLINK-14356
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Reporter: jinfeng
>Assignee: jinfeng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> I want to use flink sql to write kafka messages directly to hdfs. The 
> serialization and deserialization of messages are not involved in the middle. 
>  The bytes of the message directly convert the first field of Row.  However, 
> the current RowSerializationSchema does not support the conversion of bytes 
> to VARBINARY. Can we add some special RowSerializationSchema and 
> RowDerializationSchema ? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field

2019-10-11 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949244#comment-16949244
 ] 

Timo Walther commented on FLINK-14356:
--

If we go for STRING as well, I suggest to implement at least: 
CHAR/VARCHAR/BINARY/VARBINARY/TINYINT/INT/SMALLINT/BIGINT. The implementation 
effort is not very big. 

> Introduce "single-field" format to (de)serialize message to a single field
> --
>
> Key: FLINK-14356
> URL: https://issues.apache.org/jira/browse/FLINK-14356
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Reporter: jinfeng
>Assignee: jinfeng
>Priority: Major
>
> I want to use flink sql to write kafka messages directly to hdfs. The 
> serialization and deserialization of messages are not involved in the middle. 
>  The bytes of the message directly convert the first field of Row.  However, 
> the current RowSerializationSchema does not support the conversion of bytes 
> to VARBINARY. Can we add some special RowSerializationSchema and 
> RowDerializationSchema ? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field

2019-10-10 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949097#comment-16949097
 ] 

Jark Wu commented on FLINK-14356:
-

Hi [~twalthr], I think STRING is also a commonly used case if the file format 
is a text-like. I think we can have a {{single-value}} or {{single-field}} with 
only VARBINARY supported in the first version and evolve it step by step.  What 
do you think? 

> Introduce "single-field" format to (de)serialize message to a single field
> --
>
> Key: FLINK-14356
> URL: https://issues.apache.org/jira/browse/FLINK-14356
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Reporter: jinfeng
>Priority: Major
>
> I want to use flink sql to write kafka messages directly to hdfs. The 
> serialization and deserialization of messages are not involved in the middle. 
>  The bytes of the message directly convert the first field of Row.  However, 
> the current RowSerializationSchema does not support the conversion of bytes 
> to VARBINARY. Can we add some special RowSerializationSchema and 
> RowDerializationSchema ? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field

2019-10-10 Thread jinfeng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948788#comment-16948788
 ] 

jinfeng commented on FLINK-14356:
-

I am very happy to contribute this . It would be simple to implement the raw 
format  that only support VARBINARY

> Introduce "single-field" format to (de)serialize message to a single field
> --
>
> Key: FLINK-14356
> URL: https://issues.apache.org/jira/browse/FLINK-14356
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Reporter: jinfeng
>Priority: Major
>
> I want to use flink sql to write kafka messages directly to hdfs. The 
> serialization and deserialization of messages are not involved in the middle. 
>  The bytes of the message directly convert the first field of Row.  However, 
> the current RowSerializationSchema does not support the conversion of bytes 
> to VARBINARY. Can we add some special RowSerializationSchema and 
> RowDerializationSchema ? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14356) Introduce "single-field" format to (de)serialize message to a single field

2019-10-10 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948608#comment-16948608
 ] 

Timo Walther commented on FLINK-14356:
--

Alternatively, we could call it {{raw}} and really only support VARBINARY. I 
think this would simplify the design. What do you think?

> Introduce "single-field" format to (de)serialize message to a single field
> --
>
> Key: FLINK-14356
> URL: https://issues.apache.org/jira/browse/FLINK-14356
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Reporter: jinfeng
>Priority: Major
>
> I want to use flink sql to write kafka messages directly to hdfs. The 
> serialization and deserialization of messages are not involved in the middle. 
>  The bytes of the message directly convert the first field of Row.  However, 
> the current RowSerializationSchema does not support the conversion of bytes 
> to VARBINARY. Can we add some special RowSerializationSchema and 
> RowDerializationSchema ? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)