[jira] [Comment Edited] (FLINK-18202) Introduce Protobuf format

Suhan Mao (Jira) Thu, 19 Nov 2020 07:13:10 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-18202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17235531#comment-17235531
 ]


Suhan Mao edited comment on FLINK-18202 at 11/19/20, 3:12 PM:
--------------------------------------------------------------

[~twalthr] [~libenchao]

I have finished the first version of flink-pb. the code is located in 
[https://github.com/maosuhan/flink-pb] for temporary use. You can go through 
the code with the help of README. You can review the design and the code and I 
will be very happy to hear your advices. 

 

Features:
 # Codegen in deserialization. The performance is close to java native API.
 # Fast serialization. The performance is 3 times faster than proto builder 
API.(setter -> build -> toByte
 Array())
 # pb2 and pb3 are both supported. We recognize the syntax from proto file read 
by table source/sink function. 
 # All protobuf data type is supported including simple type, message type, map 
type, array type.
 # Implement Dynamic Table Factory and RowData interface
 # Support pb.message-class-name,  pb.ignore-parse-errors and 
pb.ignore-default-values connector params. Check detail in github.
 # Support flexible configuration of java_multiple_files and 
java_outer_classname in proto file. It could cause code gen error if we do not 
care about this.

 

Here is some of my questions, I'm new to contribute flink code:
 # where to place the code in flink project. In flink-formats? Name this module 
by "flink-protobuf" ?
 # Should I give a more detailed design doc?
 # Is the way I use codegen in deserialization part the right way in flink 
project?
 # Is test case enough and give me some advice.
 # Is there any further actions we can move forward.

 


was (Author: maosuhan):
[~twalthr] [~libenchao]

I have finished the first version of flink-pb. the code is located in 
[https://github.com/maosuhan/flink-pb] for temporary use. You can go through 
the code with the help of README. You can review the design and the code and I 
will be very happy to hear your advices. 

 

Features:
 # Codegen in deserialization. The performance is close to java native API.
 # Fast serialization. The performance is 3 times faster than proto builder 
API.(setter -> build -> toByte
 Array())
 # pb2 and pb3 are both supported. We recognize the syntax from proto file read 
by table source/sink function. 
 # All protobuf data type is supported including simple type, message type, map 
type, array type.
 # Implement Dynamic Table Factory and RowData interface
 # Support pb.message-class-name,  pb.ignore-parse-errors and 
pb.ignore-default-values connector params. Check detail in github.

 

Here is some of my questions, I'm new to contribute flink code:
 # where to place the code in flink project. In flink-formats? Name this module 
by "flink-protobuf" ?
 # Should I give a more detailed design doc?
 # Is the way I use codegen in deserialization part the right way in flink 
project?
 # Is test case enough and give me some advice.
 # Is there any further actions we can move forward.

 

> Introduce Protobuf format
> -------------------------
>
>                 Key: FLINK-18202
>                 URL: https://issues.apache.org/jira/browse/FLINK-18202
>             Project: Flink
>          Issue Type: New Feature
>          Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>            Reporter: Benchao Li
>            Priority: Major
>         Attachments: image-2020-06-15-17-18-03-182.png
>
>
> PB[1] is a very famous and wildly used (de)serialization framework. The ML[2] 
> also has some discussions about this. It's a useful feature.
> This issue maybe needs some designs, or a FLIP.
> [1] [https://developers.google.com/protocol-buffers]
> [2] [http://apache-flink.147419.n8.nabble.com/Flink-SQL-UDF-td3725.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-18202) Introduce Protobuf format

Reply via email to