[jira] [Commented] (FLINK-33220) PyFlink support for Datagen connector

2023-10-09 Thread Liu Chong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773534#comment-17773534
 ] 

Liu Chong commented on FLINK-33220:
---

we've got the code ready for adding datagen to pyflink, would you like to give 
some comments if we should move forward and submit the PR? [~dianfu] 

> PyFlink support for Datagen connector
> -
>
> Key: FLINK-33220
> URL: https://issues.apache.org/jira/browse/FLINK-33220
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Python
>Reporter: Liu Chong
>Priority: Minor
>
> This is a simple Jira to propose the support of Datagen in PyFlink datastream 
> API as a built-in source connector



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33197) PyFlink support for ByteArraySchema

2023-10-09 Thread Liu Chong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773536#comment-17773536
 ] 

Liu Chong commented on FLINK-33197:
---

we've got the code ready for adding ByteArraySchema to pyflink, would you like 
to give some comments if we should move forward and submit the PR? [~dianfu] 

> PyFlink support for ByteArraySchema
> ---
>
> Key: FLINK-33197
> URL: https://issues.apache.org/jira/browse/FLINK-33197
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Python
>Affects Versions: 1.17.0
>Reporter: Liu Chong
>Priority: Minor
>
> Currently in Python Flink API, when reading messages from a Kafka source, 
> only SimpleStringSchema is available.
> If the data is in arbitary binary format(e.g. marshalled Protocol Buffer msg) 
> it may not be decodable with the default 'utf-8' encoding. 
> There's currently a workaround which is to manually set the encoding to 
> 'ISO-8859-1' which supports all possible byte combinations. 
> However this is not an elegant solution.
> We should support ByteArraySchema which outputs a raw byte array for 
> subsequent unmarshalling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33220) PyFlink support for Datagen connector

2023-10-09 Thread Liu Chong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Chong updated FLINK-33220:
--
Description: This is a simple Jira to propose the support of Datagen in 
PyFlink datastream API as a built-in source connector  (was: This is a simple 
Jira to propose the support of datagen in PyFlink datastream API as a built-in 
source connector)

> PyFlink support for Datagen connector
> -
>
> Key: FLINK-33220
> URL: https://issues.apache.org/jira/browse/FLINK-33220
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Python
>Reporter: Liu Chong
>Priority: Minor
>
> This is a simple Jira to propose the support of Datagen in PyFlink datastream 
> API as a built-in source connector



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33220) PyFlink support for datagen connector

2023-10-09 Thread Liu Chong (Jira)
Liu Chong created FLINK-33220:
-

 Summary: PyFlink support for datagen connector
 Key: FLINK-33220
 URL: https://issues.apache.org/jira/browse/FLINK-33220
 Project: Flink
  Issue Type: New Feature
  Components: API / Python
Reporter: Liu Chong


This is a simple Jira to propose the support of datagen in PyFlink datastream 
API as a built-in source connector



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33220) PyFlink support for Datagen connector

2023-10-09 Thread Liu Chong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Chong updated FLINK-33220:
--
Summary: PyFlink support for Datagen connector  (was: PyFlink support for 
datagen connector)

> PyFlink support for Datagen connector
> -
>
> Key: FLINK-33220
> URL: https://issues.apache.org/jira/browse/FLINK-33220
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Python
>Reporter: Liu Chong
>Priority: Minor
>
> This is a simple Jira to propose the support of datagen in PyFlink datastream 
> API as a built-in source connector



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33197) PyFlink support for ByteArraySchema

2023-10-06 Thread Liu Chong (Jira)
Liu Chong created FLINK-33197:
-

 Summary: PyFlink support for ByteArraySchema
 Key: FLINK-33197
 URL: https://issues.apache.org/jira/browse/FLINK-33197
 Project: Flink
  Issue Type: New Feature
  Components: API / Python
Affects Versions: 1.17.0
Reporter: Liu Chong


Currently in Python Flink API, when reading messages from a Kafka source, only 
SimpleStringSchema is available.
If the data is in arbitary binary format(e.g. marshalled Protocol Buffer msg) 
it may not be decodable with the default 'utf-8' encoding. 
There's currently a workaround which is to manually set the encoding to 
'ISO-8859-1' which supports all possible byte combinations. 
However this is not an elegant solution.
We should support ByteArraySchema which outputs a raw byte array for subsequent 
unmarshalling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)