[jira] [Updated] (SPARK-48330) Fix the python streaming data source timeout issue for large trigger interval

2024-05-18 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-48330: --- Summary: Fix the python streaming data source timeout issue for large trigger interval (was: Fix

[jira] [Created] (SPARK-48330) Fix the python data source timeout issue for large trigger interval

2024-05-18 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-48330: -- Summary: Fix the python data source timeout issue for large trigger interval Key: SPARK-48330 URL: https://issues.apache.org/jira/browse/SPARK-48330 Project: Spark

[jira] [Created] (SPARK-48062) Add pyspark test for SimpleDataSourceStreamingReader

2024-04-30 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-48062: -- Summary: Add pyspark test for SimpleDataSourceStreamingReader Key: SPARK-48062 URL: https://issues.apache.org/jira/browse/SPARK-48062 Project: Spark Issue Type:

[jira] [Updated] (SPARK-47793) Implement SimpleDataSourceStreamReader for python streaming data source

2024-04-21 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-47793: --- Epic Link: SPARK-46866 > Implement SimpleDataSourceStreamReader for python streaming data source >

[jira] [Updated] (SPARK-47920) Add documentation for python streaming data source

2024-04-21 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-47920: --- Epic Link: SPARK-46866 > Add documentation for python streaming data source >

[jira] [Updated] (SPARK-47777) Add spark connect test for python streaming data source

2024-04-21 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-4?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-4: --- Epic Link: SPARK-46866 > Add spark connect test for python streaming data source >

[jira] [Updated] (SPARK-47273) Implement python stream writer interface

2024-04-21 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-47273: --- Epic Link: SPARK-46866 > Implement python stream writer interface >

[jira] [Updated] (SPARK-47107) Implement partition reader for python streaming data source

2024-04-21 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-47107: --- Epic Link: SPARK-46866 > Implement partition reader for python streaming data source >

[jira] [Created] (SPARK-47920) Add documentation for python streaming data source

2024-04-19 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-47920: -- Summary: Add documentation for python streaming data source Key: SPARK-47920 URL: https://issues.apache.org/jira/browse/SPARK-47920 Project: Spark Issue Type:

[jira] [Updated] (SPARK-47793) Implement SimpleDataSourceStreamReader for python streaming data source

2024-04-10 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-47793: --- Description:  SimpleDataSourceStreamReader is a simplified version of the DataStreamReader

[jira] [Created] (SPARK-47793) Implement SimpleDataSourceStreamReader for python streaming data source

2024-04-10 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-47793: -- Summary: Implement SimpleDataSourceStreamReader for python streaming data source Key: SPARK-47793 URL: https://issues.apache.org/jira/browse/SPARK-47793 Project: Spark

[jira] [Created] (SPARK-47777) Add spark connect test for python streaming data source

2024-04-09 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-4: -- Summary: Add spark connect test for python streaming data source Key: SPARK-4 URL: https://issues.apache.org/jira/browse/SPARK-4 Project: Spark Issue

[jira] [Updated] (SPARK-47273) Implement python stream writer interface

2024-03-04 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-47273: --- Description: In order to support developing spark streaming sink in python, we need to implement

[jira] [Updated] (SPARK-47273) Implement python stream writer interface

2024-03-04 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-47273: --- Description: In order to support developing spark streaming sink in python, we need to implement 

[jira] [Created] (SPARK-47273) Implement python stream writer interface

2024-03-04 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-47273: -- Summary: Implement python stream writer interface Key: SPARK-47273 URL: https://issues.apache.org/jira/browse/SPARK-47273 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-47155) Fix incorrect error class in create_data_source.py

2024-02-24 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-47155: -- Summary: Fix incorrect error class in create_data_source.py Key: SPARK-47155 URL: https://issues.apache.org/jira/browse/SPARK-47155 Project: Spark Issue Type:

[jira] [Created] (SPARK-47107) Implement partition reader for python streaming data source

2024-02-20 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-47107: -- Summary: Implement partition reader for python streaming data source Key: SPARK-47107 URL: https://issues.apache.org/jira/browse/SPARK-47107 Project: Spark

[jira] [Created] (SPARK-46994) Refactor PythonWrite to prepare for supporting python data source streaming write

2024-02-06 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-46994: -- Summary: Refactor PythonWrite to prepare for supporting python data source streaming write Key: SPARK-46994 URL: https://issues.apache.org/jira/browse/SPARK-46994

[jira] [Created] (SPARK-46962) Implement python worker to run python streaming data source

2024-02-02 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-46962: -- Summary: Implement python worker to run python streaming data source Key: SPARK-46962 URL: https://issues.apache.org/jira/browse/SPARK-46962 Project: Spark

[jira] [Updated] (SPARK-46866) Streaming python data source API

2024-02-02 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-46866: --- Issue Type: Epic (was: Improvement) > Streaming python data source API >

[jira] [Created] (SPARK-46866) Streaming python data source API

2024-01-25 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-46866: -- Summary: Streaming python data source API Key: SPARK-46866 URL: https://issues.apache.org/jira/browse/SPARK-46866 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-46736) Retain empty protobuf message in schema for rpotobuf connector

2024-01-16 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-46736: -- Summary: Retain empty protobuf message in schema for rpotobuf connector Key: SPARK-46736 URL: https://issues.apache.org/jira/browse/SPARK-46736 Project: Spark

[jira] [Created] (SPARK-46709) Expose partition_id column in state datasource by default

2024-01-12 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-46709: -- Summary: Expose partition_id column in state datasource by default Key: SPARK-46709 URL: https://issues.apache.org/jira/browse/SPARK-46709 Project: Spark Issue

[jira] [Created] (SPARK-45794) Introduce state metadata source to query the streaming state metadata information

2023-11-04 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-45794: -- Summary: Introduce state metadata source to query the streaming state metadata information Key: SPARK-45794 URL: https://issues.apache.org/jira/browse/SPARK-45794

[jira] [Updated] (SPARK-45747) Support session window aggregation in state reader

2023-10-31 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-45747: --- Summary: Support session window aggregation in state reader (was: Support session window operator

[jira] [Created] (SPARK-45747) Support session window operator in state reader

2023-10-31 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-45747: -- Summary: Support session window operator in state reader Key: SPARK-45747 URL: https://issues.apache.org/jira/browse/SPARK-45747 Project: Spark Issue Type:

[jira] [Commented] (SPARK-45511) SPIP: State Data Source - Reader

2023-10-16 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776009#comment-17776009 ] Chaoqin Li commented on SPARK-45511: +1. This will be very useful for debugging stateful streaming 

[jira] [Updated] (SPARK-45558) Introduce a metadata file for streaming stateful operator

2023-10-16 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoqin Li updated SPARK-45558: --- Description: The information to store in the metadata file: * operator name (no need to be unique

[jira] [Created] (SPARK-45558) Introduce a metadata file for streaming stateful operator

2023-10-16 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-45558: -- Summary: Introduce a metadata file for streaming stateful operator Key: SPARK-45558 URL: https://issues.apache.org/jira/browse/SPARK-45558 Project: Spark Issue

[jira] [Comment Edited] (SPARK-43421) Implement changelog checkpointing for RocksDB state store

2023-05-09 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17721037#comment-17721037 ] Chaoqin Li edited comment on SPARK-43421 at 5/9/23 6:19 PM: Design doc:

[jira] [Created] (SPARK-42353) Cleanup orphan sst and log files in RocksDB checkpoint directory

2023-02-05 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-42353: -- Summary: Cleanup orphan sst and log files in RocksDB checkpoint directory Key: SPARK-42353 URL: https://issues.apache.org/jira/browse/SPARK-42353 Project: Spark

[jira] [Commented] (SPARK-42075) Deprecate DStream API

2023-01-15 Thread Chaoqin Li (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677114#comment-17677114 ] Chaoqin Li commented on SPARK-42075: [~kabhwan] , sure, I can take this. Thanks! > Deprecate

[jira] [Created] (SPARK-40492) Perform maintenance of StateStore instances when they become inactive

2022-09-19 Thread Chaoqin Li (Jira)
Chaoqin Li created SPARK-40492: -- Summary: Perform maintenance of StateStore instances when they become inactive Key: SPARK-40492 URL: https://issues.apache.org/jira/browse/SPARK-40492 Project: Spark