[jira] [Updated] (HUDI-1205) Serialization fail when log file is larger than 2GB

2020-08-19 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1205: - Status: Open (was: New) > Serialization fail when log file is larger than 2GB >

[jira] [Updated] (HUDI-1205) Serialization fail when log file is larger than 2GB

2020-08-19 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1205: - Description: When scanning the log file, if the log file(or log file group) is larger than 2GB,

[jira] [Created] (HUDI-1205) Serialization fail when log file is larger than 2GB

2020-08-19 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1205: Summary: Serialization fail when log file is larger than 2GB Key: HUDI-1205 URL: https://issues.apache.org/jira/browse/HUDI-1205 Project: Apache Hudi Issue

[jira] [Commented] (HUDI-920) Incremental view on MOR table using Spark Datasource

2020-08-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173594#comment-17173594 ] Yanjia Gary Li commented on HUDI-920: - The most challenging thing of the incremental query for MOR was

[jira] [Resolved] (HUDI-69) Support realtime view in Spark datasource #136

2020-08-07 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li resolved HUDI-69. Resolution: Fixed > Support realtime view in Spark datasource #136 >

[jira] [Resolved] (HUDI-1052) Support vectorized reader for MOR datasource reader

2020-08-07 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li resolved HUDI-1052. -- Resolution: Fixed > Support vectorized reader for MOR datasource reader >

[jira] [Resolved] (HUDI-1050) Support filter pushdown and column pruning for MOR table on Spark Datasource

2020-08-07 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li resolved HUDI-1050. -- Resolution: Fixed > Support filter pushdown and column pruning for MOR table on Spark

[jira] [Updated] (HUDI-1052) Support vectorized reader for MOR datasource reader

2020-08-07 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1052: - Status: In Progress (was: Open) > Support vectorized reader for MOR datasource reader >

[jira] [Updated] (HUDI-1141) Serialization fail when loading two log files

2020-07-31 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1141: - Summary: Serialization fail when loading two log files (was: Serialization fail when loading

[jira] [Created] (HUDI-1141) Serialization fail when loading large log files

2020-07-31 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1141: Summary: Serialization fail when loading large log files Key: HUDI-1141 URL: https://issues.apache.org/jira/browse/HUDI-1141 Project: Apache Hudi Issue

[jira] [Updated] (HUDI-1050) Support filter pushdown and column pruning for MOR table on Spark Datasource

2020-07-26 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1050: - Status: In Progress (was: Open) > Support filter pushdown and column pruning for MOR table on

[jira] [Updated] (HUDI-1120) Support spotless for scala

2020-07-22 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1120: - Component/s: Code Cleanup > Support spotless for scala > -- > >

[jira] [Updated] (HUDI-1120) Support spotless for scala

2020-07-22 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1120: - Fix Version/s: 0.6.0 > Support spotless for scala > -- > >

[jira] [Updated] (HUDI-1120) Support spotless for scala

2020-07-22 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1120: - Status: In Progress (was: Open) > Support spotless for scala > -- > >

[jira] [Updated] (HUDI-1120) Support spotless for scala

2020-07-22 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1120: - Status: Open (was: New) > Support spotless for scala > -- > >

[jira] [Created] (HUDI-1120) Support spotless for scala

2020-07-22 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1120: Summary: Support spotless for scala Key: HUDI-1120 URL: https://issues.apache.org/jira/browse/HUDI-1120 Project: Apache Hudi Issue Type: Sub-task

[jira] [Updated] (HUDI-1050) Support filter pushdown and column pruning for MOR table on Spark Datasource

2020-07-21 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1050: - Fix Version/s: (was: 0.6.1) 0.6.0 > Support filter pushdown and column

[jira] [Updated] (HUDI-1114) Explore Spark Structure Streaming for Hudi Dataset

2020-07-20 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1114: - Status: Open (was: New) > Explore Spark Structure Streaming for Hudi Dataset >

[jira] [Created] (HUDI-1114) Explore Spark Structure Streaming for Hudi Dataset

2020-07-20 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1114: Summary: Explore Spark Structure Streaming for Hudi Dataset Key: HUDI-1114 URL: https://issues.apache.org/jira/browse/HUDI-1114 Project: Apache Hudi Issue

[jira] [Created] (HUDI-1101) Decouple Hive dependencies from hudi-spark and hudi-utilities

2020-07-16 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1101: Summary: Decouple Hive dependencies from hudi-spark and hudi-utilities Key: HUDI-1101 URL: https://issues.apache.org/jira/browse/HUDI-1101 Project: Apache Hudi

[jira] [Updated] (HUDI-1051) Improve MOR datasource reader file listing and path handling

2020-06-24 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1051: - Status: Open (was: New) > Improve MOR datasource reader file listing and path handling >

[jira] [Updated] (HUDI-1052) Support vectorized reader for MOR datasource reader

2020-06-24 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1052: - Status: Open (was: New) > Support vectorized reader for MOR datasource reader >

[jira] [Updated] (HUDI-1050) Support filter pushdown and column pruning for MOR table on Spark Datasource

2020-06-24 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1050: - Status: Open (was: New) > Support filter pushdown and column pruning for MOR table on Spark

[jira] [Created] (HUDI-1052) Support vectorized reader for MOR datasource reader

2020-06-24 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1052: Summary: Support vectorized reader for MOR datasource reader Key: HUDI-1052 URL: https://issues.apache.org/jira/browse/HUDI-1052 Project: Apache Hudi Issue

[jira] [Created] (HUDI-1051) Improve MOR datasource reader file listing and path handling

2020-06-24 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1051: Summary: Improve MOR datasource reader file listing and path handling Key: HUDI-1051 URL: https://issues.apache.org/jira/browse/HUDI-1051 Project: Apache Hudi

[jira] [Created] (HUDI-1050) Support filter pushdown and column pruning for MOR table on Spark Datasource

2020-06-24 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1050: Summary: Support filter pushdown and column pruning for MOR table on Spark Datasource Key: HUDI-1050 URL: https://issues.apache.org/jira/browse/HUDI-1050 Project:

[jira] [Updated] (HUDI-1028) Hudi write job stuck when start EmbeddedTimelineService failed

2020-06-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1028: - Description: With "hoodie.embed.timeline.server" set to "true" as default in 0.5.3, I deployed a

[jira] [Updated] (HUDI-1028) Hudi write job stuck when start EmbeddedTimelineService failed

2020-06-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1028: - Summary: Hudi write job stuck when start EmbeddedTimelineService failed (was: Hudi write job

[jira] [Commented] (HUDI-1028) Hudi write job stuck when start EmbeddedTimelineService failed

2020-06-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138910#comment-17138910 ] Yanjia Gary Li commented on HUDI-1028: -- Hi [~xleesf], have you seen similar things happened in your

[jira] [Updated] (HUDI-1028) Hudi write job stuck when create EmbeddedTimelineService failed

2020-06-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1028: - Description: With "hoodie.embed.timeline.server" set to "true" as default in 0.5.3, I deployed a

[jira] [Updated] (HUDI-1028) Hudi write job stuck when create EmbeddedTimelineService failed

2020-06-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1028: - Attachment: stack_trace.txt > Hudi write job stuck when create EmbeddedTimelineService failed >

[jira] [Updated] (HUDI-1028) Hudi write job stuck when create EmbeddedTimelineService failed

2020-06-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1028: - Status: Open (was: New) > Hudi write job stuck when create EmbeddedTimelineService failed >

[jira] [Created] (HUDI-1028) Hudi write job stuck when create EmbeddedTimelineService failed

2020-06-17 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1028: Summary: Hudi write job stuck when create EmbeddedTimelineService failed Key: HUDI-1028 URL: https://issues.apache.org/jira/browse/HUDI-1028 Project: Apache Hudi

[jira] [Commented] (HUDI-1018) Handle empty checkpoint better in delta streamer

2020-06-12 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17134533#comment-17134533 ] Yanjia Gary Li commented on HUDI-1018: -- [~Litianye] since we solve this ticket together with 

[jira] [Assigned] (HUDI-1018) Handle empty checkpoint better in delta streamer

2020-06-12 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li reassigned HUDI-1018: Assignee: Tianye Li > Handle empty checkpoint better in delta streamer >

[jira] [Updated] (HUDI-1018) Handle empty checkpoint better in delta streamer

2020-06-09 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1018: - Component/s: DeltaStreamer > Handle empty checkpoint better in delta streamer >

[jira] [Updated] (HUDI-1018) Handle empty checkpoint better in delta streamer

2020-06-09 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1018: - Status: Open (was: New) > Handle empty checkpoint better in delta streamer >

[jira] [Created] (HUDI-1018) Handle empty checkpoint better in delta streamer

2020-06-09 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1018: Summary: Handle empty checkpoint better in delta streamer Key: HUDI-1018 URL: https://issues.apache.org/jira/browse/HUDI-1018 Project: Apache Hudi Issue

[jira] [Closed] (HUDI-905) Support PrunedFilteredScan for Spark Datasource

2020-06-09 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li closed HUDI-905. --- Resolution: Not A Problem TableScan already supported filter and projection pushdown. > Support

[jira] [Updated] (HUDI-610) MOR table Impala read support

2020-06-09 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-610: Summary: MOR table Impala read support (was: Impala nea real time table support) > MOR table

[jira] [Assigned] (HUDI-610) Impala nea real time table support

2020-06-09 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li reassigned HUDI-610: --- Assignee: (was: Yanjia Gary Li) > Impala nea real time table support >

[jira] [Resolved] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-06-09 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li resolved HUDI-494. - Resolution: Fixed > [DEBUGGING] Huge amount of tasks when writing files into HDFS >

[jira] [Closed] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-06-09 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li closed HUDI-494. --- > [DEBUGGING] Huge amount of tasks when writing files into HDFS >

[jira] [Resolved] (HUDI-822) Decouple hoodie related methods with Hoodie Input Formats

2020-06-09 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li resolved HUDI-822. - Resolution: Fixed > Decouple hoodie related methods with Hoodie Input Formats >

[jira] [Closed] (HUDI-822) Decouple hoodie related methods with Hoodie Input Formats

2020-06-09 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li closed HUDI-822. --- > Decouple hoodie related methods with Hoodie Input Formats >

[jira] [Updated] (HUDI-1011) Refactor hudi-client unit tests structure

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1011: - Status: Open (was: New) > Refactor hudi-client unit tests structure >

[jira] [Updated] (HUDI-1011) Refactor hudi-client unit tests structure

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1011: - Component/s: Testing > Refactor hudi-client unit tests structure >

[jira] [Updated] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1010: - Status: Open (was: New) > Fix the memory leak for hudi-client unit tests >

[jira] [Updated] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1010: - Component/s: Testing > Fix the memory leak for hudi-client unit tests >

[jira] [Updated] (HUDI-1011) Refactor hudi-client unit tests structure

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1011: - Labels: help-wanted (was: ) > Refactor hudi-client unit tests structure >

[jira] [Created] (HUDI-1011) Refactor hudi-client unit tests structure

2020-06-08 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1011: Summary: Refactor hudi-client unit tests structure Key: HUDI-1011 URL: https://issues.apache.org/jira/browse/HUDI-1011 Project: Apache Hudi Issue Type:

[jira] [Updated] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1010: - Description: hudi-client unit test has a memory leak, which could be some resources are not

[jira] [Updated] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1010: - Description: hudi-client unit test has a memory leak, which could be some resources are not

[jira] [Updated] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1010: - Labels: help-wanted (was: ) > Fix the memory leak for hudi-client unit tests >

[jira] [Created] (HUDI-1010) Fix the memory leak for hudi-client unit tests

2020-06-08 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1010: Summary: Fix the memory leak for hudi-client unit tests Key: HUDI-1010 URL: https://issues.apache.org/jira/browse/HUDI-1010 Project: Apache Hudi Issue Type:

[jira] [Resolved] (HUDI-773) Hudi On Azure Data Lake Storage V2

2020-05-27 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li resolved HUDI-773. - Resolution: Fixed Azure info was added to the docs. > Hudi On Azure Data Lake Storage V2 >

[jira] [Closed] (HUDI-773) Hudi On Azure Data Lake Storage V2

2020-05-27 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li closed HUDI-773. --- > Hudi On Azure Data Lake Storage V2 > -- > > Key:

[jira] [Resolved] (HUDI-804) Add Azure Support to Hudi Doc

2020-05-27 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li resolved HUDI-804. - Resolution: Fixed > Add Azure Support to Hudi Doc > - > >

[jira] [Closed] (HUDI-804) Add Azure Support to Hudi Doc

2020-05-27 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li closed HUDI-804. --- > Add Azure Support to Hudi Doc > - > > Key: HUDI-804 >

[jira] [Resolved] (HUDI-805) Verify which types of Azure storage support Hudi

2020-05-27 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li resolved HUDI-805. - Resolution: Fixed Azure Data Lake Storage Gen 2 and Azure Blob Storage support Hudi. > Verify

[jira] [Closed] (HUDI-805) Verify which types of Azure storage support Hudi

2020-05-27 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li closed HUDI-805. --- > Verify which types of Azure storage support Hudi > > >

[jira] [Updated] (HUDI-805) Verify which types of Azure storage support Hudi

2020-05-27 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-805: Status: Open (was: New) > Verify which types of Azure storage support Hudi >

[jira] [Updated] (HUDI-805) Verify which types of Azure storage support Hudi

2020-05-27 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-805: Status: In Progress (was: Open) > Verify which types of Azure storage support Hudi >

[jira] [Updated] (HUDI-804) Add Azure Support to Hudi Doc

2020-05-25 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-804: Status: In Progress (was: Open) > Add Azure Support to Hudi Doc > - > >

[jira] [Updated] (HUDI-804) Add Azure Support to Hudi Doc

2020-05-25 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-804: Status: Open (was: New) > Add Azure Support to Hudi Doc > - > >

[jira] [Commented] (HUDI-110) Better defaults for Partition extractor for Spark DataSOurce and DeltaStreamer

2020-05-23 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114980#comment-17114980 ] Yanjia Gary Li commented on HUDI-110: - [~shivnarayan] no, the PR is not related to this ticket. This

[jira] [Updated] (HUDI-110) Better defaults for Partition extractor for Spark DataSOurce and DeltaStreamer

2020-05-23 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-110: Labels: (was: bug-bash-0.6.0 pull-request-available) > Better defaults for Partition extractor for

[jira] [Assigned] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-23 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li reassigned HUDI-494: --- Assignee: Yanjia Gary Li (was: lamber-ken) > [DEBUGGING] Huge amount of tasks when writing

[jira] [Commented] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-23 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114967#comment-17114967 ] Yanjia Gary Li commented on HUDI-494: - [~shivnarayan] this is still under review.

[jira] [Updated] (HUDI-920) Incremental view on MOR table using Spark Datasource

2020-05-22 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-920: Fix Version/s: 0.6.0 > Incremental view on MOR table using Spark Datasource >

[jira] [Updated] (HUDI-920) Incremental view on MOR table using Spark Datasource

2020-05-22 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-920: Status: Open (was: New) > Incremental view on MOR table using Spark Datasource >

[jira] [Created] (HUDI-920) Incremental view on MOR table using Spark Datasource

2020-05-22 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-920: --- Summary: Incremental view on MOR table using Spark Datasource Key: HUDI-920 URL: https://issues.apache.org/jira/browse/HUDI-920 Project: Apache Hudi (incubating)

[jira] [Assigned] (HUDI-905) Support PrunedFilteredScan for Spark Datasource

2020-05-21 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li reassigned HUDI-905: --- Assignee: Yanjia Gary Li > Support PrunedFilteredScan for Spark Datasource >

[jira] [Updated] (HUDI-905) Support PrunedFilteredScan for Spark Datasource

2020-05-20 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-905: Status: Open (was: New) > Support PrunedFilteredScan for Spark Datasource >

[jira] [Updated] (HUDI-905) Support PrunedFilteredScan for Spark Datasource

2020-05-20 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-905: Component/s: Spark Integration > Support PrunedFilteredScan for Spark Datasource >

[jira] [Updated] (HUDI-905) Support PrunedFilteredScan for Spark Datasource

2020-05-20 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-905: Priority: Minor (was: Major) > Support PrunedFilteredScan for Spark Datasource >

[jira] [Updated] (HUDI-905) Support PrunedFilteredScan for Spark Datasource

2020-05-20 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-905: Description: Hudi Spark Datasource incremental view currently is using 

[jira] [Updated] (HUDI-905) Support PrunedFilteredScan for Spark Datasource

2020-05-20 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-905: Summary: Support PrunedFilteredScan for Spark Datasource (was: Support native filter pushdown for

[jira] [Assigned] (HUDI-905) Support PrunedFilteredScan for Spark Datasource

2020-05-20 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li reassigned HUDI-905: --- Assignee: (was: Yanjia Gary Li) > Support PrunedFilteredScan for Spark Datasource >

[jira] [Assigned] (HUDI-30) Explore support for Spark Datasource V2

2020-05-20 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-30?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li reassigned HUDI-30: -- Assignee: (was: Yanjia Gary Li) > Explore support for Spark Datasource V2 >

[jira] [Commented] (HUDI-110) Better defaults for Partition extractor for Spark DataSOurce and DeltaStreamer

2020-05-19 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111795#comment-17111795 ] Yanjia Gary Li commented on HUDI-110: - IIUC, this ticket is trying to extract the partition info from

[jira] [Created] (HUDI-905) Support native filter pushdown for Spark Datasource

2020-05-17 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-905: --- Summary: Support native filter pushdown for Spark Datasource Key: HUDI-905 URL: https://issues.apache.org/jira/browse/HUDI-905 Project: Apache Hudi (incubating)

[jira] [Commented] (HUDI-890) Prepare for 0.5.3 patch release

2020-05-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109805#comment-17109805 ] Yanjia Gary Li commented on HUDI-890: - Hi [~bhavanisudha] , #1602 HUDI-494 fix incorrect record size

[jira] [Updated] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-494: Fix Version/s: (was: 0.5.3) > [DEBUGGING] Huge amount of tasks when writing files into HDFS >

[jira] [Updated] (HUDI-110) Better defaults for Partition extractor for Spark DataSOurce and DeltaStreamer

2020-05-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-110: Status: In Progress (was: Open) > Better defaults for Partition extractor for Spark DataSOurce and

[jira] [Resolved] (HUDI-528) Incremental Pull fails when latest commit is empty

2020-05-15 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li resolved HUDI-528. - Resolution: Fixed > Incremental Pull fails when latest commit is empty >

[jira] [Assigned] (HUDI-318) Update Migration Guide to Include Delta Streamer

2020-05-12 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li reassigned HUDI-318: --- Assignee: (was: Yanjia Gary Li) > Update Migration Guide to Include Delta Streamer >

[jira] [Updated] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-12 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-494: Fix Version/s: 0.5.3 > [DEBUGGING] Huge amount of tasks when writing files into HDFS >

[jira] [Updated] (HUDI-528) Incremental Pull fails when latest commit is empty

2020-05-12 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-528: Fix Version/s: 0.5.3 > Incremental Pull fails when latest commit is empty >

[jira] [Updated] (HUDI-528) Incremental Pull fails when latest commit is empty

2020-05-10 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-528: Status: In Progress (was: Open) > Incremental Pull fails when latest commit is empty >

[jira] [Assigned] (HUDI-528) Incremental Pull fails when latest commit is empty

2020-05-10 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li reassigned HUDI-528: --- Assignee: Yanjia Gary Li > Incremental Pull fails when latest commit is empty >

[jira] [Updated] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-10 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-494: Status: In Progress (was: Open) > [DEBUGGING] Huge amount of tasks when writing files into HDFS >

[jira] [Comment Edited] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-07 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101055#comment-17101055 ] Yanjia Gary Li edited comment on HUDI-494 at 5/8/20, 1:38 AM: -- -Ok, I see what

[jira] [Commented] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-06 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101207#comment-17101207 ] Yanjia Gary Li commented on HUDI-494: -   Commit 1: {code:java} "partitionToWriteStats" : {

[jira] [Commented] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-06 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101055#comment-17101055 ] Yanjia Gary Li commented on HUDI-494: - Ok, I see what happened here. Root cause is 

[jira] [Updated] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-06 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-494: Status: Open (was: New) > [DEBUGGING] Huge amount of tasks when writing files into HDFS >

[jira] [Assigned] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-06 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li reassigned HUDI-494: --- Assignee: Yanjia Gary Li (was: Vinoth Chandar) > [DEBUGGING] Huge amount of tasks when

[jira] [Commented] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-06 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17100967#comment-17100967 ] Yanjia Gary Li commented on HUDI-494: - Hi folks, this issue seems coming back again...

[jira] [Updated] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-06 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-494: Attachment: example2_hdfs.png > [DEBUGGING] Huge amount of tasks when writing files into HDFS >

[jira] [Updated] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-06 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-494: Attachment: example2_sparkui.png > [DEBUGGING] Huge amount of tasks when writing files into HDFS >

  1   2   >