[jira] [Updated] (HUDI-426) Implement Spark DataSource Support for querying bootstrapped tables

2020-08-04 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-426:

Status: Closed  (was: Patch Available)

> Implement Spark DataSource Support for querying bootstrapped tables
> ---
>
> Key: HUDI-426
> URL: https://issues.apache.org/jira/browse/HUDI-426
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Spark Integration
>Reporter: Balaji Varadarajan
>Assignee: Udit Mehrotra
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need ability in SparkDataSource to query COW table which is bootstrapped 
> as per 
> [https://cwiki.apache.org/confluence/display/HUDI/RFC+-+12+:+Efficient+Migration+of+Large+Parquet+Tables+to+Apache+Hudi#RFC-12:EfficientMigrationofLargeParquetTablestoApacheHudi-BootstrapIndex:]
>  
> Current implementation delegates to Parquet DataSource but this wont work as 
> we need ability to stitch the columns externally.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-426) Implement Spark DataSource Support for querying bootstrapped tables

2020-07-06 Thread Balaji Varadarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balaji Varadarajan updated HUDI-426:

Priority: Blocker  (was: Major)

> Implement Spark DataSource Support for querying bootstrapped tables
> ---
>
> Key: HUDI-426
> URL: https://issues.apache.org/jira/browse/HUDI-426
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Spark Integration
>Reporter: Balaji Varadarajan
>Assignee: Udit Mehrotra
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need ability in SparkDataSource to query COW table which is bootstrapped 
> as per 
> [https://cwiki.apache.org/confluence/display/HUDI/RFC+-+12+:+Efficient+Migration+of+Large+Parquet+Tables+to+Apache+Hudi#RFC-12:EfficientMigrationofLargeParquetTablestoApacheHudi-BootstrapIndex:]
>  
> Current implementation delegates to Parquet DataSource but this wont work as 
> we need ability to stitch the columns externally.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-426) Implement Spark DataSource Support for querying bootstrapped tables

2020-05-25 Thread Balaji Varadarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balaji Varadarajan updated HUDI-426:

Status: Patch Available  (was: In Progress)

> Implement Spark DataSource Support for querying bootstrapped tables
> ---
>
> Key: HUDI-426
> URL: https://issues.apache.org/jira/browse/HUDI-426
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Spark Integration
>Reporter: Balaji Varadarajan
>Assignee: Udit Mehrotra
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need ability in SparkDataSource to query COW table which is bootstrapped 
> as per 
> [https://cwiki.apache.org/confluence/display/HUDI/RFC+-+12+:+Efficient+Migration+of+Large+Parquet+Tables+to+Apache+Hudi#RFC-12:EfficientMigrationofLargeParquetTablestoApacheHudi-BootstrapIndex:]
>  
> Current implementation delegates to Parquet DataSource but this wont work as 
> we need ability to stitch the columns externally.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-426) Implement Spark DataSource Support for querying bootstrapped tables

2020-05-25 Thread Balaji Varadarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balaji Varadarajan updated HUDI-426:

Status: In Progress  (was: Open)

> Implement Spark DataSource Support for querying bootstrapped tables
> ---
>
> Key: HUDI-426
> URL: https://issues.apache.org/jira/browse/HUDI-426
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Spark Integration
>Reporter: Balaji Varadarajan
>Assignee: Udit Mehrotra
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need ability in SparkDataSource to query COW table which is bootstrapped 
> as per 
> [https://cwiki.apache.org/confluence/display/HUDI/RFC+-+12+:+Efficient+Migration+of+Large+Parquet+Tables+to+Apache+Hudi#RFC-12:EfficientMigrationofLargeParquetTablestoApacheHudi-BootstrapIndex:]
>  
> Current implementation delegates to Parquet DataSource but this wont work as 
> we need ability to stitch the columns externally.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-426) Implement Spark DataSource Support for querying bootstrapped tables

2020-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-426:

Labels: pull-request-available  (was: )

> Implement Spark DataSource Support for querying bootstrapped tables
> ---
>
> Key: HUDI-426
> URL: https://issues.apache.org/jira/browse/HUDI-426
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: Spark Integration
>Reporter: Balaji Varadarajan
>Assignee: Nicholas Jiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>
> We need ability in SparkDataSource to query COW table which is bootstrapped 
> as per 
> [https://cwiki.apache.org/confluence/display/HUDI/RFC+-+12+:+Efficient+Migration+of+Large+Parquet+Tables+to+Apache+Hudi#RFC-12:EfficientMigrationofLargeParquetTablestoApacheHudi-BootstrapIndex:]
>  
> Current implementation delegates to Parquet DataSource but this wont work as 
> we need ability to stitch the columns externally.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-426) Implement Spark DataSource Support for querying bootstrapped tables

2020-01-08 Thread leesf (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

leesf updated HUDI-426:
---
Fix Version/s: (was: 0.5.1)
   0.6.0

> Implement Spark DataSource Support for querying bootstrapped tables
> ---
>
> Key: HUDI-426
> URL: https://issues.apache.org/jira/browse/HUDI-426
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: Spark Integration
>Reporter: Balaji Varadarajan
>Assignee: Nicholas Jiang
>Priority: Major
> Fix For: 0.6.0
>
>
> We need ability in SparkDataSource to query COW table which is bootstrapped 
> as per 
> [https://cwiki.apache.org/confluence/display/HUDI/RFC+-+12+:+Efficient+Migration+of+Large+Parquet+Tables+to+Apache+Hudi#RFC-12:EfficientMigrationofLargeParquetTablestoApacheHudi-BootstrapIndex:]
>  
> Current implementation delegates to Parquet DataSource but this wont work as 
> we need ability to stitch the columns externally.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)