[jira] [Updated] (HUDI-5196) For spark with version greater than 3.2+, the query of hudi table using spark sql supports reading parameter configuration.

scx (Jira) Thu, 10 Nov 2022 23:01:06 -0800


     [ 
https://issues.apache.org/jira/browse/HUDI-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


scx updated HUDI-5196:
----------------------
    Description: 
{code:java}
Previously, when we used Spark SQL to read the hudi table, we could only read 
the rt table or the ro table, or use timestamp as of for time travel. However, 
for incremental data reading, some personalized reading parameter 
configurations cannot be used.

in this pr,we can read data:

// query snapshot table
select id, name, price, ts from $tableName1 
['hoodie.datasource.query.type'=>'snapshot','as.of.instant'=>'$instant1'] 
// query incremental table
select id, name, price, ts from $tableName1
   |[
   |'hoodie.datasource.query.type'=>'incremental',
   |'hoodie.datasource.read.begin.instanttime'=>'$instant1',
   |'hoodie.datasource.read.end.instanttime'=>'$instant2'
   |]
// query read_optimized table
select id, name, price, ts from $tableName1 
['hoodie.datasource.query.type'=>'read_optimized']


{code}

  was:
{code:java}
// query snapshot table
select id, name, price, ts from $tableName1 
['hoodie.datasource.query.type'=>'snapshot','as.of.instant'=>'$instant1'] 
// query incremental table
select id, name, price, ts from $tableName1
   |[
   |'hoodie.datasource.query.type'=>'incremental',
   |'hoodie.datasource.read.begin.instanttime'=>'$instant1',
   |'hoodie.datasource.read.end.instanttime'=>'$instant2'
   |]
// query read_optimized table
select id, name, price, ts from $tableName1 
['hoodie.datasource.query.type'=>'read_optimized']{code}


> For spark with version greater than 3.2+, the query of hudi table using spark 
> sql supports reading parameter configuration.
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-5196
>                 URL: https://issues.apache.org/jira/browse/HUDI-5196
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: scx
>            Priority: Major
>              Labels: pull-request-available
>
> {code:java}
> Previously, when we used Spark SQL to read the hudi table, we could only read 
> the rt table or the ro table, or use timestamp as of for time travel. 
> However, for incremental data reading, some personalized reading parameter 
> configurations cannot be used.
> in this pr,we can read data:
> // query snapshot table
> select id, name, price, ts from $tableName1 
> ['hoodie.datasource.query.type'=>'snapshot','as.of.instant'=>'$instant1'] 
> // query incremental table
> select id, name, price, ts from $tableName1
>    |[
>    |'hoodie.datasource.query.type'=>'incremental',
>    |'hoodie.datasource.read.begin.instanttime'=>'$instant1',
>    |'hoodie.datasource.read.end.instanttime'=>'$instant2'
>    |]
> // query read_optimized table
> select id, name, price, ts from $tableName1 
> ['hoodie.datasource.query.type'=>'read_optimized']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HUDI-5196) For spark with version greater than 3.2+, the query of hudi table using spark sql supports reading parameter configuration.

Reply via email to