[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2020-02-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17034536#comment-17034536
 ] 

ASF subversion and git services commented on IMPALA-8778:
-

Commit ea0e1def6160d596082b01365fcbbb6e24afb21d in impala's branch 
refs/heads/master from Yanjia Li
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ea0e1de ]

IMPALA-8778: Support Apache Hudi Read Optimized Table

Hudi Read Optimized Table contains multiple versions of parquet files,
in order to load the table correctly, Impala needs to recognize Hudi Read
Optimized Table as a HdfsTable and load the latest version of the file
using HoodieROTablePathFilter.

Tests
 - Unit test for Hudi in FileMetadataLoader
 - Create table tests in functional_schema_template.sql
 - Query tests in hudi-parquet.test

Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf
Reviewed-on: http://gerrit.cloudera.org:8080/14711
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2020-01-17 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017842#comment-17017842
 ] 

Vinoth Chandar commented on IMPALA-8778:


Great to see this making progress!! 

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2020-01-14 Thread Yanjia Gary Li (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17015657#comment-17015657
 ] 

Yanjia Gary Li commented on IMPALA-8778:


Hello, this PR is ready to review!

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2020-01-02 Thread Jira


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006969#comment-17006969
 ] 

Zoltán Borók-Nagy commented on IMPALA-8778:
---

[~garyli1019] you might want to take a look at 
testdata/bin/create-load-data.sh, probably you'll need a function similar to 
'load-custom-schemas()'. This will upload your data files to the test-warehouse 
directory.

You also need to create the tables in the Hive Metastore. You probably want to 
do that as part of the data loading, in that case you'll need to invoke those 
CREATE TABLE statements from create-load-data.sh. Alternatively you can also 
create the tables during test execution.

I looked at the output of the Jenkins job. It failed during the RAT check. It 
means the files omit copyright information. You either want to add copyright 
statements to your files, or more likely you want to include them in the RAT 
exclude list:

[https://github.com/apache/impala/blob/master/bin/rat_exclude_files.txt]

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-12-19 Thread Yanjia Gary Li (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000559#comment-17000559
 ] 

Yanjia Gary Li commented on IMPALA-8778:


[~boroknagyz] just included the unit test in the PR. 

I might have some issues on my VM when creating all the mini-clusters and 
loading test data. So I manually copy the folder /testdata/data/hudicow to HDFS 
/test-warehouse/hudicow. Not sure if this is the right path when running the 
automated script.

Is there a script that only handling copying files into test-warehouse? Not all 
the mini-clusters working but my HDFS does. 

Looks like Jenkins doesn't like those testdata. Should I add it in a different 
way? 

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-12-18 Thread Yanjia Gary Li (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999469#comment-16999469
 ] 

Yanjia Gary Li commented on IMPALA-8778:


[~boroknagyz] that's very helpful, thanks!

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-12-11 Thread Jira


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993598#comment-16993598
 ] 

Zoltán Borók-Nagy commented on IMPALA-8778:
---

Hi Yanija,

Sorry for the late answer, I wasn't watching this Jira.

We usually have these kind of things under the testdata/ directory.

E.g. under testdata/data there is a bunch of files written in different file 
formats. During data load or tests we copy these files to HDFS under 
/test-warehouse/ so the tests can see them.

If you need more complex things than copying files, e.g. if you need to write 
some utility program in java, then you probably want to create your java 
application under testdata/. An example for that is testdata/TableFlattener.

I hope this answer helped.

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-21 Thread Yanjia Gary Li (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16979693#comment-16979693
 ] 

Yanjia Gary Li commented on IMPALA-8778:


Thanks for all the feedback! It will definitely be very interesting to add 
real-time support in the future. I will focus on setting up the testing 
environment for now. 

My idea about the testing environment will be adding an independent folder to 
the HDFS test-warehouse in the preparing test data stage and then I can either 
test FileMetadataLoader or sending a complete impala query. 

The writing data part I can use the test-jar provided by hudi, in this way we 
can create a real-time data source later, but I have to create a new java 
module to write the test data into HDFS mini-cluster. 

So where is the proper location to put this module? Or is there any 
recommendation that could be a better way?

I will be on vacation for the next few weeks so apologies for the delay. 

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-18 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977099#comment-16977099
 ] 

Vinoth Chandar commented on IMPALA-8778:


[~garyli1019] there is a merge logic involved that is pretty custom to hudi. So 
we probably want to still work with the hudi input formats per se. we can 
tackle this down the line, once we get this really working :)

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-18 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977018#comment-16977018
 ] 

Tim Armstrong commented on IMPALA-8778:
---

Yeah, a HDFS table can have a mix of input formats. The HdfsScanNodeOperator 
handles multiple file formats just fine.

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-18 Thread Yanjia Gary Li (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976968#comment-16976968
 ] 

Yanjia Gary Li commented on IMPALA-8778:


[~vinoth] Make sense to me. I think the Real Time table could also be possible 
to add without changing anything from the backend if frontend could combine 
Avro + Parquet into the hdfsTable(based on the code I read but not 100% sure).

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-18 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976786#comment-16976786
 ] 

Vinoth Chandar commented on IMPALA-8778:


Took a pass at the patch. One suggestion is : may be have the format as 
`HUDI_PARQUET` so its clearer? We could eventually do RT tables and even ORC 
when we have it. ? Otherwise, the way you used pathFilters looks good to me.

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-15 Thread Yanjia Gary Li (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16975336#comment-16975336
 ] 

Yanjia Gary Li commented on IMPALA-8778:


Done. Same URL: [https://gerrit.cloudera.org/#/c/14711/]

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-15 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16975266#comment-16975266
 ] 

Tim Armstrong commented on IMPALA-8778:
---

[~garyli1019] drafts in gerrit are only visible to the reviewers list. Can you 
publish it. Include "WIP" in the commit message so we know it's a work in 
progress.

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-14 Thread Yanjia Gary Li (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974711#comment-16974711
 ] 

Yanjia Gary Li commented on IMPALA-8778:


Hi guys, I made a draft [https://gerrit.cloudera.org/#/c/14711/]. Would you 
guys take a look to see if my approach makes sense? I will implement the test 
after we agree on this approach. 

So from my understanding, HdfsTable is handling the partition itself so we 
could not directly use the HoodieParquetInputFormat class, but we can use 
HoodieROTablePathFilter to filter the fileStatus when impala is loading every 
partition. 

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-07 Thread Yanjia Gary Li (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969633#comment-16969633
 ] 

Yanjia Gary Li commented on IMPALA-8778:


Thanks Tim and Vinoth. I will follow the first path then. 

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-06 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968698#comment-16968698
 ] 

Vinoth Chandar commented on IMPALA-8778:


> I think you need logic in Impala that understands slices and only uses the 
> latest slice when querying a partition.

+1. in Hive/Spark/Presto, we make the query call HoodieInputFormat to do this 

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-06 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968581#comment-16968581
 ] 

Tim Armstrong commented on IMPALA-8778:
---

I don't see how you could implement reading from a Hudi table without changing 
Impala (or Hive for that matter). With the original Hive table layout, the 
contents of a partition are determined by listing a directory, and it looks 
like if you list the directory of a Hudi partition, you will get back 
duplicated data from multiple slices. I.e. I think you need logic in Impala 
that understands slices and only uses the latest slice when querying a 
partition.

The only way to add or remove an individual file to a classic Hive table 
(Impala/Hive tables are the same thing) is to add or remove it from the 
partition directory. 

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yanjia Gary Li
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-05 Thread Yanjia Gary Li (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968038#comment-16968038
 ] 

Yanjia Gary Li commented on IMPALA-8778:


Hello [~tarmstrong] , I'd like to resume the discussion on this topic. Yuanbin 
finished his internship a few months ago so please assign this ticket to me. 

After reading some code on both impala and hudi sides, the following are the 
approaches I could think about.
 * As discussed above, to create a new class similar to hdfsTable with Hudi 
dependency to filter path. 
 * Implement everything on the Hudi side and send a sequence of queries to the 
impala server to ALTER the table. The hive sync tool on the Hudi repo is using 
this method. I think this approach could be easier than the one above because 
we could follow a similar strategy as the hive sync tool and we don't need to 
wait until the next release to use this feature.

To make sure this method is possible, I'd like to know what query could handle 
this situation:
 * first stage: in HDFS partition year=2019/month=10/day=1, we have 
file1_v1.parquet, file2_v1.parquet
 * second stage: we ran a Hudi job to update the partition 
year=2019/month=10/day=1, we have file1_v1.parquet, file1_v2.parquet, 
file2_v1.parquet

If we want to *drop* file1_v1.parquet and *load* file1_v2.parquet to the table, 
what query should I run? What will happen if another user submits a query when 
the metadata is updating?

Thanks

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yuanbin Cheng
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-08-06 Thread Vinoth Chandar (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901249#comment-16901249
 ] 

Vinoth Chandar commented on IMPALA-8778:


>Another small question about how to determine that the Hoodie specific path, 
>it seems that I can use HoodiePartitionMetadata to check whether it is a valid 
>dataset if invalid or dataset not found, I can treat it as a no hoodie path, 
>am I correct?

The HoodieTableMetaClient already does those things for you. We can follow up 
on the HUDI ticket more (to keep this about Impala/Hudi integration alone). 

Also, I'd suggest that we land this once we have renamed packaged on Hudi to 
org.apache.hudi and made the first release.. Rough ETA, end of month. So you 
can keep working on the patch as is, test and finally we can just pick up the 
new artifact. 

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yuanbin Cheng
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-08-05 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900393#comment-16900393
 ] 

Tim Armstrong commented on IMPALA-8778:
---

[~Yuanbin] I don't think we can avoid adding the Hudi dependency, that seems OK 
to add.

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yuanbin Cheng
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-08-05 Thread Yuanbin Cheng (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900334#comment-16900334
 ] 

Yuanbin Cheng commented on IMPALA-8778:
---

[~tarmstrong]

As the discussion before, I am trying to make Hudi dataset as the kind of the 
Hive Table in the Impala, and currently, in order to get the latest version of 
the files in Hudi partition, it seems that I need to use the Hudi classes 
directly, which means that Impala needs to take Hudi dependency.

I want to ask can I add the Hudi dependency in the Impala? Or if there is some 
other way that I can call the Hudi classes in the Impala?

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yuanbin Cheng
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-08-05 Thread Yuanbin Cheng (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900331#comment-16900331
 ] 

Yuanbin Cheng commented on IMPALA-8778:
---

[~vinoth] 

1. I have read the HoodieInputFormat, I see, I can use the HUDI class and use 
the timeline to get and filter the latest version of the partition of the HUDI 
dataset.

We need to ask Tim whether we can add the HUDI dependency in the Impala project.

2. Correct, the table has to have only one version, multiple file versions will 
have the wrong result. I am thinking to add the support in Impala that makes 
the Impala can recognize the Hudi specific path and then get the latest version 
of the files.

3. Another small question about how to determine that the Hoodie specific path, 
it seems that I can use HoodiePartitionMetadata to check whether it is a valid 
dataset if invalid or dataset not found, I can treat it as a no hoodie path, am 
I correct?

 

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yuanbin Cheng
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-08-03 Thread Vinoth Chandar (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899514#comment-16899514
 ] 

Vinoth Chandar commented on IMPALA-8778:


>>Do you have any idea about how to load the latest version of the Hudi dataset 
>>without using the InputFormat as Hive, or any related code about the Hive 
>>metadata in Hudi may help a lot? 

Few options to do this, using the Hudi classes directly.. but that would mean 
Impala will now take a Hudi dependency. is that okay? 
In short, if you have a `List` then you can use either 
[HoodieROTablePathFilter|https://github.com/apache/incubator-hudi/blob/479908fd20a97c5f7007f06ba7ee3904967e1050/hoodie-spark/src/main/scala/com/uber/hoodie/DefaultSource.scala#L66]
 (like the Spark datasource) or instantiate the Timeline/FileSystemView classes 
(like the 
[HoodieInputFormat|https://github.com/apache/incubator-hudi/blob/129e4336413fd2290e137804cf16c515c502c2f7/hoodie-hadoop-mr/src/main/java/com/uber/hoodie/hadoop/HoodieInputFormat.java#L89]
 does)

>>Current I just add the `HoodieInputFormat` as a VALID_INPUT_FORMAT which will 
>>make the Impala read the Hudi as the regular Parquet table.

But the table will have to be purely inserts, right? with upserts (and multiple 
file versions), you will have incorrect results? 

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yuanbin Cheng
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-08-01 Thread Yuanbin Cheng (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898457#comment-16898457
 ] 

Yuanbin Cheng commented on IMPALA-8778:
---

[~vinoth]

I have read the code in the Apache Impala that related to the HdfsTable. For 
now, because Hudi partitioning is compatible with Hive partitioning.

So currently, my thought is changing the partition loading part of the coed in 
Apache Impala. It is the loadFileMetadataForPartitions method in the HdfsTable 
class.

This method group the path of partitions and for every path create a 
`FileMetadataLoader` and then parallel call the load method.

Here is the load method in the FileMetadataLoader

[https://github.com/apache/impala/blob/9ee4a5e1940afa47227a92e0f6fba6d4c9909f63/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java#L129]

Since the Impala didn't use the InputFormat classes as Hive, I think I need to 
modify this load partition method to teach the Impala how to load the Hudi 
table.

Do you have any idea about how to load the latest version of the Hudi dataset 
without using the InputFormat as Hive, or any related code about the Hive 
metadata in Hudi may help a lot? 

Another thing is that I have created a draft change in Impala's Gerrit.

[https://gerrit.cloudera.org/#/c/13948/]

Current I just add the `HoodieInputFormat` as a VALID_INPUT_FORMAT which will 
make the Impala read the Hudi as the regular Parquet table.

I am struggling to add some tests in the Impala to verify that this change can 
actually make the Impala successfully read the Hudi data, it seems that I need 
to add Hudi dependencies in the test set and set some data for testing. 

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yuanbin Cheng
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-07-31 Thread Vinoth Chandar (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897458#comment-16897458
 ] 

Vinoth Chandar commented on IMPALA-8778:


[~Yuanbin] any early thoughts on reading the Impala code?

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yuanbin Cheng
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-07-24 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892286#comment-16892286
 ] 

Tim Armstrong commented on IMPALA-8778:
---

I am not all that familiar with this code myself, but I know enough to get you 
started.

Impala doesn't use the same InputFormat classes as hive. Rather, it recognises 
the Java class names and handles it on its own. E.g. Impala is aware of the 
"MapredParquetInputFormat" class and refers to it internally as the PARQUET 
file format - see 
https://github.com/apache/impala/blob/94652d74521e95e8606ea2d22aabcaddde6fc471/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java#L62
 

I think the first steps would be to add Hudi to the list of known file formats 
in HdfsFileFormat.java. Then you could teach the Impala to load the table by 
making isHdfsInputFormatClass() return true here
 
https://github.com/apache/impala/blob/fc974f944a9266e68e6f1694eecdc2160fd52582/fe/src/main/java/org/apache/impala/catalog/Table.java#L327

Then you would need to teach Impala how to load the files and partitions for 
HdfsTable. If the partitioning is compatible, then maybe we just need to get 
the file metadata loading working. The file metadata is loaded here: 
https://github.com/apache/impala/blob/fc974f944a9266e68e6f1694eecdc2160fd52582/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L554

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yuanbin Cheng
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-07-24 Thread Yuanbin Cheng (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892276#comment-16892276
 ] 

Yuanbin Cheng commented on IMPALA-8778:
---

[~tarmstr...@cloudera.com]

Hi Tim,

I created this ticket for the task that adds the Hudi support in the Impala.

>From the implementation patch of the support Presto by Apache Hudi, I found 
>that they used the following way to add the Hudi support:

"In presto, there was a point in the code that lists the DFS folders with the 
`inputFormat` object (HoodieInputFormat for Hudi tables) actually constructed 
already. All we did was check if the `inputFormat` object was an instance of 
HoodieInputFormat and call inputFormat.getSplits() to obtain the latest Hudi 
file slices for the presto query."

And I got some suggestion about this task.

"The `HoodieInputFormat` is annotated with a special annotation. All we need to 
do in Impala is find the place where it lists the file system for files and 
check for this condition and filter for the latest file versions by calling 
`HoodieInputFormat.getSplits()`. This will unblock your use-case and let you 
query RO view on Impala."

Can I ask is there any point in the Impala that "lists the DFS folders with the 
`inputFormat` object"?

It would be so helpful if you can help me determine if or not I can use the 
same method to do this task.

Currently, I am searching for the code in Impala and try to get this point, 
however, I am not familiar with the Impala source code, it really takes me so 
much effort.

Thanks so much!

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yuanbin Cheng
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org