Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17121 )

Change subject: IMPALA-7712: Support Google Cloud Storage
......................................................................


Patch Set 7:

(5 comments)

> I had a couple small nits, but this makes sense to me. The only concern I 
> would have is if IMPALA-10563 is more than just a slow down.

Sure. I'm testing concurrent inserts on a real cluster on GCP to see if the 
issue occurs.

http://gerrit.cloudera.org:8080/#/c/17121/6/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/17121/6/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@804
PS6, Line 804: hasNex
> Nit: hasNext()
Done


http://gerrit.cloudera.org:8080/#/c/17121/6/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@805
PS6, Line 805: hasNext(
> Nit: hasNext()
Done


http://gerrit.cloudera.org:8080/#/c/17121/6/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@806
PS6, Line 806: hasNext(
> Nit: hasNext()
Done


http://gerrit.cloudera.org:8080/#/c/17121/6/tests/custom_cluster/test_hdfs_fd_caching.py
File tests/custom_cluster/test_hdfs_fd_caching.py:

http://gerrit.cloudera.org:8080/#/c/17121/6/tests/custom_cluster/test_hdfs_fd_caching.py@132
PS6, Line 132:     # Caching applies to HDFS, S3, and ABFS files. If this is 
HDFS, S3, or ABFS, then
             :     # verify that caching works. Otherwise, verify that file 
handles are not cached.
> Nit: Now we don't cache GCS file handles, so this needs to be updated.
Done


http://gerrit.cloudera.org:8080/#/c/17121/5/tests/stress/test_insert_stress.py
File tests/stress/test_insert_stress.py:

http://gerrit.cloudera.org:8080/#/c/17121/5/tests/stress/test_insert_stress.py@81
PS5, Line 81:   @SkipIfGCS.jira(reason="IMPALA-10563")
> Ok, to be clear, the statement runs slower, but it does eventually complete
Yeah, in the time out period (600s), only half of the inserts finish. I'm 
trying to see if extending the timeout period can let it pass.
BTW, on HDFS, this test takes 36s. On S3, this test takes 90s.

I'm also testing concurrent inserts on a real cluster on GCP.



--
To view, visit http://gerrit.cloudera.org:8080/17121
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b
Gerrit-Change-Number: 17121
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Wed, 10 Mar 2021 01:48:15 +0000
Gerrit-HasComments: Yes

Reply via email to