Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8894 )

Change subject: IMPALA-6372: Go parallel for Hive dataload
......................................................................


Patch Set 9:

(2 comments)

I've now read through this. I think this is a very welcome change, nice work. I 
don't think there's anything major in my comments; this all seems like a nice 
incremental step.

http://gerrit.cloudera.org:8080/#/c/8894/9/testdata/bin/load_nested.py
File testdata/bin/load_nested.py:

http://gerrit.cloudera.org:8080/#/c/8894/9/testdata/bin/load_nested.py@254
PS9, Line 254:   with cluster.hive.cursor(db_name=target_db) as hive:
Should these be run in parallel? (Feel free to do separately.)


http://gerrit.cloudera.org:8080/#/c/8894/9/testdata/bin/load_nested.py@303
PS9, Line 303:     impala.compute_stats()
I think compute_stats runs serially rather than in parallel.



--
To view, visit http://gerrit.cloudera.org:8080/8894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34b71e6df3c8f23a5a31451280e35f4dc015a2fd
Gerrit-Change-Number: 8894
Gerrit-PatchSet: 9
Gerrit-Owner: Joe McDonnell <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Philip Zeyliger <[email protected]>
Gerrit-Comment-Date: Tue, 03 Apr 2018 21:42:59 +0000
Gerrit-HasComments: Yes

Reply via email to