Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8894 )
Change subject: IMPALA-6372: Go parallel for Hive dataload ...................................................................... Patch Set 9: (2 comments) I've now read through this. I think this is a very welcome change, nice work. I don't think there's anything major in my comments; this all seems like a nice incremental step. http://gerrit.cloudera.org:8080/#/c/8894/9/testdata/bin/load_nested.py File testdata/bin/load_nested.py: http://gerrit.cloudera.org:8080/#/c/8894/9/testdata/bin/load_nested.py@254 PS9, Line 254: with cluster.hive.cursor(db_name=target_db) as hive: Should these be run in parallel? (Feel free to do separately.) http://gerrit.cloudera.org:8080/#/c/8894/9/testdata/bin/load_nested.py@303 PS9, Line 303: impala.compute_stats() I think compute_stats runs serially rather than in parallel. -- To view, visit http://gerrit.cloudera.org:8080/8894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I34b71e6df3c8f23a5a31451280e35f4dc015a2fd Gerrit-Change-Number: 8894 Gerrit-PatchSet: 9 Gerrit-Owner: Joe McDonnell <[email protected]> Gerrit-Reviewer: David Knupp <[email protected]> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Philip Zeyliger <[email protected]> Gerrit-Comment-Date: Tue, 03 Apr 2018 21:42:59 +0000 Gerrit-HasComments: Yes
