[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..

IMPALA-5429: Multi threaded block metadata loading

Implements multi threaded block metadata loading on the Catalog
server where we fetch block metadata for multiple partitions of a
single table in parallel. Number of threads to load the metadata is
controlled by the following two parameters (set on the Catalog server
startup and applies for each table load)

-max_hdfs_partitions_parallel_load(default=5)
-max_nonhdfs_partitions_parallel_load(default=20)

We use different thread pool sizes for HDFS and non-HDFS tables since
non-HDFS supports much higher throughput of RPC calls for listStatus
/listFiles. Based on our experiments, S3 showed a linear speed up
(up to ~113x) with increasing number of loading threads where as the
HDFS throughput was limited to ~5x in un-secure clusters and up to
~3.7x in secure clusters. We narrowed it down to scalability
bottlenecks in HDFS RPC implementation (HADOOP-14558) on both the
server and the client side.

One thing to note here is that the thread pool based metadata fetching
is implemented only for loading HDFS block metadata and not for loading
HMS partition information. Our experiments showed that while loading
large partitioned tables, ~90% of the time is spent in connecting to NN
and loading the HDFS block information and optimizing the rest ~10% makes
the code unnecessarily complex without much gain.

Additional notes:

- The multithreading approach is implemented for
  * INVALIDATE (loading from scratch),
  * REFRESH (reusing existing md) code paths,
  * ALTER TABLE ADD/RECOVER PARTITIONS.

- This patch makes the implementation of ListMap thread-safe since
we use that data structure as a shared state between multiple partition
metadata loding threads.

Testing and Results:

- This patch doesn't add any new tests since there is enough test
coverage already. Passed core/exhaustive runs with HDFS/S3.

- We noticed up to ~113x speedup on S3 tables(thread_pool_size=160)
and up to ~5x speed up in un-secure HDFS clusters and ~3.7x in secure
HDFS clusters.

- Synthesized the following two large tables on HDFS and S3 and noticed
significant reduction in my test DDL queries.

  (1) 100K partitions + 1 million files
  (2) 80 partitions + 250K files

 100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
 100K-PARTITIONS-1M-FILES-CUSTOM-08-ADD-PARTITION I -17.25%
 80-PARTITIONS-250K-FILES-11-REFRESH-PARTITIONI -23.57%
 80-PARTITIONS-250K-FILES-S3-08-ADD-PARTITION I -23.87%
 80-PARTITIONS-250K-FILES-09-INVALIDATE   I -24.88%
 80-PARTITIONS-250K-FILES-03-RECOVER  I -35.90%
 80-PARTITIONS-250K-FILES-07-REFRESH  I -43.03%
 100K-PARTITIONS-1M-FILES-CUSTOM-12-QUERY-PARTITIONS  I -43.93%
 100K-PARTITIONS-1M-FILES-CUSTOM-05-QUERY-AFTER-INV   I -46.59%
 80-PARTITIONS-250K-FILES-10-REFRESH-AFTER-ADD-PARTITION  I -48.71%
 100K-PARTITIONS-1M-FILES-CUSTOM-07-REFRESH   I -49.02%
 80-PARTITIONS-250K-FILES-05-QUERY-AFTER-INV  I -49.05%
 100K-PARTITIONS-1M-FILES-CUSTOM-10-REFRESH-AFTER-ADD-PARTI -51.87%
 80-PARTITIONS-250K-FILES-S3-03-RECOVER   I -67.17%
 80-PARTITIONS-250K-FILES-S3-05-QUERY-AFTER-INV   I -76.45%
 80-PARTITIONS-250K-FILES-S3-07-REFRESH   I -87.04%
 80-PARTITIONS-250K-FILES-S3-10-REFRESH-AFTER-ADD-PARTI -88.57%

Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Reviewed-on: http://gerrit.cloudera.org:8080/8235
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins
---
M be/src/catalog/catalog.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M 
fe/src/main/java/org/apache/impala/catalog/HdfsPartitionLocationCompressor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/util/ListMap.java
9 files changed, 462 insertions(+), 242 deletions(-)

Approvals:
  Bharath Vissapragada: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 12
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 

[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 11: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 11
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Oct 2017 21:56:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1397/


--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 11
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Oct 2017 18:08:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-26 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 11: Code-Review+2

Rebased, carrying +2. Thanks Dimitris.


--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 11
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Oct 2017 18:07:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-25 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 10: Code-Review+2

Nice work!


--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 10
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 25 Oct 2017 22:22:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-25 Thread Bharath Vissapragada (Code Review)
Hello Jim Apple, Dimitris Tsirogiannis, Mostafa Mokhtar, Alex Behm, Vuk 
Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8235

to look at the new patch set (#10).

Change subject: IMPALA-5429: Multi threaded block metadata loading
..

IMPALA-5429: Multi threaded block metadata loading

Implements multi threaded block metadata loading on the Catalog
server where we fetch block metadata for multiple partitions of a
single table in parallel. Number of threads to load the metadata is
controlled by the following two parameters (set on the Catalog server
startup and applies for each table load)

-max_hdfs_partitions_parallel_load(default=5)
-max_nonhdfs_partitions_parallel_load(default=20)

We use different thread pool sizes for HDFS and non-HDFS tables since
non-HDFS supports much higher throughput of RPC calls for listStatus
/listFiles. Based on our experiments, S3 showed a linear speed up
(up to ~113x) with increasing number of loading threads where as the
HDFS throughput was limited to ~5x in un-secure clusters and up to
~3.7x in secure clusters. We narrowed it down to scalability
bottlenecks in HDFS RPC implementation (HADOOP-14558) on both the
server and the client side.

One thing to note here is that the thread pool based metadata fetching
is implemented only for loading HDFS block metadata and not for loading
HMS partition information. Our experiments showed that while loading
large partitioned tables, ~90% of the time is spent in connecting to NN
and loading the HDFS block information and optimizing the rest ~10% makes
the code unnecessarily complex without much gain.

Additional notes:

- The multithreading approach is implemented for
  * INVALIDATE (loading from scratch),
  * REFRESH (reusing existing md) code paths,
  * ALTER TABLE ADD/RECOVER PARTITIONS.

- This patch makes the implementation of ListMap thread-safe since
we use that data structure as a shared state between multiple partition
metadata loding threads.

Testing and Results:

- This patch doesn't add any new tests since there is enough test
coverage already. Passed core/exhaustive runs with HDFS/S3.

- We noticed up to ~113x speedup on S3 tables(thread_pool_size=160)
and up to ~5x speed up in un-secure HDFS clusters and ~3.7x in secure
HDFS clusters.

- Synthesized the following two large tables on HDFS and S3 and noticed
significant reduction in my test DDL queries.

  (1) 100K partitions + 1 million files
  (2) 80 partitions + 250K files

 100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
 100K-PARTITIONS-1M-FILES-CUSTOM-08-ADD-PARTITION I -17.25%
 80-PARTITIONS-250K-FILES-11-REFRESH-PARTITIONI -23.57%
 80-PARTITIONS-250K-FILES-S3-08-ADD-PARTITION I -23.87%
 80-PARTITIONS-250K-FILES-09-INVALIDATE   I -24.88%
 80-PARTITIONS-250K-FILES-03-RECOVER  I -35.90%
 80-PARTITIONS-250K-FILES-07-REFRESH  I -43.03%
 100K-PARTITIONS-1M-FILES-CUSTOM-12-QUERY-PARTITIONS  I -43.93%
 100K-PARTITIONS-1M-FILES-CUSTOM-05-QUERY-AFTER-INV   I -46.59%
 80-PARTITIONS-250K-FILES-10-REFRESH-AFTER-ADD-PARTITION  I -48.71%
 100K-PARTITIONS-1M-FILES-CUSTOM-07-REFRESH   I -49.02%
 80-PARTITIONS-250K-FILES-05-QUERY-AFTER-INV  I -49.05%
 100K-PARTITIONS-1M-FILES-CUSTOM-10-REFRESH-AFTER-ADD-PARTI -51.87%
 80-PARTITIONS-250K-FILES-S3-03-RECOVER   I -67.17%
 80-PARTITIONS-250K-FILES-S3-05-QUERY-AFTER-INV   I -76.45%
 80-PARTITIONS-250K-FILES-S3-07-REFRESH   I -87.04%
 80-PARTITIONS-250K-FILES-S3-10-REFRESH-AFTER-ADD-PARTI -88.57%

Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
---
M be/src/catalog/catalog.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M 
fe/src/main/java/org/apache/impala/catalog/HdfsPartitionLocationCompressor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/util/ListMap.java
9 files changed, 462 insertions(+), 242 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8235/10
--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 10
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 

[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-25 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 9:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/8235/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8235/9//COMMIT_MSG@42
PS9, Line 42: datastructure
> nit: data structure
Done


http://gerrit.cloudera.org:8080/#/c/8235/9//COMMIT_MSG@50
PS9, Line 50: speed up
> speedup
I see conflicting results. As per webster it is "speedup", but as per Oxford it 
is "speed-up". Now I don't know who to trust, but I'll go with your suggestion 
:)

https://en.oxforddictionaries.com/definition/speed-up
https://www.merriam-webster.com/dictionary/speedup


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@122
PS9, Line 122: Limits the
> "Maximum"
Done


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@223
PS9, Line 223: public FileMetadataLoadStats(Path path) { hdfsPath = path; }
> nit: move the ctor below the class variable members.
Done


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@228
PS9, Line 228:
> nit: extra space
Done


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@232
PS9, Line 232: the
> remove
Done


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@248
PS9, Line 248: runnable
> callable
Done


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@266
PS9, Line 266: // Computes the file metadata from scratch (by calling 
resetAndLoadFileMetadata())
 : // when reuseFileMd_ is false, else refreshes the existing 
file metadata for
 : // modified files using refreshFileMetadata().
> nit: you can actually remove the comment as it simply describes the code be
Done


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@465
PS9, Line 465: hasFileChanged
> I think you need to put back the caching check. Alex made a good point that
Alex to the rescue, good point. Made the change.


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@773
PS9, Line 773: loadParitionFileMetadataFromScratch
> "load from scratch" and "reset and load" essentially mean the same thing. M
Done


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@792
PS9, Line 792: S3
> non-HDFS (S3/ADLS)
Done


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@792
PS9, Line 792: S3
> "the latter"
Done


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@844
PS9, Line 844: if (LOG.isTraceEnabled()) {
 : LOG.trace(loadStats.debugString());
 :   }
> nit: single line
Done


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/service/BackendConfig.java@71
PS9, Line 71: maxS3PartsParallelLoad
> Why "maxS3..."? Everywhere else we use "maxNonHdfs..".
Yep, forgot to change here.


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/util/ListMap.java
File fe/src/main/java/org/apache/impala/util/ListMap.java:

http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/util/ListMap.java@43
PS9, Line 43: getList() { return list_; }
> Hm, why is list_ exposed to the world? You may want to check who wants this
- I checked the callers and all of them just readers and either iterate / sets 
it inside another thrift object (which is when I think a copy is made).

- Looks like ImmutableList.copyOf() doesn't always make a copy (although 
subject to change), using it here.


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/util/ListMap.java@76
PS9, Line 76: synchronized (this) {
> this is equivalent to public synchronized void populate...
oops, yea, updated.



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 9
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 

[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-25 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 9:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/8235/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8235/9//COMMIT_MSG@42
PS9, Line 42: datastructure
nit: data structure


http://gerrit.cloudera.org:8080/#/c/8235/9//COMMIT_MSG@50
PS9, Line 50: speed up
speedup


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@122
PS9, Line 122: Limits the
"Maximum"


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@223
PS9, Line 223: public FileMetadataLoadStats(Path path) { hdfsPath = path; }
nit: move the ctor below the class variable members.


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@228
PS9, Line 228:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@232
PS9, Line 232: the
remove


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@248
PS9, Line 248: runnable
callable


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@266
PS9, Line 266: // Computes the file metadata from scratch (by calling 
resetAndLoadFileMetadata())
 : // when reuseFileMd_ is false, else refreshes the existing 
file metadata for
 : // modified files using refreshFileMetadata().
nit: you can actually remove the comment as it simply describes the code below.


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@465
PS9, Line 465: hasFileChanged
I think you need to put back the caching check. Alex made a good point that it 
is used for updating the cached bytes size.


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@773
PS9, Line 773: loadParitionFileMetadataFromScratch
"load from scratch" and "reset and load" essentially mean the same thing. Maybe 
rename this to resetAndLoadPartitionFileMetadata()?


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@792
PS9, Line 792: S3
non-HDFS (S3/ADLS)


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@792
PS9, Line 792: S3
"the latter"


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@844
PS9, Line 844: if (LOG.isTraceEnabled()) {
 : LOG.trace(loadStats.debugString());
 :   }
nit: single line


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/service/BackendConfig.java@71
PS9, Line 71: maxS3PartsParallelLoad
Why "maxS3..."? Everywhere else we use "maxNonHdfs..".


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/util/ListMap.java
File fe/src/main/java/org/apache/impala/util/ListMap.java:

http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/util/ListMap.java@43
PS9, Line 43: getList() { return list_; }
Hm, why is list_ exposed to the world? You may want to check who wants this and 
what do they do with it (i.e modify or simply iterate). If it's the latter you 
can pass an ImmutableList(). If it is the former, see if we need to expand the 
public api of this class instead.


http://gerrit.cloudera.org:8080/#/c/8235/9/fe/src/main/java/org/apache/impala/util/ListMap.java@76
PS9, Line 76: synchronized (this) {
this is equivalent to public synchronized void populate...



-- 
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 9
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 25 Oct 2017 19:24:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-25 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 9: Code-Review+1

(1 comment)

Thanks Alex, carrying +1. Rebased for the final review.

http://gerrit.cloudera.org:8080/#/c/8235/8/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/8/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@122
PS8, Line 122:   // Limits the number of errors logged when loading partitioned 
tables.
> remove "huge"
Done



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 9
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 25 Oct 2017 17:46:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-25 Thread Bharath Vissapragada (Code Review)
Hello Jim Apple, Dimitris Tsirogiannis, Mostafa Mokhtar, Alex Behm, Vuk 
Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8235

to look at the new patch set (#9).

Change subject: IMPALA-5429: Multi threaded block metadata loading
..

IMPALA-5429: Multi threaded block metadata loading

Implements multi threaded block metadata loading on the Catalog
server where we fetch block metadata for multiple partitions of a
single table in parallel. Number of threads to load the metadata is
controlled by the following two parameters (set on the Catalog server
startup and applies for each table load)

-max_hdfs_partitions_parallel_load(default=5)
-max_nonhdfs_partitions_parallel_load(default=20)

We use different thread pool sizes for HDFS and non-HDFS tables since
non-HDFS supports much higher throughput of RPC calls for listStatus
/listFiles. Based on our experiments, S3 showed a linear speed up
(up to ~113x) with increasing number of loading threads where as the
HDFS throughput was limited to ~5x in un-secure clusters and up to
~3.7x in secure clusters. We narrowed it down to scalability
bottlenecks in HDFS RPC implementation (HADOOP-14558) on both the
server and the client side.

One thing to note here is that the thread pool based metadata fetching
is implemented only for loading HDFS block metadata and not for loading
HMS partition information. Our experiments showed that while loading
large partitioned tables, ~90% of the time is spent in connecting to NN
and loading the HDFS block information and optimizing the rest ~10% makes
the code unnecessarily complex without much gain.

Additional notes:

- The multithreading approach is implemented for
  * INVALIDATE (loading from scratch),
  * REFRESH (reusing existing md) code paths,
  * ALTER TABLE ADD/RECOVER PARTITIONS.

- This patch makes the implementation of ListMap thread-safe since
we use that datastructure as a shared state between multiple partition
metadata loding threads.

Testing and Results:

- This patch doesn't add any new tests since there is enough test
coverage already. Passed core/exhaustive runs with HDFS/S3.

- We noticed up to ~113x speed up on S3 tables(thread_pool_size=160)
and up to ~5x speed up in un-secure HDFS clusters and ~3.7x in secure
HDFS clusters.

- Synthesized the following two large tables on HDFS and S3 and noticed
significant reduction in my test DDL queries.

  (1) 100K partitions + 1 million files
  (2) 80 partitions + 250K files

 100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
 100K-PARTITIONS-1M-FILES-CUSTOM-08-ADD-PARTITION I -17.25%
 80-PARTITIONS-250K-FILES-11-REFRESH-PARTITIONI -23.57%
 80-PARTITIONS-250K-FILES-S3-08-ADD-PARTITION I -23.87%
 80-PARTITIONS-250K-FILES-09-INVALIDATE   I -24.88%
 80-PARTITIONS-250K-FILES-03-RECOVER  I -35.90%
 80-PARTITIONS-250K-FILES-07-REFRESH  I -43.03%
 100K-PARTITIONS-1M-FILES-CUSTOM-12-QUERY-PARTITIONS  I -43.93%
 100K-PARTITIONS-1M-FILES-CUSTOM-05-QUERY-AFTER-INV   I -46.59%
 80-PARTITIONS-250K-FILES-10-REFRESH-AFTER-ADD-PARTITION  I -48.71%
 100K-PARTITIONS-1M-FILES-CUSTOM-07-REFRESH   I -49.02%
 80-PARTITIONS-250K-FILES-05-QUERY-AFTER-INV  I -49.05%
 100K-PARTITIONS-1M-FILES-CUSTOM-10-REFRESH-AFTER-ADD-PARTI -51.87%
 80-PARTITIONS-250K-FILES-S3-03-RECOVER   I -67.17%
 80-PARTITIONS-250K-FILES-S3-05-QUERY-AFTER-INV   I -76.45%
 80-PARTITIONS-250K-FILES-S3-07-REFRESH   I -87.04%
 80-PARTITIONS-250K-FILES-S3-10-REFRESH-AFTER-ADD-PARTI -88.57%

Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
---
M be/src/catalog/catalog.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M 
fe/src/main/java/org/apache/impala/catalog/HdfsPartitionLocationCompressor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/util/ListMap.java
9 files changed, 461 insertions(+), 245 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8235/9
--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 9
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 

[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-23 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 8: Code-Review+1

(1 comment)

Dimitris should +2

http://gerrit.cloudera.org:8080/#/c/8235/8/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/8/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@122
PS8, Line 122:   // Limits the number of errors logged when loading huge 
partitioned tables.
remove "huge"



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 8
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 23 Oct 2017 16:47:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-22 Thread Bharath Vissapragada (Code Review)
Hello Jim Apple, Dimitris Tsirogiannis, Mostafa Mokhtar, Alex Behm, Vuk 
Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8235

to look at the new patch set (#8).

Change subject: IMPALA-5429: Multi threaded block metadata loading
..

IMPALA-5429: Multi threaded block metadata loading

Implements multi threaded block metadata loading on the Catalog
server where we fetch block metadata for multiple partitions of a
single table in parallel. Number of threads to load the metadata is
controlled by the following two parameters (set on the Catalog server
startup and applies for each table load)

-max_hdfs_partitions_parallel_load(default=5)
-max_nonhdfs_partitions_parallel_load(default=20)

We use different thread pool sizes for HDFS and non-HDFS tables since
non-HDFS supports much higher throughput of RPC calls for listStatus
/listFiles. Based on our experiments, S3 showed a linear speed up
(up to ~113x) with increasing number of loading threads where as the
HDFS throughput was limited to ~5x in un-secure clusters and up to
~3.7x in secure clusters. We narrowed it down to scalability
bottlenecks in HDFS RPC implementation (HADOOP-14558) on both the
server and the client side.

One thing to note here is that the thread pool based metadata fetching
is implemented only for loading HDFS block metadata and not for loading
HMS partition information. Our experiments showed that while loading
large partitioned tables, ~90% of the time is spent in connecting to NN
and loading the HDFS block information and optimizing the rest ~10% makes
the code unnecessarily complex without much gain.

Additional notes:

- The multithreading approach is implemented for
  * INVALIDATE (loading from scratch),
  * REFRESH (reusing existing md) code paths,
  * ALTER TABLE ADD/RECOVER PARTITIONS.

- This patch makes the implementation of ListMap thread-safe since
we use that datastructure as a shared state between multiple partition
metadata loding threads.

Testing and Results:

- This patch doesn't add any new tests since there is enough test
coverage already. Passed core/exhaustive runs with HDFS/S3.

- We noticed up to ~113x speed up on S3 tables(thread_pool_size=160)
and up to ~5x speed up in un-secure HDFS clusters and ~3.7x in secure
HDFS clusters.

- Synthesized the following two large tables on HDFS and S3 and noticed
significant reduction in my test DDL queries.

  (1) 100K partitions + 1 million files
  (2) 80 partitions + 250K files

 100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
 100K-PARTITIONS-1M-FILES-CUSTOM-08-ADD-PARTITION I -17.25%
 80-PARTITIONS-250K-FILES-11-REFRESH-PARTITIONI -23.57%
 80-PARTITIONS-250K-FILES-S3-08-ADD-PARTITION I -23.87%
 80-PARTITIONS-250K-FILES-09-INVALIDATE   I -24.88%
 80-PARTITIONS-250K-FILES-03-RECOVER  I -35.90%
 80-PARTITIONS-250K-FILES-07-REFRESH  I -43.03%
 100K-PARTITIONS-1M-FILES-CUSTOM-12-QUERY-PARTITIONS  I -43.93%
 100K-PARTITIONS-1M-FILES-CUSTOM-05-QUERY-AFTER-INV   I -46.59%
 80-PARTITIONS-250K-FILES-10-REFRESH-AFTER-ADD-PARTITION  I -48.71%
 100K-PARTITIONS-1M-FILES-CUSTOM-07-REFRESH   I -49.02%
 80-PARTITIONS-250K-FILES-05-QUERY-AFTER-INV  I -49.05%
 100K-PARTITIONS-1M-FILES-CUSTOM-10-REFRESH-AFTER-ADD-PARTI -51.87%
 80-PARTITIONS-250K-FILES-S3-03-RECOVER   I -67.17%
 80-PARTITIONS-250K-FILES-S3-05-QUERY-AFTER-INV   I -76.45%
 80-PARTITIONS-250K-FILES-S3-07-REFRESH   I -87.04%
 80-PARTITIONS-250K-FILES-S3-10-REFRESH-AFTER-ADD-PARTI -88.57%

Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
---
M be/src/catalog/catalog.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M 
fe/src/main/java/org/apache/impala/catalog/HdfsPartitionLocationCompressor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/util/ListMap.java
9 files changed, 461 insertions(+), 245 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8235/8
--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 8
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 

[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-22 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 7:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@221
PS7, Line 221: // are excluded because they are considered hidden from 
Impala's perspecitve
> Shrink:
Done


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@226
PS7, Line 226: // Number of files skipped while computeing the block 
metadata information.
> Shrink:
Done


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@248
PS7, Line 248: // If set to true, reloads the block metadata only when the 
files in this path
> reloads the file metadata (let's try to use the new "file metadata" termino
Done. Changed at other places too.


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@249
PS7, Line 249: // has changed since last load (more details in 
hasFileChanged()).
> have changed
Done


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@785
PS7, Line 785:* much higher throughput of RPC calls for 
listStatus/listFiles. Based on our
> ... RPC calls for listStatus/listFiles. For simplicity, the filesystem type
Done


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@791
PS7, Line 791:* This method is not optimized for tables with mixed 
partition types (partitions mapped
> I don'd think we need this much detail, how about folding a simple sentence
Done


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@794
PS7, Line 794:* This may not work well with the mixed partition types and 
needs to fixed.
> Don't think this is needed.
Done


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@819
PS7, Line 819: // For tables without partitions, we have no block metadata 
to load.
> remove ","
Done


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@841
PS7, Line 841:   LOG.error("Encountered an error loading block metadata 
for table: " +
> Could this flood the log when HDFS is under pressure or there is an intermi
Very good point. This can likely flood the log. IMO we should have some stack 
traces of errors for supportability. Added some code to limit the number of 
errors to log (100) and logged the total failed tasks. Please let me know if 
you prefer some other approach.



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 7
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Sun, 22 Oct 2017 20:05:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-20 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 7:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@221
PS7, Line 221: // are excluded because they are considered hidden from 
Impala's perspecitve
Shrink:

// Number of hidden files excluded from file metadata loading . More details at 
isValidDataFile().


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@226
PS7, Line 226: // Number of files skipped while computeing the block 
metadata information.
Shrink:

// Number of files skipped from file metadata loading because the the files 
have not changed since the last load. More details at hasFileChanged().


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@248
PS7, Line 248: // If set to true, reloads the block metadata only when the 
files in this path
reloads the file metadata (let's try to use the new "file metadata" terminology 
consistently)


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@249
PS7, Line 249: // has changed since last load (more details in 
hasFileChanged()).
have changed


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@785
PS7, Line 785:* much higher throughput of RPC calls for 
listStatus/listFiles. Based on our
... RPC calls for listStatus/listFiles. For simplicity, the filesystem type is 
determined based on the table's root path and not for each partition 
individually. Based on our experiments, S3 showed ...


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@791
PS7, Line 791:* This method is not optimized for tables with mixed 
partition types (partitions mapped
I don'd think we need this much detail, how about folding a simple sentence 
into the existing paragraph above, see suggestion above.


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@794
PS7, Line 794:* This may not work well with the mixed partition types and 
needs to fixed.
Don't think this is needed.


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@819
PS7, Line 819: // For tables without partitions, we have no block metadata 
to load.
remove ","


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@841
PS7, Line 841:   LOG.error("Encountered an error loading block metadata 
for table: " +
Could this flood the log when HDFS is under pressure or there is an 
intermittent Kerberos issue? I'm thinking we should aggregate these into a 
single log message. Do you think having every single path helps with 
supportability? My feeling is it's more important to note how many tasks 
failed, but I'll defer to you if you think having the paths is useful.



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 7
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Sat, 21 Oct 2017 00:12:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-20 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 6:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc@38
PS6, Line 38: DEFINE_int32(max_hdfs_parts_parallel_load, 5,
> Let's spell out "parts" into "partitions" to avoid confusion.
Done


http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc@40
PS6, Line 40: "tables. Due to HDFS architectural limitations, it is 
unlikely to get a linear "
> In the sentence "Due to HDFS architectural..." what does the "it" after the
Done


http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc@42
PS6, Line 42: DEFINE_int32(max_s3_parts_parallel_load, 20,
> Does ADLS fall into this or the HDFS bucket? What about other filesystems?
Yes, although I haven't tested it specifically with ADLS. I mentioned that in 
the commit message but it looks like a good thing to reflect that in the config 
name too. I changed it too, let me know if you think we should use a better 
name here.


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@216
PS6, Line 216:   private class PathStorageMdLoadingStats {
> Naming seems a little weird. I think "Storage" is too generic and "Block" i
Wfm. Changed it.


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@240
PS6, Line 240: // If set to true, reloads the block metadata only when the 
underlying file
> Please clarify "the underlying file"
Done


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@413
PS6, Line 413: ++loadStats.ignoredFiles;
> I think we should distinguish the hidden file and already-up-to-date file c
Done


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@773
PS6, Line 773:* Returns the thread pool size to load the block metadata of 
this table.
> Command on how we distinguish between HDFS, S3 and other filesystems (using
Done


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@793
PS6, Line 793: return Math.max(Math.min(numPaths, threadPoolSize), 1);
> For my understanding, why is the max() needed here? Is it not be a precondi
For empty tables, numPaths is 0. But yes this should be a preconditions check. 
So I moved it to the caller and added a preconditions check in this method.


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@800
PS6, Line 800:* metadata is reloaded, else it is loaded from scratch.
> Difference between "reloaded" and "loaded from scratch" is not very clear.
Sounds better. Done.


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@804
PS6, Line 804: int numPathsToLoad = partsByPath.keySet().size();
> partsByPath.size()?
Done


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@812
PS6, Line 812:   for (Path p: partsByPath.keySet()) {
> The tasks are ordered randomly. I wonder if submitting the tasks in a clust
Will do. I have two other jiras (based on others' comments) to create, so will 
add this to the list.


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@821
PS6, Line 821: } catch (ExecutionException | InterruptedException e) {
> We still consider the load successful if one of these fails. What do you th
Oops, I forgot to wrap it as a TLE and throw. Thanks for pointing it out. My 
opinion is that a partially loaded table is ok, but we should let the user know 
about it. What do you think?


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@822
PS6, Line 822:   LOG.error("Encountered an error loading block metadata 
for table: " +
> Let's also dump the path
ExecutionException (ex, IOException in this case) includes the path by default 
and LOG.error() logs the full trace?


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@898
PS6, Line 898:* the newly created HdfsPartitions in parallel.
> remove "the"
Done


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/util/ListMap.java
File fe/src/main/java/org/apache/impala/util/ListMap.java:

http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/util/ListMap.java@38
PS6, Line 38:   private 

[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-16 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 6:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc@38
PS6, Line 38: DEFINE_int32(max_hdfs_parts_parallel_load, 5,
Let's spell out "parts" into "partitions" to avoid confusion.


http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc@40
PS6, Line 40: "tables. Due to HDFS architectural limitations, it is 
unlikely to get a linear "
In the sentence "Due to HDFS architectural..." what does the "it" after the 
comma refer to? Better to spell it out, e.g.,

"the response time of block metadata loading is unlikely to improve beyond 5 
threads."


http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc@42
PS6, Line 42: DEFINE_int32(max_s3_parts_parallel_load, 20,
Does ADLS fall into this or the HDFS bucket? What about other filesystems?


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@216
PS6, Line 216:   private class PathStorageMdLoadingStats {
Naming seems a little weird. I think "Storage" is too generic and "Block" is 
too specific. How about we settle on "FileMetadata", so we'd have:

FileMetadataLoadStats
FileMetadataLoadRequest


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@240
PS6, Line 240: // If set to true, reloads the block metadata only when the 
underlying file
Please clarify "the underlying file"


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@413
PS6, Line 413: ++loadStats.ignoredFiles;
I think we should distinguish the hidden file and already-up-to-date file 
cases, e.g. split ignoredFiles into 'hiddenFiles' and 'skippedFreshFiles' or 
similar


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@773
PS6, Line 773:* Returns the thread pool size to load the block metadata of 
this table.
Command on how we distinguish between HDFS, S3 and other filesystems (using the 
table root path?), and what the behavior is for mixed-filesystem tables.


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@793
PS6, Line 793: return Math.max(Math.min(numPaths, threadPoolSize), 1);
For my understanding, why is the max() needed here? Is it not be a precondition 
that numPaths > 0 and threadPoolSize > 0?


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@800
PS6, Line 800:* metadata is reloaded, else it is loaded from scratch.
Difference between "reloaded" and "loaded from scratch" is not very clear. 
Maybe say "incrementally refreshed" vs. "reloaded from scratch"?


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@804
PS6, Line 804: int numPathsToLoad = partsByPath.keySet().size();
partsByPath.size()?


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@812
PS6, Line 812:   for (Path p: partsByPath.keySet()) {
The tasks are ordered randomly. I wonder if submitting the tasks in a clustered 
fashion would be better or worse. Not relevant to this patch, but might be 
interesting as a follow-on investigation.


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@821
PS6, Line 821: } catch (ExecutionException | InterruptedException e) {
We still consider the load successful if one of these fails. What do you think 
the right behavior should be? Is a partially loaded table an acceptable outcome?


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@822
PS6, Line 822:   LOG.error("Encountered an error loading block metadata 
for table: " +
Let's also dump the path


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@898
PS6, Line 898:* the newly created HdfsPartitions in parallel.
remove "the"


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/util/ListMap.java
File fe/src/main/java/org/apache/impala/util/ListMap.java:

http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/util/ListMap.java@38
PS6, Line 38:   private List list_ = Collections.synchronizedList(new 
ArrayList());
The CC requirements and behavior are not clear to me. Why is it not sufficient 
to make all methods synchronized?



--
To 

[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-16 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc@39
PS6, Line 39: (Advanced) Number of threads used to load block metadata for HDFS 
based partitioned "
: "tables. Due to HDFS architectural limitations, it is 
unlikely to get a linear "
: "speed up beyond 5 threads.
> When multiple tables are loaded, should I think about the total number of t
Unfortunately not. With the current design, this queue is actually unbounded. 
num_metadata_loading_threads only applies to the loads happening via 
TableLoadingMgr class and there would be other loads, that happen via DDLs 
(CatalogOpExecutor) like REFRESHES/ADD PARTITIONS etc. The original plan was to 
use the TableLoadingMgr to queue all these loads but we didn't end up doing it 
since it shows up as a regression to the end users (since the DDLs can 
potentially wait in the queue much longer than before). Ideally we could do it 
and increase num_metadata_loading_threads to a large value to mimic the present 
behavior.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@783
PS5, Line 783: numPaths) throws Ca
> CONF is the default configuration and its loaded once upfront for the lifet
getFileSystem() by default uses a cache underneath unless we disable it via

CONF.setBoolean("fs.hdfs.impl.disable.cache", true);

But I'm not totally sure if there is a way to optimize beyond that point for 
each partition.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@801
PS5, Line 801:
> yes, that answers it. might be useful to try a workload that has the same n
That'd be a good experiment. We are definitely as fast as the slowest partition 
load. Given we are using a thread pool, smaller partitions give up the 
executing thread much quicker and that would be used by the queued 
partitions_to_load.


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@818
PS6, Line 818: for (Future task: pendingMdLoadTasks)
> just for my own info-- since this work is triggered by an end-user, how is
Currently we don't support query cancellation for planning queries 
(IMPALA-915). That is a bigger query life-cycle change and needs to done 
separately.



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 6
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 16 Oct 2017 22:28:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-16 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

http://gerrit.cloudera.org:8080/#/c/8235/6/be/src/catalog/catalog.cc@39
PS6, Line 39: (Advanced) Number of threads used to load block metadata for HDFS 
based partitioned "
: "tables. Due to HDFS architectural limitations, it is 
unlikely to get a linear "
: "speed up beyond 5 threads.
When multiple tables are loaded, should I think about the total number of 
threads as num_metadata_loading_threads * max_hdfs_parts_parallel_load? If so, 
is the scaling limitation of 5 with regards to total threads hitting the 
namenode or 5 * 16 (per default settings)?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@783
PS5, Line 783: numPaths) throws Ca
> Correct. This is one of the overheads as noticed in the perf runs and unfor
CONF is the default configuration and its loaded once upfront for the lifetime 
of this class (L201).
I suspect few filesystems are specified-- perhaps we may get lucky and there is 
only one. Potentially, there's a way to make this method cheaper for such cases?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@801
PS5, Line 801:
> Yes, each partition can have its own no. of files, so the work definitely v
yes, that answers it. might be useful to try a workload that has the same 
number of blocks as your current workload, but distributed non-uniformly across 
partitions and files.


http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@818
PS6, Line 818: for (Future task: pendingMdLoadTasks)
just for my own info-- since this work is triggered by an end-user, how is 
cancellation dealt with?



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 6
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 16 Oct 2017 20:35:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-16 Thread Bharath Vissapragada (Code Review)
Hello Jim Apple, Dimitris Tsirogiannis, Mostafa Mokhtar, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8235

to look at the new patch set (#6).

Change subject: IMPALA-5429: Multi threaded block metadata loading
..

IMPALA-5429: Multi threaded block metadata loading

Implements multi threaded block metadata loading on the Catalog
server where we fetch block metadata for multiple partitions of a
single table in parallel. Number of threads to load the metadata is
controlled by the following two parameters (set on the Catalog server
startup and applies for each table load)

-max_hdfs_parts_parallel_load(default=5)
-max_s3_parts_parallel_load(default=10)

We use different thread pool sizes for HDFS and S3 based tables
since S3 supports much higher throughput of RPC calls for listStatus
/listFiles. Based on our experiments, S3 showed a linear speed up
(up to ~113x) with increasing number of loading threads where as the
HDFS throughput was limited to ~5x in un-secure clusters and up to
~3.7x in secure clusters. We narrowed it down to scalability
bottlenecks in HDFS RPC implementation (HADOOP-14558) on both the
server and the client side.

One thing to note here is that the thread pool based metadata fetching
is implemented only for loading HDFS block metadata and not for loading
HMS partition information. Our experiments showed that while loading
large partitioned tables, ~90% of the time is spent in connecting to NN
and loading the HDFS block information and optimizing the rest ~10% makes
the code unnecessarily complex without much gain.

Additional notes:

- The multithreading approach is implemented for
  * INVALIDATE (loading from scratch),
  * REFRESH (reusing existing md) code paths,
  * ALTER TABLE ADD/RECOVER PARTITIONS.

- This patch makes the implementation of ListMap thread-safe since
we use that datastructure as a shared state between multiple partition
metadata loding threads.

- While the configuration param max_s3_parts_parallel_load says s3, it
applies for any FileSystem implementation that doesn't support storage
IDs (like ADLS).

Testing and Results:

- This patch doesn't add any new tests since there is enough test
coverage already. Passed core/exhaustive runs with HDFS/S3.

- We noticed up to ~113x speed up on S3 tables(thread_pool_size=160)
and up to ~5x speed up in un-secure HDFS clusters and ~3.7x in secure
HDFS clusters.

- Synthesized the following two large tables on HDFS and S3 and noticed
significant reduction in my test DDL queries.

  (1) 100K partitions + 1 million files
  (2) 80 partitions + 250K files

 100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
 100K-PARTITIONS-1M-FILES-CUSTOM-08-ADD-PARTITION I -17.25%
 80-PARTITIONS-250K-FILES-11-REFRESH-PARTITIONI -23.57%
 80-PARTITIONS-250K-FILES-S3-08-ADD-PARTITION I -23.87%
 80-PARTITIONS-250K-FILES-09-INVALIDATE   I -24.88%
 80-PARTITIONS-250K-FILES-03-RECOVER  I -35.90%
 80-PARTITIONS-250K-FILES-07-REFRESH  I -43.03%
 100K-PARTITIONS-1M-FILES-CUSTOM-12-QUERY-PARTITIONS  I -43.93%
 100K-PARTITIONS-1M-FILES-CUSTOM-05-QUERY-AFTER-INV   I -46.59%
 80-PARTITIONS-250K-FILES-10-REFRESH-AFTER-ADD-PARTITION  I -48.71%
 100K-PARTITIONS-1M-FILES-CUSTOM-07-REFRESH   I -49.02%
 80-PARTITIONS-250K-FILES-05-QUERY-AFTER-INV  I -49.05%
 100K-PARTITIONS-1M-FILES-CUSTOM-10-REFRESH-AFTER-ADD-PARTI -51.87%
 80-PARTITIONS-250K-FILES-S3-03-RECOVER   I -67.17%
 80-PARTITIONS-250K-FILES-S3-05-QUERY-AFTER-INV   I -76.45%
 80-PARTITIONS-250K-FILES-S3-07-REFRESH   I -87.04%
 80-PARTITIONS-250K-FILES-S3-10-REFRESH-AFTER-ADD-PARTI -88.57%

Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
---
M be/src/catalog/catalog.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M 
fe/src/main/java/org/apache/impala/catalog/HdfsPartitionLocationCompressor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/util/ListMap.java
9 files changed, 411 insertions(+), 238 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8235/6
--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 6
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris 

[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-16 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 5:

(36 comments)

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG@59
PS4, Line 59:
> I see. So are these literally as simple as, to pick the first one, a single
That is correct.


http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc@35
PS5, Line 35: DEFINE_int32(num_metadata_loading_threads, 16,
: "(Advanced) The number of metadata loading threads (degree of 
parallelism) to use "
: "when loading catalog metadata.");
> I'm confused by the commit message which talks about not loading from hms u
Right, this flag controls the number parallel *table* loads. And each table 
loading involves

- Loading HMS metadata (including schema details, partition information etc.)
- Loading/synthesizing block metadata

So this flag is about inter-table parallelism and the commit message talks 
about intra-table parallel loading. Does that make sense? (I clarified the 
content in the commit message a little to reflect this, incase that helps).


http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc@42
PS5, Line 42: DEFINE_int32(max_s3_parts_parallel_load, 10,
> I would be more aggressive with this parameter and put it at 20.
Done


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@215
PS5, Line 215:   // Block metadata loading stats for a single HDFS path.
> nit: File/block (since we're also loading/refreshing files). Also, you may
Done


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@217
PS5, Line 217: loadedFiles_
> You may want to add a comment here. What is loaded vs refreshed? Is the one
Done


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@218
PS5, Line 218: _
> I believe the convention is that we don't use '_' for public members.
Done. Was not aware of this.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@217
PS5, Line 217: public int loadedFiles_ = 0;
 : public int refreshedFiles_ = 0;
 : public int ignoredFiles_ = 0;
> add comments for these-- see the question regarding refreshedFiles below, f
Yep, the current variables are confusing for sure, redid them and added 
comments.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@231
PS5, Line 231: Runnable
> I don't know how this is used later on, but alternatively you can make Path
I settled for Runnable because it looked more cleaner to me with this class 
exposing the debugString() method and only those callers who need the return 
value can access that method. I'm fine either way. I switched to Callable.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@247
PS5, Line 247: Blocks on the loadBlockMetadata() call.
> Not following this comment. run() either calls refreshBlockMetadata() or lo
The comment was stale. Updated it.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@333
PS5, Line 333: loadBlockMetadata
> I know I am guilty for some of these names but maybe rename this to "resetA
Done


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@363
PS5, Line 363: numUnknownDiskIds
> Are you overriding or incrementing the value of numUnknownDiskIds in the cr
Incrementing and will use the final value in L369. Tracking the create call 
which eventually calls HdfsPartition.createDiskIds(), it only increments and 
doesn't overwrite.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@368
PS5, Line 368: for (HdfsPartition partition: partitions) 
partition.setFileDescriptors(
> Am I misreading this or does each partition get set to the same list of new
That is right. All the HdfsPartitions in 'partitions' map to the same partDir. 
So we update the fileDescs for each of them.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@368
PS5, Line 368: newFileDescs
> Hm, that doesn't seem particularly safe (i.e. using the same list for every
Like we discussed, there is a subtle bug here (even without the patch). We can 
see the inconsistency with the following steps

- create table test(a 

[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-13 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@368
PS5, Line 368: for (HdfsPartition partition: partitions) 
partition.setFileDescriptors(
> Am I misreading this or does each partition get set to the same list of new
Just clarified this so I'll post my misunderstanding here. The comment states 
that "partitions in 'partitions' that correspond to the path 'partDir'". 
"correspond" is open to interpretation so I understood this as: 
/.../partDir/partition1_dir, partition2_dir, etc.
The intended interpretation is that multiple partitions map (or refer) to the 
same fs path ('pathDir') (usecase: "latest" is an alias for the most recently 
populated partition).



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 13 Oct 2017 18:36:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-11 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc@42
PS5, Line 42: DEFINE_int32(max_s3_parts_parallel_load, 10,
I would be more aggressive with this parameter and put it at 20.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@787
PS5, Line 787: int threadPoolSize = 
FileSystemUtil.supportsStorageIds(tableFs) ?
What is the expected behavior for tables with mixed FSs?
As a mix of S3 and HDFS partitions.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@801
PS5, Line 801: getLoadingThreadPoolSize
> can different partitions have different number of files? if so, work across
Parallelization of metadata loading is done on per partition granularity.



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 12 Oct 2017 04:51:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-11 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 5:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@215
PS5, Line 215:   // Block metadata loading stats for a single HDFS path.
nit: File/block (since we're also loading/refreshing files). Also, you may want 
to change the name of the private class to reflect that, e.g. 
PathMetadataLoadingStats or something like that.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@217
PS5, Line 217: loadedFiles_
You may want to add a comment here. What is loaded vs refreshed? Is the one 
superset of the other. Good to clarify to avoid confusion.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@218
PS5, Line 218: _
I believe the convention is that we don't use '_' for public members.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@231
PS5, Line 231: Runnable
I don't know how this is used later on, but alternatively you can make 
PathBlockMetadataLoadRequest implement Callable, hence 
returning the stats when calling call(). Now you seem to access the stats only 
through the debugString() function.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@247
PS5, Line 247: Blocks on the loadBlockMetadata() call.
Not following this comment. run() either calls refreshBlockMetadata() or 
loadBlockMetadata(), so I can't really interpret what this comment is saying.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@333
PS5, Line 333: loadBlockMetadata
I know I am guilty for some of these names but maybe rename this to 
"resetAndLoadBlockMetadata"? Then it is more clear what the differences are 
between this function and rerfreshBlockMetadata().


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@363
PS5, Line 363: numUnknownDiskIds
Are you overriding or incrementing the value of numUnknownDiskIds in the 
create() function?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@368
PS5, Line 368: newFileDescs
Hm, that doesn't seem particularly safe (i.e. using the same list for every 
partition). Are we certain that any other partition modification operation 
(e.g. alter partition set location) will not try to override the file 
descriptors, thereby affecting all the other partitions?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@380
PS5, Line 380: changed
 :* mtime
It's not just the changed mtime that we're looking for in order to determiner 
modification, so you may want to either remove this or mention all the criteria 
we use. I prefer the former.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@383
PS5, Line 383: The initial table load still uses the listFiles()
 :* on the data directory that fetches both the FileStatus as 
well as BlockLocations in
 :* a single call.
remove


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@398
PS5, Line 398: get(0)
Comment why we take the file descriptors of one partition and that it doesn't 
really matter which one we choose.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@399
PS5, Line 399: public String apply(FileDescriptor desc) {
 :   return desc.getFileName();
 : }
Add @Override and make it a single line (if it fits)


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@448
PS5, Line 448: (fd == null) || (fd.getFileLength() != status.getLen()) ||
 :   (fd.getModificationTime() != status.getModificationTime());
I think we were also checking if the partition was cached. Isn't this check 
needed anymore?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@455
PS5, Line 455: refreshFileMetadata
Maybe call it refreshPartitionStorageMetadata? Overall, it may make sense to 
replace "File/Block" with "Storage" whenever it makes sense. Thoughts?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@694
PS5, Line 694: Exception
Why this change?



[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-11 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc@35
PS5, Line 35: DEFINE_int32(num_metadata_loading_threads, 16,
: "(Advanced) The number of metadata loading threads (degree of 
parallelism) to use "
: "when loading catalog metadata.");
I'm confused by the commit message which talks about not loading from hms using 
multiple threads and this flag which indicates that hms is loaded using 
multiple threads.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@217
PS5, Line 217: public int loadedFiles_ = 0;
 : public int refreshedFiles_ = 0;
 : public int ignoredFiles_ = 0;
add comments for these-- see the question regarding refreshedFiles below, for 
example.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@368
PS5, Line 368: for (HdfsPartition partition: partitions) 
partition.setFileDescriptors(
Am I misreading this or does each partition get set to the same list of newly 
found descriptors?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@426
PS5, Line 426: new Reference(Long.valueOf(0)
why not use numUnknownDiskIds here?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@431
PS5, Line 431: ++loadStats.refreshedFiles_;
does refreshedFiles mean "file blocks reloaded" or "file checked for reload and 
possibly reloaded"?

would be good to track how many times the if-block on L418 was entered since 
this method is intended to be used when few changes are present.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@433
PS5, Line 433: for (HdfsPartition partition: partitions) 
partition.setFileDescriptors(n
same question as in the load method.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@773
PS5, Line 773: HDFS and S3
just to clarify, HdfsTable covers both hdfs table metadata as well as metadata 
needed for s3?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@783
PS5, Line 783: getFileSystem(CONF)
I noticed that this is called in many places in this class-- is it bc a given 
table can be stored on multiple filesystems?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@801
PS5, Line 801: getLoadingThreadPoolSize
can different partitions have different number of files? if so, work across 
threads may vary. what's costly here: per file call, per partition call, or 
number of blocks per file?



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:51:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG@59
PS4, Line 59:  100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
> Thanks for pointing this out. I should've added more context here. Mostafa
I see. So are these literally as simple as, to pick the first one, a single 
refresh call?



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Jim Apple 
Gerrit-Comment-Date: Wed, 11 Oct 2017 01:14:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-10 Thread Bharath Vissapragada (Code Review)
Hello Jim Apple,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8235

to look at the new patch set (#5).

Change subject: IMPALA-5429: Multi threaded block metadata loading
..

IMPALA-5429: Multi threaded block metadata loading

Implements multi threaded block metadata loading on the Catalog
server where we fetch block metadata for multiple partitions
in parallel. Number of threads to load the metadata is controlled
by the following two parameters (set on the Catalog server startup)

-max_hdfs_parts_parallel_load(default=5)
-max_s3_parts_parallel_load(default=10)

We use different thread pool sizes for HDFS and S3 based tables
since S3 supports much higher throughput of RPC calls for listStatus
/listFiles. Based on our experiments, S3 showed a linear speed up
(up to ~113x) with increasing number of loading threads where as the
HDFS throughput was limited to ~5x in un-secure clusters and up to
~3.7x in secure clusters. We narrowed it down to scalability
bottlenecks in HDFS RPC implementation (HADOOP-14558) on both the
server and the client side.

One thing to note here is that the thread pool based metadata fetching
is implemented only for loading HDFS block metadata and not for loading
HMS partition information. Our experiments showed that while loading large
partitioned tables, ~90% of the time is spent in connecting to NN and
loading the HDFS block information and optimizing the rest ~10% makes
the code unnecessarily complex without much gain.

Additional notes:

- The multithreading approach is implemented for
  * INVALIDATE (loading from scratch),
  * REFRESH (reusing existing md) code paths,
  * ALTER TABLE ADD/RECOVER PARTITIONS.

- This patch makes the implementation of ListMap thread-safe since
we use that datastructure as a shared state between multiple partition
metadata loding threads.

- While the configuration param max_s3_parts_parallel_load says s3, it
applies for any FileSystem implementation that doesn't support storage
IDs (like ADLS).

Testing and Results:

- This patch doesn't add any new tests since there is enough test
coverage already. Passed core/exhaustive runs with HDFS/S3.

- We noticed up to ~113x speed up on S3 tables(thread_pool_size=160)
and up to ~5x speed up in un-secure HDFS clusters and ~3.7x in secure
HDFS clusters.

- Synthesized the following two large tables on HDFS and S3 and noticed
significant reduction in my test DDL queries.

  (1) 100K partitions + 1 million files
  (2) 80 partitions + 250K files

 100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
 100K-PARTITIONS-1M-FILES-CUSTOM-08-ADD-PARTITION I -17.25%
 80-PARTITIONS-250K-FILES-11-REFRESH-PARTITIONI -23.57%
 80-PARTITIONS-250K-FILES-S3-08-ADD-PARTITION I -23.87%
 80-PARTITIONS-250K-FILES-09-INVALIDATE   I -24.88%
 80-PARTITIONS-250K-FILES-03-RECOVER  I -35.90%
 80-PARTITIONS-250K-FILES-07-REFRESH  I -43.03%
 100K-PARTITIONS-1M-FILES-CUSTOM-12-QUERY-PARTITIONS  I -43.93%
 100K-PARTITIONS-1M-FILES-CUSTOM-05-QUERY-AFTER-INV   I -46.59%
 80-PARTITIONS-250K-FILES-10-REFRESH-AFTER-ADD-PARTITION  I -48.71%
 100K-PARTITIONS-1M-FILES-CUSTOM-07-REFRESH   I -49.02%
 80-PARTITIONS-250K-FILES-05-QUERY-AFTER-INV  I -49.05%
 100K-PARTITIONS-1M-FILES-CUSTOM-10-REFRESH-AFTER-ADD-PARTI -51.87%
 80-PARTITIONS-250K-FILES-S3-03-RECOVER   I -67.17%
 80-PARTITIONS-250K-FILES-S3-05-QUERY-AFTER-INV   I -76.45%
 80-PARTITIONS-250K-FILES-S3-07-REFRESH   I -87.04%
 80-PARTITIONS-250K-FILES-S3-10-REFRESH-AFTER-ADD-PARTI -88.57%

Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
---
M be/src/catalog/catalog.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M 
fe/src/main/java/org/apache/impala/catalog/HdfsPartitionLocationCompressor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/util/ListMap.java
9 files changed, 386 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8235/5
--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-10 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG@59
PS4, Line 59:  100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
> What are these benchmarks?
Thanks for pointing this out. I should've added more context here. Mostafa 
worked on creating a metadata benchmark by synthesizing large partitioned 
tables and then running some operations that stress the Catalog. Looks like the 
code/scripts are not yet ready for consumption by everyone. I borrowed those 
from him to run these tests. I'll add that information to the commit message.



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Jim Apple 
Gerrit-Comment-Date: Wed, 11 Oct 2017 00:24:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-09 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG@59
PS4, Line 59:  100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
What are these benchmarks?



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Jim Apple 
Gerrit-Comment-Date: Tue, 10 Oct 2017 03:48:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-09 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8235


Change subject: IMPALA-5429: Multi threaded block metadata loading
..

IMPALA-5429: Multi threaded block metadata loading

Implements multi threaded block metadata loading on the Catalog
server where we fetch block metadata for multiple partitions
in parallel. Number of threads to load the metadata is controlled
by the following two parameters (set on the Catalog server startup)

-max_hdfs_parts_parallel_load(default=5)
-max_s3_parts_parallel_load(default=10)

We use different thread pool sizes for HDFS and S3 based tables
since S3 supports much higher throughput of RPC calls for listStatus
/listFiles. Based on our experiments, S3 showed a linear speed up
(up to ~113x) with increasing number of loading threads where as the
HDFS throughput was limited to ~5x in un-secure clusters and up to
~3.7x in secure clusters. We narrowed it down to scalability
bottlenecks in HDFS RPC implementation (HADOOP-14558) on both the
server and the client side.

One thing to note here is that the thread pool based metadata fetching
is implemented only for loading HDFS block metadata and not for loading
HMS partition information. Our experiments showed that while loading large
partitioned tables, ~90% of the time is spent in connecting to NN and
loading the HDFS block information and optimizing the rest ~10% makes
the code unnecessarily complex without much gain.

Additional notes:

- The multithreading approach is implemented for
  * INVALIDATE (loading from scratch),
  * REFRESH (reusing existing md) code paths,
  * ALTER TABLE ADD/RECOVER PARTITIONS.

- This patch makes the implementation of ListMap thread-safe since
we use that datastructure as a shared state between multiple partition
metadata loding threads.

- While the configuration param max_s3_parts_parallel_load says s3, it
applies for any FileSystem implementation that doesn't support storage
IDs (like ADLS).

Testing and Results:

- This patch doesn't add any new tests since there is enough test
coverage already. Passed core/exhaustive runs with HDFS/S3.

- We noticed up to ~113x speed up on S3 tables(thread_pool_size=160)
and up to ~5x speed up in un-secure HDFS clusters and ~3.7x in secure
HDFS clusters.

- Benchmark improvements on a 16 node cluster (I = Improvement)

 100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
 100K-PARTITIONS-1M-FILES-CUSTOM-08-ADD-PARTITION I -17.25%
 100K-PARTITIONS-1M-FILES-CUSTOM-11-DROP-PARTITIONI -18.53%
 80-PARTITIONS-250K-FILES-11-REFRESH-PARTITIONI -23.57%
 80-PARTITIONS-250K-FILES-S3-08-ADD-PARTITION I -23.87%
 80-PARTITIONS-250K-FILES-09-INVALIDATE   I -24.88%
 80-PARTITIONS-250K-FILES-01-DROP I -34.82%
 80-PARTITIONS-250K-FILES-03-RECOVER  I -35.90%
 80-PARTITIONS-250K-FILES-07-REFRESH  I -43.03%
 100K-PARTITIONS-1M-FILES-CUSTOM-12-QUERY-PARTITIONS  I -43.93%
 100K-PARTITIONS-1M-FILES-CUSTOM-05-QUERY-AFTER-INV   I -46.59%
 80-PARTITIONS-250K-FILES-10-REFRESH-AFTER-ADD-PARTITION  I -48.71%
 100K-PARTITIONS-1M-FILES-CUSTOM-07-REFRESH   I -49.02%
 80-PARTITIONS-250K-FILES-05-QUERY-AFTER-INV  I -49.05%
 100K-PARTITIONS-1M-FILES-CUSTOM-10-REFRESH-AFTER-ADD-PARTI -51.87%
 80-PARTITIONS-250K-FILES-S3-03-RECOVER   I -67.17%
 80-PARTITIONS-250K-FILES-S3-01-DROP  I -70.38%
 80-PARTITIONS-250K-FILES-S3-05-QUERY-AFTER-INV   I -76.45%
 80-PARTITIONS-250K-FILES-S3-07-REFRESH   I -87.04%
 80-PARTITIONS-250K-FILES-S3-10-REFRESH-AFTER-ADD-PARTI -88.57%

Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
---
M be/src/catalog/catalog.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M 
fe/src/main/java/org/apache/impala/catalog/HdfsPartitionLocationCompressor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/util/ListMap.java
9 files changed, 386 insertions(+), 212 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8235/4
--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada