Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/11575 )
Change subject: IMPALA-6964: Track stats about column and page sizes in Parquet reader ...................................................................... Patch Set 10: (7 comments) http://gerrit.cloudera.org:8080/#/c/11575/10/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/11575/10/be/src/exec/hdfs-scan-node-base.h@563 PS10, Line 563: multiple nit: double word http://gerrit.cloudera.org:8080/#/c/11575/10/be/src/exec/hdfs-scan-node-base.h@564 PS10, Line 564: updated nit: typo http://gerrit.cloudera.org:8080/#/c/11575/10/be/src/exec/hdfs-scan-node-base.h@565 PS10, Line 565: boost::shared_mutex bytes_read_per_col_lock_; Is it an option to collect the stats in the column readers and then pull them from there in StopAndFinalizeCounters instead? That would prevent us from having a lock. http://gerrit.cloudera.org:8080/#/c/11575/10/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/11575/10/be/src/exec/hdfs-scan-node-base.cc@901 PS10, Line 901: uncompressed_bytes_read_per_column_counter_->UpdateCounter(bytes_read.second.first); Do we need to acquire the lock here, too? It we don't, can you add a comment to explain why? http://gerrit.cloudera.org:8080/#/c/11575/10/be/src/exec/hdfs-scan-node-base.cc@945 PS10, Line 945: auto bytes_read = bytes_read_per_col_.find(slot_id); nit: in this case I'd mention the type in the name to prevent confusion, i.e. bytes_read_it = ... http://gerrit.cloudera.org:8080/#/c/11575/10/be/src/exec/hdfs-scan-node-base.cc@947 PS10, Line 947: lock_guard<boost::shared_mutex> bytes_read_per_col_guard(bytes_read_per_col_lock_); It seems that unique_lock is the corresponding unique version of shared_lock. Is there a benefit in using shared_lock? Can you add the namespace to the lock templates, too? Otherwise it's not clear whether we're using std:: or boost::. http://gerrit.cloudera.org:8080/#/c/11575/5/tests/util/parse_util.py File tests/util/parse_util.py: http://gerrit.cloudera.org:8080/#/c/11575/5/tests/util/parse_util.py@137 PS5, Line 137: example of how this method should be used: > Done I think it'd be nice to actually execute the example code. You could look for a suitable test where to add it, or just add the assert to this method. -- To view, visit http://gerrit.cloudera.org:8080/11575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I322f9b324b6828df28e5caf79529085c43d7c817 Gerrit-Change-Number: 11575 Gerrit-PatchSet: 10 Gerrit-Owner: Sahil Takiar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Comment-Date: Tue, 13 Nov 2018 20:28:11 +0000 Gerrit-HasComments: Yes
