Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11575 )

Change subject: IMPALA-6964: Track stats about column and page sizes in Parquet 
reader
......................................................................


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/11575/5/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

http://gerrit.cloudera.org:8080/#/c/11575/5/be/src/exec/hdfs-parquet-scanner.h@662
PS5, Line 662: ParquetColumnReade
nit: typo


http://gerrit.cloudera.org:8080/#/c/11575/5/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/11575/5/be/src/exec/hdfs-scan-node-base.cc@635
PS5, Line 635:   DCHECK(scanner->get() == nullptr);
DCHECK_EQ


http://gerrit.cloudera.org:8080/#/c/11575/5/tests/util/parse_util.py
File tests/util/parse_util.py:

http://gerrit.cloudera.org:8080/#/c/11575/5/tests/util/parse_util.py@130
PS5, Line 130: _
By convention we (and Python) often uses a leading _ for private methods. Now 
that it is used in other modules you should remove it.


http://gerrit.cloudera.org:8080/#/c/11575/5/tests/util/parse_util.py@137
PS5, Line 137:
Please put an example of a string here that this regex matches. You might even 
put an assert regex_...matches(example) there.


http://gerrit.cloudera.org:8080/#/c/11575/5/tests/util/parse_util.py@138
PS5, Line 138:   regex_summary_stat = r"\(Avg:.*?\((?P<avg>[0-9]*[.]*[0-9]*)\) 
; " \
Use re.compile, then wrap the RE in """ to not have to use backslashes. This 
page has more details: https://docs.python.org/2/library/re.html#re.VERBOSE


http://gerrit.cloudera.org:8080/#/c/11575/5/tests/util/parse_util.py@147
PS5, Line 147:   for counter in re.findall(counter_name + ".*", 
runtime_profile):
Will that match newlines, too?


http://gerrit.cloudera.org:8080/#/c/11575/5/tests/util/parse_util.py@148
PS5, Line 148:     summary_stat = list(re.finditer(regex_summary_stat, counter))
If you do this line-wise, why would there be more than one match?


http://gerrit.cloudera.org:8080/#/c/11575/5/tests/util/parse_util.py@149
PS5, Line 149: all
nit: at all?


http://gerrit.cloudera.org:8080/#/c/11575/5/tests/util/parse_util.py@152
PS5, Line 152:
> flake8: E203 whitespace before ':'
You can also create a dict using dict(avg=0, min=0, max=0, samples=0) (might be 
a bit more concise).

You can also add an assert to make sure the counter is actually empty.



--
To view, visit http://gerrit.cloudera.org:8080/11575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I322f9b324b6828df28e5caf79529085c43d7c817
Gerrit-Change-Number: 11575
Gerrit-PatchSet: 5
Gerrit-Owner: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Comment-Date: Thu, 08 Nov 2018 03:10:44 +0000
Gerrit-HasComments: Yes

Reply via email to