Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc@1071
PS6, Line 1071:       if 
(PageContainsTupleData(current_page_header_.data_page_header.encoding)) {
> Playing devil's advocate and really trying to understand the crux of why th
The basic problem is that I want to be able to reason about how many I/O 
buffers you need to make progress in reading through a Parquet file. If each 
column can make progress independently then it's simple since you can reason 
about each column in isolation. With the copying out you only need one buffer 
per column to make progress.

If you try to hold onto I/O buffers you end up with dependencies between column 
readers. E.g. reader A may have materialised 512 values and hit the end of a 
buffer. Reader A can't free the previous buffer until those rows have been 
returned. Those rows can't be returned until all of the column values have been 
materialised for those rows. But maybe column B can only materialise 511 values 
before hitting the end of its buffer. So then you'd need to output 511 rows, 
then advance column B, then output 1 row before you advance column A.

You can get out of that two ways:
* Advance the different column readers different amounts and make progress on 
materialising as many final rows as you can. I think you would need to try to 
advance all column readers, then return however many rows you can at that 
point, then try to inch them forward again.
* Let each column reader advance to the next buffer before freeing the previous 
one and hope that all column readers can materialise the requested number of 
values without running out of buffers. We'd need to make some assumptions to 
ensure that we can always materialise batch_size values given M I/O buffers. I 
don't think we can make any universally valid assumptions so we'd probably have 
to add code to detect when those assumptions are invalidated and add query 
options to get out of any problems.

It doesn't seem worth spending complexity on either solution, especially since 
we'd need extra test coverage.

Data pages are much smaller (around 64kb) than the 8MB I/O buffers and aren't 
currently part of the reservation so they're less of a problem for now - the 
memory overhead per column will be around max(2 * 64kb, 1024 * <avg value 
size>) which is tolerable.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Fri, 29 Sep 2017 01:41:25 +0000
Gerrit-HasComments: Yes

Reply via email to