[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 21 Oct 2017 10:24:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..

IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

This change only affects uncompressed plain-encoded Parquet where
RowBatches may directly reference strings stored in the I/O
buffers. The proposed fix is to simply copy the data pages if
needed then use the same logic that we use for decompressed data
pages.

This copy inevitably adds some CPU overhead, but I believe this is
acceptable because:
* We generally recommend using compression, and optimize for that
  case.
* Copying memory is cheaper than decompressing data.
* Scans of uncompressed data are very likely to be I/O bound.

This allows several major simplifications:
* The resource management for compressed and uncompressed
  scans is much more similar.
* We don't need to attach Disk I/O buffers to RowBatches.
* We don't need to deal with attaching I/O buffers in
  ScannerContext.
* Column readers can release each I/O buffer *before* advancing to
  the next one, making it easier to reason about resource
  consumption. E.g. each Parquet column only needs one I/O buffer at
  a time to make progress.

Future changes will apply to Avro, Sequence Files and Text. Once
all scanners are converted, ScannerContext::contains_tuple_data_
will always be false and we can remove some dead code.

Testing
===
Ran core ASAN and exhaustive debug builds.

Perf

No difference in most cases when scanning uncompressed parquet.
There is a significant regression (50% increase in runtime) in
targeted perf tests scanning non-dictionary-encoded strings (see
benchmark output below).  After the regression performance is
comparable to Snappy compression.

I also did a TPC-H run but ran into some issues with the report
generator. I manually compared times and there were no regressions.

++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
++---+-++++
| TARGETED-PERF(_61) | parquet / none / none | 23.02   | +0.60% | 4.23  
 | +5.97% |
++---+-++++

+++---++-++++-+---+
| Workload   | Query  | File Format   | Avg(s) | 
Base Avg(s) | Delta(Avg) | StdDev(%)  | Base StdDev(%) | Num Clients | Iters |
+++---++-++++-+---+
| TARGETED-PERF(_61) | PERF_STRING-Q2 | parquet / none / none | 3.00   | 
1.98| R +52.10%  |   0.97%|   1.25%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q1 | parquet / none / none | 2.86   | 
1.92| R +49.11%  |   0.34%|   2.34%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q3 | parquet / none / none | 3.16   | 
2.15| R +47.04%  |   1.03%|   0.72%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q4 | parquet / none / none | 3.16   | 
2.17| R +45.60%  |   0.14%|   1.11%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q5 | parquet / none / none | 3.51   | 
2.55| R +37.88%  |   0.83%|   0.49%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_AGG-Q5| parquet / none / none | 0.79   | 
0.61| R +30.86%  |   1.54%|   4.10%| 1   | 5 |
| TARGETED-PERF(_61) | primitive_top-n_al | parquet / none / none | 39.45  | 
35.07   |   +12.51%  |   0.29%|   0.29%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q7 | parquet / none / none | 6.78   | 
6.10|   +11.13%  |   0.99%|   0.74%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q6 | parquet / none / none | 8.83   | 
8.14|   +8.52%   |   0.15%|   0.32%| 1   | 5 |
...

Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Reviewed-on: http://gerrit.cloudera.org:8080/8085
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.h
4 files changed, 66 insertions(+), 53 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit 

[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1363/


--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 21 Oct 2017 06:28:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 10: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1361/


--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 21 Oct 2017 03:08:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1361/


--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Oct 2017 23:16:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 10: Code-Review+2

carry Dan's +2


--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 18 Oct 2017 19:12:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 10:

Mostafa, we spoke offline about this but just wanted to confirm it looked ok to 
you. The summary is that the CPU cost of value materialisation for 
non-dict-encoded strings in uncompressed parquet scans regresses to roughly the 
same CPU cost as snappy-compressed parquet. I didn't see any regressions on 
TPC-H, only narrowly targeted benchmarks.


--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 18 Oct 2017 19:12:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc@983
PS8, Line 983: data
> buffers?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 18 Oct 2017 19:09:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-18 Thread Tim Armstrong (Code Review)
Hello Lars Volker, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8085

to look at the new patch set (#9).

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..

IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

This change only affects uncompressed plain-encoded Parquet where
RowBatches may directly reference strings stored in the I/O
buffers. The proposed fix is to simply copy the data pages if
needed then use the same logic that we use for decompressed data
pages.

This copy inevitably adds some CPU overhead, but I believe this is
acceptable because:
* We generally recommend using compression, and optimize for that
  case.
* Copying memory is cheaper than decompressing data.
* Scans of uncompressed data are very likely to be I/O bound.

This allows several major simplifications:
* The resource management for compressed and uncompressed
  scans is much more similar.
* We don't need to attach Disk I/O buffers to RowBatches.
* We don't need to deal with attaching I/O buffers in
  ScannerContext.
* Column readers can release each I/O buffer *before* advancing to
  the next one, making it easier to reason about resource
  consumption. E.g. each Parquet column only needs one I/O buffer at
  a time to make progress.

Future changes will apply to Avro, Sequence Files and Text. Once
all scanners are converted, ScannerContext::contains_tuple_data_
will always be false and we can remove some dead code.

Testing
===
Ran core ASAN and exhaustive debug builds.

Perf

No difference in most cases when scanning uncompressed parquet.
There is a significant regression (50% increase in runtime) in
targeted perf tests scanning non-dictionary-encoded strings (see
benchmark output below).  After the regression performance is
comparable to Snappy compression.

I also did a TPC-H run but ran into some issues with the report
generator. I manually compared times and there were no regressions.

++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
++---+-++++
| TARGETED-PERF(_61) | parquet / none / none | 23.02   | +0.60% | 4.23  
 | +5.97% |
++---+-++++

+++---++-++++-+---+
| Workload   | Query  | File Format   | Avg(s) | 
Base Avg(s) | Delta(Avg) | StdDev(%)  | Base StdDev(%) | Num Clients | Iters |
+++---++-++++-+---+
| TARGETED-PERF(_61) | PERF_STRING-Q2 | parquet / none / none | 3.00   | 
1.98| R +52.10%  |   0.97%|   1.25%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q1 | parquet / none / none | 2.86   | 
1.92| R +49.11%  |   0.34%|   2.34%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q3 | parquet / none / none | 3.16   | 
2.15| R +47.04%  |   1.03%|   0.72%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q4 | parquet / none / none | 3.16   | 
2.17| R +45.60%  |   0.14%|   1.11%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q5 | parquet / none / none | 3.51   | 
2.55| R +37.88%  |   0.83%|   0.49%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_AGG-Q5| parquet / none / none | 0.79   | 
0.61| R +30.86%  |   1.54%|   4.10%| 1   | 5 |
| TARGETED-PERF(_61) | primitive_top-n_al | parquet / none / none | 39.45  | 
35.07   |   +12.51%  |   0.29%|   0.29%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q7 | parquet / none / none | 6.78   | 
6.10|   +11.13%  |   0.99%|   0.74%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q6 | parquet / none / none | 8.83   | 
8.14|   +8.52%   |   0.15%|   0.32%| 1   | 5 |
...

Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.h
4 files changed, 66 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/8085/9
--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master

[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-18 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 8: Code-Review+2

(1 comment)

Yes, this is good with me. Could you get Mostafa to sign off on perf 
implications, perhaps recording that in the JIRA if not already 
(issues.apache.org not loading for me)?

http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc@983
PS8, Line 983: data
buffers?



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 18 Oct 2017 18:16:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 8:

Dan, does this look ok to you at a high level? I think the code change itself 
is relatively straightforward but it would be good to confirm the direction and 
perf trade-off.


--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 18 Oct 2017 17:54:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 8: Code-Review+1

rebase, carry +1 from alex


--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 13 Oct 2017 23:10:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 7: Code-Review+1

Carry +1


--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 04 Oct 2017 22:15:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc@1071
PS6, Line 1071:   if 
(PageContainsTupleData(current_page_header_.data_page_header.encoding)) {
> Does seem like the simplest solution. Might be worth adding a short comment
Expanded the comment just below to explain motivation.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 04 Oct 2017 22:07:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-04 Thread Tim Armstrong (Code Review)
Hello Lars Volker, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8085

to look at the new patch set (#7).

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..

IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

This change only affects uncompressed plain-encoded Parquet where
RowBatches may directly reference strings stored in the I/O
buffers. The proposed fix is to simply copy the data pages if
needed then use the same logic that we use for decompressed data
pages.

This copy inevitably adds some CPU overhead, but I believe this is
acceptable because:
* We generally recommend using compression, and optimize for that
  case.
* Copying memory is cheaper than decompressing data.
* Scans of uncompressed data are very likely to be I/O bound.

This allows several major simplifications:
* The resource management for compressed and uncompressed
  scans is much more similar.
* We don't need to attach Disk I/O buffers to RowBatches.
* We don't need to deal with attaching I/O buffers in
  ScannerContext.
* Column readers can release each I/O buffer *before* advancing to
  the next one, making it easier to reason about resource
  consumption. E.g. each Parquet column only needs one I/O buffer at
  a time to make progress.

Future changes will apply to Avro, Sequence Files and Text. Once
all scanners are converted, ScannerContext::contains_tuple_data_
will always be false and we can remove some dead code.

Testing
===
Ran core ASAN and exhaustive debug builds.

Perf

No difference in most cases when scanning uncompressed parquet.
There is a significant regression (50% increase in runtime) in
targeted perf tests scanning non-dictionary-encoded strings (see
benchmark output below).  After the regression performance is
comparable to Snappy compression.

I also did a TPC-H run but ran into some issues with the report
generator. I manually compared times and there were no regressions.

++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
++---+-++++
| TARGETED-PERF(_61) | parquet / none / none | 23.02   | +0.60% | 4.23  
 | +5.97% |
++---+-++++

+++---++-++++-+---+
| Workload   | Query  | File Format   | Avg(s) | 
Base Avg(s) | Delta(Avg) | StdDev(%)  | Base StdDev(%) | Num Clients | Iters |
+++---++-++++-+---+
| TARGETED-PERF(_61) | PERF_STRING-Q2 | parquet / none / none | 3.00   | 
1.98| R +52.10%  |   0.97%|   1.25%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q1 | parquet / none / none | 2.86   | 
1.92| R +49.11%  |   0.34%|   2.34%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q3 | parquet / none / none | 3.16   | 
2.15| R +47.04%  |   1.03%|   0.72%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q4 | parquet / none / none | 3.16   | 
2.17| R +45.60%  |   0.14%|   1.11%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q5 | parquet / none / none | 3.51   | 
2.55| R +37.88%  |   0.83%|   0.49%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_AGG-Q5| parquet / none / none | 0.79   | 
0.61| R +30.86%  |   1.54%|   4.10%| 1   | 5 |
| TARGETED-PERF(_61) | primitive_top-n_al | parquet / none / none | 39.45  | 
35.07   |   +12.51%  |   0.29%|   0.29%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q7 | parquet / none / none | 6.78   | 
6.10|   +11.13%  |   0.99%|   0.74%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q6 | parquet / none / none | 8.83   | 
8.14|   +8.52%   |   0.15%|   0.32%| 1   | 5 |
...

Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.h
4 files changed, 66 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/8085/7
--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master

[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-28 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc@1071
PS6, Line 1071:   if 
(PageContainsTupleData(current_page_header_.data_page_header.encoding)) {
> The basic problem is that I want to be able to reason about how many I/O bu
Does seem like the simplest solution. Might be worth adding a short comment 
here about the intention, since it's not necessarily obvious (at least to me).

Data pages are indeed smaller, but due to the mempool I think you may end up 
allocating MAX_CHUNK_SIZE per data page in the worst case. Probably not very 
common, so seems ok.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 29 Sep 2017 04:34:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc@1071
PS6, Line 1071:   if 
(PageContainsTupleData(current_page_header_.data_page_header.encoding)) {
> Playing devil's advocate and really trying to understand the crux of why th
The basic problem is that I want to be able to reason about how many I/O 
buffers you need to make progress in reading through a Parquet file. If each 
column can make progress independently then it's simple since you can reason 
about each column in isolation. With the copying out you only need one buffer 
per column to make progress.

If you try to hold onto I/O buffers you end up with dependencies between column 
readers. E.g. reader A may have materialised 512 values and hit the end of a 
buffer. Reader A can't free the previous buffer until those rows have been 
returned. Those rows can't be returned until all of the column values have been 
materialised for those rows. But maybe column B can only materialise 511 values 
before hitting the end of its buffer. So then you'd need to output 511 rows, 
then advance column B, then output 1 row before you advance column A.

You can get out of that two ways:
* Advance the different column readers different amounts and make progress on 
materialising as many final rows as you can. I think you would need to try to 
advance all column readers, then return however many rows you can at that 
point, then try to inch them forward again.
* Let each column reader advance to the next buffer before freeing the previous 
one and hope that all column readers can materialise the requested number of 
values without running out of buffers. We'd need to make some assumptions to 
ensure that we can always materialise batch_size values given M I/O buffers. I 
don't think we can make any universally valid assumptions so we'd probably have 
to add code to detect when those assumptions are invalidated and add query 
options to get out of any problems.

It doesn't seem worth spending complexity on either solution, especially since 
we'd need extra test coverage.

Data pages are much smaller (around 64kb) than the 8MB I/O buffers and aren't 
currently part of the reservation so they're less of a problem for now - the 
memory overhead per column will be around max(2 * 64kb, 1024 * ) which is tolerable.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 29 Sep 2017 01:41:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-28 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc@1071
PS6, Line 1071:   if 
(PageContainsTupleData(current_page_header_.data_page_header.encoding)) {
> (if we pushed predicate evaluation down into the column readers, that might
Playing devil's advocate and really trying to understand the crux of why this 
new approach is simpler.

You are right that the stream can free the previous I/O buffer immediately, but 
you are still accumulating the same amount of memory in the data_pool_, so the 
peak memory consumption seems no different than before.

Overall you might even need one more page than before. We copy and accumulate 
data pages here, but only free the data page from the stream in the next call 
to ReadDataPage(), so for some period of time one page exists both in the 
stream and in the data_pool_.

I'm not sure I follow why the the "synchronized release of I/O buffers" is 
difficult in the copy-in-transfer approach. The scratch batch already has all 
interesting I/O buffers attached, so we can release those after copying the 
var-len data of survivors. Returned batches will have no I/O buffers attached.

Completely agree about the perf tradeoffs you mentioned.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 29 Sep 2017 00:35:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc@1071
PS6, Line 1071:   if 
(PageContainsTupleData(current_page_header_.data_page_header.encoding)) {
> That would reintroduce one of the problems I was trying to solve - the need
(if we pushed predicate evaluation down into the column readers, that might be 
viable, although there's a trade-off between optimising the number of bytes 
copied and optimising the number of individual copies, since each copy has some 
fixed overhead.).



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 29 Sep 2017 00:01:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc@1071
PS6, Line 1071:   if 
(PageContainsTupleData(current_page_header_.data_page_header.encoding)) {
> Instead of eagerly copying the entire page, have you considered only coping
That would reintroduce one of the problems I was trying to solve - the need to 
synchronise release of I/O buffers between different column readers. The big 
simplification in the current approach is that each column reader can 
immediately free the previous I/O buffer as soon as it needs to advance to the 
net data page.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 28 Sep 2017 23:59:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44
PS6, Line 44: There is a significant regression (50% increase in runtime) in
> Thanks for the experiment. I'm in favor of moving forward, but still a litt
I think the upside is that simplifying the Parquet code incrementally makes it 
easier to optimise in future. We have plenty of scope to optimise CPU 
efficiency in future in ways that will benefit most queries. I.e. I wouldn't 
spend time optimising the uncompressed strings case, rather just try to make it 
overall more CPU efficient.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 28 Sep 2017 23:57:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-28 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44
PS6, Line 44: There is a significant regression (50% increase in runtime) in
> I reran with mt_dop=8. Time spent in the scan on STRING-Q1 went from 3.35s
Thanks for the experiment. I'm in favor of moving forward, but still a little 
worried. Maybe we can investigate later whether we can recoup some of the perf 
losses. The main issue is that affected users have no way to get the 
performance back, except downgrading.

I'm curious to hear what other think of the perf regression.


http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/6/be/src/exec/parquet-column-readers.cc@1071
PS6, Line 1071:   if 
(PageContainsTupleData(current_page_header_.data_page_header.encoding)) {
Instead of eagerly copying the entire page, have you considered only coping the 
var-len data of surviving tuples in TransferScratchTuples() or 
FinalizeTupleTransfer()?

That approach might be more efficient for selective scans, but worse for full 
scans.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 28 Sep 2017 23:44:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44
PS6, Line 44: There is a significant regression (50% increase in runtime) in
> Makes sense. Might still be worth doing a quick experiment with MT_DOP to s
I reran with mt_dop=8. Time spent in the scan on STRING-Q1 went from 3.35s -> 
5.15s. I.e. about +50%, so no real difference.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 28 Sep 2017 22:08:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-27 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44
PS6, Line 44: There is a significant regression (50% increase in runtime) in
> I'm pretty sure it's the memory copying - making a memory allocation should
Makes sense. Might still be worth doing a quick experiment with MT_DOP to see 
if the regression remains the same (MT_DOP is easier to reason about).

I'm definitely in favor of this change, but uncompressed Parquet with a lot of 
string columns is unfortunately very common, despite our recommendations.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 27 Sep 2017 16:25:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-27 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/hdfs-parquet-scanner.cc@245
PS5, Line 245:   context_->ReleaseCompletedResources(nullptr, true);
> I can obviously fix it if you feel strongly but it just doesn't seem that i
Ah, I understand.


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h@477
PS5, Line 477:   int64_t size, const char* err_ctx, uint8_t** buffer);
> Done. Also switch to const char* so we don't need to create a std::string f
Thx. ctx somewhat reminds me of the various context objects that we use 
elsewhere, but I don't feel strongly about it.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 27 Sep 2017 16:20:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44
PS6, Line 44: There is a significant regression (50% increase in runtime) in
> Just to clarify my point. My hope is that most of the overhead may ultimate
I'm pretty sure it's the memory copying - making a memory allocation should at 
least an order of magnitude cheaper than doing a pass over a data page. Unsure 
if the difference is due to the extra instructions executed or the increase in 
cache pressure from having two copies of the data.

I haven't measured but I'm also not convinced that the Disk IO Mgr's buffer 
caching is necessarily more efficient or scalable than TCMalloc. The IO mgr 
just has a global lock whereas TCMalloc has locks per size class plus batching 
via the thread cache. The buffer pool should be more scalable for large 
allocations than either in any case.

The queries that regressed are close to the worst possible case since they 
don't do any work aside from materialising the strings and evaluating a 
conjunct. Plus the data is already present in the buffer cache.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 27 Sep 2017 16:09:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-27 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44
PS6, Line 44: There is a significant regression (50% increase in runtime) in
> Maybe the main difference is that copying into a mempool will ultimately fr
Just to clarify my point. My hope is that most of the overhead may ultimately 
go away when backing the mem pool by the buffer pool (which I presume can 
recycle buffers?).



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 27 Sep 2017 15:42:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-27 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44
PS6, Line 44: There is a significant regression (50% increase in runtime) in
> Do we understand where the regression comes from exactly? Somehow I find it
Maybe the main difference is that copying into a mempool will ultimately free 
the copied memory, but I/O buffers may be recycled and would not take the 
tcmalloc hit from freeing from another thread?



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 27 Sep 2017 15:38:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-27 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44
PS6, Line 44: There is a significant regression (50% increase in runtime) in
Do we understand where the regression comes from exactly? Somehow I find it 
hard to believe a 50% overhead just from copying. Is it maybe another tcmalloc 
issue where different threads allocate/free the memory? Have you tried a 
similar experiment with MT_DOP?



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 27 Sep 2017 15:34:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/hdfs-parquet-scanner.cc@245
PS5, Line 245:   context_->ReleaseCompletedResources(nullptr, true);
> I fixed it while I was checking the callsites of ReleaseCompletedResources(
I can obviously fix it if you feel strongly but it just doesn't seem that 
important to me.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 26 Sep 2017 23:50:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-26 Thread Tim Armstrong (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8085

to look at the new patch set (#6).

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..

IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

This change only affects uncompressed plain-encoded Parquet where
RowBatches may directly reference strings stored in the I/O
buffers. The proposed fix is to simply copy the data pages if
needed then use the same logic that we use for decompressed data
pages.

This copy inevitably adds some CPU overhead, but I believe this is
acceptable because:
* We generally recommend using compression, and optimize for that
  case.
* Copying memory is cheaper than decompressing data.
* Scans of uncompressed data are very likely to be I/O bound.

This allows several major simplifications:
* The resource management for compressed and uncompressed
  scans is much more similar.
* We don't need to attach Disk I/O buffers to RowBatches.
* We don't need to deal with attaching I/O buffers in
  ScannerContext.
* Column readers can release each I/O buffer *before* advancing to
  the next one, making it easier to reason about resource
  consumption. E.g. each Parquet column only needs one I/O buffer at
  a time to make progress.

Future changes will apply to Avro, Sequence Files and Text. Once
all scanners are converted, ScannerContext::contains_tuple_data_
will always be false and we can remove some dead code.

Testing
===
Ran core ASAN and exhaustive debug builds.

Perf

No difference in most cases when scanning uncompressed parquet.
There is a significant regression (50% increase in runtime) in
targeted perf tests scanning non-dictionary-encoded strings (see
benchmark output below).  After the regression performance is
comparable to Snappy compression.

I also did a TPC-H run but ran into some issues with the report
generator. I manually compared times and there were no regressions.

++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
++---+-++++
| TARGETED-PERF(_61) | parquet / none / none | 23.02   | +0.60% | 4.23  
 | +5.97% |
++---+-++++

+++---++-++++-+---+
| Workload   | Query  | File Format   | Avg(s) | 
Base Avg(s) | Delta(Avg) | StdDev(%)  | Base StdDev(%) | Num Clients | Iters |
+++---++-++++-+---+
| TARGETED-PERF(_61) | PERF_STRING-Q2 | parquet / none / none | 3.00   | 
1.98| R +52.10%  |   0.97%|   1.25%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q1 | parquet / none / none | 2.86   | 
1.92| R +49.11%  |   0.34%|   2.34%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q3 | parquet / none / none | 3.16   | 
2.15| R +47.04%  |   1.03%|   0.72%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q4 | parquet / none / none | 3.16   | 
2.17| R +45.60%  |   0.14%|   1.11%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q5 | parquet / none / none | 3.51   | 
2.55| R +37.88%  |   0.83%|   0.49%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_AGG-Q5| parquet / none / none | 0.79   | 
0.61| R +30.86%  |   1.54%|   4.10%| 1   | 5 |
| TARGETED-PERF(_61) | primitive_top-n_al | parquet / none / none | 39.45  | 
35.07   |   +12.51%  |   0.29%|   0.29%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q7 | parquet / none / none | 6.78   | 
6.10|   +11.13%  |   0.99%|   0.74%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q6 | parquet / none / none | 8.83   | 
8.14|   +8.52%   |   0.15%|   0.32%| 1   | 5 |
...

Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.h
4 files changed, 64 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/8085/6
--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: 

[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 5:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8085/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8085/5//COMMIT_MSG@7
PS5, Line 7: IMPALA-5307
> Can you add a line at the bottom what the other part(s) would look like?
Done


http://gerrit.cloudera.org:8080/#/c/8085/5//COMMIT_MSG@56
PS5, Line 56: 
+++---++-++++-+---+
> Nit: You could make the second column smaller to make this more readable, a
Done


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/hdfs-parquet-scanner.cc@245
PS5, Line 245:   context_->ReleaseCompletedResources(nullptr, true);
> I think it's best to change the whole file at once, or only change occurren
I fixed it while I was checking the callsites of ReleaseCompletedResources().


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h@476
PS5, Line 476:   Status AllocateUncompressedDataPage(
> Should we call this "AllocateUncompressedDataBuffer"? Otherwise it sounds t
That seems vaguer. It would also be confusing since the pool is 
data_page_pool_. I updated the comment to mention that the contents are 
uncompressed rather than the page.


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h@477
PS5, Line 477:   int64_t size, const std::string& desc, uint8_t** buffer);
> Maybe err_desc, err_detail, or detail? "desc" reminds me of descriptors.
Done. Also switch to const char* so we don't need to create a std::string for 
every call.


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h@485
PS5, Line 485: IsStringType
> This does not say "VarLenStringType" but above in a comment you refer to va
Fixed it. We don't need to copy if it's a CHAR.


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.cc@1075
PS5, Line 1075:   uncompressed_size, "uncompressed variable-length 
data", _buffer));
> DCHECK(copy_buffer != nullptr); And maybe initialize it to nullptr, so that
It doesn't seem that necessary to me since it's part of the contract of the 
function and it will crash immediately anyway.


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.cc@1105
PS5, Line 1105: *buffer = data_page_pool_->TryAllocate(size);
Bad indentation.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 26 Sep 2017 23:49:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-25 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 5:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8085/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8085/5//COMMIT_MSG@7
PS5, Line 7: IMPALA-5307
Can you add a line at the bottom what the other part(s) would look like?


http://gerrit.cloudera.org:8080/#/c/8085/5//COMMIT_MSG@56
PS5, Line 56: 
+++---++-++++-+---+
Nit: You could make the second column smaller to make this more readable, and 
add a bottom delimiter line to indicate it was truncated on purposed and not by 
mistake.


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/hdfs-parquet-scanner.cc@245
PS5, Line 245:   context_->ReleaseCompletedResources(nullptr, true);
I think it's best to change the whole file at once, or only change occurrences 
where necessary. This looks like it may be left from a previous patchset.


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h@476
PS5, Line 476:   Status AllocateUncompressedDataPage(
Should we call this "AllocateUncompressedDataBuffer"? Otherwise it sounds to me 
like it'll only be needed for uncompressed pages.


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h@477
PS5, Line 477:   int64_t size, const std::string& desc, uint8_t** buffer);
Maybe err_desc, err_detail, or detail? "desc" reminds me of descriptors.


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h@485
PS5, Line 485: IsStringType
This does not say "VarLenStringType" but above in a comment you refer to 
var-len data. Can you clarify one of them?


http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.cc@1075
PS5, Line 1075:   uncompressed_size, "uncompressed variable-length 
data", _buffer));
DCHECK(copy_buffer != nullptr); And maybe initialize it to nullptr, so that 
it's explicit what the allocation will do.



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Mon, 25 Sep 2017 23:30:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-09-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..

IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

This change only affects uncompressed plain-encoded Parquet where
RowBatches may directly reference strings stored in the I/O
buffers. The proposed fix is to simply copy the data pages if
needed then use the same logic that we use for decompressed data
pages.

This copy inevitably adds some CPU overhead, but I believe this is
acceptable because:
* We generally recommend using compression, and optimize for that
  case.
* Copying memory is cheaper than decompressing data.
* Scans of uncompressed data are very likely to be I/O bound.

This allows several major simplifications:
* The resource management for compressed and uncompressed
  scans is much more similar.
* We don't need to attach Disk I/O buffers to RowBatches.
* We don't need to deal with attaching I/O buffers in
  ScannerContext.
* Column readers can release each I/O buffer *before* advancing to
  the next one, making it easier to reason about resource
  consumption. E.g. each Parquet column only needs one I/O buffer at
  a time to make progress.

Testing
===
Ran core ASAN and exhaustive debug builds.

Perf

No difference in most cases when scanning uncompressed parquet.
There is a significant regression (50% increase in runtime) in
targeted perf tests scanning non-dictionary-encoded strings (see
benchmark output below).  After the regression performance is
comparable to Snappy compression.

I also did a TPC-H run but ran into some issues with the report
generator. I manually compared times and there were no regressions.

++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
++---+-++++
| TARGETED-PERF(_61) | parquet / none / none | 23.02   | +0.60% | 4.23  
 | +5.97% |
++---+-++++

+++---++-++++-+---+
| Workload   | Query  | 
File Format   | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%)  | Base 
StdDev(%) | Num Clients | Iters |
+++---++-++++-+---+
| TARGETED-PERF(_61) | PERF_STRING-Q2 | 
parquet / none / none | 3.00   | 1.98| R +52.10%  |   0.97%|   
1.25%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q1 | 
parquet / none / none | 2.86   | 1.92| R +49.11%  |   0.34%|   
2.34%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q3 | 
parquet / none / none | 3.16   | 2.15| R +47.04%  |   1.03%|   
0.72%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q4 | 
parquet / none / none | 3.16   | 2.17| R +45.60%  |   0.14%|   
1.11%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q5 | 
parquet / none / none | 3.51   | 2.55| R +37.88%  |   0.83%|   
0.49%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_AGG-Q5| 
parquet / none / none | 0.79   | 0.61| R +30.86%  |   1.54%|   
4.10%| 1   | 5 |
| TARGETED-PERF(_61) | primitive_top-n_all| 
parquet / none / none | 39.45  | 35.07   |   +12.51%  |   0.29%|   
0.29%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q7 | 
parquet / none / none | 6.78   | 6.10|   +11.13%  |   0.99%|   
0.74%| 1   | 5 |
| TARGETED-PERF(_61) | PERF_STRING-Q6 | 
parquet / none / none | 8.83   | 8.14|   +8.52%   |   0.15%|   
0.32%| 1   | 5 |

Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.h
4 files changed, 63 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/8085/5
--