Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9757 )

Change subject: IMPALA-6679,IMPALA-6678: reduce scan reservation
......................................................................


Patch Set 6:

(20 comments)

> This isn't ready for review yet, but since you were interested.
 > I've been able to confirm that the mem_usage_scaling regressions
 > were purely because of reserving more memory than needed. I haven't
 > investigated if these changes are sufficient to solve the stress
 > test regressions.

did you get time to investigate how this affects the stress test regression?

http://gerrit.cloudera.org:8080/#/c/9757/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9757/6//COMMIT_MSG@13
PS6, Line 13: When starting each scan range, check to see how big the initial 
scan
            : range is (the full thing for row-based formats, the footer for
            : Parquet) and determine whether more reservation would be useful.
            :
            : For Parquet, check the actual column layout of the file before
            : increasing reservation to avoid getting reservation that we
            : can't actually use for the files that we're scanning. This also
            : means we can avoid estimating an "ideal" reservation in the 
planner.
do we have tests that verify that the reservation is increased to expected 
levels after a low planner estimate? both for parquet and other formats


http://gerrit.cloudera.org:8080/#/c/9757/6//COMMIT_MSG@43
PS6, Line 43: Added counters to track when threads were not spawned due to 
reservation
            : and to track when reservation increases are requested and denied.
add tests to verify that reservations were denied if enough was not available


http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-parquet-scanner-test.cc
File be/src/exec/hdfs-parquet-scanner-test.cc:

http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-parquet-scanner-test.cc@45
PS6, Line 45: // Should round up to nearest power-of-two buffer size if < max 
scan range buffer
maybe add a case that is somewhere in between min and max buffer


http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-parquet-scanner.cc@1699
PS6, Line 1699: ideal_reservation += min(BitUtil::RoundUpToPowerOf2(len, 
max_buffer_size),
              :           DiskIoMgr::IDEAL_MAX_SIZED_BUFFERS_PER_SCAN_RANGE * 
max_buffer_size);
maybe mention here briefly why we increment in steps of max_buffer_size if 
len>max_buffer_size. Similar to description in line 1750 -1753


http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-scan-node-base.cc@527
PS6, Line 527: int64_t target = BitUtil::RoundUpToPowerOf2(*reservation, 
max_buffer_size);
             :       if (*reservation == target) {
             :         // If it's already an I/O buffer multiple, increase by 
an I/O buffer or to ideal.
             :         target = min(*reservation + max_buffer_size, 
ideal_scan_range_reservation);
             :       }
how about
int64_t target = BitUtil::RoundUpToPowerOf2( *reservation + 1, max_buffer_size);

this will always result an increase to an I/O buffer multiple upto the 
ideal_scan_range_reservation (since *reservation < ideal_scan_range_reservation)


http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-scan-node.h
File be/src/exec/hdfs-scan-node.h:

http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-scan-node.h@178
PS6, Line 178:   /// this thread with DeductReservationForScannerThread().
maybe mention that it releases Reservation and only holds on to minimum 
reservation.


http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-scan-node.h@204
PS6, Line 204: or
nit: return all

maybe mention that it holds on to minimum reservation


http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-scan-node.cc@226
PS6, Line 226: spare_reservation_
what if between line 225 and 226, some scanner thread releases its reservation, 
which results in spare reservation being more than 
resource_profile_.min_reservation and 'increase' being negative. does that mean 
that buffer_pool_client_.IncreaseReservation would actually let go of 
reservation? If yes, is that the intended behaviour?


http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-scan-node.cc@392
PS6, Line 392: // Take a snapshot of num_unqueued_files_ before calling 
GetNextUnstartedRange().
             :     // We don't want num_unqueued_files_ to go to zero between 
the return from
             :     // GetNextUnstartedRange() and the check for when all ranges 
are complete.
update comment: replace GetNextUnstartedRange with StartNextScanRange.

Here and at line 434


http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-scan-node.cc@481
PS6, Line 481: *scanner_thread_reservation, partition, filter_ctxs, 
expr_results_pool);
             :   context.AddStream(scan_range, *scanner_thread_reservation);
reservation passed twice. can we remove redundancy by using 
context->total_reservation_ inside  AddStream()?


http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/hdfs-scan-node.cc@518
PS6, Line 518: // Reservation may have been increased by the scanner.
maybe mention, that this happens only in case of column formats like parquet 
scanner


http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

http://gerrit.cloudera.org:8080/#/c/9757/6/be/src/exec/scanner-context.h@379
PS6, Line 379: o
nit: to allow


http://gerrit.cloudera.org:8080/#/c/9757/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/9757/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@150
PS6, Line 150:   // Default reservation for a backend scan range for a column 
in columnar formats like
should we mention the units just to ward off any ambiguity?


http://gerrit.cloudera.org:8080/#/c/9757/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@154
PS6, Line 154: // TODO: is it worth making this a tunable query option?
not a huge fan of adding more knobs that the user doesnt know how to tune. If 
the purpose of providing this as a query option is to have a workaround, what 
kindof scenario do you think this would be useful in? a lot of small files 
without stats? would a better solution be to calculate stats instead?


http://gerrit.cloudera.org:8080/#/c/9757/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@158
PS6, Line 158: FOOTER_READ_SIZE
nit: how about PARQUET_FOOTER_READ_SIZE, just to make it explicit


http://gerrit.cloudera.org:8080/#/c/9757/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@752
PS6, Line 752: maxScanRangeBytes_
confusing at first when you look at this and maxScanRangeLength,
maybe rename maxScanRangeLength to maxScanRangeLengthLimit
OR
rename maxScanRangeLength to maxScanRangeLengthUsed
?
one is enforced limit and other is the actual max length used.


http://gerrit.cloudera.org:8080/#/c/9757/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1264
PS6, Line 1264: fileFormats_.contains(HdfsFileFormat.PARQUET)
will estimates be biased towards parquet if there are multiple formats?
does it even matter if it does?


http://gerrit.cloudera.org:8080/#/c/9757/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1443
PS6, Line 1443: estimates
nit: repeated word


http://gerrit.cloudera.org:8080/#/c/9757/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1444
PS6, Line 1444: In that case the reservation
              :    * may still be usable for a different column.
you mean in the case where we underestimate it for another column but try to 
reserve more during the scanner reservation increase phase in the backend?


http://gerrit.cloudera.org:8080/#/c/9757/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1465
PS6, Line 1465: encodedDataSize
nit: encodedDataByteSize



--
To view, visit http://gerrit.cloudera.org:8080/9757
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc80e05118a9eef72cac8e2308418122e3ee0842
Gerrit-Change-Number: 9757
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Mon, 16 Apr 2018 22:23:47 +0000
Gerrit-HasComments: Yes

Reply via email to