Repository: incubator-impala
Updated Branches:
refs/heads/master 00e3a55cb -> 9f4d9ff68
IMPALA-5778: clarify --read_size option.
Remove BTS_BLOCK_OVERFLOW error code, which is no longer used and
referenced --read_size.
Improve the flag description. The output is now:
-read_size ((Advanced) The preferred I/O request size in bytes to issue to
HDFS or the local filesystem. Increasing the read size will increase
memory requirements. Decreasing the read size may decrease I/O
throughput.) type: int32 default: 8388608
Testing:
Tested that Impala built and basic queries could run.
Change-Id: I3c20a9d55f89170b11f569c90b7f2949ddbe4211
Reviewed-on: http://gerrit.cloudera.org:8080/7623
Reviewed-by: Tim Armstrong <[email protected]>
Tested-by: Impala Public Jenkins
Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/9f4d9ff6
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/9f4d9ff6
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/9f4d9ff6
Branch: refs/heads/master
Commit: 9f4d9ff68fae72d2c7cce44587b82e977ea668e9
Parents: 00e3a55
Author: Tim Armstrong <[email protected]>
Authored: Tue Aug 8 17:10:06 2017 -0700
Committer: Impala Public Jenkins <[email protected]>
Committed: Fri Aug 11 01:18:03 2017 +0000
----------------------------------------------------------------------
be/src/runtime/disk-io-mgr.cc | 7 +++++--
common/thrift/generate_error_codes.py | 4 +---
2 files changed, 6 insertions(+), 5 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/9f4d9ff6/be/src/runtime/disk-io-mgr.cc
----------------------------------------------------------------------
diff --git a/be/src/runtime/disk-io-mgr.cc b/be/src/runtime/disk-io-mgr.cc
index b168cdb..21edbb2 100644
--- a/be/src/runtime/disk-io-mgr.cc
+++ b/be/src/runtime/disk-io-mgr.cc
@@ -78,11 +78,14 @@ DEFINE_int32(num_s3_io_threads, 16, "Number of S3 I/O
threads");
// enforced by ADLS for a cluster, which spans between 500-700. For smaller
clusters
// (~10 nodes), 64 threads would be more ideal.
DEFINE_int32(num_adls_io_threads, 16, "Number of ADLS I/O threads");
-// The read size is the size of the reads sent to hdfs/os.
+// The read size is the preferred size of the reads issued to HDFS or the
local FS.
// There is a trade off of latency and throughout, trying to keep disks busy
but
// not introduce seeks. The literature seems to agree that with 8 MB reads,
random
// io and sequential io perform similarly.
-DEFINE_int32(read_size, 8 * 1024 * 1024, "Read Size (in bytes)");
+DEFINE_int32(read_size, 8 * 1024 * 1024, "(Advanced) The preferred I/O request
size in "
+ "bytes to issue to HDFS or the local filesystem. Increasing the read size
will "
+ "increase memory requirements. Decreasing the read size may decrease I/O "
+ "throughput.");
DECLARE_int64(min_buffer_size);
// With 1024B through 8MB buffers, this is up to ~2GB of buffers.
http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/9f4d9ff6/common/thrift/generate_error_codes.py
----------------------------------------------------------------------
diff --git a/common/thrift/generate_error_codes.py
b/common/thrift/generate_error_codes.py
index ccd713c..ad0e342 100755
--- a/common/thrift/generate_error_codes.py
+++ b/common/thrift/generate_error_codes.py
@@ -206,9 +206,7 @@ error_codes = (
("UDF_MEM_LIMIT_EXCEEDED", 64, "$0's allocations exceeded memory limits."),
- ("BTS_BLOCK_OVERFLOW", 65, "Cannot process row that is bigger than the IO
size "
- "(row_size=$0, null_indicators_size=$1). To run this query, increase the IO
size "
- "(--read_size option)."),
+ ("UNUSED_65", 65, "No longer in use."),
("COMPRESSED_FILE_MULTIPLE_BLOCKS", 66,
"For better performance, snappy-, gzip-, and bzip-compressed files "