Repository: incubator-impala Updated Branches: refs/heads/master 00e3a55cb -> 9f4d9ff68
IMPALA-5778: clarify --read_size option. Remove BTS_BLOCK_OVERFLOW error code, which is no longer used and referenced --read_size. Improve the flag description. The output is now: -read_size ((Advanced) The preferred I/O request size in bytes to issue to HDFS or the local filesystem. Increasing the read size will increase memory requirements. Decreasing the read size may decrease I/O throughput.) type: int32 default: 8388608 Testing: Tested that Impala built and basic queries could run. Change-Id: I3c20a9d55f89170b11f569c90b7f2949ddbe4211 Reviewed-on: http://gerrit.cloudera.org:8080/7623 Reviewed-by: Tim Armstrong <tarmstr...@cloudera.com> Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/9f4d9ff6 Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/9f4d9ff6 Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/9f4d9ff6 Branch: refs/heads/master Commit: 9f4d9ff68fae72d2c7cce44587b82e977ea668e9 Parents: 00e3a55 Author: Tim Armstrong <tarmstr...@cloudera.com> Authored: Tue Aug 8 17:10:06 2017 -0700 Committer: Impala Public Jenkins <impala-public-jenk...@gerrit.cloudera.org> Committed: Fri Aug 11 01:18:03 2017 +0000 ---------------------------------------------------------------------- be/src/runtime/disk-io-mgr.cc | 7 +++++-- common/thrift/generate_error_codes.py | 4 +--- 2 files changed, 6 insertions(+), 5 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/9f4d9ff6/be/src/runtime/disk-io-mgr.cc ---------------------------------------------------------------------- diff --git a/be/src/runtime/disk-io-mgr.cc b/be/src/runtime/disk-io-mgr.cc index b168cdb..21edbb2 100644 --- a/be/src/runtime/disk-io-mgr.cc +++ b/be/src/runtime/disk-io-mgr.cc @@ -78,11 +78,14 @@ DEFINE_int32(num_s3_io_threads, 16, "Number of S3 I/O threads"); // enforced by ADLS for a cluster, which spans between 500-700. For smaller clusters // (~10 nodes), 64 threads would be more ideal. DEFINE_int32(num_adls_io_threads, 16, "Number of ADLS I/O threads"); -// The read size is the size of the reads sent to hdfs/os. +// The read size is the preferred size of the reads issued to HDFS or the local FS. // There is a trade off of latency and throughout, trying to keep disks busy but // not introduce seeks. The literature seems to agree that with 8 MB reads, random // io and sequential io perform similarly. -DEFINE_int32(read_size, 8 * 1024 * 1024, "Read Size (in bytes)"); +DEFINE_int32(read_size, 8 * 1024 * 1024, "(Advanced) The preferred I/O request size in " + "bytes to issue to HDFS or the local filesystem. Increasing the read size will " + "increase memory requirements. Decreasing the read size may decrease I/O " + "throughput."); DECLARE_int64(min_buffer_size); // With 1024B through 8MB buffers, this is up to ~2GB of buffers. http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/9f4d9ff6/common/thrift/generate_error_codes.py ---------------------------------------------------------------------- diff --git a/common/thrift/generate_error_codes.py b/common/thrift/generate_error_codes.py index ccd713c..ad0e342 100755 --- a/common/thrift/generate_error_codes.py +++ b/common/thrift/generate_error_codes.py @@ -206,9 +206,7 @@ error_codes = ( ("UDF_MEM_LIMIT_EXCEEDED", 64, "$0's allocations exceeded memory limits."), - ("BTS_BLOCK_OVERFLOW", 65, "Cannot process row that is bigger than the IO size " - "(row_size=$0, null_indicators_size=$1). To run this query, increase the IO size " - "(--read_size option)."), + ("UNUSED_65", 65, "No longer in use."), ("COMPRESSED_FILE_MULTIPLE_BLOCKS", 66, "For better performance, snappy-, gzip-, and bzip-compressed files "