[
https://issues.apache.org/jira/browse/IMPALA-13564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900246#comment-17900246
]
LiuYuan commented on IMPALA-13564:
----------------------------------
Query options (defaults shown in []):
ABORT_ON_ERROR: [0]
COMPRESSION_CODEC: []
DEFAULT_FILE_FORMAT: [TEXT]
DEFAULT_HINTS_INSERT_STATEMENT: []
DISABLE_CODEGEN: [0]
DISABLE_HDFS_NUM_ROWS_ESTIMATE: [0]
DISABLE_ROW_RUNTIME_FILTERING: [0]
DISABLE_STREAMING_PREAGGREGATIONS: [0]
DISABLE_UNSAFE_SPILLS: [0]
ENABLE_TRIVIAL_QUERY_FOR_ADMISSION: [1]
EXEC_TIME_LIMIT_S: [0]
EXPAND_COMPLEX_TYPES: [0]
EXPLAIN_LEVEL: [STANDARD]
IDLE_SESSION_TIMEOUT: [3600]
LOCK_MAX_WAIT_TIME_S: [300]
MAX_ROW_SIZE: [524288]
MAX_STATEMENT_LENGTH_BYTES: [16777216]
MEM_LIMIT: [0]
MT_DOP: []
NUM_SCANNER_THREADS: [0]
OPTIMIZE_PARTITION_KEY_SCANS: [0]
OPTIMIZE_SIMPLE_LIMIT: [0]
ORC_SCHEMA_RESOLUTION: [POSITION]
PARQUET_ARRAY_RESOLUTION: [THREE_LEVEL]
PARQUET_FALLBACK_SCHEMA_RESOLUTION: NAME
QUERY_TIMEOUT_S: [0]
REQUEST_POOL: []
RETRY_FAILED_QUERIES: [0]
RUNTIME_FILTER_MODE: [GLOBAL]
RUNTIME_FILTER_WAIT_TIME_MS: [0]
S3_SKIP_INSERT_STAGING: [1]
SCRATCH_LIMIT: [-1]
STATEMENT_EXPRESSION_LIMIT: [250000]
SYNC_DDL: [0]
THREAD_RESERVATION_AGGREGATE_LIMIT: [0]
THREAD_RESERVATION_LIMIT: [3000]
TIMEZONE: [Asia/Shanghai]
VALUES_STMT_AVOID_LOSSY_CHAR_PADDING: [0]
Advanced Query Options:
ABORT_JAVA_UDF_ON_EXCEPTION: [0]
AGG_MEM_CORRELATION_FACTOR: [0.500000]
ANALYTIC_RANK_PUSHDOWN_THRESHOLD: [1000]
APPX_COUNT_DISTINCT: [0]
BROADCAST_BYTES_LIMIT: [34359738368]
BROADCAST_TO_PARTITION_FACTOR: [1.000000]
BUFFER_POOL_LIMIT: []
CLIENT_IDENTIFIER: Impala Shell v4.3.0-RELEASE (535a8af) built on Thu Sep 26
02:10:35 CST 2024
COMPUTE_COLUMN_MINMAX_STATS: [0]
COMPUTE_PROCESSING_COST: [0]
COMPUTE_STATS_MIN_SAMPLE_SIZE: [1073741824]
CONVERT_LEGACY_HIVE_PARQUET_UTC_TIMESTAMPS: [0]
DEFAULT_JOIN_DISTRIBUTION_MODE: [BROADCAST]
DEFAULT_NDV_SCALE: [2]
DEFAULT_SPILLABLE_BUFFER_SIZE: [2097152]
DEFAULT_TRANSACTIONAL_TYPE: [NONE]
DELETE_STATS_IN_TRUNCATE: [1]
DISABLE_CODEGEN_CACHE: [0]
DISABLE_CODEGEN_ROWS_THRESHOLD: [50000]
DISABLE_DATA_CACHE: [0]
DISABLE_HBASE_NUM_ROWS_ESTIMATE: [0]
DISABLE_OPTIMIZED_ICEBERG_V2_READ: [0]
ENABLED_RUNTIME_FILTER_TYPES: [BLOOM,MIN_MAX]
ENABLE_ASYNC_DDL_EXECUTION: [1]
ENABLE_ASYNC_LOAD_DATA_EXECUTION: [1]
ENABLE_CNF_REWRITES: [1]
ENABLE_DISTINCT_SEMI_JOIN_OPTIMIZATION: [1]
ENABLE_EXPR_REWRITES: [1]
ENABLE_OUTER_JOIN_TO_INNER_TRANSFORMATION: [0]
ENABLE_REPLAN: [1]
EXEC_SINGLE_NODE_ROWS_THRESHOLD: [100]
FALLBACK_DB_FOR_FUNCTIONS: []
FETCH_ROWS_TIMEOUT_MS: [10000]
HBASE_CACHE_BLOCKS: [0]
HBASE_CACHING: [0]
JOIN_ROWS_PRODUCED_LIMIT: [0]
JOIN_SELECTIVITY_CORRELATION_FACTOR: [0.000000]
KUDU_READ_MODE: [DEFAULT]
KUDU_REPLICA_SELECTION: [CLOSEST_REPLICA]
KUDU_SNAPSHOT_READ_TIMESTAMP_MICROS: [0]
LARGE_AGG_MEM_THRESHOLD: [536870912]
MAX_CNF_EXPRS: [200]
MAX_ERRORS: [100]
MAX_FRAGMENT_INSTANCES_PER_NODE: [128]
MAX_FS_WRITERS: [0]
MAX_MEM_ESTIMATE_FOR_ADMISSION: [0]
MAX_NUM_RUNTIME_FILTERS: [10]
MEM_LIMIT_COORDINATORS: [0]
MEM_LIMIT_EXECUTORS: [0]
MINMAX_FILTERING_LEVEL: [ROW_GROUP]
MINMAX_FILTER_FAST_CODE_PATH: [ON]
MINMAX_FILTER_PARTITION_COLUMNS: [1]
MINMAX_FILTER_SORTED_COLUMNS: [1]
MINMAX_FILTER_THRESHOLD: [0.000000]
MIN_SPILLABLE_BUFFER_SIZE: [65536]
NUM_REMOTE_EXECUTOR_CANDIDATES: [3]
NUM_ROWS_PRODUCED_LIMIT: [0]
NUM_THREADS_FOR_TABLE_MIGRATION: [1]
ORC_ASYNC_READ: [1]
ORC_READ_STATISTICS: [1]
PARQUET_ANNOTATE_STRINGS_UTF8: [0]
PARQUET_BLOOM_FILTERING: 0
PARQUET_BLOOM_FILTER_WRITE: [IF_NO_DICT]
PARQUET_DICTIONARY_FILTERING: 0
PARQUET_DICTIONARY_RUNTIME_FILTER_ENTRY_LIMIT: [1024]
PARQUET_FILE_SIZE: [0]
PARQUET_LATE_MATERIALIZATION_THRESHOLD: [20]
PARQUET_OBJECT_STORE_SPLIT_SIZE: [268435456]
PARQUET_PAGE_ROW_COUNT_LIMIT: []
PARQUET_READ_PAGE_INDEX: [1]
PARQUET_READ_STATISTICS: [1]
PARQUET_WRITE_PAGE_INDEX: [1]
PREAGG_BYTES_LIMIT: [-1]
PREFETCH_MODE: [HT_BUCKET]
PROCESSING_COST_MIN_THREADS: [1]
REFRESH_UPDATED_HMS_PARTITIONS: [0]
REPLICA_PREFERENCE: [CACHE_LOCAL]
REPORT_SKEW_LIMIT: [1.000000]
RESOURCE_TRACE_RATIO: [0.000000]
RUNTIME_BLOOM_FILTER_SIZE: [1048576]
RUNTIME_FILTER_ERROR_RATE: []
RUNTIME_FILTER_MAX_SIZE: [16777216]
RUNTIME_FILTER_MIN_SIZE: [1048576]
RUNTIME_IN_LIST_FILTER_ENTRY_LIMIT: [1024]
SCAN_BYTES_LIMIT: [0]
SCHEDULE_RANDOM_REPLICA: [0]
SHOW_COLUMN_MINMAX_STATS: [0]
SHUFFLE_DISTINCT_EXPRS: [1]
SORT_RUN_BYTES_LIMIT: [-1]
SPOOL_ALL_RESULTS_FOR_RETRIES: [1]
STRINGIFY_MAP_KEYS: [0]
TARGETED_KUDU_SCAN_RANGE_LENGTH: [-1]
TEST_REPLAN: [0]
TOPN_BYTES_LIMIT: [536870912]
USE_DOP_FOR_COSTING: [1]
USE_LOCAL_TZ_FOR_UNIX_TIMESTAMP_CONVERSIONS: [0]
Shell Options
LIVE_PROGRESS: True
LIVE_SUMMARY: False
WRITE_DELIMITED: False
VERBOSE: True
DELIMITER: \t
OUTPUT_FILE: None
VERTICAL: False
Variables:
No variables defined.
> Exector crash in impala::DecodeValue<impala::StringValue> when select * from
> table which has hundreds string column
> -------------------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-13564
> URL: https://issues.apache.org/jira/browse/IMPALA-13564
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.3.0
> Reporter: LiuYuan
> Priority: Major
>
> when I select * from the table which has hundreds string column, I got a
> SIGSEGV.
> there is a gdb backtrace:
> {code:java}
> (gdb) bt
> #0 0x0000000001e9b90c in impala::DecodeValue<impala::StringValue>
> (decode_error=0x7f3a31d60b04, out_val=0x427ec435, idx=<optimized out>,
> dict_len=17, dict=0x17292780)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/util/bit-packing.inline.h:295
> #1 impala::BitPacking::UnpackAndDecode32Values<impala::StringValue, 5>
> (in=in@entry=0x1cc2b622 "\002\204\001\300", dict=0x17292780,
> dict_len=dict_len@entry=17, out=<optimized out>, stride=stride@entry=7049,
> decode_error=0x7f3a31d60b04,
> in_bytes=<optimized out>) at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/util/bit-packing.inline.h:356
> #2 0x0000000001f5e7c1 in
> impala::BitPacking::UnpackAndDecodeValues<impala::StringValue, 5>
> (decode_error=0x7f3a31d60b04, stride=7049, out=0x42563774,
> num_values=<optimized out>, dict_len=17, dict=0x17292780, in_bytes=<optimized
> out>,
> in=<optimized out>) at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/util/bit-packing.inline.h:145
> #3 impala::BitPacking::UnpackAndDecodeValues<impala::StringValue>
> (bit_width=<optimized out>, in=<optimized out>, in_bytes=12210,
> dict=dict@entry=0x17292780, dict_len=dict_len@entry=17,
> num_values=num_values@entry=480, out=0x42563774,
> stride=7049, decode_error=0x7f3a31d60b04) at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/util/bit-packing.inline.h:124
> #4 0x0000000001f649fd in
> impala::BatchedBitReader::UnpackAndDecodeBatch<impala::StringValue>
> (stride=7049, v=0x42563774, num_values=<optimized out>, dict_len=<optimized
> out>, dict=0x17292780, bit_width=<optimized out>, this=<optimized out>)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/util/bit-stream-utils.inline.h:199
> #5 impala::RleBatchDecoder<unsigned
> int>::DecodeLiteralValues<impala::StringValue> (out=<synthetic pointer>,
> dict_len=<optimized out>, dict=0x17292780, num_literals_to_consume=504,
> this=<optimized out>)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/util/rle-encoding.h:628
> #6 impala::DictDecoder<impala::StringValue>::GetNextValues (count=520,
> stride=7049, first_value=0x42563774, this=0x20391450) at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/util/dict-encoding.h:549
> #7 impala::ScalarColumnReader<impala::StringValue, (parquet::Type::type)6,
> true>::DecodeValues<(parquet::Encoding::type)2> (out_vals=0x42563774,
> count=<optimized out>, stride=7049, this=0x20391000)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/exec/parquet/parquet-column-readers.cc:858
> #8 impala::ScalarColumnReader<impala::StringValue, (parquet::Type::type)6,
> true>::DecodeValues (out_vals=0x42563774, count=<optimized out>, stride=7049,
> this=0x20391000)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/exec/parquet/parquet-column-readers.cc:846
> #9 impala::ScalarColumnReader<impala::StringValue, (parquet::Type::type)6,
> true>::ReadSlotsNoConversion (this=this@entry=0x20391000,
> num_to_read=<optimized out>, tuple_size=tuple_size@entry=7049,
> tuple_mem=<optimized out>)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/exec/parquet/parquet-column-readers.cc:770
> #10 0x0000000001f653c1 in impala::ScalarColumnReader<impala::StringValue,
> (parquet::Type::type)6, true>::ReadSlots (tuple_mem=<optimized out>,
> tuple_size=7049, num_to_read=1024, this=0x20391000)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/exec/parquet/parquet-column-readers.cc:735
> #11 impala::ScalarColumnReader<impala::StringValue, (parquet::Type::type)6,
> true>::MaterializeValueBatchRepeatedDefLevel (this=this@entry=0x20391000,
> max_values=max_values@entry=1024, tuple_size=tuple_size@entry=7049,
> tuple_mem=tuple_mem@entry=0x42200000 "@\247k\005",
> num_values=num_values@entry=0x7f3a31d60c20) at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/exec/parquet/parquet-column-readers.cc:663
> #12 0x0000000001f65dcc in impala::ScalarColumnReader<impala::StringValue,
> (parquet::Type::type)6, true>::ReadValueBatch<false> (this=0x20391000,
> max_values=1024, tuple_size=7049, tuple_mem=0x42200000 "@\247k\005",
> num_values=0x1f5858d0)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/exec/parquet/parquet-column-readers.cc:496
> #13 0x0000000001db8f5b in impala::HdfsParquetScanner::FillScratchMicroBatches
> (this=0x1f5ad800, column_readers=..., row_batch=0x20890500,
> skip_row_group=0x1f5ada58, micro_batches=0x1f5adf04, num_micro_batches=1,
> max_num_tuples=595,
> num_tuples=<optimized out>) at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/exec/parquet/hdfs-parquet-scanner.cc:2508
> #14 0x0000000001dcdf34 in impala::HdfsParquetScanner::AssembleRows<false>
> (this=this@entry=0x1f5ad800, row_batch=row_batch@entry=0x20890500,
> skip_row_group=<optimized out>)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/smart_ptr/scoped_ptr.hpp:103
> #15 0x0000000001dcb9d0 in impala::HdfsParquetScanner::GetNextInternal
> (this=0x1f5ad800, row_batch=0x20890500) at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/exec/parquet/hdfs-parquet-scanner.cc:532
> #16 0x0000000001dbc2ae in impala::HdfsParquetScanner::ProcessSplit
> (this=0x1f5ad800) at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/exec/parquet/hdfs-parquet-scanner.cc:416
> #17 0x0000000001d3cdd6 in impala::HdfsScanNode::ProcessSplit
> (this=0x18c3e000, filter_ctxs=..., expr_results_pool=<optimized out>,
> scan_range=0x1f04c180, scanner_thread_reservation=0x7f3a31d61378)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/exec/hdfs-scan-node.cc:495
> #18 0x0000000001d3f41d in impala::HdfsScanNode::ScannerThread
> (this=0x18c3e000, first_thread=false, scanner_thread_reservation=<optimized
> out>) at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/exec/hdfs-scan-node.cc:413
> #19 0x0000000001b6bd59 in boost::function0<void>::operator()
> (this=0x7f3a31d619d0) at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/function/function_template.hpp:763
> #20 impala::Thread::SuperviseThread(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*) (name=..., category=...,
> functor=..., parent_thread_info=0x7f3a3e27e750, thread_started=0x7f3a3e27dd30)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/be/src/util/thread.cc:360
> #21 0x0000000001b6cff1 in
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*>
> >::operator()<void (*)(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*),
> boost::_bi::list0>(boost::_bi::type<void>, void
> (*&)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*), boost::_bi::list0&, int) (a=<synthetic
> pointer>...,
> f=@0x1e5657f8: 0x1b6ba20
> <impala::Thread::SuperviseThread(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*)>, this=0x1e565800)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/bind/bind.hpp:531
> #22 boost::_bi::bind_t<void, void (*)(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*),
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> >
> >::operator()() (this=0x1e5657f8)
> at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/bind/bind.hpp:1294
> #23 boost::detail::thread_data<boost::_bi::bind_t<void, void
> (*)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*),
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> > >
> >::run() (
> this=0x1e5656c0) at
> /data/fuxi_ci_workspace/6713819dec1a06263634b15a/toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/include/boost/thread/detail/thread.hpp:120
> #24 0x000000000242e2c7 in thread_proxy ()
> #25 0x00007f3b8717c67a in ?? () from /usr/lib64/libc.so.6
> #26 0x00007f3b871ff160 in ?? () from /usr/lib64/libc.so.6
> #27 0x0000000000000000 in ?? () {code}
> I think there should be passed min(
> micro_batches[r].length, scratch_batch_->capacity
> ) instead of micro_batches[r].length, because micro_batches[r].length is
> 1024, but scratch_batch_->capacity is less than 1024 when row_size is bigger
> than 4096
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]