[
https://issues.apache.org/jira/browse/IMPALA-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204901#comment-17204901
]
Abhishek Rawat commented on IMPALA-10102:
-----------------------------------------
Do you always get the same stack for the crash and also wondering what is the
'max_compressed_size'. It's just trying to allocate memory for a compressed
page and if for some reason 'max_compressed_size' is a huge value (could be due
to invalid uncompressed page size in the page - maybe data corruption or some
other bug?) then memory allocation will fail.
I think the check you are suggesting makes sense but maybe for sanity check
retry the repro and check the max_compressed_size and make sure its sane? I
think you could use TryAllocate instead and if it returns NULL then return an
error. And would be good to log the size of the failed allocation in the error
message.
> Impalad crashses when writting a parquet file with large rows
> -------------------------------------------------------------
>
> Key: IMPALA-10102
> URL: https://issues.apache.org/jira/browse/IMPALA-10102
> Project: IMPALA
> Issue Type: Bug
> Reporter: Quanlong Huang
> Assignee: Yida Wu
> Priority: Critical
> Labels: crash
>
> Encountered a crash when testing following queries on my local branch:
> {code:sql}
> create table bigstrs3 stored as parquet as
> select *, repeat(uuid(), cast(random() * 200000 as int)) as bigstr
> from functional.alltypes
> limit 1000;
> # Length of uuid() is 36. So the max row size is 7,200,000.
> set MAX_ROW_SIZE=8m;
> create table my_str_group stored as parquet as
> select group_concat(string_col) as ss, bigstr
> from bigstrs3 group by bigstr;
> create table my_cnt stored as parquet as
> select count(*) as cnt, bigstr
> from bigstrs3 group by bigstr;
> {code}
> The crash stacktrace:
> {code}
> Crash reason: SIGSEGV
> Crash address: 0x0
> Process uptime: not available
> Thread 336 (crashed)
> 0 libc-2.23.so + 0x14e10b
> 1 impalad!snappy::UncheckedByteArraySink::Append(char const*, unsigned
> long) [clone .localalias.0] + 0x1a
> 2 impalad!snappy::Compress(snappy::Source*, snappy::Sink*) + 0xb1
> 3 impalad!snappy::RawCompress(char const*, unsigned long, char*, unsigned
> long*) + 0x51
> 4 impalad!impala::SnappyCompressor::ProcessBlock(bool, long, unsigned char
> const*, long*, unsigned char**) [compress.cc : 295 + 0x24]
> 5 impalad!impala::Codec::ProcessBlock32(bool, int, unsigned char const*,
> int*, unsigned char**) [codec.cc : 211 + 0x41]
> 6 impalad!impala::HdfsParquetTableWriter::BaseColumnWriter::Flush(long*,
> long*, long*) [hdfs-parquet-table-writer.cc : 775 + 0x56]
> 7 impalad!impala::HdfsParquetTableWriter::FlushCurrentRowGroup()
> [hdfs-parquet-table-writer.cc : 1330 + 0x60]
> 8 impalad!impala::HdfsParquetTableWriter::Finalize()
> [hdfs-parquet-table-writer.cc : 1297 + 0x19]
> 9
> impalad!impala::HdfsTableSink::FinalizePartitionFile(impala::RuntimeState*,
> impala::OutputPartition*) [hdfs-table-sink.cc : 652 + 0x2e]
> 10
> impalad!impala::HdfsTableSink::WriteRowsToPartition(impala::RuntimeState*,
> impala::RowBatch*, std::pair<std::unique_ptr<impala::OutputPartition,
> std::default_delete<impala::OutputPartition> >, std::vector<int,
> std::allocator<int> > >*) [hdfs-table-sink.cc : 282 + 0x21]
> 11 impalad!impala::HdfsTableSink::Send(impala::RuntimeState*,
> impala::RowBatch*) [hdfs-table-sink.cc : 621 + 0x2e]
> 12 impalad!impala::FragmentInstanceState::ExecInternal()
> [fragment-instance-state.cc : 422 + 0x58]
> 13 impalad!impala::FragmentInstanceState::Exec() [fragment-instance-state.cc
> : 106 + 0x16]
> 14 impalad!impala::QueryState::ExecFInstance(impala::FragmentInstanceState*)
> [query-state.cc : 836 + 0x19]
> 15 impalad!impala::QueryState::StartFInstances()::{lambda()#1}::operator()()
> const + 0x26
> 16
> impalad!boost::detail::function::void_function_obj_invoker0<impala::QueryState::StartFInstances()::<lambda()>,
> void>::invoke [function_template.hpp : 159 + 0xc]
> 17 impalad!boost::function0<void>::operator()() const [function_template.hpp
> : 770 + 0x1d]
> 18 impalad!impala::Thread::SuperviseThread(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*) [thread.cc : 360 + 0xf]
> 19 impalad!void
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*>
> >::operator()<void (*)(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*),
> boost::_bi::list0>(boost::_bi::type<void>, void
> (*&)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*), boost::_bi::list0&, int) [bind.hpp : 531 + 0x15]
> 20 impalad!boost::_bi::bind_t<void, void
> (*)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*),
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> >
> >::operator()() [bind.hpp : 1222 + 0x22]
> 21 impalad!boost::detail::thread_data<boost::_bi::bind_t<void, void
> (*)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*),
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> > >
> >::run() [thread.hpp : 116 + 0x12]
> 22 impalad!thread_proxy + 0x72
> 23 libpthread-2.23.so + 0x76ba
> 24 libc-2.23.so + 0x1074dd
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]