[
https://issues.apache.org/jira/browse/IMPALA-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17223031#comment-17223031
]
ASF subversion and git services commented on IMPALA-10102:
----------------------------------------------------------
Commit d62a04078d1e8455716720374dd07dfab1d7dfd1 in impala's branch
refs/heads/master from Yida Wu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d62a040 ]
IMPALA-10102 Fix Impalad crashes when writing a parquet file with
large rows.
The crash happens when trying to dereference a null pointer
returned from a failed memory allocation from memory pool.
TryAllocate is used instead of Allocate and null check is added
for the large memory allocations such as buffer for dictionary
page and compressed dictionary page. The memory allocation is
most likely to fail for these large allocations when memory is
scarce.
This change fixes the crash in this particular code path,
however in practice, there could still be an OOM issue which
could lead to the process getting killed by the OS. The change
doesn't fix the OOM issue, users need to configure the mem_limit
(start-up option) properly to avoid the OOM crash.
Test:
Ran a script to redo the test mentioned in the Jira for thirty
times, no crash happens.
Change-Id: I0dee474cceb0c370278d290eb900c05769b23dec
Reviewed-on: http://gerrit.cloudera.org:8080/16638
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Impalad crashses when writting a parquet file with large rows
> -------------------------------------------------------------
>
> Key: IMPALA-10102
> URL: https://issues.apache.org/jira/browse/IMPALA-10102
> Project: IMPALA
> Issue Type: Bug
> Reporter: Quanlong Huang
> Assignee: Yida Wu
> Priority: Critical
> Labels: crash
>
> Encountered a crash when testing following queries on my local branch:
> {code:sql}
> create table bigstrs3 stored as parquet as
> select *, repeat(uuid(), cast(random() * 200000 as int)) as bigstr
> from functional.alltypes
> limit 1000;
> # Length of uuid() is 36. So the max row size is 7,200,000.
> set MAX_ROW_SIZE=8m;
> create table my_str_group stored as parquet as
> select group_concat(string_col) as ss, bigstr
> from bigstrs3 group by bigstr;
> create table my_cnt stored as parquet as
> select count(*) as cnt, bigstr
> from bigstrs3 group by bigstr;
> {code}
> The crash stacktrace:
> {code}
> Crash reason: SIGSEGV
> Crash address: 0x0
> Process uptime: not available
> Thread 336 (crashed)
> 0 libc-2.23.so + 0x14e10b
> 1 impalad!snappy::UncheckedByteArraySink::Append(char const*, unsigned
> long) [clone .localalias.0] + 0x1a
> 2 impalad!snappy::Compress(snappy::Source*, snappy::Sink*) + 0xb1
> 3 impalad!snappy::RawCompress(char const*, unsigned long, char*, unsigned
> long*) + 0x51
> 4 impalad!impala::SnappyCompressor::ProcessBlock(bool, long, unsigned char
> const*, long*, unsigned char**) [compress.cc : 295 + 0x24]
> 5 impalad!impala::Codec::ProcessBlock32(bool, int, unsigned char const*,
> int*, unsigned char**) [codec.cc : 211 + 0x41]
> 6 impalad!impala::HdfsParquetTableWriter::BaseColumnWriter::Flush(long*,
> long*, long*) [hdfs-parquet-table-writer.cc : 775 + 0x56]
> 7 impalad!impala::HdfsParquetTableWriter::FlushCurrentRowGroup()
> [hdfs-parquet-table-writer.cc : 1330 + 0x60]
> 8 impalad!impala::HdfsParquetTableWriter::Finalize()
> [hdfs-parquet-table-writer.cc : 1297 + 0x19]
> 9
> impalad!impala::HdfsTableSink::FinalizePartitionFile(impala::RuntimeState*,
> impala::OutputPartition*) [hdfs-table-sink.cc : 652 + 0x2e]
> 10
> impalad!impala::HdfsTableSink::WriteRowsToPartition(impala::RuntimeState*,
> impala::RowBatch*, std::pair<std::unique_ptr<impala::OutputPartition,
> std::default_delete<impala::OutputPartition> >, std::vector<int,
> std::allocator<int> > >*) [hdfs-table-sink.cc : 282 + 0x21]
> 11 impalad!impala::HdfsTableSink::Send(impala::RuntimeState*,
> impala::RowBatch*) [hdfs-table-sink.cc : 621 + 0x2e]
> 12 impalad!impala::FragmentInstanceState::ExecInternal()
> [fragment-instance-state.cc : 422 + 0x58]
> 13 impalad!impala::FragmentInstanceState::Exec() [fragment-instance-state.cc
> : 106 + 0x16]
> 14 impalad!impala::QueryState::ExecFInstance(impala::FragmentInstanceState*)
> [query-state.cc : 836 + 0x19]
> 15 impalad!impala::QueryState::StartFInstances()::{lambda()#1}::operator()()
> const + 0x26
> 16
> impalad!boost::detail::function::void_function_obj_invoker0<impala::QueryState::StartFInstances()::<lambda()>,
> void>::invoke [function_template.hpp : 159 + 0xc]
> 17 impalad!boost::function0<void>::operator()() const [function_template.hpp
> : 770 + 0x1d]
> 18 impalad!impala::Thread::SuperviseThread(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*) [thread.cc : 360 + 0xf]
> 19 impalad!void
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*>
> >::operator()<void (*)(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*),
> boost::_bi::list0>(boost::_bi::type<void>, void
> (*&)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*), boost::_bi::list0&, int) [bind.hpp : 531 + 0x15]
> 20 impalad!boost::_bi::bind_t<void, void
> (*)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*),
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> >
> >::operator()() [bind.hpp : 1222 + 0x22]
> 21 impalad!boost::detail::thread_data<boost::_bi::bind_t<void, void
> (*)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*),
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> > >
> >::run() [thread.hpp : 116 + 0x12]
> 22 impalad!thread_proxy + 0x72
> 23 libpthread-2.23.so + 0x76ba
> 24 libc-2.23.so + 0x1074dd
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]