[
https://issues.apache.org/jira/browse/IMPALA-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203998#comment-17203998
]
Yida Wu edited comment on IMPALA-10102 at 10/1/20, 12:13 AM:
-------------------------------------------------------------
The crash happens occasionally with the below settings.
1. modify test script in
testdata/workloads/functional-query/queries/QueryTest/spilling-large-rows.test
{quote}set mem_limit="{color:#ff0000}8gb{color}";
create table bigstrs3 stored as parquet as
select *, repeat(uuid(), cast(random() * 200000 as int)) as bigstr
from functional.alltypes limit {color:#ff0000}1000{color};
{quote}
2. start impala-cluster:
$IMPALA_HOME//bin/start-impala-cluster.py
---impalad_args="--mt_dop_auto_fallback=true"
3. run test:
impala-py.test tests/query_test/test_spilling.py -k large
The reason of the crash:
hdfs-parquet-table-writer.cc:
{quote}uint8_t* {color:#ff0000}compressed_data{color} =
parent_->per_file_mem_pool_->Allocate(max_compressed_size);
{quote}
The compressed_data allocated failed due to no enough space left in the memory
pool. However the code doesn't verify it and lead to . To avoid the crash, the
simple way is to verify the returned object after Allocate is called, return an
error status if it is null.
Also, there is another issue I have met when I test this case, which is one of
the impalad processes is killed by linux due to OOM. I assume it happens due to
the lack of memory in my local box because of the setting of high memory limit
(8gb) for each process, however, it looks like a configuration issue more than
a bug in the system. If the memory limit is set extremely high compared to the
memory capacity , inevitably OOM could happen when it uses up all the memory.
So I am thinking if adding a verification on the Allocate function is enough
for this Jira if it solves the problem. [~arawat] [~stigahuang]
was (Author: baggio000):
The crash happens occasionally with the below settings.
1. modify test script in
testdata/workloads/functional-query/queries/QueryTest/spilling-large-rows.test
{quote}
set mem_limit="{color:#FF0000}8gb{color}";
create table bigstrs3 stored as parquet as
select *, repeat(uuid(), cast(random() * 200000 as int)) as bigstr
from functional.alltypes limit {color:#FF0000}1000{color};
{quote}
2. start impala-cluster:
$IMPALA_HOME//bin/start-impala-cluster.py
--impalad_args="--mt_dop_auto_fallback=true"
3. run test:
impala-py.test tests/query_test/test_spilling.py -k large
The reason of the crash:
hdfs-parquet-table-writer.cc:
{quote}uint8_t* {color:#FF0000}compressed_data{color} =
parent_->per_file_mem_pool_->Allocate(max_compressed_size);
{quote}
The compressed_data allocated failed due to no enough space left in the memory
pool. However the code doesn't verify it and lead to . To avoid the crash, the
simple way is to verify the returned object after Allocate is called, return an
error status if it is null.
Also, there is another issue I have met when I test this case, which is one of
the impalad processes is killed by linux due to OOM. I assume it happens due to
the lack of memory in my local box because of the setting of high memory limit
(8gb) for each process, however, it looks like a configuration issue more than
a bug in the system. If the memory limit is set extremely high compared to the
memory capacity , inevitably OOM could happen when it uses up all the memory.
So I am thinking if adding a verification on the Allocate function is enough
for this Jira if it solves the problem. [~arawat] [~stigahuang]
> Impalad crashses when writting a parquet file with large rows
> -------------------------------------------------------------
>
> Key: IMPALA-10102
> URL: https://issues.apache.org/jira/browse/IMPALA-10102
> Project: IMPALA
> Issue Type: Bug
> Reporter: Quanlong Huang
> Assignee: Yida Wu
> Priority: Critical
> Labels: crash
>
> Encountered a crash when testing following queries on my local branch:
> {code:sql}
> create table bigstrs3 stored as parquet as
> select *, repeat(uuid(), cast(random() * 200000 as int)) as bigstr
> from functional.alltypes
> limit 1000;
> # Length of uuid() is 36. So the max row size is 7,200,000.
> set MAX_ROW_SIZE=8m;
> create table my_str_group stored as parquet as
> select group_concat(string_col) as ss, bigstr
> from bigstrs3 group by bigstr;
> create table my_cnt stored as parquet as
> select count(*) as cnt, bigstr
> from bigstrs3 group by bigstr;
> {code}
> The crash stacktrace:
> {code}
> Crash reason: SIGSEGV
> Crash address: 0x0
> Process uptime: not available
> Thread 336 (crashed)
> 0 libc-2.23.so + 0x14e10b
> 1 impalad!snappy::UncheckedByteArraySink::Append(char const*, unsigned
> long) [clone .localalias.0] + 0x1a
> 2 impalad!snappy::Compress(snappy::Source*, snappy::Sink*) + 0xb1
> 3 impalad!snappy::RawCompress(char const*, unsigned long, char*, unsigned
> long*) + 0x51
> 4 impalad!impala::SnappyCompressor::ProcessBlock(bool, long, unsigned char
> const*, long*, unsigned char**) [compress.cc : 295 + 0x24]
> 5 impalad!impala::Codec::ProcessBlock32(bool, int, unsigned char const*,
> int*, unsigned char**) [codec.cc : 211 + 0x41]
> 6 impalad!impala::HdfsParquetTableWriter::BaseColumnWriter::Flush(long*,
> long*, long*) [hdfs-parquet-table-writer.cc : 775 + 0x56]
> 7 impalad!impala::HdfsParquetTableWriter::FlushCurrentRowGroup()
> [hdfs-parquet-table-writer.cc : 1330 + 0x60]
> 8 impalad!impala::HdfsParquetTableWriter::Finalize()
> [hdfs-parquet-table-writer.cc : 1297 + 0x19]
> 9
> impalad!impala::HdfsTableSink::FinalizePartitionFile(impala::RuntimeState*,
> impala::OutputPartition*) [hdfs-table-sink.cc : 652 + 0x2e]
> 10
> impalad!impala::HdfsTableSink::WriteRowsToPartition(impala::RuntimeState*,
> impala::RowBatch*, std::pair<std::unique_ptr<impala::OutputPartition,
> std::default_delete<impala::OutputPartition> >, std::vector<int,
> std::allocator<int> > >*) [hdfs-table-sink.cc : 282 + 0x21]
> 11 impalad!impala::HdfsTableSink::Send(impala::RuntimeState*,
> impala::RowBatch*) [hdfs-table-sink.cc : 621 + 0x2e]
> 12 impalad!impala::FragmentInstanceState::ExecInternal()
> [fragment-instance-state.cc : 422 + 0x58]
> 13 impalad!impala::FragmentInstanceState::Exec() [fragment-instance-state.cc
> : 106 + 0x16]
> 14 impalad!impala::QueryState::ExecFInstance(impala::FragmentInstanceState*)
> [query-state.cc : 836 + 0x19]
> 15 impalad!impala::QueryState::StartFInstances()::{lambda()#1}::operator()()
> const + 0x26
> 16
> impalad!boost::detail::function::void_function_obj_invoker0<impala::QueryState::StartFInstances()::<lambda()>,
> void>::invoke [function_template.hpp : 159 + 0xc]
> 17 impalad!boost::function0<void>::operator()() const [function_template.hpp
> : 770 + 0x1d]
> 18 impalad!impala::Thread::SuperviseThread(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*) [thread.cc : 360 + 0xf]
> 19 impalad!void
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*>
> >::operator()<void (*)(std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&,
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> > const&, boost::function<void ()>, impala::ThreadDebugInfo const*,
> impala::Promise<long, (impala::PromiseMode)0>*),
> boost::_bi::list0>(boost::_bi::type<void>, void
> (*&)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*), boost::_bi::list0&, int) [bind.hpp : 531 + 0x15]
> 20 impalad!boost::_bi::bind_t<void, void
> (*)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*),
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> >
> >::operator()() [bind.hpp : 1222 + 0x22]
> 21 impalad!boost::detail::thread_data<boost::_bi::bind_t<void, void
> (*)(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const&, boost::function<void
> ()>, impala::ThreadDebugInfo const*, impala::Promise<long,
> (impala::PromiseMode)0>*),
> boost::_bi::list5<boost::_bi::value<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > >,
> boost::_bi::value<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >, boost::_bi::value<boost::function<void ()> >,
> boost::_bi::value<impala::ThreadDebugInfo*>,
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> > >
> >::run() [thread.hpp : 116 + 0x12]
> 22 impalad!thread_proxy + 0x72
> 23 libpthread-2.23.so + 0x76ba
> 24 libc-2.23.so + 0x1074dd
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]