[
https://issues.apache.org/jira/browse/IMPALA-8257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16780991#comment-16780991
]
ASF subversion and git services commented on IMPALA-8257:
---------------------------------------------------------
Commit 1e6c6724bc2c62e062244df62be9bd950d9d684c in impala's branch
refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1e6c672 ]
IMPALA-8257: Parquet writer sometimes hits DCHECK when handling empty string
We had a too rigorous DCHECK in the code of
ColumnStats<StringValue>::Merge(). The DCHECK makes sure that we copy
the StringValues into their own buffer from the RowBatch memory.
Otherwise their value can be overwritten by following row batches.
The internal pointer of empty StringValues are NULL, so there is no
need to copy them to another buffer, therefore the DCHECKs are
unnecessary and moreover, they can result in crashes.
Now we only evaluate the DCHECKs when the corresponding StringValues
are not empty strings.
Testing:
I added an e2e test that inserts a lot of empty strings into a table.
Change-Id: I934b53c17720e41231e4d614fbc70f1937e19289
Reviewed-on: http://gerrit.cloudera.org:8080/12636
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Parquet writer sometimes hits DCHECK when handling empty string
> ---------------------------------------------------------------
>
> Key: IMPALA-8257
> URL: https://issues.apache.org/jira/browse/IMPALA-8257
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 2.13.0, Impala 3.1.0, Impala 3.2.0
> Reporter: Tim Armstrong
> Assignee: Zoltán Borók-Nagy
> Priority: Blocker
> Labels: crash, parquet
>
> Encountered while doing a large insert into Parquet.
> {code}
> create table customer like tpcds_300_text.customer stored as parquetfile
> insert overwrite table customer select * from tpcds_300_text.customer
> {code}
> {noformat}
> F0227 01:34:53.052708 131295 parquet-column-stats.inline.h:213]
> 794c051ae3f3913c:71f00bc400000001] Check failed:
> static_cast<void*>(prev_page_min_value_.ptr) !=
> static_cast<void*>(cs->min_value_.ptr) (0 vs. 0)
> *** Check failure stack trace: ***
> @ 0x47ec7ec google::LogMessage::Fail()
> @ 0x47ee091 google::LogMessage::SendToLog()
> @ 0x47ec1c6 google::LogMessage::Flush()
> @ 0x47ef78d google::LogMessageFatal::~LogMessageFatal()
> @ 0x27e973c impala::ColumnStats<>::Merge()
> @ 0x27e3c74
> impala::HdfsParquetTableWriter::BaseColumnWriter::FinalizeCurrentPage()
> @ 0x27ee65f
> impala::HdfsParquetTableWriter::BaseColumnWriter::AppendRow()
> @ 0x27e653b impala::HdfsParquetTableWriter::AppendRows()
> @ 0x23177fc impala::HdfsTableSink::WriteRowsToPartition()
> @ 0x231aeeb impala::HdfsTableSink::Send()
> @ 0x1f53888 impala::FragmentInstanceState::ExecInternal()
> @ 0x1f4fefa impala::FragmentInstanceState::Exec()
> @ 0x1f63333 impala::QueryState::ExecFInstance()
> @ 0x1f61615
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @ 0x1f64774
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @ 0x1d76b9f boost::function0<>::operator()()
> @ 0x22245ee impala::Thread::SuperviseThread()
> @ 0x222c972 boost::_bi::list5<>::operator()<>()
> @ 0x222c896 boost::_bi::bind_t<>::operator()()
> @ 0x222c859 boost::detail::thread_data<>::run()
> @ 0x3716329 thread_proxy
> @ 0x7fba207e8dd4 start_thread
> @ 0x7fba20511eac __clone
> {noformat}
> This actually happened on multiple machines at almost exactly the same time:
> {noformat}
> Running on machine: vc1328.halxg.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> F0227 01:34:53.025667 133025 parquet-column-stats.inline.h:213]
> 794c051ae3f3913c:71f00bc400000005] Check failed:
> static_cast<void*>(prev_page_min_value_.ptr) !=
> static_cast<void*>(cs->min_value_.ptr) (0 vs. 0)
> ...
> F0227 01:34:53.025352 131082 parquet-column-stats.inline.h:213]
> 794c051ae3f3913c:71f00bc400000007] Check failed:
> static_cast<void*>(prev_page_min_value_.ptr) !=
> static_cast<void*>(cs->min_value_.ptr) (0 vs. 0)
> {noformat}
> Coordinator log indicates it failed very fast:
> {noformat}
> I0227 01:34:48.157472 147928 impala-server.cc:1063]
> 794c051ae3f3913c:71f00bc400000000] Registered query
> query_id=794c051ae3f3913c:71f00bc400000000
> session_id=3345ef7013ba6bb2:55d8d105d3690a8e
> I0227 01:34:48.157711 147928 Frontend.java:1251]
> 794c051ae3f3913c:71f00bc400000000] Analyzing query: insert overwrite table
> customer select * from tpcds_300_text.customer db: tpcds_300_decimal_parquet
> I0227 01:34:48.158025 147928 FeSupport.java:285]
> 794c051ae3f3913c:71f00bc400000000] Requesting prioritized load of table(s):
> tpcds_300_decimal_parquet.customer
> I0227 01:34:52.049566 147928 Frontend.java:1292]
> 794c051ae3f3913c:71f00bc400000000] Analysis finished.
> I0227 01:34:52.067458 147991 admission-controller.cc:627]
> 794c051ae3f3913c:71f00bc400000000] Schedule for
> id=794c051ae3f3913c:71f00bc400000000 in pool_name=root.systest
> per_host_mem_estimate=1.62 GB PoolConfig: max_requests=-1 max_queued=200
> max_mem=-1.00 B
> I0227 01:34:52.067562 147991 admission-controller.cc:632]
> 794c051ae3f3913c:71f00bc400000000] Stats: agg_num_running=0,
> agg_num_queued=0, agg_mem_reserved=0, local_host(local_mem_admitted=0,
> num_admitted_running=0, num_queued=0, backend_mem_reserved=0)
> I0227 01:34:52.067620 147991 admission-controller.cc:664]
> 794c051ae3f3913c:71f00bc400000000] Admitted query
> id=794c051ae3f3913c:71f00bc400000000
> I0227 01:34:52.067771 147991 coordinator.cc:93]
> 794c051ae3f3913c:71f00bc400000000] Exec()
> query_id=794c051ae3f3913c:71f00bc400000000 stmt=insert overwrite table
> customer select * from tpcds_300_text.customer
> I0227 01:34:52.068926 147991 coordinator.cc:359]
> 794c051ae3f3913c:71f00bc400000000] starting execution on 9 backends for
> query_id=794c051ae3f3913c:71f00bc400000000
> I0227 01:34:52.070919 47659 impala-internal-service.cc:50]
> 794c051ae3f3913c:71f00bc400000000] ExecQueryFInstances():
> query_id=794c051ae3f3913c:71f00bc400000000
> coord=vc1326.halxg.cloudera.com:22000 #instances=1
> I0227 01:34:52.071800 147994 query-state.cc:624]
> 794c051ae3f3913c:71f00bc400000003] Executing instance.
> instance_id=794c051ae3f3913c:71f00bc400000003 fragment_idx=0
> per_fragment_instance_idx=3 coord_state_idx=0 #in-flight=1
> I0227 01:34:52.072952 147991 coordinator.cc:373]
> 794c051ae3f3913c:71f00bc400000000] started execution on 9 backends for
> query_id=794c051ae3f3913c:71f00bc400000000
> I0227 01:34:52.074553 147992 coordinator.cc:611] Coordinator waiting for
> backends to finish, 9 remaining. query_id=794c051ae3f3913c:71f00bc400000000
> F0227 01:34:52.949759 147994 parquet-column-stats.inline.h:213]
> 794c051ae3f3913c:71f00bc400000003] Check failed:
> static_cast<void*>(prev_page_min_value_.ptr) !=
> static_cast<void*>(cs->min_value_.ptr) (0 vs. 0)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]