[
https://issues.apache.org/jira/browse/IMPALA-8479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836804#comment-16836804
]
Vihang Karajgaonkar commented on IMPALA-8479:
---------------------------------------------
The snippet provided above reproduces the issue because of the way metastore
provides transaction isolation. By default metastore provides read-committed
transaction isolation (unless user has overridden
{{datanucleus.transactionIsolation}} configuration on the metastore to
{{repeatable-read}} level which is highly unlikely)
When Impala refreshes a table it requests partitions in 2 steps. First it
fetches all the partition names and then it requests all the partitions for the
fetched names. Both these steps happen in separate transactions. This should be
fine since we don't verify that number of fetched names is equals to number of
partitions received. The problem occurs when for example another transaction is
dropping some partitions of the same table at the same time. Since fetching
partitions in directSQL path is done by issuing many select statements it is
possible that the another transaction removed some of the partitions in between
the two select statements which are fired by the {{get_partitions_by_names}}
API. In such case, a partially filled partition can be returned which in this
case triggers the error. For example following sequence of events can cause
this problem to occur.
Session 1 issues {{get_partitions_by_names}} API. This internally issues many
select statements so something like
S1: {{get_partitions_by_names}}
S1: --> open_transaction
S1: --> select all partition ids for the given table and names
S2: {{drop_partition_by_name}}
S2: {{open transaction}}
S1: --> select on params table for fetching parameters etc
S2: drops a partition by name
S2: {{commit}}
S1: --> select on partition values table for fetching partition_values -->
this will not return values for the partitions which are dropped by S2. All
such partitions will be incomplete and cause this problem on Impala side.
S1: {{commit}}
S1: above returns some partially filled partitions.
In my opinion this is a problem on metastore side. The solution is either to
fetch all the partition information in one select statement joining all the
relevant tables or handle such cases in metastore code after the results are
fetched (at-least in case of drops make sure such partitions are not returned).
The second approach will still have the same problem when a partition is
altered from another session.
On the other hand, it could also be argued that Impala needs to play well other
HMS clients (like Hive) by taking shared locks when reading table metadata so
that other clients are blocked from dropping them while Impala is reading it.
Once we start supporting transactional tables in Impala, this problem will be
hopefully resolved on its own.
The issue should be rare in practice unless there are concurrent workloads
running in Hive and Impala at the same time on the same set of tables. In such
cases, it is recommended that metastore should use {{repeatable-read}}
transaction isolation level to avoid this problem. This can be done by setting
the following configuration in the hive-site.xml of metastore
{code}
<property>
<name>datanucleus.transactionIsolation</name>
<value>repeatable-read</value>
</property>
{code}
> REFRESH may fail if table metadata mutates during load
> ------------------------------------------------------
>
> Key: IMPALA-8479
> URL: https://issues.apache.org/jira/browse/IMPALA-8479
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 3.3.0
> Reporter: Vincent Tran
> Assignee: Vihang Karajgaonkar
> Priority: Major
>
> Reproduction:
> 1) Create a partitioned table
> {noformat}
> create table t1 (c1 int) partitioned by (part int);{noformat}
> 2) Generate a decent number of partitions
> {noformat}
> for i in {1..5000}; do impala-shell.sh -q "insert into t1 partition(part=$i)
> values ($i)"; done
> {noformat}
> 3) Start an IM;REFRESH; loop
> {noformat}
> while :; do impala-shell.sh -q "invalidate metadata; refresh t1;";
> done{noformat}
> 4) Start dropping partitions in Hive.
> {noformat}
> for i in {1..5000}; do hive -e "alter table t1 drop partition (part=$i)";
> done{noformat}
> Eventually, when the "REFRESH" and "ALTER TABLE ... DROP ..." coincides in
> HMS, catalogd will encounter this TableLoadingException (as appeared on trunk)
> {noformat}
> I0501 14:06:14.552676 38522 TableLoadingMgr.java:70] Loading metadata for
> table: vt.t1
> I0501 14:06:14.552776 38927 TableLoader.java:61] Loading metadata for: vt.t1
> I0501 14:06:14.552850 38522 TableLoadingMgr.java:72] Remaining items in
> queue: 0. Loads in progress: 1
> I0501 14:06:14.566756 38927 HdfsTable.java:941] Fetching partition metadata
> from the Metastore: vt.t1
> I0501 14:06:16.305446 38927 HdfsTable.java:945] Fetched partition metadata
> from the Metastore: vt.t1
> I0501 14:06:16.367607 38927 TableLoader.java:101] Loaded metadata for: vt.t1
> (1814ms)
> I0501 14:06:16.368847 38522 jni-util.cc:256]
> org.apache.impala.catalog.TableLoadingException: Failed to load metadata for
> table: vt.t1
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:956)
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:877)
> at org.apache.impala.catalog.TableLoader.load(TableLoader.java:84)
> at
> org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:241)
> at
> org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:238)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalArgumentException: Cannot parse partition values
> '[]' for table vt.t1: expected %d values but got %d [1, 0]
> at
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:119)
> at
> org.apache.impala.catalog.FeCatalogUtils.parsePartitionKeyValues(FeCatalogUtils.java:224)
> at
> org.apache.impala.catalog.HdfsTable.createPartition(HdfsTable.java:698)
> at
> org.apache.impala.catalog.HdfsTable.loadAllPartitions(HdfsTable.java:532)
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:946)
> ... 8 more
> I0501 14:06:16.440848 38522 status.cc:124] TableLoadingException: Failed to
> load metadata for table: vt.t1
> CAUSED BY: IllegalArgumentException: Cannot parse partition values '[]' for
> table vt.t1: expected %d values but got %d [1, 0]
> @ 0x1a91ff0 impala::Status::Status()
> @ 0x221518c impala::JniUtil::GetJniExceptionMsg()
> @ 0x1a7a369 impala::JniCall::Call<>()
> @ 0x1a78475 impala::JniUtil::CallJniMethod<>()
> @ 0x1a7663c impala::Catalog::ResetMetadata()
> @ 0x1a4e5df CatalogServiceThriftIf::ResetMetadata()
> @ 0x1aea49d
> impala::CatalogServiceProcessor::process_ResetMetadata()
> @ 0x1ae8b9b impala::CatalogServiceProcessor::dispatchCall()
> @ 0x1a36feb apache::thrift::TDispatchProcessor::process()
> @ 0x1e8e8a0
> apache::thrift::server::TAcceptQueueServer::Task::run()
> @ 0x1e8509e impala::ThriftThread::RunRunnable()
> @ 0x1e867c4 boost::_mfi::mf2<>::operator()()
> @ 0x1e8665a boost::_bi::list3<>::operator()<>()
> @ 0x1e863a6 boost::_bi::bind_t<>::operator()()
> @ 0x1e862b9
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @ 0x1daa4eb boost::function0<>::operator()()
> @ 0x2286cd0 impala::Thread::SuperviseThread()
> @ 0x228f054 boost::_bi::list5<>::operator()<>()
> @ 0x228ef78 boost::_bi::bind_t<>::operator()()
> @ 0x228ef3b boost::detail::thread_data<>::run()
> @ 0x37967b9 thread_proxy
> @ 0x7f1dc6fef6b9 start_thread
> @ 0x7f1dc6d2541c clone
> E0501 14:06:16.440897 38522 catalog-server.cc:122] TableLoadingException:
> Failed to load metadata for table: vt.t1
> CAUSED BY: IllegalArgumentException: Cannot parse partition values '[]' for
> table vt.t1: expected %d values but got %d [1, 0]
> {noformat}
> In earlier versions, this may manifest as a similar in context but distinct
> message.
> In Impala 2.10, for example:
> {noformat}
> I0501 09:51:20.349366 21296 CatalogServiceCatalog.java:996] Refreshing table
> metadata: default.pt1
> I0501 09:51:20.359350 21296 HdfsTable.java:1067] Incrementally loading table
> metadata for: default.pt1
> I0501 09:51:20.393787 21296 jni-util.cc:211]
> org.apache.impala.catalog.TableLoadingException: Failed to load metadata for
> table: pt1
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1092)
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1020)
> at
> org.apache.impala.catalog.CatalogServiceCatalog.reloadTable(CatalogServiceCatalog.java:1016)
> at
> org.apache.impala.service.CatalogOpExecutor.execResetMetadata(CatalogOpExecutor.java:3067)
> at
> org.apache.impala.service.JniCatalog.resetMetadata(JniCatalog.java:156)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:653)
> at java.util.ArrayList.get(ArrayList.java:429)
> at
> org.apache.hadoop.hive.common.FileUtils.makePartName(FileUtils.java:155)
> at
> org.apache.hadoop.hive.common.FileUtils.makePartName(FileUtils.java:135)
> at
> org.apache.impala.catalog.HdfsPartition.getPartitionName(HdfsPartition.java:491)
> at
> org.apache.impala.catalog.HdfsTable.updatePartitionsFromHms(HdfsTable.java:1171)
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1073)
> ... 4 more
> I0501 09:51:20.402027 21296 status.cc:122] TableLoadingException: Failed to
> load metadata for table: pt1
> CAUSED BY: IndexOutOfBoundsException: Index: 0, Size: 0
> @ 0x83efc9 impala::Status::Status()
> @ 0xb747d2 impala::JniUtil::GetJniExceptionMsg()
> @ 0x8317cb impala::Catalog::ResetMetadata()
> @ 0x8244db CatalogServiceThriftIf::ResetMetadata()
> @ 0x8cb329
> impala::CatalogServiceProcessor::process_ResetMetadata()
> @ 0x8c8f39 impala::CatalogServiceProcessor::dispatchCall()
> @ 0x80ecec apache::thrift::TDispatchProcessor::process()
> @ 0x9db65f
> apache::thrift::server::TAcceptQueueServer::Task::run()
> @ 0x9d5c79 impala::ThriftThread::RunRunnable()
> @ 0x9d6a52
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @ 0xbd6ff2 impala::Thread::SuperviseThread()
> @ 0xbd7754 boost::detail::thread_data<>::run()
> @ 0xe6418a (unknown)
> @ 0x7f0faf4a4dd5 start_thread
> @ 0x7f0faf1cdead __clone
> E0501 09:51:20.402400 21296 catalog-server.cc:82] TableLoadingException:
> Failed to load metadata for table: pt1
> CAUSED BY: IndexOutOfBoundsException: Index: 0, Size: 0
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]