Vihang Karajgaonkar created IMPALA-8366:
-------------------------------------------
Summary: Consistent synchronization model for Db updates
Key: IMPALA-8366
URL: https://issues.apache.org/jira/browse/IMPALA-8366
Project: IMPALA
Issue Type: Improvement
Components: Catalog
Reporter: Vihang Karajgaonkar
The synchronization model of updates to {{Db}} is different than for {{Table}}.
For example, adding a new function to a Db take a lock on {{functions_}} in a
synchronized block below:
{code}
public boolean addFunction(Function fn, boolean addToDbParams) {
Preconditions.checkState(fn.dbName().equals(getName()));
synchronized (functions_) {
if (getFunction(fn, Function.CompareMode.IS_INDISTINGUISHABLE) != null) {
return false;
}
List<Function> fns = functions_.get(fn.functionName());
if (fns == null) {
fns = new ArrayList<>();
functions_.put(fn.functionName(), fns);
}
if (addToDbParams && !addFunctionToDbParams(fn)) return false;
fns.add(fn);
Collections.sort(fns, FunctionUtils.FUNCTION_RESOLUTION_ORDER);
return true;
}
}
{code}
While this works for most cases, it is confusing. Any updates to {{Table}}
ideally should take a lock on {{mestoreDdlLock_}}, {{catalogVersion_}} and the
table lock itself in that order.
While in case of Db updates, changes are done in-place without taking a write
lock on {{catalogVersion_}}. This can cause problems like the one described in
IMPALA-8312 where readers can see race conditions during updates to the Db
object. It would be good and consistent if we take readlock on
{{catalogVersion_}} too so that it is less confusing and cleaner.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]