[ 
https://issues.apache.org/jira/browse/IMPALA-9532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuyao updated IMPALA-9532:
---------------------------
    Description: 
全局无效元数据对{{versionLock_}}进行写锁定。但是,ddls 的锁定协议在获取表级锁后立即释放{{versionLock_}}。这允许在 DDL 
操作进行时同时运行{{无效元数据}}。这可能会导致奇怪的竞争条件。下面的一个这样的例子可能会导致函数从目录中消失,直到再次发出无效的元数据。

以下事件序列可以重现这种竞争条件:
{noformat}
[localhost:21000] default> create function default.f() 返回 int location 
'/test-warehouse/libTestUdfs.so' symbol='NoArgs';
查询:创建函数 default.f() 返回 int location '/test-warehouse/libTestUdfs.so' 
symbol='NoArgs'
+----------------------------+
| 摘要 |
+----------------------------+
| 函数已创建。|
+----------------------------+
在 10.26 秒内获取 1 行

--> 会话 2 同时调用无效元数据
[localhost:21001] 默认> 使元数据无效;查询:invalidate metadata 查询提交时间:2020-03-18 15:04:25 
(Coordinator: http://vihang-Precision-21575:25001) 
查询进度可以在:http://<redacted>/query_plan? 
query_id=d3463484ff635684:620fbfef00000000 在 4.30 秒内获取了 0 行

--> session1 的 drop function 说 function 不存在,但 show functions 显示它。
[localhost:21000] 默认> 删除函数 f();
查询:删除函数 f()
错误:CatalogException:函数:f() 不存在。

[localhost:21000] 默认> 显示功能;
查询:显示函数
+-------------+-----------+-------------+---------------+
| 返回类型 | 签名| 二进制类型 | 是执着 |
+-------------+-----------+-------------+---------------+
| 国际 | f() | 本地 | 真实|
+-------------+-----------+-------------+---------------+
在 0.01 秒内获取 1 行
[本地主机:21000] 默认> 

-- 会话 2 永远不会看到函数 f:
[localhost:21001] 默认> 显示功能;
查询:显示函数
在 0.00 秒内获取了 0 行
{noformat}
当 create 函数语句在{{CatalogOpExecutor 中}}执行时,我们在 HMS 中应用 alterDatabase 
来持久化新的数据库参数:[https|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1409]
 : [//github.com/apache/impala/blob/master/fe/src/main/java/org/apache 
/impala/service/CatalogOpExecutor.java#L1409|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1409]

请注意,我们在第 1409 行发布了{{versionLock_}}。同时,并发{{无效元数据}}从 HMS 
获取数据库参数[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache
 
/impala/catalog/CatalogServiceCatalog.java#L1326|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L1326]它将覆盖新创建的
 Db 对象的参数。因此,我们实际上是从参数中删除了该函数,因为对 alterDatabase 的操作 1 尚未在 HMS 中提交。

{{show functions}} , {{drop function 的}}所有后续命令将显示不一致的结果。通过在 createFunction 方法中的 
alterDatabase 调用之前添加 sleep 语句,我能够重现此竞争条件。

注意:以上代码链接基于提交哈希{{7dd13f72784514a59f82c9a7a5e2250503dbfaf0}}

  was:
The global invalidate metadata takes a write lock on the {{versionLock_}}. 
However, the locking protocol for ddls release the {{versionLock_}} as soon as 
the table level lock is acquired. This allows for a concurrent {{invalidate 
metadata}} to run while the DDL operation is in progress. This can lead to 
weird race conditions. One such example is below can lead to functions 
disappearing from the catalog until a invalidate metadata is issued again.

Following sequence of events can reproduce this race condition:
{noformat}
[localhost:21000] default> create function default.f() returns int location 
'/test-warehouse/libTestUdfs.so' symbol='NoArgs';
Query: create function default.f() returns int location 
'/test-warehouse/libTestUdfs.so' symbol='NoArgs'
+----------------------------+
| summary                    |
+----------------------------+
| Function has been created. |
+----------------------------+
Fetched 1 row(s) in 10.26s

--> Session 2 invokes invalidate metadata concurrently
[localhost:21001] default> invalidate metadata; Query: invalidate metadata 
Query submitted at: 2020-03-18 15:04:25 (Coordinator: 
http://vihang-Precision-21575:25001) Query progress can be monitored at: 
http://<redacted>/query_plan?query_id=d3463484ff635684:620fbfef00000000 Fetched 
0 row(s) in 4.30s

--> drop function from session1 says function does not exist but show functions 
shows it.
[localhost:21000] default> drop function f();
Query: drop function f()
ERROR: CatalogException: Function: f() does not exist.

[localhost:21000] default> show functions;
Query: show functions
+-------------+-----------+-------------+---------------+
| return type | signature | binary type | is persistent |
+-------------+-----------+-------------+---------------+
| INT         | f()       | NATIVE      | true          |
+-------------+-----------+-------------+---------------+
Fetched 1 row(s) in 0.01s
[localhost:21000] default> 

-- Session 2 never sees the function f:
[localhost:21001] default> show functions;
Query: show functions
Fetched 0 row(s) in 0.00s
{noformat}
When the create function statement is executing in {{CatalogOpExecutor}} we 
apply the alterDatabase in HMS to persist the new db parameters here: 
[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1409]

Note the we have released the {{versionLock_}} by line 1409. Meanwhile a 
concurrent {{invalidate metadata}} fetches the db params from HMS here 
[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L1326]
 which will override the parameters of the newly created Db object. Hence 
effectively we are removing the function from the parameters since the 
operation 1 to alterDatabase is not yet committed in HMS.

All subsequent commands of {{show functions}}, {{drop function}} will show 
inconsistent results. I was able to reproduce this race condition by added a 
sleep statement just before the alterDatabase call in the createFunction method.

Note: Above code links are based of commit hash 
{{7dd13f72784514a59f82c9a7a5e2250503dbfaf0}}


> Functions can disappear when a concurrent invalidate metadata is running
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-9532
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9532
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Vihang Karajgaonkar
>            Priority: Major
>              Labels: concurrency
>
> 全局无效元数据对{{versionLock_}}进行写锁定。但是,ddls 的锁定协议在获取表级锁后立即释放{{versionLock_}}。这允许在 
> DDL 操作进行时同时运行{{无效元数据}}。这可能会导致奇怪的竞争条件。下面的一个这样的例子可能会导致函数从目录中消失,直到再次发出无效的元数据。
> 以下事件序列可以重现这种竞争条件:
> {noformat}
> [localhost:21000] default> create function default.f() 返回 int location 
> '/test-warehouse/libTestUdfs.so' symbol='NoArgs';
> 查询:创建函数 default.f() 返回 int location '/test-warehouse/libTestUdfs.so' 
> symbol='NoArgs'
> +----------------------------+
> | 摘要 |
> +----------------------------+
> | 函数已创建。|
> +----------------------------+
> 在 10.26 秒内获取 1 行
> --> 会话 2 同时调用无效元数据
> [localhost:21001] 默认> 使元数据无效;查询:invalidate metadata 查询提交时间:2020-03-18 
> 15:04:25 (Coordinator: http://vihang-Precision-21575:25001) 
> 查询进度可以在:http://<redacted>/query_plan? 
> query_id=d3463484ff635684:620fbfef00000000 在 4.30 秒内获取了 0 行
> --> session1 的 drop function 说 function 不存在,但 show functions 显示它。
> [localhost:21000] 默认> 删除函数 f();
> 查询:删除函数 f()
> 错误:CatalogException:函数:f() 不存在。
> [localhost:21000] 默认> 显示功能;
> 查询:显示函数
> +-------------+-----------+-------------+---------------+
> | 返回类型 | 签名| 二进制类型 | 是执着 |
> +-------------+-----------+-------------+---------------+
> | 国际 | f() | 本地 | 真实|
> +-------------+-----------+-------------+---------------+
> 在 0.01 秒内获取 1 行
> [本地主机:21000] 默认> 
> -- 会话 2 永远不会看到函数 f:
> [localhost:21001] 默认> 显示功能;
> 查询:显示函数
> 在 0.00 秒内获取了 0 行
> {noformat}
> 当 create 函数语句在{{CatalogOpExecutor 中}}执行时,我们在 HMS 中应用 alterDatabase 
> 来持久化新的数据库参数:[https|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1409]
>  : [//github.com/apache/impala/blob/master/fe/src/main/java/org/apache 
> /impala/service/CatalogOpExecutor.java#L1409|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1409]
> 请注意,我们在第 1409 行发布了{{versionLock_}}。同时,并发{{无效元数据}}从 HMS 
> 获取数据库参数[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache
>  
> /impala/catalog/CatalogServiceCatalog.java#L1326|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L1326]它将覆盖新创建的
>  Db 对象的参数。因此,我们实际上是从参数中删除了该函数,因为对 alterDatabase 的操作 1 尚未在 HMS 中提交。
> {{show functions}} , {{drop function 的}}所有后续命令将显示不一致的结果。通过在 createFunction 
> 方法中的 alterDatabase 调用之前添加 sleep 语句,我能够重现此竞争条件。
> 注意:以上代码链接基于提交哈希{{7dd13f72784514a59f82c9a7a5e2250503dbfaf0}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to