[jira] [Commented] (IMPALA-13144) TestIcebergTable.test_migrated_table_field_id_resolution fails with Disk I/O error

Quanlong Huang (Jira) Wed, 10 Jul 2024 19:37:03 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864875#comment-17864875
 ]


Quanlong Huang commented on IMPALA-13144:
-----------------------------------------

Saw the same error in an internal job. The query is
{code:java}
I0710 09:10:33.606647 21511 Frontend.java:2181] 
d1437e1c91c04902:ce4edcaa00000000] Analyzing query: select * from 
iceberg_migrated_alter_test db: 
test_migrated_table_field_id_resolution_b59d79db {code}
The error with backend stacktrace:
{noformat}
I0710 09:10:33.610320 27712 status.cc:71] Disk I/O error on 
impala-ec2-centos79-m6i-4xlarge-xldisk-0db9.vpc.cloudera.com:27000: Failed to 
open HDFS file 
hdfs://localhost:20500/test-warehouse/iceberg_migrated_alter_test/000000_0
Error(2): No such file or directory
Root cause: RemoteException: File does not exist: 
/test-warehouse/iceberg_migrated_alter_test/000000_0
        at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:87)
        at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:77)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:159)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2040)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:738)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:454)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:994)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:922)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2899)

    @          0x10aa39a  impala::Status::Status()
    @          0x227bbb9  impala::io::OpenHdfsFileOp::Execute()
    @          0x227cd77  impala::SynchronousThreadPool::Worker()
    @          0x227c894  
boost::detail::function::void_function_invoker2<>::invoke()
    @          0x227fc40  impala::ThreadPool<>::WorkerThread()
    @          0x227c8bf  
boost::detail::function::void_function_obj_invoker0<>::invoke()
    @          0x1a63a1a  impala::Thread::SuperviseThread()
    @          0x1a64823  boost::detail::thread_data<>::run()
    @          0x24e6a27  thread_proxy
    @     0x7f18efd2bea5  start_thread
    @     0x7f18ecc26b0d  __clone{noformat}
In the output of the test, I can see the directory was removed before creating 
the external table:
{noformat}
24/07/10 09:10:01 INFO fs.TrashPolicyDefault: Moved: 
'hdfs://localhost:20500/test-warehouse/iceberg_migrated_alter_test' to trash 
at: 
hdfs://localhost:20500/user/jenkins/.Trash/Current/test-warehouse/iceberg_migrated_alter_test
Picked up JAVA_TOOL_OPTIONS:  
-javaagent:/data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/fe/target/dependency/jamm-0.4.0.jar
-- executing against localhost:21000

create external table 
test_migrated_table_field_id_resolution_b59d79db.iceberg_migrated_alter_test 
stored as iceberg location '/test-warehouse/iceberg_migrated_alter_test'
                        tblproperties('write.format.default'='parquet', 
'iceberg.catalog'=
                        'hadoop.tables');{noformat}
So not sure how it gets the file name. Maybe it comes from the iceberg manifest 
file. Note that in my local run, I don't see the output of removing that dir.

CC [~boroknagyz] 

> TestIcebergTable.test_migrated_table_field_id_resolution fails with Disk I/O 
> error
> ----------------------------------------------------------------------------------
>
>                 Key: IMPALA-13144
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13144
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.5.0
>            Reporter: Joe McDonnell
>            Priority: Critical
>              Labels: broken-build, flaky
>
> A couple test jobs hit a failure on 
> TestIcebergTable.test_migrated_table_field_id_resolution:
> {noformat}
> query_test/test_iceberg.py:270: in test_migrated_table_field_id_resolution
>     vector, unique_database)
> common/impala_test_suite.py:725: in run_test_case
>     result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
> common/impala_test_suite.py:660: in __exec_in_impala
>     result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:1013: in __execute_query
>     return impalad_client.execute(query, user=user)
> common/impala_connection.py:216: in execute
>     fetch_profile_after_close=fetch_profile_after_close)
> beeswax/impala_beeswax.py:191: in execute
>     handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:384: in __execute_query
>     self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:405: in wait_for_finished
>     raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> E    Query aborted:Disk I/O error on 
> impala-ec2-centos79-m6i-4xlarge-xldisk-153e.vpc.cloudera.com:27000: Failed to 
> open HDFS file 
> hdfs://localhost:20500/test-warehouse/iceberg_migrated_alter_test/000000_0
> E   Error(2): No such file or directory
> E   Root cause: RemoteException: File does not exist: 
> /test-warehouse/iceberg_migrated_alter_test/000000_0
> E     at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:87)
> E     at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:77)
> E     at 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:159)
> E     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2040)
> E     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:738)
> E     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:454)
> E     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> E     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
> E     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
> E     at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:994)
> E     at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:922)
> E     at java.security.AccessController.doPrivileged(Native Method)
> E     at javax.security.auth.Subject.doAs(Subject.java:422)
> E     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
> E     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2899){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-13144) TestIcebergTable.test_migrated_table_field_id_resolution fails with Disk I/O error

Reply via email to