waterlx edited a comment on issue #1349:
URL: https://github.com/apache/iceberg/issues/1349#issuecomment-676468945


   @aokolnychyi I used a Spark job to count the record after writing into the 
Iceberg table and it did not show the expected the number. By accident, I found 
that the version nuber in version-hint was not the same as the latest metadata 
json. 
   A Flink job is used to write into Iceberg and I use MergeAppend and 
Transaction.commitTransaction() to do the commit. The table instance is not 
cached.
   
   Sorry that I could not re-create it or recall all actions I made that day. I 
guess the incorrect read might be due to that ont only was version-hint not 
updated correctly, but at least one metadata json file is also not written 
correctly due to the improper permissions. 
   
   Another thing I would like to mention (but might not relate to that 
incorrect read) is that "java.io.FileNotFoundException" in the description is 
the error made on purpose so as to trigger the error in house in my dev 
environment, the actual error on our production system is the following 
`org.apache.hadoop.ipc.RemoteException` (we enabled Ranger aganist Hadoop):
   ```
   
org.apache.hadoop.ipc.RemoteException(org.apache.ranger.authorization.hadoop.exceptions.RangerAccessControlException):
 Permission denied: user=u_teg_tdbank, access=WRITE, 
inode="/xxxx/version-hint.text"
        at 
org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkPermission(RangerHdfsAuthorizer.java:442)
        at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1663)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1647)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1597)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.resolvePathForStartFile(FSDirWriteFileOp.java:305)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2284)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2227)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.createOriginal(NameNodeRpcServer.java:745)
        at 
org.apache.hadoop.hdfs.server.namenode.ProtectionManager.create(ProtectionManager.java:326)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:715)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:421)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:866)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:809)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2248)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2574)
   
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1489)
        at org.apache.hadoop.ipc.Client.call(Client.java:1435)
        at org.apache.hadoop.ipc.Client.call(Client.java:1345)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy10.create(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:307)
        at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
        at com.sun.proxy.$Proxy11.create(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:266)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1308)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1249)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:484)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:481)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:495)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:422)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:946)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:927)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:824)
        at 
org.apache.iceberg.hadoop.HadoopTableOperations.writeVersionHint(HadoopTableOperations.java:273)
        at 
org.apache.iceberg.hadoop.HadoopTableOperations.commit(HadoopTableOperations.java:162)
        at 
org.apache.iceberg.BaseTransaction.lambda$commitSimpleTransaction$5(BaseTransaction.java:344)
        at 
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:403)
        at 
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:212)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:188)
        at 
org.apache.iceberg.BaseTransaction.commitSimpleTransaction(BaseTransaction.java:329)
        at 
org.apache.iceberg.BaseTransaction.commitTransaction(BaseTransaction.java:220)
   ```
   `org.apache.hadoop.ipc.RemoteException` is a sub-class of IOException so it 
is swallowed and logged as expected


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to