[
https://issues.apache.org/jira/browse/HDDS-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941398#comment-16941398
]
Anu Engineer commented on HDDS-2203:
------------------------------------
makes sense. Do you want this patch committed? or just move to the new model ?
> Race condition in ByteStringHelper.init()
> -----------------------------------------
>
> Key: HDDS-2203
> URL: https://issues.apache.org/jira/browse/HDDS-2203
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Client, SCM
> Reporter: Istvan Fajth
> Assignee: Istvan Fajth
> Priority: Critical
> Labels: pull-request-available, pull-requests-available
> Time Spent: 1h
> Remaining Estimate: 0h
>
> The current init method:
> {code}
> public static void init(boolean isUnsafeByteOperation) {
> final boolean set = INITIALIZED.compareAndSet(false, true);
> if (set) {
> ByteStringHelper.isUnsafeByteOperationsEnabled =
> isUnsafeByteOperation;
> } else {
> // already initialized, check values
> Preconditions.checkState(isUnsafeByteOperationsEnabled
> == isUnsafeByteOperation);
> }
> }
> {code}
> In a scenario when two thread accesses this method, and the execution order
> is the following, then the second thread runs into an exception from
> PreCondition.checkState() in the else branch.
> In an unitialized state:
> - T1 thread arrives to the method with true as the parameter, the class
> initialises the isUnsafeByteOperationsEnabled to false
> - T1 sets INITIALIZED true
> - T2 arrives to the method with true as the parameter
> - T2 reads the INITALIZED value and as it is not false goes to else branch
> - T2 tries to check if the internal boolean property is the same true as it
> wanted to set, and as T1 still to set the value, the checkState throws an
> IllegalArgumentException.
> This happens in certain Hive query cases, as it came from that testing, the
> exception we see there is the following:
> {code}
> Error: Error while processing statement: FAILED: Execution Error, return code
> 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed,
> vertexName=Map 2, vertexId=vertex_1569486223160_0334_1_02,
> diagnostics=[Vertex vertex_1569486223160_0334_1_02 [Map 2] killed/failed
> due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: item initializer failed,
> vertex=vertex_1569486223160_0334_1_02 [Map 2], java.io.IOException: Couldn't
> create RpcClient protocol
> at
> org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:263)
> at
> org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:239)
> at
> org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:203)
> at
> org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:165)
> at
> org.apache.hadoop.fs.ozone.BasicOzoneClientAdapterImpl.<init>(BasicOzoneClientAdapterImpl.java:158)
> at
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.<init>(OzoneClientAdapterImpl.java:50)
> at
> org.apache.hadoop.fs.ozone.OzoneFileSystem.createAdapter(OzoneFileSystem.java:102)
> at
> org.apache.hadoop.fs.ozone.BasicOzoneFileSystem.initialize(BasicOzoneFileSystem.java:155)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1821)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:2002)
> at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524)
> at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781)
> at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
> at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
> at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
> at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
> at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
> at
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
> at
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
> at
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException
> at
> org.apache.hadoop.ozone.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at
> org.apache.hadoop.hdds.scm.ByteStringHelper.init(ByteStringHelper.java:47)
> at org.apache.hadoop.ozone.client.rpc.RpcClient.<init>(RpcClient.java:241)
> at
> org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:256)
> ... 31 more
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]