[ 
https://issues.apache.org/jira/browse/HDDS-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941398#comment-16941398
 ] 

Anu Engineer commented on HDDS-2203:
------------------------------------

makes sense. Do you want this patch committed? or just move to the new model ? 


> Race condition in ByteStringHelper.init()
> -----------------------------------------
>
>                 Key: HDDS-2203
>                 URL: https://issues.apache.org/jira/browse/HDDS-2203
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Client, SCM
>            Reporter: Istvan Fajth
>            Assignee: Istvan Fajth
>            Priority: Critical
>              Labels: pull-request-available, pull-requests-available
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> The current init method:
> {code}
> public static void init(boolean isUnsafeByteOperation) {
>   final boolean set = INITIALIZED.compareAndSet(false, true);
>   if (set) {
>     ByteStringHelper.isUnsafeByteOperationsEnabled =
>        isUnsafeByteOperation;
>    } else {
>      // already initialized, check values
>      Preconditions.checkState(isUnsafeByteOperationsEnabled
>        == isUnsafeByteOperation);
>    }
> }
> {code}
> In a scenario when two thread accesses this method, and the execution order 
> is the following, then the second thread runs into an exception from 
> PreCondition.checkState() in the else branch.
> In an unitialized state:
> - T1 thread arrives to the method with true as the parameter, the class 
> initialises the isUnsafeByteOperationsEnabled to false
> - T1 sets INITIALIZED true
> - T2 arrives to the method with true as the parameter
> - T2 reads the INITALIZED value and as it is not false goes to else branch
> - T2 tries to check if the internal boolean property is the same true as it 
> wanted to set, and as T1 still to set the value, the checkState throws an 
> IllegalArgumentException.
> This happens in certain Hive query cases, as it came from that testing, the 
> exception we see there is the following:
> {code}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, 
> vertexName=Map 2, vertexId=vertex_1569486223160_0334_1_02, 
> diagnostics=[Vertex vertex_1569486223160_0334_1_02 [Map 2] killed/failed
>  due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: item initializer failed, 
> vertex=vertex_1569486223160_0334_1_02 [Map 2], java.io.IOException: Couldn't 
> create RpcClient protocol
>     at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:263)
>     at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:239)
>     at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:203)
>     at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:165)
>     at 
> org.apache.hadoop.fs.ozone.BasicOzoneClientAdapterImpl.<init>(BasicOzoneClientAdapterImpl.java:158)
>     at 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.<init>(OzoneClientAdapterImpl.java:50)
>     at 
> org.apache.hadoop.fs.ozone.OzoneFileSystem.createAdapter(OzoneFileSystem.java:102)
>     at 
> org.apache.hadoop.fs.ozone.BasicOzoneFileSystem.initialize(BasicOzoneFileSystem.java:155)
>     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136)
>     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491)
>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
>     at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1821)
>     at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:2002)
>     at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524)
>     at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781)
>     at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
>     at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
>     at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>     at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
>     at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
>     at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>     at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>     at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException
>     at 
> org.apache.hadoop.ozone.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:129)
>     at 
> org.apache.hadoop.hdds.scm.ByteStringHelper.init(ByteStringHelper.java:47)
>     at org.apache.hadoop.ozone.client.rpc.RpcClient.<init>(RpcClient.java:241)
>     at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:256)
>     ... 31 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to