[
https://issues.apache.org/jira/browse/HADOOP-11959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562490#comment-14562490
]
Hadoop QA commented on HADOOP-11959:
------------------------------------
\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch | 14m 45s | Pre-patch trunk compilation is
healthy. |
| {color:green}+1{color} | @author | 0m 0s | The patch does not contain any
@author tags. |
| {color:green}+1{color} | tests included | 0m 0s | The patch appears to
include 3 new or modified test files. |
| {color:green}+1{color} | javac | 7m 35s | There were no new javac warning
messages. |
| {color:green}+1{color} | javadoc | 9m 36s | There were no new javadoc
warning messages. |
| {color:green}+1{color} | release audit | 0m 22s | The applied patch does
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle | 0m 22s | The applied patch generated 1
new checkstyle issues (total was 37, now 37). |
| {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that
end in whitespace. |
| {color:green}+1{color} | install | 1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with
eclipse:eclipse. |
| {color:green}+1{color} | findbugs | 0m 41s | The patch does not introduce
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests | 1m 10s | Tests passed in
hadoop-azure. |
| | | 36m 43s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL |
http://issues.apache.org/jira/secure/attachment/12735814/HADOOP-11959.2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 50eeea1 |
| checkstyle |
https://builds.apache.org/job/PreCommit-HADOOP-Build/6854/artifact/patchprocess/diffcheckstylehadoop-azure.txt
|
| hadoop-azure test log |
https://builds.apache.org/job/PreCommit-HADOOP-Build/6854/artifact/patchprocess/testrun_hadoop-azure.txt
|
| Test Results |
https://builds.apache.org/job/PreCommit-HADOOP-Build/6854/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output |
https://builds.apache.org/job/PreCommit-HADOOP-Build/6854/console |
This message was automatically generated.
> WASB should configure client side socket timeout in storage client blob
> request options
> ---------------------------------------------------------------------------------------
>
> Key: HADOOP-11959
> URL: https://issues.apache.org/jira/browse/HADOOP-11959
> Project: Hadoop Common
> Issue Type: Bug
> Components: tools
> Reporter: Ivan Mitic
> Assignee: Ivan Mitic
> Attachments: HADOOP-11959.2.patch, HADOOP-11959.patch
>
>
> On clusters/jobs where {{mapred.task.timeout}} is set to a larger value, we
> noticed that tasks can sometimes get stuck on the below stack.
> {code}
> Thread 1: (state = IN_NATIVE)
> - java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int,
> int, int) @bci=0 (Interpreted frame)
> - java.net.SocketInputStream.read(byte[], int, int, int) @bci=87, line=152
> (Interpreted frame)
> - java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=122
> (Interpreted frame)
> - java.io.BufferedInputStream.fill() @bci=175, line=235 (Interpreted frame)
> - java.io.BufferedInputStream.read1(byte[], int, int) @bci=44, line=275
> (Interpreted frame)
> - java.io.BufferedInputStream.read(byte[], int, int) @bci=49, line=334
> (Interpreted frame)
> - sun.net.www.MeteredStream.read(byte[], int, int) @bci=16, line=134
> (Interpreted frame)
> - java.io.FilterInputStream.read(byte[], int, int) @bci=7, line=133
> (Interpreted frame)
> - sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(byte[],
> int, int) @bci=4, line=3053 (Interpreted frame)
> - com.microsoft.azure.storage.core.NetworkInputStream.read(byte[], int, int)
> @bci=7, line=49 (Interpreted frame)
> -
> com.microsoft.azure.storage.blob.CloudBlob$10.postProcessResponse(java.net.HttpURLConnection,
> com.microsoft.azure.storage.blob.CloudBlob, com.microsoft.azure
> .storage.blob.CloudBlobClient, com.microsoft.azure.storage.OperationContext,
> java.lang.Integer) @bci=204, line=1691 (Interpreted frame)
> -
> com.microsoft.azure.storage.blob.CloudBlob$10.postProcessResponse(java.net.HttpURLConnection,
> java.lang.Object, java.lang.Object, com.microsoft.azure.storage
> .OperationContext, java.lang.Object) @bci=17, line=1613 (Interpreted frame)
> -
> com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(java.lang.Object,
> java.lang.Object, com.microsoft.azure.storage.core.StorageRequest, com.mi
> crosoft.azure.storage.RetryPolicyFactory,
> com.microsoft.azure.storage.OperationContext) @bci=352, line=148 (Interpreted
> frame)
> - com.microsoft.azure.storage.blob.CloudBlob.downloadRangeInternal(long,
> java.lang.Long, byte[], int, com.microsoft.azure.storage.AccessCondition,
> com.microsof
> t.azure.storage.blob.BlobRequestOptions,
> com.microsoft.azure.storage.OperationContext) @bci=131, line=1468
> (Interpreted frame)
> - com.microsoft.azure.storage.blob.BlobInputStream.dispatchRead(int) @bci=31,
> line=255 (Interpreted frame)
> - com.microsoft.azure.storage.blob.BlobInputStream.readInternal(byte[], int,
> int) @bci=52, line=448 (Interpreted frame)
> - com.microsoft.azure.storage.blob.BlobInputStream.read(byte[], int, int)
> @bci=28, line=420 (Interpreted frame)
> - java.io.BufferedInputStream.read1(byte[], int, int) @bci=39, line=273
> (Interpreted frame)
> - java.io.BufferedInputStream.read(byte[], int, int) @bci=49, line=334
> (Interpreted frame)
> - java.io.DataInputStream.read(byte[], int, int) @bci=7, line=149
> (Interpreted frame)
> -
> org.apache.hadoop.fs.azure.NativeAzureFileSystem$NativeAzureFsInputStream.read(byte[],
> int, int) @bci=10, line=734 (Interpreted frame)
> - java.io.BufferedInputStream.read1(byte[], int, int) @bci=39, line=273
> (Interpreted frame)
> - java.io.BufferedInputStream.read(byte[], int, int) @bci=49, line=334
> (Interpreted frame)
> - java.io.DataInputStream.read(byte[]) @bci=8, line=100 (Interpreted frame)
> - org.apache.hadoop.util.LineReader.fillBuffer(java.io.InputStream, byte[],
> boolean) @bci=2, line=180 (Interpreted frame)
> -
> org.apache.hadoop.util.LineReader.readDefaultLine(org.apache.hadoop.io.Text,
> int, int) @bci=64, line=216 (Compiled frame)
> - org.apache.hadoop.util.LineReader.readLine(org.apache.hadoop.io.Text, int,
> int) @bci=19, line=174 (Interpreted frame)
> - org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue()
> @bci=108, line=185 (Interpreted frame)
> - org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue()
> @bci=13, line=553 (Interpreted frame)
> - org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue() @bci=4,
> line=80 (Interpreted frame)
> - org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue()
> @bci=4, line=91 (Interpreted frame)
> -
> org.apache.hadoop.mapreduce.Mapper.run(org.apache.hadoop.mapreduce.Mapper$Context)
> @bci=6, line=144 (Interpreted frame)
> -
> org.apache.hadoop.mapred.MapTask.runNewMapper(org.apache.hadoop.mapred.JobConf,
> org.apache.hadoop.mapreduce.split.JobSplit$TaskSplitIndex, org.apache.hadoop.
> mapred.TaskUmbilicalProtocol, org.apache.hadoop.mapred.Task$TaskReporter)
> @bci=228, line=784 (Interpreted frame)
> - org.apache.hadoop.mapred.MapTask.run(org.apache.hadoop.mapred.JobConf,
> org.apache.hadoop.mapred.TaskUmbilicalProtocol) @bci=148, line=341
> (Interpreted frame)
> - org.apache.hadoop.mapred.YarnChild$2.run() @bci=29, line=163 (Interpreted
> frame)
> -
> java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction,
> java.security.AccessControlContext) @bci=0 (Interpreted frame)
> - javax.security.auth.Subject.doAs(javax.security.auth.Subject,
> java.security.PrivilegedExceptionAction) @bci=42, line=415 (Interpreted frame)
> -
> org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction)
> @bci=14, line=1628 (Interpreted frame)
> - org.apache.hadoop.mapred.YarnChild.main(java.lang.String[]) @bci=514,
> line=158 (Interpreted frame)
> {code}
> The issue is that the storage client is by default not setting the socket
> timeout on its HTTP connections causing that in some (rare) circumstances we
> encounter a deadlock (e.g. whether the server on the other side just dies
> unexpectedly).
> The fix is to configure the maximum operation time on the storage client
> request options.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)