[jira] [Commented] (HADOOP-14070) S3a: Failed to reset the request input stream/make S3A uploadPart() retriable
[ https://issues.apache.org/jira/browse/HADOOP-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558730#comment-16558730 ] Steve Loughran commented on HADOOP-14070: - We can fix this with {{uploadPart()}} implementing the retry logic itself, which will also handle other transient errors. Implies that {{com.amazonaws.ResetException}} will need to go into the retry table as retriable > S3a: Failed to reset the request input stream/make S3A uploadPart() retriable > - > > Key: HADOOP-14070 > URL: https://issues.apache.org/jira/browse/HADOOP-14070 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0-alpha2 >Reporter: Seth Fitzsimmons >Priority: Major > > {code} > Feb 07, 2017 8:05:46 AM > com.google.common.util.concurrent.Futures$CombinedFuture > setExceptionAndMaybeLog > SEVERE: input future failed. > com.amazonaws.ResetException: Failed to reset the request input stream; If > the request involves an input stream, the maximum stream buffer size can be > configured via request.getRequestClientOptions().setReadLimit(int) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1221) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1042) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:948) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:661) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:635) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:618) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$300(AmazonHttpClient.java:586) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:573) > at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:445) > at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4041) > at > com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3041) > at > com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3026) > at org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1114) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload$1.call(S3ABlockOutputStream.java:501) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload$1.call(S3ABlockOutputStream.java:492) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1219) > at > org.apache.hadoop.fs.s3a.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:222) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:102) > at com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:169) > at > com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:102) > at > org.apache.hadoop.fs.s3a.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:222) > ... 20 more > 2017-02-07 08:05:46 WARN S3AInstrumentation:777 - Closing output stream > statistics while data is still marked as pending upload in > OutputStreamStatistics{blocksSubmitted=519, blocksInQueue=0, blocksActive=1, > blockUploadsCompleted=518, blockUploadsFailed=2, bytesPendingUpload=82528300, > bytesUploaded=54316236800, blocksAllocated=519, blocksReleased=519, > blocksActivelyAllocated=0, exceptionsInMultipartFinalize=0, > transferDuration=2637812 ms, queueDuration=839 ms, averageQueueTime=1 ms, > totalUploadDuration=2638651 ms, effectiveBandwidth=2.05848506680118E7 bytes/s} > Exception in thread "main" org.apache.hadoop.fs.s3a.AWSClientIOException: > Multi-part upload with id > 'uDonLgtsyeToSmhyZuNb7YrubCDiyXCCQy4mdVc5ZmYWPPHyZ3H3ZlFZzKktaPUiYb7uT4.oM.lcyoazHF7W8pK4xWmXV4RWmIYGYYhN6m25nWRrBEE9DcJHcgIhFD8xd7EKIjijEd1k4S5JY1HQvA--' > to 2017/history-170130.orc on 2017/history-170130.orc: > com.amazonaws.ResetException: Failed to reset the request input stream; If > the request involves an input stream, the maximum stream buffer size can be >
[jira] [Commented] (HADOOP-14070) S3a: Failed to reset the request input stream
[ https://issues.apache.org/jira/browse/HADOOP-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522551#comment-16522551 ] Seth Fitzsimmons commented on HADOOP-14070: --- I did not set fs.s3a.fast.upload.buffer explicitly. > S3a: Failed to reset the request input stream > - > > Key: HADOOP-14070 > URL: https://issues.apache.org/jira/browse/HADOOP-14070 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0-alpha2 >Reporter: Seth Fitzsimmons >Priority: Major > > {code} > Feb 07, 2017 8:05:46 AM > com.google.common.util.concurrent.Futures$CombinedFuture > setExceptionAndMaybeLog > SEVERE: input future failed. > com.amazonaws.ResetException: Failed to reset the request input stream; If > the request involves an input stream, the maximum stream buffer size can be > configured via request.getRequestClientOptions().setReadLimit(int) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1221) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1042) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:948) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:661) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:635) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:618) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$300(AmazonHttpClient.java:586) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:573) > at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:445) > at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4041) > at > com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3041) > at > com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3026) > at org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1114) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload$1.call(S3ABlockOutputStream.java:501) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload$1.call(S3ABlockOutputStream.java:492) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1219) > at > org.apache.hadoop.fs.s3a.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:222) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:102) > at com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:169) > at > com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:102) > at > org.apache.hadoop.fs.s3a.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:222) > ... 20 more > 2017-02-07 08:05:46 WARN S3AInstrumentation:777 - Closing output stream > statistics while data is still marked as pending upload in > OutputStreamStatistics{blocksSubmitted=519, blocksInQueue=0, blocksActive=1, > blockUploadsCompleted=518, blockUploadsFailed=2, bytesPendingUpload=82528300, > bytesUploaded=54316236800, blocksAllocated=519, blocksReleased=519, > blocksActivelyAllocated=0, exceptionsInMultipartFinalize=0, > transferDuration=2637812 ms, queueDuration=839 ms, averageQueueTime=1 ms, > totalUploadDuration=2638651 ms, effectiveBandwidth=2.05848506680118E7 bytes/s} > Exception in thread "main" org.apache.hadoop.fs.s3a.AWSClientIOException: > Multi-part upload with id > 'uDonLgtsyeToSmhyZuNb7YrubCDiyXCCQy4mdVc5ZmYWPPHyZ3H3ZlFZzKktaPUiYb7uT4.oM.lcyoazHF7W8pK4xWmXV4RWmIYGYYhN6m25nWRrBEE9DcJHcgIhFD8xd7EKIjijEd1k4S5JY1HQvA--' > to 2017/history-170130.orc on 2017/history-170130.orc: > com.amazonaws.ResetException: Failed to reset the request input stream; If > the request involves an input stream, the maximum stream buffer size can be > configured via request.getRequestClientOptions().setReadLimit(int): Failed to > reset the request input stream; If the request involves an input stream, the > maximum stream buffer size can be configured via >
[jira] [Commented] (HADOOP-14070) S3a: Failed to reset the request input stream
[ https://issues.apache.org/jira/browse/HADOOP-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522437#comment-16522437 ] Steve Loughran commented on HADOOP-14070: - Looking @ this again, tracing codepaths in the IDE * When disk is being used as the buffer for block upload, we pass in the File reference: expectation, AWS SDK does mark/reset against it, reopening the file if needed. We explicitly moved to passing in the file ref for better recovery. * When a byte array or bytebuffer is used to buffer blocks, then we pass in an associated stream which has no limits on where it can reset...its just an offset into an array Stack trace here doesn't show us what is being used in the S3A connector, just that the AWS buffering couldn't reset its buffering wrapper as it had gone too far. And why was that {{SdkBufferedInputStream}} being used? That's the interesting question. # tracing things through the IDE, it would only seem to happen if {{InputStream.markSupported() == false}}. # the byte array and our won block array output streams do return true, do let you reset to anywhere. # So either that wasn't picked up, and they were wrapped, or this was a file source (it's the default, after all), and somehow that got wrapped by a buffer. [~mojodna] it's been a while —but do you remember if you explicitly set {{fs.s3a.fast.upload.buffer}} to buffer via byte array "array" or byte buffer, "bytebuffer"? If not, something isn't right with what the SDK claims to do for wrapping file sources & handling falures there. w.r.t actions, if it is in the SDK, that's the place for the fix. its happening in the xfer manager threads, only being manifest for us in the waitForUploads() code; there's no way we could wrap and fix this > S3a: Failed to reset the request input stream > - > > Key: HADOOP-14070 > URL: https://issues.apache.org/jira/browse/HADOOP-14070 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0-alpha2 >Reporter: Seth Fitzsimmons >Priority: Major > > {code} > Feb 07, 2017 8:05:46 AM > com.google.common.util.concurrent.Futures$CombinedFuture > setExceptionAndMaybeLog > SEVERE: input future failed. > com.amazonaws.ResetException: Failed to reset the request input stream; If > the request involves an input stream, the maximum stream buffer size can be > configured via request.getRequestClientOptions().setReadLimit(int) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1221) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1042) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:948) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:661) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:635) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:618) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$300(AmazonHttpClient.java:586) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:573) > at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:445) > at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4041) > at > com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3041) > at > com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3026) > at org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1114) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload$1.call(S3ABlockOutputStream.java:501) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload$1.call(S3ABlockOutputStream.java:492) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1219) > at > org.apache.hadoop.fs.s3a.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:222) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:102) > at com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:169) > at >
[jira] [Commented] (HADOOP-14070) S3a: Failed to reset the request input stream
[ https://issues.apache.org/jira/browse/HADOOP-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16369474#comment-16369474 ] Steve Loughran commented on HADOOP-14070: - Reviewing. Sorry, I'd missed this before. # 80 MB parts of a 40GB upload; 519 blocks up # write of a block failed; SdkBufferedInputStream/BufferedInputStream couldn't return to a previous mark, so reported a failure to the progress listener then threw {{{com.amazonaws.ResetException}} which extends {{SdkClientException}} # {{BlockOutputStream.close()}} call got a failure from the upload, so failed. I actually think we are close being able to handle this, as that part upload goes through {{WriteOperationsHelper.uploadPart()}}, which had the retry handler. And with HADOOP-14028 which went in after this was filed, we are handling the File directly to AWS S3, so the mark/reset should work. But that doesn't have any explicit code for {{com.amazonaws.ResetException}}, which is what we need to look at. Adding explicit handling for this in the S3ARetryPolicy would guarantee that in any retryable POST/PUT operation the failure would trigger a retry. We can do this without needing to to wait and see if the problem is now completely fixed, whereas its one-line to add a new idempotent retry policy, and it will isolate us from any AWS client regressions. Targeting Hadoop 3.2 > S3a: Failed to reset the request input stream > - > > Key: HADOOP-14070 > URL: https://issues.apache.org/jira/browse/HADOOP-14070 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0-alpha2 >Reporter: Seth Fitzsimmons >Priority: Major > > {code} > Feb 07, 2017 8:05:46 AM > com.google.common.util.concurrent.Futures$CombinedFuture > setExceptionAndMaybeLog > SEVERE: input future failed. > com.amazonaws.ResetException: Failed to reset the request input stream; If > the request involves an input stream, the maximum stream buffer size can be > configured via request.getRequestClientOptions().setReadLimit(int) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1221) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1042) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:948) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:661) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:635) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:618) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$300(AmazonHttpClient.java:586) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:573) > at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:445) > at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4041) > at > com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3041) > at > com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3026) > at org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1114) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload$1.call(S3ABlockOutputStream.java:501) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload$1.call(S3ABlockOutputStream.java:492) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1219) > at > org.apache.hadoop.fs.s3a.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:222) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:102) > at com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:169) > at > com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:102) > at > org.apache.hadoop.fs.s3a.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:222) > ... 20 more > 2017-02-07 08:05:46 WARN S3AInstrumentation:777 - Closing output stream > statistics while data is still marked as pending upload in >
[jira] [Commented] (HADOOP-14070) S3a: Failed to reset the request input stream
[ https://issues.apache.org/jira/browse/HADOOP-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16369423#comment-16369423 ] Steve Loughran commented on HADOOP-14070: - Looks like the stream reset logic isn't going back to the filesystem, which can reset to anywhere it likes in the block > S3a: Failed to reset the request input stream > - > > Key: HADOOP-14070 > URL: https://issues.apache.org/jira/browse/HADOOP-14070 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0-alpha2 >Reporter: Seth Fitzsimmons >Priority: Major > > {code} > Feb 07, 2017 8:05:46 AM > com.google.common.util.concurrent.Futures$CombinedFuture > setExceptionAndMaybeLog > SEVERE: input future failed. > com.amazonaws.ResetException: Failed to reset the request input stream; If > the request involves an input stream, the maximum stream buffer size can be > configured via request.getRequestClientOptions().setReadLimit(int) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1221) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1042) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:948) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:661) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:635) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:618) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$300(AmazonHttpClient.java:586) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:573) > at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:445) > at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4041) > at > com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3041) > at > com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3026) > at org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1114) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload$1.call(S3ABlockOutputStream.java:501) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload$1.call(S3ABlockOutputStream.java:492) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1219) > at > org.apache.hadoop.fs.s3a.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:222) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:102) > at com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:169) > at > com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:102) > at > org.apache.hadoop.fs.s3a.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:222) > ... 20 more > 2017-02-07 08:05:46 WARN S3AInstrumentation:777 - Closing output stream > statistics while data is still marked as pending upload in > OutputStreamStatistics{blocksSubmitted=519, blocksInQueue=0, blocksActive=1, > blockUploadsCompleted=518, blockUploadsFailed=2, bytesPendingUpload=82528300, > bytesUploaded=54316236800, blocksAllocated=519, blocksReleased=519, > blocksActivelyAllocated=0, exceptionsInMultipartFinalize=0, > transferDuration=2637812 ms, queueDuration=839 ms, averageQueueTime=1 ms, > totalUploadDuration=2638651 ms, effectiveBandwidth=2.05848506680118E7 bytes/s} > Exception in thread "main" org.apache.hadoop.fs.s3a.AWSClientIOException: > Multi-part upload with id > 'uDonLgtsyeToSmhyZuNb7YrubCDiyXCCQy4mdVc5ZmYWPPHyZ3H3ZlFZzKktaPUiYb7uT4.oM.lcyoazHF7W8pK4xWmXV4RWmIYGYYhN6m25nWRrBEE9DcJHcgIhFD8xd7EKIjijEd1k4S5JY1HQvA--' > to 2017/history-170130.orc on 2017/history-170130.orc: > com.amazonaws.ResetException: Failed to reset the request input stream; If > the request involves an input stream, the maximum stream buffer size can be > configured via request.getRequestClientOptions().setReadLimit(int): Failed to > reset the request input stream; If the request involves an input stream, the > maximum