[
https://issues.apache.org/jira/browse/HADOOP-14012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832197#comment-15832197
]
Steve Loughran commented on HADOOP-14012:
-----------------------------------------
triggered in async spark streaming run when s3 was shut down.
{code}
2017-01-20 15:49:08,992 [JobScheduler] INFO scheduler.JobScheduler
(Logging.scala:logInfo(54)) - Finished job streaming job 1484927337000 ms.0
from job set of time 1484927337000 ms
2017-01-20 15:49:08,993 [JobScheduler] INFO scheduler.JobScheduler
(Logging.scala:logInfo(54)) - Total delay: 11.992 s for time 1484927337000 ms
(execution: 0.000 s)
2017-01-20 15:49:08,997 [JobGenerator] WARN dstream.FileInputDStream
(Logging.scala:logWarning(87)) - Error finding new files
org.apache.hadoop.fs.s3a.AWSClientIOException: get on
s3a://hwdev-steve-new/spark-cloud/S3AStreamingSuite/streaming/streaming:
com.amazonaws.AbortedException: :
at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:128)
at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)
at
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.get(DynamoDBMetadataStore.java:361)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1617)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1436)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:1412)
at org.apache.hadoop.fs.Globber.listStatus(Globber.java:76)
at org.apache.hadoop.fs.Globber.doGlob(Globber.java:234)
at org.apache.hadoop.fs.Globber.glob(Globber.java:148)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1978)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:2130)
at
org.apache.spark.streaming.dstream.FileInputDStream.findNewFiles(FileInputDStream.scala:205)
at
org.apache.spark.streaming.dstream.FileInputDStream.compute(FileInputDStream.scala:149)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341)
at
org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334)
at scala.Option.orElse(Option.scala:289)
at
org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331)
at
org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:36)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341)
at
org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334)
at scala.Option.orElse(Option.scala:289)
at
org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331)
at
org.apache.spark.streaming.dstream.FilteredDStream.compute(FilteredDStream.scala:36)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341)
at
org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334)
at scala.Option.orElse(Option.scala:289)
at
org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331)
at
org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:36)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341)
at
org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336)
at
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334)
at scala.Option.orElse(Option.scala:289)
at
org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331)
at
org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:48)
at
org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:122)
at
org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:121)
at
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at
scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at
org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:121)
at
org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:249)
at
org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:247)
at scala.util.Try$.apply(Try.scala:192)
at
org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:247)
at
org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:183)
at
org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:89)
at
org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:88)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
Caused by: com.amazonaws.AbortedException:
at
com.amazonaws.internal.SdkFilterInputStream.abortIfNeeded(SdkFilterInputStream.java:51)
at
com.amazonaws.internal.SdkFilterInputStream.markSupported(SdkFilterInputStream.java:107)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:912)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:661)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:635)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:618)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$300(AmazonHttpClient.java:586)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:573)
at
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:445)
at
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:1722)
at
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:1698)
at
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.getItem(AmazonDynamoDBClient.java:1158)
at
com.amazonaws.services.dynamodbv2.document.internal.GetItemImpl.doLoadItem(GetItemImpl.java:77)
at
com.amazonaws.services.dynamodbv2.document.internal.GetItemImpl.getItem(GetItemImpl.java:66)
at
com.amazonaws.services.dynamodbv2.document.Table.getItem(Table.java:594)
at
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.get(DynamoDBMetadataStore.java:340)
... 71 more
2017-01-20 15:49:09,001 [JobGenerator] INFO dstream.FileInputDStream
(Logging.scala:logInfo(54)) - New files at time 1484927338000 ms:
2017-01-20 15:49:09,004 [JobGenerator] INFO scheduler.JobScheduler
(Logging.scala:logInfo(54)) - Added jobs for time 1484927338000 ms
2017-01-20 15:49:09,004 [JobScheduler] INFO scheduler.JobScheduler
(Logging.scala:logInfo(54)) - Starting job streaming job 1484927338000 ms.0
from job set of time 1484927338000 ms
{code}
> make sure dynamo exceptions are handled in translateException
> -------------------------------------------------------------
>
> Key: HADOOP-14012
> URL: https://issues.apache.org/jira/browse/HADOOP-14012
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: HADOOP-13345
> Reporter: Steve Loughran
>
> {{translateException}} is good at translating S3 exceptions into meaningful
> IOEs. We need to make sure that dynamodb exceptions are also mapped
> appropriately.
> Action plan
> # break things
> # collect the stack traces
> # translate them where possible
> # discuss in troubleshooting section of s3guard
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]