[ 
https://issues.apache.org/jira/browse/BEAM-11016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Beam JIRA Bot updated BEAM-11016:
---------------------------------
    Labels: stale-assigned  (was: )

> SqsIO exception when moving to AWS2 SDK
> ---------------------------------------
>
>                 Key: BEAM-11016
>                 URL: https://issues.apache.org/jira/browse/BEAM-11016
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-aws
>    Affects Versions: 2.24.0
>            Reporter: Alexey Romanenko
>            Assignee: Alexey Romanenko
>            Priority: P2
>              Labels: stale-assigned
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> [email protected] reported in [email 
> thread|https://lists.apache.org/thread.html/r70b5561fde7d9c750475e44d1b8049709dcbcbe43f19233814a2b97c%40%3Cuser.beam.apache.org%3E]:
>  
> I've been attempting to migrate to the AWS2 SDK (version 2.24.0) on Apache 
> Spark 2.4.7. However, when switching over to the new API and running it I 
> keep getting the following exceptions: 
> {code}
>  2020-09-30 20:38:32,428 ERROR streaming.StreamingContext: Error starting the 
> context, marking it as stopped org.apache.spark.SparkException: Job aborted 
> due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent 
> failure: Lost task 0.3 in stage 2.0 (TID 6, 10.42.180.18, executor 0): 
> java.lang.NullPointerException at 
> org.apache.beam.sdk.io.aws2.sqs.SqsUnboundedSource.lambda$new$e39e7721$1(SqsUnboundedSource.java:41)
>  
> at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Suppliers$MemoizingSupplier.get(Suppliers.java:129)
>  
> at 
> org.apache.beam.sdk.io.aws2.sqs.SqsUnboundedSource.getSqs(SqsUnboundedSource.java:74)
>  
> at 
> org.apache.beam.sdk.io.aws2.sqs.SqsUnboundedReader.pull(SqsUnboundedReader.java:165)
>  
> at 
> org.apache.beam.sdk.io.aws2.sqs.SqsUnboundedReader.advance(SqsUnboundedReader.java:108)
>  
> at 
> org.apache.beam.runners.spark.io.MicrobatchSource$Reader.advanceWithBackoff(MicrobatchSource.java:246)
>  
> at 
> org.apache.beam.runners.spark.io.MicrobatchSource$Reader.start(MicrobatchSource.java:224)
>  
> at 
> org.apache.beam.runners.spark.stateful.StateSpecFunctions$1.apply(StateSpecFunctions.java:168)
>  
> at 
> org.apache.beam.runners.spark.stateful.StateSpecFunctions$1.apply(StateSpecFunctions.java:107)
>  
> at org.apache.spark.streaming.StateSpec$$anonfun$1.apply(StateSpec.scala:181)
>  
> at org.apache.spark.streaming.StateSpec$$anonfun$1.apply(StateSpec.scala:180)
>  
> at 
> org.apache.spark.streaming.rdd.MapWithStateRDDRecord$$anonfun$updateRecordWithData$1.apply(MapWithStateRDD.scala:57)
>  {code}
>  
> Examining the source of SqsUnboundedSource reveals a lambda where it's trying 
> to chain a few references:
> {code}
>     read.sqsClientProvider().getSqsClient()
> {code}
>  
> Which is odd as I explicitly set the client provider on the read transform.  
> This was working well enough with the old SqsIO API to connect and process 
> messages off the queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to