[
https://issues.apache.org/jira/browse/SPARK-24273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483908#comment-16483908
]
Jami Malikzade commented on SPARK-24273:
----------------------------------------
[~kiszk]
I went deeper and found more:
This way it works and creates empty rdd, as after filter 0 rows returned
val df =
spark.read.option("header","true").option("sep",",").schema(testschema).csv("s3a://phub-1526909295-81/salary.csv").filter('salary
> 300).withColumn("month", when('name === "Smith", "6").otherwise("3"))
df.checkpoint()
df.show()
Thiw way it fails on df.show() and non-empty file is created(though after
filter 0 rows returned)
val df =
spark.read.option("header","true").option("sep",",").schema(testschema).csv("s3a://phub-1526909295-81/salary.csv").filter('salary
> 300).withColumn("month", when('name === "Smith",
"6").otherwise("3")).checkpoint()
df.show()
> Failure while using .checkpoint method
> --------------------------------------
>
> Key: SPARK-24273
> URL: https://issues.apache.org/jira/browse/SPARK-24273
> Project: Spark
> Issue Type: Bug
> Components: Spark Shell
> Affects Versions: 2.3.0
> Reporter: Jami Malikzade
> Priority: Major
>
> We are getting following error:
> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 416, AWS
> Service: Amazon S3, AWS Request ID:
> tx000000000000000014126-005ae9bfd9-9ed9ac2-default, AWS Error Code:
> InvalidRange, AWS Error Message: null, S3 Extended Request ID:
> 9ed9ac2-default-default"
> when we use checkpoint method as below.
> val streamBucketDF = streamPacketDeltaDF
> .filter('timeDelta > maxGap && 'timeDelta <= 30000)
> .withColumn("bucket", when('timeDelta <= mediumGap, "medium")
> .otherwise("large")
> )
> .checkpoint()
> Do you have idea how to prevent invalid range in header to be sent, or how it
> can be workarounded or fixed?
> Thanks.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]