[jira] [Commented] (SPARK-17945) Writing to S3 should allow setting object metadata
[ https://issues.apache.org/jira/browse/SPARK-17945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15715396#comment-15715396 ] Jeff Schobelock commented on SPARK-17945: - Just wanted to add some more justification to this (not sure anyone will look). The reason why I wanted this is because when I write out from Spark, to set the Object metadata after that via Java SDK APIs requires a copy of the object. I was hoping for a way to do this in one step that doesn't use the AWS CLI via shell commands form within spark, or other "hacky" things like that. > Writing to S3 should allow setting object metadata > -- > > Key: SPARK-17945 > URL: https://issues.apache.org/jira/browse/SPARK-17945 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 2.0.1 >Reporter: Jeff Schobelock >Priority: Minor > > I can't find any possible way to use Spark to write to S3 and set user object > metadata. This seems like such a simple thing that I feel I must be missing > somewhere how to do itbut I have yet to find anything. > I don't know what all work adding this would entail. My idea would be that > there is something like: > rdd.saveAsTextFile(s3://testbucket/file).withMetadata(Map> data). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17945) Writing to S3 should allow setting object metadata
[ https://issues.apache.org/jira/browse/SPARK-17945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578064#comment-15578064 ] Jeff Schobelock commented on SPARK-17945: - That's true enough. The use case on my end is that we use s3 object Metadata to get things like file delimiters, etc when we process files. > Writing to S3 should allow setting object metadata > -- > > Key: SPARK-17945 > URL: https://issues.apache.org/jira/browse/SPARK-17945 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 2.0.1 >Reporter: Jeff Schobelock >Priority: Minor > > I can't find any possible way to use Spark to write to S3 and set user object > metadata. This seems like such a simple thing that I feel I must be missing > somewhere how to do itbut I have yet to find anything. > I don't know what all work adding this would entail. My idea would be that > there is something like: > rdd.saveAsTextFile(s3://testbucket/file).withMetadata(Map> data). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17945) Writing to S3 should allow setting object metadata
[ https://issues.apache.org/jira/browse/SPARK-17945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577593#comment-15577593 ] Sean Owen commented on SPARK-17945: --- You can just set this with the S3 APIs directly? this borders or something it's not worth Spark passing-through just for S3 unless there's a common use case for it and possibly other FSes > Writing to S3 should allow setting object metadata > -- > > Key: SPARK-17945 > URL: https://issues.apache.org/jira/browse/SPARK-17945 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 2.0.1 >Reporter: Jeff Schobelock >Priority: Minor > > I can't find any possible way to use Spark to write to S3 and set user object > metadata. This seems like such a simple thing that I feel I must be missing > somewhere how to do itbut I have yet to find anything. > I don't know what all work adding this would entail. My idea would be that > there is something like: > rdd.saveAsTextFile(s3://testbucket/file).withMetadata(Map> data). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org