[
https://issues.apache.org/jira/browse/SPARK-18017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643864#comment-15643864
]
Steve Loughran commented on SPARK-18017:
----------------------------------------
you can check what's been picked up by grabbing a copy of the filesystem
instance and then logging the value returned in {{getDefaultBlockSize()}}.
If you switch to S3a, which you should be, calling toString() on the FS
instance is generally sufficient to dump the block size and lots of other
useful bits of information. It's relevant property is {{fs.s3a.block.size}}.
> Changing Hadoop parameter through
> sparkSession.sparkContext.hadoopConfiguration doesn't work
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-18017
> URL: https://issues.apache.org/jira/browse/SPARK-18017
> Project: Spark
> Issue Type: Bug
> Affects Versions: 2.0.0
> Environment: Scala version 2.11.8; Java 1.8.0_91;
> com.databricks:spark-csv_2.11:1.2.0
> Reporter: Yuehua Zhang
>
> My Spark job tries to read csv files on S3. I need to control the number of
> partitions created so I set Hadoop parameter fs.s3n.block.size. However, it
> stopped working after we upgrade Spark from 1.6.1 to 2.0.0. Not sure if it is
> related to https://issues.apache.org/jira/browse/SPARK-15991.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]