[jira] [Commented] (SPARK-6151) schemaRDD to parquetfile with saveAsParquetFile control the HDFS block size

Littlestar (JIRA) Sun, 12 Apr 2015 20:51:06 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491881#comment-14491881
 ]


Littlestar commented on SPARK-6151:
-----------------------------------

>>The HDFS Block Size is set once when you first install Hadoop.
blockSize can be changed when File create.
FSDataOutputStream org.apache.hadoop.fs.FileSystem.create(Path f, boolean 
overwrite, int bufferSize, short replication, long blockSize) throws IOException



> schemaRDD to parquetfile with saveAsParquetFile control the HDFS block size
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-6151
>                 URL: https://issues.apache.org/jira/browse/SPARK-6151
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.2.1
>            Reporter: Littlestar
>            Priority: Trivial
>
> How schemaRDD to parquetfile with saveAsParquetFile control the HDFS block 
> size. may be Configuration need.
> related question by others.
> http://apache-spark-user-list.1001560.n3.nabble.com/HDFS-block-size-for-parquet-output-tt21183.html
> http://qnalist.com/questions/5054892/spark-sql-parquet-and-impala



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-6151) schemaRDD to parquetfile with saveAsParquetFile control the HDFS block size

Reply via email to