[jira] [Commented] (SPARK-6151) schemaRDD to parquetfile with saveAsParquetFile control the HDFS block size

Sree Vaddi (JIRA) Sun, 12 Apr 2015 07:55:20 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491502#comment-14491502
 ]


Sree Vaddi commented on SPARK-6151:
-----------------------------------

[~cnstar9988]
The HDFS Block Size is set once when you first install Hadoop.
It is possible to change the HDFS block size in your hadoop configuration and 
restart your hadoop for the change to take effect. (read literature and feel 
comfortable, before you make this change).
Then, you can run saveAsParquetFile().  Which will now use the new HDFS block 
size.


> schemaRDD to parquetfile with saveAsParquetFile control the HDFS block size
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-6151
>                 URL: https://issues.apache.org/jira/browse/SPARK-6151
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.2.1
>            Reporter: Littlestar
>            Priority: Trivial
>
> How schemaRDD to parquetfile with saveAsParquetFile control the HDFS block 
> size. may be Configuration need.
> related question by others.
> http://apache-spark-user-list.1001560.n3.nabble.com/HDFS-block-size-for-parquet-output-tt21183.html
> http://qnalist.com/questions/5054892/spark-sql-parquet-and-impala



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-6151) schemaRDD to parquetfile with saveAsParquetFile control the HDFS block size

Reply via email to