[jira] [Commented] (SPARK-2883) Spark Support for ORCFile format

Zhan Zhang (JIRA) Fri, 26 Jun 2015 14:58:30 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603669#comment-14603669
 ]


Zhan Zhang commented on SPARK-2883:
-----------------------------------

[~philclaridge] 
I try the spark-shell in local mode, and I did not meet the OOM issue. 
Following is the one I tried.
2502    case class AllDataTypes(
2503                             stringField: String,
2504                             intField: Int,
2505                             longField: Long,
2506                             floatField: Float,
2507                             doubleField: Double,
2508                             shortField: Short,
2509                             byteField: Byte,
2510                             booleanField: Boolean)
2511  val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
2512   import sqlContext.implicits._
2513   val range = (0 to 255)
2514  val data = sc.parallelize(range).map(x => AllDataTypes(s"$x", x, 
x.toLong,     x.toFloat, x.toDouble, x.toShort, x.toByte, x % 2 == 0))
2515  data.toDF().write.format("orc").save("orcRDD")
2516  :history

If you met OutOfMemoryError: PermGen space, you need to set -XX:PermSize=512M.

For further question, as [~marmbrus] said, you can ask this question in the 
user list.

> Spark Support for ORCFile format
> --------------------------------
>
>                 Key: SPARK-2883
>                 URL: https://issues.apache.org/jira/browse/SPARK-2883
>             Project: Spark
>          Issue Type: New Feature
>          Components: Input/Output, SQL
>            Reporter: Zhan Zhang
>            Assignee: Zhan Zhang
>            Priority: Critical
>             Fix For: 1.4.0
>
>         Attachments: 2014-09-12 07.05.24 pm Spark UI.png, 2014-09-12 07.07.19 
> pm jobtracker.png, orc.diff
>
>
> Verify the support of OrcInputFormat in spark, fix issues if exists and add 
> documentation of its usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-2883) Spark Support for ORCFile format

Reply via email to