[
https://issues.apache.org/jira/browse/SPARK-5813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321748#comment-14321748
]
Florian Verhein commented on SPARK-5813:
----------------------------------------
No specific technical reason esp WRT Spark... It's more of an attempt to keep
in line with recommendations for Hadoop in production (relevant since hadoop is
included in spark-ec2 - and cdh seems to be favoured). For example, CDH
supports OracleJDK, Horton didn't support OpenJDK before 1.7 and OracleJDK
still seems to be the favoured choice in production deployments, e.g.
http://wiki.apache.org/hadoop/HadoopJavaVersions.
I don't have first had data about how they compare performance wise. I've heard
OracleJDK being preferred for Hadoop on that front, but I also found this
http://www.slideshare.net/PrincipledTechnologies/big-data-technology-on-red-hat-enterprise-linux-openjdk-vs-oracle-jdk,
so perhaps performance is less of a reason these days?
Do you know of any performance analysis done with Spark, Tachyon on OpenJDK vs
OracleJDK?
In terms of difficulty, it's not hard to script installation of OracleJDK. E.g.
I've gone down the path of supporting both for the above reasons here (link may
break in future):
https://github.com/florianverhein/spark-ec2/blob/packer/packer/java-setup.sh
Aside: Based on bugs you mentioned, is there a list somewhere of which JDK
versions to avoid WRT Spark?
> Spark-ec2: Switch to OracleJDK
> ------------------------------
>
> Key: SPARK-5813
> URL: https://issues.apache.org/jira/browse/SPARK-5813
> Project: Spark
> Issue Type: Improvement
> Components: EC2
> Reporter: Florian Verhein
> Priority: Minor
>
> Currently using OpenJDK, however it is generally recommended to use Oracle
> JDK, esp for Hadoop deployments, etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]