Re: Spark 1.1.0 with Hadoop 2.5.0
That's a Hive version issue, not Hadoop version issue. On Tue, Oct 7, 2014 at 7:21 AM, Li HM hmx...@gmail.com wrote: Thanks for the replied. Please refer to my another post entitled How to make ./bin/spark-sql work with hive. It has all the error/exceptions I am getting. If I understand you correctly, I can build the package with mvn -Phive,hadoop-2.4 -Dhadoop.version=2.5.0 clean package This is what I actually tried. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark 1.1.0 with Hadoop 2.5.0
The build command should be correct. What exact error did you encounter when trying Spark 1.1 + Hive 0.12 + Hadoop 2.5.0? On 10/7/14 2:21 PM, Li HM wrote: Thanks for the replied. Please refer to my another post entitled How to make ./bin/spark-sql work with hive. It has all the error/exceptions I am getting. If I understand you correctly, I can build the package with mvn -Phive,hadoop-2.4 -Dhadoop.version=2.5.0 clean package This is what I actually tried. On Mon, Oct 6, 2014 at 11:03 PM, Sean Owen so...@cloudera.com wrote: The hadoop-2.4 profile is really intended to be Hadoop 2.4+. It should compile and run fine with Hadoop 2.5 as far as I know. CDH 5.2 is Hadoop 2.5 + Spark 1.1, so there is evidence it works. You didn't say what doesn't work. On Tue, Oct 7, 2014 at 6:07 AM, hmxxyy hmx...@gmail.com wrote: Does Spark 1.1.0 work with Hadoop 2.5.0? The maven build instruction only has command options up to hadoop 2.4. Anybody ever made it work? I am trying to run spark-sql with hive 0.12 on top of hadoop 2.5.0 but can't make it work. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-1-0-with-Hadoop-2-5-0-tp15827.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark 1.1.0 with Hadoop 2.5.0
Thanks Cheng. Here is the error message after a fresh build. $ mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.5.0 -Phive -DskipTests clean package [INFO] [INFO] Reactor Summary: [INFO] [INFO] Spark Project Parent POM .. SUCCESS [19.117s] [INFO] Spark Project Core SUCCESS [11:24.009s] [INFO] Spark Project Bagel ... SUCCESS [1:09.498s] [INFO] Spark Project GraphX .. SUCCESS [3:41.113s] [INFO] Spark Project Streaming ... SUCCESS [4:25.378s] [INFO] Spark Project ML Library .. SUCCESS [5:43.323s] [INFO] Spark Project Tools ... SUCCESS [44.647s] [INFO] Spark Project Catalyst SUCCESS [4:48.658s] [INFO] Spark Project SQL . SUCCESS [4:56.966s] [INFO] Spark Project Hive SUCCESS [3:45.269s] [INFO] Spark Project REPL SUCCESS [2:11.617s] [INFO] Spark Project YARN Parent POM . SUCCESS [6.723s] [INFO] Spark Project YARN Stable API . SUCCESS [2:20.860s] [INFO] Spark Project Hive Thrift Server .. SUCCESS [1:15.231s] [INFO] Spark Project Assembly SUCCESS [1:41.245s] [INFO] Spark Project External Twitter SUCCESS [50.839s] [INFO] Spark Project External Kafka .. SUCCESS [1:15.888s] [INFO] Spark Project External Flume Sink . SUCCESS [57.807s] [INFO] Spark Project External Flume .. SUCCESS [1:26.589s] [INFO] Spark Project External ZeroMQ . SUCCESS [54.361s] [INFO] Spark Project External MQTT ... SUCCESS [53.901s] [INFO] Spark Project Examples SUCCESS [2:39.407s] [INFO] [INFO] BUILD SUCCESS [INFO] spark-sql use mydb; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:302) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:272) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35) at org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:38) at org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:360) at org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:360) at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58) at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:103) at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:98) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:58) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:291) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) On Tue, Oct 7, 2014 at 6:19 AM, Cheng Lian lian.cs@gmail.com wrote: The build command should be correct. What exact error did you encounter when trying Spark 1.1 + Hive 0.12 + Hadoop 2.5.0? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark 1.1.0 with Hadoop 2.5.0
Here is the hive-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? configuration !-- Hive Execution Parameters -- property namehive.metastore.local/name valuefalse/value descriptioncontrols whether to connect to remove metastore server or open a new metastore server in Hive Client JVM/description /property property namehive.metastore.uris/name valuethrift://*:2513/value descriptionRemote location of the metastore server/description /property property namehive.metastore.warehouse.dir/name value/projects/hcatalog-warehouse/value descriptionlocation of default database for the warehouse/description /property property namehive.metastore.sasl.enabled/name valuetrue/value descriptionIf true, the metastore thrift interface will be secured with SASL. Clients must authenticate with Kerberos./description /property property namehive.metastore.kerberos.principal/name valuehcat/*.com@.COM/value descriptionThe service principal for the metastore thrift server. The special string _HOST will be replaced automatically with the correct host name./description /property property namehive.metastore.client.socket.timeout/name value200/value descriptionMetaStore Client socket timeout in seconds/description /property property namehive.exec.mode.local.auto/name valuefalse/value descriptionLet hive determine whether to run in local mode automatically/description /property property namehive.hadoop.supports.splittable.combineinputformat/name valuetrue/value descriptionHive internal, should be set to true as MAPREDUCE-1597 is present in Hadoop/description /property property namehive.exec.scratchdir/name value/tmp/value descriptionHDFS Scratch space for Hive jobs/description /property property namehive.querylog.location/name value${user.home}/hivelogs/value descriptionLocal Directory where structured hive query logs are created. One file per session is created in this directory. If this variable set to empty string structured log will not be created./description /property property namemapreduce.job.queuename/name valuedefault/value descriptionSet a default queue name for execution of the Hive queries/description /property property namehadoop.clientside.fs.operations/name valuetrue/value descriptionFS operations related to DDL operations are owned by Hive client/description /property property namehive.exec.compress.output/name valuetrue/value description This controls whether the final outputs of a query (to a local/hdfs file or a hive table) is compressed. The compression codec and other options are determined from hadoop config variables mapred.output.compress* /description /property property namehive.exec.compress.intermediate/name valuetrue/value description This controls whether intermediate files produced by hive between multiple map-reduce jobs are compressed. The compression codec and other options are determined from hadoop config variables mapred.output.compress* /description /property property namehive.auto.convert.join/name valuefalse/value description This controls whether intermediate files produced by hive between multiple map-reduce jobs are compressed. The compression codec and other options are determined from hadoop config variables mapred.output.compress* /description /property property namehive.optimize.partition.prune.metadata/name valuetrue/value descriptionThis controls whether metadata optimizations are applied during partition pruning/description /property property namehive.mapred.mode/name valuenonstrict/value descriptionThe mode in which the hive operations are being performed. In strict mode, some risky queries are not allowed to run/description /property property nameio.seqfile.compression.type/name valueBLOCK/value descriptionDetermines how the compression is performed. Can take NONE, RECORD or BLOCK/description /property property namehive.input.format/name valueorg.apache.hadoop.hive.ql.io.CombineHiveInputFormat/value descriptionDetermines the input format. Can take org.apache.hadoop.hive.ql.io.HiveInputFormat or org.apache.hadoop.hive.ql.io.CombineHiveInputFormat (default)/description /property property namemapreduce.input.fileinputformat.split.minsize/name value134217728/value descriptionSize of the minimum split for CombineFileInputFormat (128MB recommended)/description /property property namemapreduce.input.fileinputformat.split.maxsize/name value1073741824/value descriptionSize of maximum split for CombineFileInputFormat (1GB recommended)/description /property property namemapreduce.input.fileinputformat.split.minsize.per.rack/name value134217728/value descriptionSize of minimum split size per rack (128MB recommended)/description /property property namemapreduce.input.fileinputformat.split.minsize.per.node/name value134217728/value descriptionSize of minimum split size per node (128MB
Spark 1.1.0 with Hadoop 2.5.0
Does Spark 1.1.0 work with Hadoop 2.5.0? The maven build instruction only has command options up to hadoop 2.4. Anybody ever made it work? I am trying to run spark-sql with hive 0.12 on top of hadoop 2.5.0 but can't make it work. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-1-0-with-Hadoop-2-5-0-tp15827.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org