Re: Spark 1.1.0 with Hadoop 2.5.0

2014-10-07 Thread Sean Owen
That's a Hive version issue, not Hadoop version issue.

On Tue, Oct 7, 2014 at 7:21 AM, Li HM hmx...@gmail.com wrote:
 Thanks for the replied.

 Please refer to my another post entitled How to make ./bin/spark-sql
 work with hive. It has all the error/exceptions I am getting.

 If I understand you correctly, I can build the package with
 mvn -Phive,hadoop-2.4 -Dhadoop.version=2.5.0 clean package

 This is what I actually tried.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark 1.1.0 with Hadoop 2.5.0

2014-10-07 Thread Cheng Lian
The build command should be correct. What exact error did you encounter 
when trying Spark 1.1 + Hive 0.12 + Hadoop 2.5.0?


On 10/7/14 2:21 PM, Li HM wrote:

Thanks for the replied.

Please refer to my another post entitled How to make ./bin/spark-sql
work with hive. It has all the error/exceptions I am getting.

If I understand you correctly, I can build the package with
mvn -Phive,hadoop-2.4 -Dhadoop.version=2.5.0 clean package

This is what I actually tried.

On Mon, Oct 6, 2014 at 11:03 PM, Sean Owen so...@cloudera.com wrote:

The hadoop-2.4 profile is really intended to be Hadoop 2.4+. It
should compile and run fine with Hadoop 2.5 as far as I know. CDH 5.2
is Hadoop 2.5 + Spark 1.1, so there is evidence it works. You didn't
say what doesn't work.

On Tue, Oct 7, 2014 at 6:07 AM, hmxxyy hmx...@gmail.com wrote:

Does Spark 1.1.0 work with Hadoop 2.5.0?

The maven build instruction only has command options  up to hadoop 2.4.

Anybody ever made it work?

I am trying to run spark-sql with hive 0.12 on top of hadoop 2.5.0 but can't
make it work.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-1-0-with-Hadoop-2-5-0-tp15827.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org




-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark 1.1.0 with Hadoop 2.5.0

2014-10-07 Thread Li HM
Thanks Cheng.

Here is the error message after a fresh build.

$ mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.5.0 -Phive -DskipTests
clean package
[INFO] 
[INFO] Reactor Summary:
[INFO]
[INFO] Spark Project Parent POM .. SUCCESS [19.117s]
[INFO] Spark Project Core  SUCCESS [11:24.009s]
[INFO] Spark Project Bagel ... SUCCESS [1:09.498s]
[INFO] Spark Project GraphX .. SUCCESS [3:41.113s]
[INFO] Spark Project Streaming ... SUCCESS [4:25.378s]
[INFO] Spark Project ML Library .. SUCCESS [5:43.323s]
[INFO] Spark Project Tools ... SUCCESS [44.647s]
[INFO] Spark Project Catalyst  SUCCESS [4:48.658s]
[INFO] Spark Project SQL . SUCCESS [4:56.966s]
[INFO] Spark Project Hive  SUCCESS [3:45.269s]
[INFO] Spark Project REPL  SUCCESS [2:11.617s]
[INFO] Spark Project YARN Parent POM . SUCCESS [6.723s]
[INFO] Spark Project YARN Stable API . SUCCESS [2:20.860s]
[INFO] Spark Project Hive Thrift Server .. SUCCESS [1:15.231s]
[INFO] Spark Project Assembly  SUCCESS [1:41.245s]
[INFO] Spark Project External Twitter  SUCCESS [50.839s]
[INFO] Spark Project External Kafka .. SUCCESS [1:15.888s]
[INFO] Spark Project External Flume Sink . SUCCESS [57.807s]
[INFO] Spark Project External Flume .. SUCCESS [1:26.589s]
[INFO] Spark Project External ZeroMQ . SUCCESS [54.361s]
[INFO] Spark Project External MQTT ... SUCCESS [53.901s]
[INFO] Spark Project Examples  SUCCESS [2:39.407s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 

spark-sql use mydb;
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException:
Unable to instantiate
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
org.apache.spark.sql.execution.QueryExecutionException: FAILED:
Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException:
Unable to instantiate
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:302)
at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:272)
at 
org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35)
at 
org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35)
at 
org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:38)
at 
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:360)
at 
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:360)
at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:103)
at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:98)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:58)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:291)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

On Tue, Oct 7, 2014 at 6:19 AM, Cheng Lian lian.cs@gmail.com wrote:
 The build command should be correct. What exact error did you encounter when
 trying Spark 1.1 + Hive 0.12 + Hadoop 2.5.0?



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark 1.1.0 with Hadoop 2.5.0

2014-10-07 Thread Li HM
Here is the hive-site.xml

?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?

configuration

!-- Hive Execution Parameters --

property
  namehive.metastore.local/name
  valuefalse/value
  descriptioncontrols whether to connect to remove metastore server
or open a new metastore server in Hive Client JVM/description
/property

property
  namehive.metastore.uris/name
  valuethrift://*:2513/value
  descriptionRemote location of the metastore server/description
/property

property
  namehive.metastore.warehouse.dir/name
  value/projects/hcatalog-warehouse/value
  descriptionlocation of default database for the warehouse/description
/property

property
  namehive.metastore.sasl.enabled/name
  valuetrue/value
  descriptionIf true, the metastore thrift interface will be secured
with SASL. Clients must authenticate with Kerberos./description
/property

property
  namehive.metastore.kerberos.principal/name
  valuehcat/*.com@.COM/value
  descriptionThe service principal for the metastore thrift server.
The special string _HOST will be replaced automatically with the
correct host name./description
/property

property
  namehive.metastore.client.socket.timeout/name
  value200/value
  descriptionMetaStore Client socket timeout in seconds/description
/property

property
  namehive.exec.mode.local.auto/name
  valuefalse/value
  descriptionLet hive determine whether to run in local mode
automatically/description
/property

property
  namehive.hadoop.supports.splittable.combineinputformat/name
  valuetrue/value
  descriptionHive internal, should be set to true as MAPREDUCE-1597
is present in Hadoop/description
/property

property
  namehive.exec.scratchdir/name
  value/tmp/value
  descriptionHDFS Scratch space for Hive jobs/description
/property

property
  namehive.querylog.location/name
  value${user.home}/hivelogs/value
  descriptionLocal Directory where structured hive query logs are
created. One file per session is created in this directory. If this
variable set to empty string structured log will not be
created./description
/property

property
  namemapreduce.job.queuename/name
  valuedefault/value
  descriptionSet a default queue name for execution of the Hive
queries/description
/property

property
  namehadoop.clientside.fs.operations/name
  valuetrue/value
  descriptionFS operations related to DDL operations are owned by
Hive client/description
/property

property
  namehive.exec.compress.output/name
  valuetrue/value
  description This controls whether the final outputs of a query (to
a local/hdfs file or a hive table) is compressed. The compression
codec and other options are determined from hadoop config variables
mapred.output.compress* /description
/property

property
  namehive.exec.compress.intermediate/name
  valuetrue/value
  description This controls whether intermediate files produced by
hive between multiple map-reduce jobs are compressed. The compression
codec and other options are determined from hadoop config variables
mapred.output.compress* /description
/property

property
  namehive.auto.convert.join/name
  valuefalse/value
  description This controls whether intermediate files produced by
hive between multiple map-reduce jobs are compressed. The compression
codec and other options are determined from hadoop config variables
mapred.output.compress* /description
/property

property
  namehive.optimize.partition.prune.metadata/name
  valuetrue/value
  descriptionThis controls whether metadata optimizations are
applied during partition pruning/description
/property

property
  namehive.mapred.mode/name
  valuenonstrict/value
  descriptionThe mode in which the hive operations are being
performed. In strict mode, some risky queries are not allowed to
run/description
/property

property
  nameio.seqfile.compression.type/name
  valueBLOCK/value
  descriptionDetermines how the compression is performed. Can take
NONE, RECORD or BLOCK/description
/property

property
  namehive.input.format/name
  valueorg.apache.hadoop.hive.ql.io.CombineHiveInputFormat/value
  descriptionDetermines the input format. Can take
org.apache.hadoop.hive.ql.io.HiveInputFormat or
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
(default)/description
/property

property
  namemapreduce.input.fileinputformat.split.minsize/name
  value134217728/value
  descriptionSize of the minimum split for CombineFileInputFormat
(128MB recommended)/description
/property

property
  namemapreduce.input.fileinputformat.split.maxsize/name
  value1073741824/value
  descriptionSize of maximum split for CombineFileInputFormat (1GB
recommended)/description
/property

property
  namemapreduce.input.fileinputformat.split.minsize.per.rack/name
  value134217728/value
  descriptionSize of minimum split size per rack (128MB
recommended)/description
/property

property
  namemapreduce.input.fileinputformat.split.minsize.per.node/name
  value134217728/value
  descriptionSize of minimum split size per node (128MB

Spark 1.1.0 with Hadoop 2.5.0

2014-10-06 Thread hmxxyy
Does Spark 1.1.0 work with Hadoop 2.5.0?

The maven build instruction only has command options  up to hadoop 2.4.

Anybody ever made it work?

I am trying to run spark-sql with hive 0.12 on top of hadoop 2.5.0 but can't
make it work.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-1-0-with-Hadoop-2-5-0-tp15827.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org