RE: SparkSQL 1.1 hang when "DROP" or "LOAD"

2014-09-16 Thread linkpatrickliu
Hi, Hao Cheng. I have done other tests. And the result shows the thriftServer can connect to Zookeeper. However, I found some more interesting things. And I think I have found a bug! Test procedure: Test1: (0) Use beeline to connect to thriftServer. (1) Switch database "use dw_op1"; (OK) The log

RE: SparkSQL 1.1 hang when "DROP" or "LOAD"

2014-09-15 Thread linkpatrickliu
Seems like the thriftServer cannot connect to Zookeeper, so it cannot get lock. This is how it the log looks when I run SparkSQL: "load data inpath "kv1.txt" into table src;" log: 14/09/16 14:40:47 INFO Driver: 14/09/16 14:40:47 INFO ClientCnxn: Opening socket connection to server SVR4044HW2285.h

RE: SparkSQL 1.1 hang when "DROP" or "LOAD"

2014-09-15 Thread linkpatrickliu
Besides, When I use bin/spark-sql, I can Load data and drop table freely. Only when I use sbin/start-thriftserver.sh and connect with beeline, the client will hang! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-1-1-hang-when-DROP-or-LOAD-tp14222

RE: SparkSQL 1.1 hang when "DROP" or "LOAD"

2014-09-15 Thread linkpatrickliu
Hi, Hao Cheng, This is my spark assembly jar name: spark-assembly-1.1.0-hadoop2.0.0-cdh4.6.0.jar I compiled spark 1.1.0 with following cmd: export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m" mvn -Dhadoop.version=2.0.0-cdh4.6.0 -Phive -Pspark-ganglia-lgpl -DskipTests pa

RE: SparkSQL 1.1 hang when "DROP" or "LOAD"

2014-09-15 Thread linkpatrickliu
Hi, Hao Cheng, Here is the Spark\Hadoop version: Spark version = 1.1.0 Hadoop version = 2.0.0-cdh4.6.0 And hive-site.xml: fs.default.name hdfs://ns dfs.nameservices ns dfs.ha.namenodes.ns machine01,machine02 dfs.namenode.rpc-address.ns.mach

RE: SparkSQL 1.1 hang when "DROP" or "LOAD"

2014-09-15 Thread linkpatrickliu
Hi, Hao Cheng, Here is the Spark\Hadoop version: Spark version = 1.1.0 Hadoop version = 2.0.0-cdh4.6.0 And hive-site.xml: fs.default.name hdfs://ns dfs.nameservices ns dfs.ha.namenodes.ns machine01,machine02 dfs.namenode.rpc-address.ns.mach

SparkSQL 1.1 hang when "DROP" or "LOAD"

2014-09-14 Thread linkpatrickliu
I started sparkSQL thrift server: "sbin/start-thriftserver.sh" Then I use beeline to connect to it: "bin/beeline" "!connect jdbc:hive2://localhost:1 op1 op1" I have created a database for user op1. "create database dw_op1"; And grant all privileges to user op1; "grant all on database dw_op1

SparkSQL hang due to

2014-09-11 Thread linkpatrickliu
I am running Spark Standalone mode with Spark 1.1 I started SparkSQL thrift server as follows: ./sbin/start-thriftserver.sh Then I use beeline to connect to it. Now, I can "CREATE", "SELECT", "SHOW" the databases or the tables; But when I "DROP" or "Load data inpath 'kv1.txt' into table src", the

RE: The concurrent model of spark job/stage/task

2014-08-29 Thread linkpatrickliu
Hi, I think an example will help illustrate the model better. /*** SimpleApp.scala ***/import org.apache.spark.SparkContextimport org.apache.spark.SparkContext._ object SimpleApp { def main(args: Array[String]) {val logFile = "$YOUR_SPARK_HOME/README.md" val sc = new SparkContext("local

RE: problem connection to hdfs on localhost from spark-shell

2014-08-28 Thread linkpatrickliu
Change your conf/spark-env.sh: export HADOOP_CONF_DIR="/etc/hadoop/conf"export YARN_CONF_DIR="/etc/hadoop/conf" Date: Thu, 28 Aug 2014 16:19:05 -0700 From: ml-node+s1001560n13074...@n3.nabble.com To: linkpatrick...@live.com Subject: problem connection to hdfs on localhost from spark-shell

RE: The concurrent model of spark job/stage/task

2014-08-28 Thread linkpatrickliu
Hi, Please see the answers following each question. If there's any mistake, please let me know. Thanks! I am not sure which mode you are running. So I will assume you are using spark-submit script to submit spark applications to spark cluster(spark-standalone or Yarn) 1. how to start 2 or more

RE: org.apache.hadoop.io.compress.SnappyCodec not found

2014-08-28 Thread linkpatrickliu
Hi, You can set the settings in conf/spark-env.sh like this:export SPARK_LIBRARY_PATH=/usr/lib/hadoop/lib/native/ SPARK_JAVA_OPTS+="-Djava.library.path=$SPARK_LIBRARY_PATH "SPARK_JAVA_OPTS+="-Dspark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec "SPARK_JAVA_OPTS+="-Dio.compress

[Compile error] Spark 1.0.2 against cloudera 2.0.0-cdh4.6.0 error

2014-08-07 Thread linkpatrickliu
Hi, Following the "" document: # Cloudera CDH 4.2.0 mvn -Pyarn-alpha -Dhadoop.version=2.0.0-cdh4.2.0 -DskipTests clean package I compile Spark 1.0.2 with this cmd: mvn -Pyarn-alpha -Dhadoop.version=2.0.0-cdh4.6.0 -DskipTests clean package However, I got two errors: [INFO] Compiling 14 Scala so

[Compile error] Spark 1.0.2 against cloudera 2.0.0-cdh4.6.0 error

2014-08-07 Thread linkpatrickliu
Hi, Following the "" document: # Cloudera CDH 4.2.0 mvn -Pyarn-alpha -Dhadoop.version=2.0.0-cdh4.2.0 -DskipTests clean package I compile Spark 1.0.2 with this cmd: mvn -Pyarn-alpha -Dhadoop.version=2.0.0-cdh4.6.0 -DskipTests clean package However, I got two errors: [INFO] Compiling 14 Scala so

Cannot connect to hive metastore

2014-07-17 Thread linkpatrickliu
Seems like the mysql connector jar is not included in the classpath. Where can I set the jar to the classpath? hive-site.xml: javax.jdo.option.ConnectionURL jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true&characterEncoding=UTF-8 JDBC connect string for a JDBC met