RE: HiveContext throws org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

2015-07-07 Thread Cheng, Hao
Caused by: java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.derby.jdbc.EmbeddedDriver

It will be included in the assembly jar usually, not sure what's wrong. But can 
you try add the derby jar into the driver classpath and try again? 

-Original Message-
From: bdev [mailto:buntu...@gmail.com] 
Sent: Tuesday, July 7, 2015 5:07 PM
To: user@spark.apache.org
Subject: HiveContext throws 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

Just trying to get started with Spark and attempting to use HiveContext using 
spark-shell to interact with existing Hive tables on my CDH cluster but keep 
running into the errors (pls see below) when I do 'hiveContext.sql(show 
tables)'. Wanted to know what all JARs need to be included to have this 
working. Thanks!


java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:472)
at
org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229)
at
org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225)
at
org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241)
at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240)
at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86)
at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:31)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:36)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:38)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:40)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:42)
at $iwC$$iwC$$iwC$$iwC$$iwC.init(console:44)
at $iwC$$iwC$$iwC$$iwC.init(console:46)
at $iwC$$iwC$$iwC.init(console:48)
at $iwC$$iwC.init(console:50)
at $iwC.init(console:52)
at init(console:54)
at .init(console:58)
at .clinit(console)
at .init(console:7)
at .clinit(console)
at $print(console)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
at 
org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at 
org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:856)
at
org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:901)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:813)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:656)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:664)
at
org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:669)
at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:996)
at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
at
scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at
org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:944)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1058)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at 
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: Unable to 

Re: How to implement top() and filter() on object List for JavaRDD

2015-07-07 Thread Hafsa Asif
Rusty,

I am very thankful for your help. Actually, I am facing difficulty in
objects. My plan is that, I have an object list containing list of User
objects. After parallelizing it through spark context, I apply comparator
on user.getUserName(). As usernames are sorted, their related user object
are sorted according to user names.
 In the end when I apply top, I get the whole object of user .

Some like this:

public static ComparatorUser UserComparator
= new ComparatorUser() {

public int compare(User usr1, User usr2) {
String userName1 = usr1.getUserName().toUpperCase();
String userName2 = usr1.getUserName().toUpperCase();

//ascending order
return userName1.compareTo(userName2);

//descending order
//return fruitName2.compareTo(fruitName1);
}

};

JavaRDDUser rdd = context.parallelize(usersList);

2015-07-07 16:05 GMT+02:00 rusty [via Apache Spark User List] 
ml-node+s1001560n23684...@n3.nabble.com:

 JavaRDDString lines2 = ctx.parallelize(Arrays.asList(3, 6, 2,
 5, 8, 6, 7));
 ListString top =lines2.top(7, new
 CustomComapratorString());
 for (String integer : top) {
 System.out.println(integer);
 }




 class CustomComapratorT implements Serializable, ComparatorT {
 /**
  *
  */
 public CustomComaprator() {
 // TODO Auto-generated constructor stub

 }

 private static final long serialVersionUID = 2004092520677431781L;

 @Override
 public int compare(T o11, T o12) {
 int o1 = Integer.parseInt(String.valueOf(o11));
 int o2 = Integer.parseInt(String.valueOf(o12));

 return o1  o2 ? 1 : o1 == o2 ? 0 : -1;
 }

 }


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://apache-spark-user-list.1001560.n3.nabble.com/How-to-implement-top-and-filter-on-object-List-for-JavaRDD-tp23669p23684.html
  To unsubscribe from How to implement top() and filter() on object List
 for JavaRDD, click here
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=23669code=aGFmc2EuYXNpZkBtYXRjaGluZ3V1LmNvbXwyMzY2OXwtMTA0ODgyNjY3NA==
 .
 NAML
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-implement-top-and-filter-on-object-List-for-JavaRDD-tp23669p23688.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: The auxService:spark_shuffle does not exist

2015-07-07 Thread roy
we tried --master yarn-client with no different result.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/The-auxService-spark-shuffle-does-not-exist-tp23662p23689.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: How to implement top() and filter() on object List for JavaRDD

2015-07-07 Thread Hafsa Asif
Sorry ignore my last reply.
Rusty,

I am very thankful for your help. Actually, I am facing difficulty in
objects. My plan is that, I have an object list containing list of User
objects. After parallelizing it through spark context, I apply comparator
on user.getUserName(). As usernames are sorted, their related user object
are sorted according to user names.
 In the end when I apply top, I get the whole object of user .

Some like this:

public static ComparatorUser UserComparator
= new ComparatorUser() {

public int compare(User usr1, User usr2) {
String userName1 = usr1.getUserName().toUpperCase();
String userName2 = usr1.getUserName().toUpperCase();

//ascending order
return userName1.compareTo(userName2);

//descending order
//return fruitName2.compareTo(fruitName1);
}

};

JavaRDDUser rdd = context.parallelize(usersList);

   ListUser top =rdd.top(1, UserComparator);// but it is giving me
serialization issues.

Can you guide me that how can I sort out this issue?
I also applied your code, it runs fine but I want it with object (You know
Object gives me tough time).

Best
Hafsa

2015-07-07 16:54 GMT+02:00 Hafsa Asif hafsa.a...@matchinguu.com:

 Rusty,

 I am very thankful for your help. Actually, I am facing difficulty in
 objects. My plan is that, I have an object list containing list of User
 objects. After parallelizing it through spark context, I apply comparator
 on user.getUserName(). As usernames are sorted, their related user object
 are sorted according to user names.
  In the end when I apply top, I get the whole object of user .

 Some like this:

 public static ComparatorUser UserComparator
 = new ComparatorUser() {

 public int compare(User usr1, User usr2) {
 String userName1 = usr1.getUserName().toUpperCase();
 String userName2 = usr1.getUserName().toUpperCase();

 //ascending order
 return userName1.compareTo(userName2);

 //descending order
 //return fruitName2.compareTo(fruitName1);
 }

 };

 JavaRDDUser rdd = context.parallelize(usersList);

 2015-07-07 16:05 GMT+02:00 rusty [via Apache Spark User List] [hidden
 email] http:///user/SendEmail.jtp?type=nodenode=23688i=0:

 JavaRDDString lines2 = ctx.parallelize(Arrays.asList(3, 6, 2,
 5, 8, 6, 7));
 ListString top =lines2.top(7, new
 CustomComapratorString());
 for (String integer : top) {
 System.out.println(integer);
 }




 class CustomComapratorT implements Serializable, ComparatorT {
 /**
  *
  */
 public CustomComaprator() {
 // TODO Auto-generated constructor stub

 }

 private static final long serialVersionUID =
 2004092520677431781L;

 @Override
 public int compare(T o11, T o12) {
 int o1 = Integer.parseInt(String.valueOf(o11));
 int o2 = Integer.parseInt(String.valueOf(o12));

 return o1  o2 ? 1 : o1 == o2 ? 0 : -1;
 }

 }


 --
  If you reply to this email, your message will be added to the
 discussion below:

 http://apache-spark-user-list.1001560.n3.nabble.com/How-to-implement-top-and-filter-on-object-List-for-JavaRDD-tp23669p23684.html
  To unsubscribe from How to implement top() and filter() on object List
 for JavaRDD, click here.
 NAML
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml



 --
 View this message in context: Re: How to implement top() and filter() on
 object List for JavaRDD
 http://apache-spark-user-list.1001560.n3.nabble.com/How-to-implement-top-and-filter-on-object-List-for-JavaRDD-tp23669p23688.html

 Sent from the Apache Spark User List mailing list archive
 http://apache-spark-user-list.1001560.n3.nabble.com/ at Nabble.com.



Spark Kafka Direct Streaming

2015-07-07 Thread abi_pat
Hi,

I am using the new experimental Direct Stream API. Everything is working
fine but when it comes to fault tolerance, I am not sure how to achieve it.
Presently my Kafka config map looks like this

configMap.put(zookeeper.connect,192.168.51.98:2181);
configMap.put(group.id, UUID.randomUUID().toString());
configMap.put(auto.offset.reset,smallest);
configMap.put(auto.commit.enable,true);
configMap.put(topics,IPDR31);
configMap.put(kafka.consumer.id,kafkasparkuser);
configMap.put(bootstrap.servers,192.168.50.124:9092);
SetString topic = new HashSetString();
topic.add(IPDR31);

JavaPairInputDStreambyte[], byte[] kafkaData =
KafkaUtils.createDirectStream(js,byte[].class,byte[].class,DefaultDecoder.class,DefaultDecoder.class,configMap,topic);

Questions -

Q1- Is my Kafka configuration correct or should it be changed? 

Q2- I also looked into the Checkpointing but in my usecase, Data
checkpointing is not required but meta checkpointing is required. Can I
achieve this, i.e. enabling meta checkpointing and not the data
checkpointing?



Thanks
Abhishek Patel



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Kafka-Direct-Streaming-tp23685.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: is it possible to disable -XX:OnOutOfMemoryError=kill %p for the executors?

2015-07-07 Thread Kostas Kougios
I've recompiled spark deleting the -XX:OnOutOfMemoryError=kill declaration,
but still I am getting a SIGTERM! 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/is-it-possible-to-disable-XX-OnOutOfMemoryError-kill-p-for-the-executors-tp23680p23687.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Remoting started followed by a Remoting shut down straight away

2015-07-07 Thread Kostas Kougios
Is this normal?

15/07/07 15:27:04 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sparkExecutor@cruncher02.stratified:50063]
15/07/07 15:27:04 INFO util.Utils: Successfully started service
'sparkExecutor' on port 50063.
15/07/07 15:27:04 INFO remote.RemoteActorRefProvider$RemotingTerminator:
Remoting shut down.

The executor keeps running for a while but then suddently:

15/07/07 15:29:55 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED
SIGNAL 15: SIGTERM





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Remoting-started-followed-by-a-Remoting-shut-down-straight-away-tp23686.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Error when connecting to Spark SQL via Hive JDBC driver

2015-07-07 Thread ratio
Hi,

problem not solved yet. Compiling Spark by myself is no option. I don't have
permissions and skills for doing that. Could someone please explain, what
exactly is causing the problem? If Spark is distributed via pre-compiled
versions, why not to add the corresponding JDBC driver jars?

At least Spark SQL is competing for Standard Connectivity: Connect through
JDBC or ODBC. Spark SQL includes a server mode with industry standard JDBC
and ODBC connectivity.

It seems like a great piece of software is shipped with a small lack, making
it unusable for a part of the community...

Thank you.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Error-when-connecting-to-Spark-SQL-via-Hive-JDBC-driver-tp23397p23691.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



<    1   2