RE: HiveContext throws org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.derby.jdbc.EmbeddedDriver It will be included in the assembly jar usually, not sure what's wrong. But can you try add the derby jar into the driver classpath and try again? -Original Message- From: bdev [mailto:buntu...@gmail.com] Sent: Tuesday, July 7, 2015 5:07 PM To: user@spark.apache.org Subject: HiveContext throws org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient Just trying to get started with Spark and attempting to use HiveContext using spark-shell to interact with existing Hive tables on my CDH cluster but keep running into the errors (pls see below) when I do 'hiveContext.sql(show tables)'. Wanted to know what all JARs need to be included to have this working. Thanks! java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:472) at org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229) at org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225) at org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241) at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240) at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:31) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:36) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:38) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:40) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:42) at $iwC$$iwC$$iwC$$iwC$$iwC.init(console:44) at $iwC$$iwC$$iwC$$iwC.init(console:46) at $iwC$$iwC$$iwC.init(console:48) at $iwC$$iwC.init(console:50) at $iwC.init(console:52) at init(console:54) at .init(console:58) at .clinit(console) at .init(console:7) at .clinit(console) at $print(console) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:856) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:901) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:813) at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:656) at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:664) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:669) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:996) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:944) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1058) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.RuntimeException: Unable to
Re: How to implement top() and filter() on object List for JavaRDD
Rusty, I am very thankful for your help. Actually, I am facing difficulty in objects. My plan is that, I have an object list containing list of User objects. After parallelizing it through spark context, I apply comparator on user.getUserName(). As usernames are sorted, their related user object are sorted according to user names. In the end when I apply top, I get the whole object of user . Some like this: public static ComparatorUser UserComparator = new ComparatorUser() { public int compare(User usr1, User usr2) { String userName1 = usr1.getUserName().toUpperCase(); String userName2 = usr1.getUserName().toUpperCase(); //ascending order return userName1.compareTo(userName2); //descending order //return fruitName2.compareTo(fruitName1); } }; JavaRDDUser rdd = context.parallelize(usersList); 2015-07-07 16:05 GMT+02:00 rusty [via Apache Spark User List] ml-node+s1001560n23684...@n3.nabble.com: JavaRDDString lines2 = ctx.parallelize(Arrays.asList(3, 6, 2, 5, 8, 6, 7)); ListString top =lines2.top(7, new CustomComapratorString()); for (String integer : top) { System.out.println(integer); } class CustomComapratorT implements Serializable, ComparatorT { /** * */ public CustomComaprator() { // TODO Auto-generated constructor stub } private static final long serialVersionUID = 2004092520677431781L; @Override public int compare(T o11, T o12) { int o1 = Integer.parseInt(String.valueOf(o11)); int o2 = Integer.parseInt(String.valueOf(o12)); return o1 o2 ? 1 : o1 == o2 ? 0 : -1; } } -- If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-implement-top-and-filter-on-object-List-for-JavaRDD-tp23669p23684.html To unsubscribe from How to implement top() and filter() on object List for JavaRDD, click here http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=23669code=aGFmc2EuYXNpZkBtYXRjaGluZ3V1LmNvbXwyMzY2OXwtMTA0ODgyNjY3NA== . NAML http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-implement-top-and-filter-on-object-List-for-JavaRDD-tp23669p23688.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: The auxService:spark_shuffle does not exist
we tried --master yarn-client with no different result. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/The-auxService-spark-shuffle-does-not-exist-tp23662p23689.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How to implement top() and filter() on object List for JavaRDD
Sorry ignore my last reply. Rusty, I am very thankful for your help. Actually, I am facing difficulty in objects. My plan is that, I have an object list containing list of User objects. After parallelizing it through spark context, I apply comparator on user.getUserName(). As usernames are sorted, their related user object are sorted according to user names. In the end when I apply top, I get the whole object of user . Some like this: public static ComparatorUser UserComparator = new ComparatorUser() { public int compare(User usr1, User usr2) { String userName1 = usr1.getUserName().toUpperCase(); String userName2 = usr1.getUserName().toUpperCase(); //ascending order return userName1.compareTo(userName2); //descending order //return fruitName2.compareTo(fruitName1); } }; JavaRDDUser rdd = context.parallelize(usersList); ListUser top =rdd.top(1, UserComparator);// but it is giving me serialization issues. Can you guide me that how can I sort out this issue? I also applied your code, it runs fine but I want it with object (You know Object gives me tough time). Best Hafsa 2015-07-07 16:54 GMT+02:00 Hafsa Asif hafsa.a...@matchinguu.com: Rusty, I am very thankful for your help. Actually, I am facing difficulty in objects. My plan is that, I have an object list containing list of User objects. After parallelizing it through spark context, I apply comparator on user.getUserName(). As usernames are sorted, their related user object are sorted according to user names. In the end when I apply top, I get the whole object of user . Some like this: public static ComparatorUser UserComparator = new ComparatorUser() { public int compare(User usr1, User usr2) { String userName1 = usr1.getUserName().toUpperCase(); String userName2 = usr1.getUserName().toUpperCase(); //ascending order return userName1.compareTo(userName2); //descending order //return fruitName2.compareTo(fruitName1); } }; JavaRDDUser rdd = context.parallelize(usersList); 2015-07-07 16:05 GMT+02:00 rusty [via Apache Spark User List] [hidden email] http:///user/SendEmail.jtp?type=nodenode=23688i=0: JavaRDDString lines2 = ctx.parallelize(Arrays.asList(3, 6, 2, 5, 8, 6, 7)); ListString top =lines2.top(7, new CustomComapratorString()); for (String integer : top) { System.out.println(integer); } class CustomComapratorT implements Serializable, ComparatorT { /** * */ public CustomComaprator() { // TODO Auto-generated constructor stub } private static final long serialVersionUID = 2004092520677431781L; @Override public int compare(T o11, T o12) { int o1 = Integer.parseInt(String.valueOf(o11)); int o2 = Integer.parseInt(String.valueOf(o12)); return o1 o2 ? 1 : o1 == o2 ? 0 : -1; } } -- If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-implement-top-and-filter-on-object-List-for-JavaRDD-tp23669p23684.html To unsubscribe from How to implement top() and filter() on object List for JavaRDD, click here. NAML http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: Re: How to implement top() and filter() on object List for JavaRDD http://apache-spark-user-list.1001560.n3.nabble.com/How-to-implement-top-and-filter-on-object-List-for-JavaRDD-tp23669p23688.html Sent from the Apache Spark User List mailing list archive http://apache-spark-user-list.1001560.n3.nabble.com/ at Nabble.com.
Spark Kafka Direct Streaming
Hi, I am using the new experimental Direct Stream API. Everything is working fine but when it comes to fault tolerance, I am not sure how to achieve it. Presently my Kafka config map looks like this configMap.put(zookeeper.connect,192.168.51.98:2181); configMap.put(group.id, UUID.randomUUID().toString()); configMap.put(auto.offset.reset,smallest); configMap.put(auto.commit.enable,true); configMap.put(topics,IPDR31); configMap.put(kafka.consumer.id,kafkasparkuser); configMap.put(bootstrap.servers,192.168.50.124:9092); SetString topic = new HashSetString(); topic.add(IPDR31); JavaPairInputDStreambyte[], byte[] kafkaData = KafkaUtils.createDirectStream(js,byte[].class,byte[].class,DefaultDecoder.class,DefaultDecoder.class,configMap,topic); Questions - Q1- Is my Kafka configuration correct or should it be changed? Q2- I also looked into the Checkpointing but in my usecase, Data checkpointing is not required but meta checkpointing is required. Can I achieve this, i.e. enabling meta checkpointing and not the data checkpointing? Thanks Abhishek Patel -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Kafka-Direct-Streaming-tp23685.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: is it possible to disable -XX:OnOutOfMemoryError=kill %p for the executors?
I've recompiled spark deleting the -XX:OnOutOfMemoryError=kill declaration, but still I am getting a SIGTERM! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/is-it-possible-to-disable-XX-OnOutOfMemoryError-kill-p-for-the-executors-tp23680p23687.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Remoting started followed by a Remoting shut down straight away
Is this normal? 15/07/07 15:27:04 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor@cruncher02.stratified:50063] 15/07/07 15:27:04 INFO util.Utils: Successfully started service 'sparkExecutor' on port 50063. 15/07/07 15:27:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down. The executor keeps running for a while but then suddently: 15/07/07 15:29:55 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Remoting-started-followed-by-a-Remoting-shut-down-straight-away-tp23686.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Error when connecting to Spark SQL via Hive JDBC driver
Hi, problem not solved yet. Compiling Spark by myself is no option. I don't have permissions and skills for doing that. Could someone please explain, what exactly is causing the problem? If Spark is distributed via pre-compiled versions, why not to add the corresponding JDBC driver jars? At least Spark SQL is competing for Standard Connectivity: Connect through JDBC or ODBC. Spark SQL includes a server mode with industry standard JDBC and ODBC connectivity. It seems like a great piece of software is shipped with a small lack, making it unusable for a part of the community... Thank you. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-when-connecting-to-Spark-SQL-via-Hive-JDBC-driver-tp23397p23691.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org