output the datas(txt)
I get results from RDDs, like : Array(Array(1,2,3),Array(2,3,4),Array(3,4,6)) how can I output them to 1.txt like : 1 2 3 2 3 4 3 4 6 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/output-the-datas-txt-tp26350.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
deal with datas' structure
Now,I have a map val ji = scala.collection.mutable.Map[String,scala.collection.mutable.ArrayBuffer[String]]() there are so many datas like: ji = map("a"->ArrayBuffer["1","2","3"],"b"->ArrayBuffer["1","2","3"],"c"->ArrayBuffer["2","3"]) if "a" choose "1","b" and "c" can't choose "1", for example, ji = map("b"->ArrayBuffer["2","3"],"c"->ArrayBuffer["2","3"]) if "b" choose "2","c" can't choose "2", for example, ji = map("c"->ArrayBuffer["3"]) And I need get all the Possibilities and sort them from small to big and ouput them to result.txt. Results Like: a:1 b:2 c:3 a:1 c:2 b:3 b:1 a:2 c:3 b:1 c:2 a:3 Finally,we can get result.txt: a b c a c b b a c b c a What should I do? I think it is difficult to me. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/deal-with-datas-structure-tp26349.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
When I merge some datas,can't go on...
I have a file,like 1.txt: 1 2 1 3 1 4 1 5 1 6 1 7 2 4 2 5 2 7 2 9 I want to merge them,results like this map(1->List(2,3,4,5,6,7),2->List(4,5,7,9)) what should I do?。。 val file1=sc.textFile("1.txt") val q1=file1.flatMap(_.split(' '))???,maybe I should change RDD[int] to RDD[int,int]? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/When-I-merge-some-datas-can-t-go-on-tp26341.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
sql:Exception in thread "main" scala.MatchError: StringType
(sbt) scala: import org.apache.spark.SparkContext import org.apache.spark.SparkConf import org.apache.spark.sql object SimpleApp { def main(args: Array[String]) { val conf = new SparkConf() conf.setAppName("mytest").setMaster("spark://Master:7077") val sc = new SparkContext(conf) val sqlContext = new sql.SQLContext(sc) val d=sqlContext.read.json("/home/hadoop/2015data_test/Data/Data/100808cb11e9898816ef15fcdde4e1d74cbc0b/Db6Jh2XeQ.json") sc.stop() } } __ after sbt package : ./spark-submit --class "SimpleApp" /home/hadoop/Downloads/sbt/bin/target/scala-2.10/simple-project_2.10-1.0.jar ___ json fIle: { "programmers": [ { "firstName": "Brett", "lastName": "McLaughlin", "email": "" }, { "firstName": "Jason", "lastName": "Hunter", "email": "" }, { "firstName": "Elliotte", "lastName": "Harold", "email": "" } ], "authors": [ { "firstName": "Isaac", "lastName": "Asimov", "genre": "sciencefiction" }, { "firstName": "Tad", "lastName": "Williams", "genre": "fantasy" }, { "firstName": "Frank", "lastName": "Peretti", "genre": "christianfiction" } ], "musicians": [ { "firstName": "Eric", "lastName": "Clapton", "instrument": "guitar" }, { "firstName": "Sergei", "lastName": "Rachmaninoff", "instrument": "piano" } ] } ___ Exception in thread "main" scala.MatchError: StringType (of class org.apache.spark.sql.types.StringType$) at org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58) at org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139) ___ why -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/sql-Exception-in-thread-main-scala-MatchError-StringType-tp25868.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
get parameters of spark-submit
1.I code my scala class and pack.(not input the hdfs files' paths,just use the paths from "spark-submit"'s parameters) 2.Then,If I input like this: ${SPARK_HOME/bin}/spark-submit \ --master \ \ hdfs:// \ hdfs:// \ what should I do to get the two hdfs files' paths in my scala class's code(before pack the jar file)? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/get-parameters-of-spark-submit-tp25749.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
worker:java.lang.ClassNotFoundException: ttt.test$$anonfun$1
package ttt import org.apache.spark.SparkConf import org.apache.spark.SparkContext object test { def main(args: Array[String]) { val conf = new SparkConf() conf.setAppName("mytest") .setMaster("spark://Master:7077") .setSparkHome("/usr/local/spark") .setJars(Array("/home/hadoop/spark-assembly-1.4.0-hadoop2.4.0.jar","/home/hadoop/datanucleus-core-3.2.10.jar","/home/hadoop/spark-1.4.0-yarn-shuffle.jar","spark-examples-1.4.0-hadoop2.4.0.jar")) val sc = new SparkContext(conf) val rawData = sc.textFile("/home/hadoop/123.csv") val secondData = rawData.flatMap(_.split(",").toString) println(secondData.first) /line 32 sc.stop() } } __ the problem is: //219.216.64.55 is my worker's ip 15/12/14 03:18:34 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 219.216.64.55): java.lang.ClassNotFoundException: ttt.test$$anonfun$1 at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) ... __ I can run the example that spark provides , but if I try my own program,it can't find the class.. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/worker-java-lang-ClassNotFoundException-ttt-test-anonfun-1-tp25696.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re:Re: HELP! I get "java.lang.String cannot be cast to java.lang.Intege " for a long time.
Thank you,and I find the problem is my package is test,but I write package org.apache.spark.examples ,and IDEA had imported the spark-examples-1.5.2-hadoop2.6.0.jar ,so I can run it,and it makes lots of problems __ Now , I change the package like this: package test import org.apache.spark.SparkConf import org.apache.spark.SparkContext object test { def main(args: Array[String]) { val conf = new SparkConf().setAppName("mytest").setMaster("spark://Master:7077") val sc = new SparkContext(conf) sc.addJar("/home/hadoop/spark-assembly-1.5.2-hadoop2.6.0.jar")//It doesn't work.!? val rawData = sc.textFile("/home/hadoop/123.csv") val secondData = rawData.flatMap(_.split(",").toString) println(secondData.first) /line 32 sc.stop() } } it causes that: 15/12/11 18:41:06 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 219.216.65.129): java.lang.ClassNotFoundException: test.test$$anonfun$1 at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) // 219.216.65.129 is my worker computer. // I can connect to my worker computer. // Spark can start successfully. // addFile is also doesn't work,the tmp file will also dismiss. At 2015-12-10 22:32:21, "Himanshu Mehra [via Apache Spark User List]" wrote: You are trying to print an array, but anyway it will print the objectID of the array if the input is same as you have shown here. Try flatMap() instead of map and check if the problem is same. --Himanshu If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/HELP-I-get-java-lang-String-cannot-be-cast-to-java-lang-Intege-for-a-long-time-tp25666p25667.html To unsubscribe from HELP! I get "java.lang.String cannot be cast to java.lang.Intege " for a long time., click here. NAML -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/HELP-I-get-java-lang-String-cannot-be-cast-to-java-lang-Intege-for-a-long-time-tp25666p25689.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: HELP! I get "java.lang.String cannot be cast to java.lang.Intege " for a long time.
I do like this "val secondData = rawData.flatMap(_.split("\t").take(3))" and I find: 15/12/10 22:36:55 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 219.216.65.129): java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106) at org.apache.spark.examples.SparkPi$$anonfun$1.apply(SparkPi.scala:32) at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/HELP-I-get-java-lang-String-cannot-be-cast-to-java-lang-Intege-for-a-long-time-tp25666p25668.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org