output the datas(txt)

2016-02-27 Thread Bonsen
I get results from RDDs,
like :
Array(Array(1,2,3),Array(2,3,4),Array(3,4,6))
how can I output them to 1.txt
like :
1 2 3
2 3 4
3 4 6




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/output-the-datas-txt-tp26350.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



deal with datas' structure

2016-02-27 Thread Bonsen
Now,I have a map
val ji =
scala.collection.mutable.Map[String,scala.collection.mutable.ArrayBuffer[String]]()
there are so many datas like:
ji =
map("a"->ArrayBuffer["1","2","3"],"b"->ArrayBuffer["1","2","3"],"c"->ArrayBuffer["2","3"])

if "a" choose "1","b" and "c" can't choose "1",
for example,
ji = map("b"->ArrayBuffer["2","3"],"c"->ArrayBuffer["2","3"])
if "b" choose "2","c" can't choose "2",
for example,
ji = map("c"->ArrayBuffer["3"])

And I need get all the Possibilities and sort them from small to big and
ouput them to result.txt.
Results Like:
a:1  b:2  c:3
a:1  c:2  b:3
b:1  a:2  c:3
b:1  c:2  a:3

Finally,we can get result.txt:
 a b c
 a c b
 b a c
 b c a
   
What should I do? I think it is difficult to me.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/deal-with-datas-structure-tp26349.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



When I merge some datas,can't go on...

2016-02-25 Thread Bonsen
I have a file,like 1.txt:
1 2
1 3 
1 4 
1 5 
1 6 
1 7
2 4
2 5
2 7
2 9

I want to merge them,results like this
map(1->List(2,3,4,5,6,7),2->List(4,5,7,9))
what should I do?。。
val file1=sc.textFile("1.txt")
val q1=file1.flatMap(_.split(' '))???,maybe I should change RDD[int] to
RDD[int,int]?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/When-I-merge-some-datas-can-t-go-on-tp26341.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



sql:Exception in thread "main" scala.MatchError: StringType

2016-01-03 Thread Bonsen
(sbt) scala:
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql
object SimpleApp {
  def main(args: Array[String]) {
val conf = new SparkConf()
conf.setAppName("mytest").setMaster("spark://Master:7077")
val sc = new SparkContext(conf)
val sqlContext = new sql.SQLContext(sc)
val
d=sqlContext.read.json("/home/hadoop/2015data_test/Data/Data/100808cb11e9898816ef15fcdde4e1d74cbc0b/Db6Jh2XeQ.json")
sc.stop()
  }
}
__
after sbt package :
./spark-submit --class "SimpleApp" 
/home/hadoop/Downloads/sbt/bin/target/scala-2.10/simple-project_2.10-1.0.jar
___
json fIle:
{
"programmers": [
{
"firstName": "Brett",
"lastName": "McLaughlin",
"email": ""
},
{
"firstName": "Jason",
"lastName": "Hunter",
"email": ""
},
{
"firstName": "Elliotte",
"lastName": "Harold",
"email": ""
}
],
"authors": [
{
"firstName": "Isaac",
"lastName": "Asimov",
"genre": "sciencefiction"
},
{
"firstName": "Tad",
"lastName": "Williams",
"genre": "fantasy"
},
{
"firstName": "Frank",
"lastName": "Peretti",
"genre": "christianfiction"
}
],
"musicians": [
{
"firstName": "Eric",
"lastName": "Clapton",
"instrument": "guitar"
},
{
"firstName": "Sergei",
"lastName": "Rachmaninoff",
"instrument": "piano"
}
]
}
___
Exception in thread "main" scala.MatchError: StringType (of class
org.apache.spark.sql.types.StringType$)
at org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58)
at
org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139)
___
why



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/sql-Exception-in-thread-main-scala-MatchError-StringType-tp25868.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



get parameters of spark-submit

2015-12-21 Thread Bonsen
1.I code my scala class and pack.(not input the hdfs files' paths,just use
the paths from "spark-submit"'s parameters)
2.Then,If I input like this:
${SPARK_HOME/bin}/spark-submit \
--master  \
 \
hdfs:// \
hdfs:// \

what should I do to get the two hdfs files' paths in my scala class's
code(before pack the jar file)?




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/get-parameters-of-spark-submit-tp25749.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



worker:java.lang.ClassNotFoundException: ttt.test$$anonfun$1

2015-12-14 Thread Bonsen
package ttt

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object test {
  def main(args: Array[String]) {
val conf = new SparkConf()
conf.setAppName("mytest")
  .setMaster("spark://Master:7077")
  .setSparkHome("/usr/local/spark")
 
.setJars(Array("/home/hadoop/spark-assembly-1.4.0-hadoop2.4.0.jar","/home/hadoop/datanucleus-core-3.2.10.jar","/home/hadoop/spark-1.4.0-yarn-shuffle.jar","spark-examples-1.4.0-hadoop2.4.0.jar"))
val sc = new SparkContext(conf)
val rawData = sc.textFile("/home/hadoop/123.csv")
val secondData = rawData.flatMap(_.split(",").toString)
println(secondData.first)   /line 32
sc.stop()
  }
}
__
the problem is:   //219.216.64.55 is my worker's ip
15/12/14 03:18:34 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
219.216.64.55): java.lang.ClassNotFoundException: ttt.test$$anonfun$1
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
...
__
I can run the example that spark provides , but if I try my own program,it
can't find the class..



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/worker-java-lang-ClassNotFoundException-ttt-test-anonfun-1-tp25696.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re:Re: HELP! I get "java.lang.String cannot be cast to java.lang.Intege " for a long time.

2015-12-11 Thread Bonsen
Thank you,and I find the problem is my package is test,but I write package 
org.apache.spark.examples ,and IDEA had imported the 
spark-examples-1.5.2-hadoop2.6.0.jar ,so I can run it,and it makes lots of 
problems
__
Now , I change the package like this:


package test
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object test {
  def main(args: Array[String]) {
val conf = new 
SparkConf().setAppName("mytest").setMaster("spark://Master:7077")
val sc = new SparkContext(conf)

sc.addJar("/home/hadoop/spark-assembly-1.5.2-hadoop2.6.0.jar")//It doesn't 
work.!?

val rawData = sc.textFile("/home/hadoop/123.csv")
val secondData = rawData.flatMap(_.split(",").toString)
println(secondData.first)   /line 32
sc.stop()
  }
}

it causes that: 
15/12/11 18:41:06 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 
219.216.65.129): java.lang.ClassNotFoundException: test.test$$anonfun$1
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)




//  219.216.65.129 is my worker computer.
//  I can connect to my worker computer.
// Spark can start successfully.
//  addFile is also doesn't work,the tmp file will also dismiss.








At 2015-12-10 22:32:21, "Himanshu Mehra [via Apache Spark User List]" 
 wrote:
You are trying to print an array, but anyway it will print the objectID  of the 
array if the input is same as you have shown here. Try flatMap() instead of map 
and check if the problem is same.

   --Himanshu


If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/HELP-I-get-java-lang-String-cannot-be-cast-to-java-lang-Intege-for-a-long-time-tp25666p25667.html
To unsubscribe from HELP! I get "java.lang.String cannot be cast to 
java.lang.Intege " for a long time., click here.
NAML



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/HELP-I-get-java-lang-String-cannot-be-cast-to-java-lang-Intege-for-a-long-time-tp25666p25689.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: HELP! I get "java.lang.String cannot be cast to java.lang.Intege " for a long time.

2015-12-10 Thread Bonsen
I do like this "val secondData = rawData.flatMap(_.split("\t").take(3))"

and I find:
15/12/10 22:36:55 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
219.216.65.129): java.lang.ClassCastException: java.lang.String cannot be
cast to java.lang.Integer
at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106)
at org.apache.spark.examples.SparkPi$$anonfun$1.apply(SparkPi.scala:32)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/HELP-I-get-java-lang-String-cannot-be-cast-to-java-lang-Intege-for-a-long-time-tp25666p25668.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org