Thanks Ted,
that helped me, it turned out that I wrongly formated the name of the
server, I had to add spark:// in front of server name.
Cheers,
Andrejs
On 11/11/15 14:26, Ted Yu wrote:
Please take a look
at launcher/src/test/java/org/apache/spark/launcher/SparkLauncherSuite.java
to see how
spark-1.4.1-bin-hadoop2.6")
.setAppResource("/home/user/MyCode/forSpark/wordcount.py").addPyFile("/home/andabe/MyCode/forSpark/wordcount.py")
.setMaster("myServerName")
.setAppName("pytho2word")
.launch();
println("finishing")
spark.waitFor();
println("finished")
Any help is appreciated.
Cheers,
Andrejs
Thank you for the information.
Cheers,
Andrejs
On 04/18/2015 10:23 AM, Nick Pentreath wrote:
> ES-hadoop uses a scan & scroll search to efficiently retrieve large
> result sets. Scores are not tracked in a scan and sorting is not
> supported hence 0 scores.
>
> http://www.
hip, Butler County, Ohio, in
the United States. It is located about ten miles southwest of Hamilton on
Howards Creek, a tributary of the Great Miami River in section 28 of R1ET3N of
the Congress Lands. It is three miles west of Shandon and two miles south of
Okeana.", _metadata -> Map(_index -> dbpedia, _type -> docs, _id ->
AUy5aQs7895C6HE5GmG4, _score -> 0.0))
As you can see _score is 0.
Would appreciate any help,
Cheers,
Andrejs
Hi,
Can some one pleas sugest me, what is the best way to output spark data as
JSON file. (File where each line is a JSON object)
Cheers,
Andrejs
I found my problem. I assumed based on TF-IDF in Wikipedia , that log base
10 is used, but as I found in this discussion
<https://groups.google.com/forum/#!topic/scala-language/K5tbYSYqQc8>, in
scala it is actually ln (natural logarithm).
Regards,
Andrejs
On Thu, Oct 30, 2014 at 10:49 PM,
f
where calculated
Best regards,
Andrejs
Hi,
I'm new to Mllib and spark. I'm trying to use tf-idf and use those values
for term ranking.
I'm getting tf values in vector format, but how can get the values of
vector?
val sc = new SparkContext(conf)
val documents: RDD[Seq[String]] =
sc.textFile("/home/andr