What you say does not imply that numpy can inter-operate with existing
Spark machine learning code. It is also certainly the case that no numpy
currently uses Spark.
It may well be that users could use numpy in closures being sent to Spark,
but that is a far walk from useful parallel numerical co
Ted I am not too sure but this https://spark.apache.org/faq.html, suggests
otherwise I think. Does Spark require modified versions of Scala or Python?
No. Spark requires no changes to Scala or compiler plugins. The Python API
uses the standard CPython implementation, and can call into existing C
I try to create a vector map from lucene within java. mahout is trying to
invoke lucene 3.x classes not me.
i will be grateful, if someone can give me a sample code of how to create a
vector file from a lucene 4.x index directory.
http://mahout.apache.org/users/basics/creating-vectors-from-text.h
You can't be using Lucene 4x with Lucene 3x. Lucene 4x is not backward
compatible with Lucene 3x.
R u trying to set TermVectors and offsets, if so it should be done
differently with Lucene 4x, see TestClusterDumper.java for an example.
On Thu, Oct 23, 2014 at 7:15 PM, Benjamin Eckstein wrote:
>
what information do you need?
I use mahout 0.9 and lucene 4.6.1 via maven depency.
those two line in the main method produces the error
String[] args = {"--field title","--dir ressources/mahout/tmp",
"--dictOut term_dictionary.txt","--output sequence.file","--idfield isbn"};
org.apache.mah
sorry: Part 2.
String[] args = {"--field title","--dir ressources/mahout/tmp",
"--dictOut term_dictionary.txt","--output sequence.file","--idfield isbn"};
org.apache.mahout.utils.vectors.lucene.Driver.main(args);
I have posted more details on Stackoverflow see
http://stackoverflow.com/q
Can you please provide more information?
On Thu, Oct 23, 2014 at 3:51 PM, Benjamin Eckstein wrote:
> Hello, i have 2 lines of code, that produces a class not found exception
>
Hello, i have 2 lines of code, that produces a class not found exception
Hmmm
I don't think that the array formats used by Spark are compatible with the
formats used by numpy.
I could be wrong, but even if there isn't outright incompatibility, there
is likely to be some significant overhead in format conversion.
On Tue, Oct 21, 2014 at 6:12 PM, Vibhanshu Prasad
Off the list I’ve heard of problems using the maven artifacts for Spark even
when you are not building Spark. There have been reported problems in the
serialization class UIDs generated when building Mahout. If you encounter those
try the build method in the PR and report these to the Spark folk
Hi Ted,
What is MapR classifiers? do you mean MapReduce?
Since the data is streaming data, shall we store the data in any database
like NoSQL DB and export it to Hadoop (if the data is huge) build the
model, and deploy the model in production for classifying the streaming
data in realtime?
But ho
11 matches
Mail list logo