We are moving to higher performance platforms than Hadoop mapreduce, like
Spark. You can still do map/reduce style code but Mahout's not taking new
Hadoop mr code.
On Oct 1, 2014, at 6:30 AM, Arian Pasquali ar...@arianpasquali.com wrote:
Yes Suneel,
Indeed It is in MR fashion.
What exactly do
Hey guys,
I think it is fair to give you some feedback.
I managed to implement BM25+ http://en.wikipedia.org/wiki/Okapi_BM25 term
score on Mahout.
It was straightforward using the current TFIDF implementation as an example.
Basically what I did was implement the interface
How did u implement BM25PartialVectorReducer and BM25Converter?? The
present implementations for TFIDFConverter and Reducer are MR.
Mahout is not accepting any new MapReduce code.
On Wed, Oct 1, 2014 at 7:18 AM, Arian Pasquali ar...@arianpasquali.com
wrote:
Hey guys,
I think it is fair to give
Thanks so much for the feedback. Glad to hear it was straightforward.
But the important question is
how did BM25 work for you?
On Wed, Oct 1, 2014 at 6:18 AM, Arian Pasquali ar...@arianpasquali.com
wrote:
Hey guys,
I think it is fair to give you some feedback.
I managed to
Hi Ted,
My dataset is a collection of documents in german and I can say that the
scores seems better compared to my TFIDF scores. Results make more sense
now, specially my bi-grams.
Arian Pasquali
http://about.me/arianpasquali
2014-10-01 13:09 GMT+01:00 Ted Dunning ted.dunn...@gmail.com:
Yes Suneel,
Indeed It is in MR fashion.
What exactly do you mean when you said Mahout is not accepting any new
MapReduce code?
Do you mean for submitting a patch?
I'm sure there might be better ways to implement it, but I'm more
interesting in the results right now.
What would be your
On Wed, Oct 1, 2014 at 7:52 AM, Arian Pasquali ar...@arianpasquali.com
wrote:
My dataset is a collection of documents in german and I can say that the
scores seems better compared to my TFIDF scores. Results make more sense
now, specially my bi-grams.
OK.
I will take note.
Yes,
I'm studying his work http://nlp.uned.es/~jperezi/Lucene-BM25/ and the
current mahout's tfidf code.
Trying to understand how I would port that to mr.
I ll try to share something if I succeed.
Arian Pasquali
http://about.me/arianpasquali
2014-09-24 5:12 GMT+01:00 Suneel Marthi
Hello everyone,
I'm very sorry to bump in like this, I have been added to the mail list
(I think), but it seems that I'm somehow unable to ask a question, that
is, I asked a question full times and got no answer. I hope this way
will work.
I'm new to Mahout and I've been struggling with
@Marko, Subject: Streaming KMeans
See
http://stackoverflow.com/questions/17272296/how-to-use-mahout-streaming-k-means/18090471#18090471
for how to invoke Streaming Kmeans
Also look at examples/bin/cluster-reuters.sh for the Streaming KMeans
option.
On Wed, Sep 24, 2014 at 11:34 AM, Marko
Marko,
Suneel's answer is much better than mine.
On Wed, Sep 24, 2014 at 10:10 PM, Suneel Marthi suneel.mar...@gmail.com
wrote:
@Marko, Subject: Streaming KMeans
See
http://stackoverflow.com/questions/17272296/how-to-use-mahout-streaming-k-means/18090471#18090471
for how to invoke
Hi,
I was wondering if would be possible to support bm25 term weighting
extending Mahout's tf-idf implementation.
I was curious to know if anyone here has already tried to do so.
If not, what would be your suggestion for such implementation on Mahout?
Arian Pasquali
Should be pretty easy. I haven't heard of anyone doing it.
Sent from my iPhone
On Sep 23, 2014, at 18:53, Arian Pasquali ar...@arianpasquali.com wrote:
Hi,
I was wondering if would be possible to support bm25 term weighting
extending Mahout's tf-idf implementation.
I was curious to
Lucene 4.x supports okapi-bm25. So it should be easy to implement.
On Tue, Sep 23, 2014 at 11:57 PM, Ted Dunning ted.dunn...@gmail.com wrote:
Should be pretty easy. I haven't heard of anyone doing it.
Sent from my iPhone
On Sep 23, 2014, at 18:53, Arian Pasquali ar...@arianpasquali.com
14 matches
Mail list logo