stuff. I am new to both one and I
> want to choose one to focus on.
>
>
>
> On Tue, Jul 10, 2012 at 4:35 PM, Ted Dunning
> wrote:
>
> > Note that on page 6 they explicitly say that if they had to actually read
> > their input, this wouldn't help. Since they
On Fri, Jul 13, 2012 at 12:09 PM, Masoud Moshref Javadi wrote:
> First of all thank you for your response with pictures.
> That's true. Some features are 1 in many points and some are not. That's
> the nature of my problem. But I did not scale features.
> Should I do scaling? may be using a dimens
For document-like things objects search using text retrieval-like
techniques (but in batch) is good.
For reduced dimension document-like things then you need to go to
alternative methods to do full scale nearest neighbor computations. With a
strong metric like L_2, you can to all-points nearest n
Solr would do this well. The upcoming knn package would do it differently
and for different purposes, but also would do it well.
On Sat, Jul 14, 2012 at 8:17 AM, Pat Ferrel wrote:
> Intersting.
>
> I have another requirement, which is to do something like real time vector
> based queries. Imagi
I would call it kinda-cosine distance. There are some intricate
normalization factors.
On Sat, Jul 14, 2012 at 5:22 PM, Lance Norskog wrote:
> Lucene's MoreLikeThis feature does cosine distance (I think) directly
> against term vectors.
>
> On Sat, Jul 14, 2012 at 11:1
It is possibly sparseness, but more likely this is the known pathology of
the adaptive logistic regression in which it gets over-confident and locks
down training rate too early.
I have a few suggestions:
1) try the OnlineLogisticRegression. I think that you can find decent
training parameters p
For picking terms from a document that stand apart from those in a large
corpus, this tf*idf trick is nearly identical to using the latent log
likelihood test. It produces pretty darned good results.
On Tue, Jul 17, 2012 at 8:22 PM, Ken Krugler wrote:
> The simplistic approach I used was to extr
Folks have done SVD on very large matrices with Mahout, but not necessarily
for spectral clustering.
Are you sure that you actually need 4000 vectors? As sparse as your data
is, I would expect that no more than a few hundred are anything but
statistical noise.
On Thu, Jul 19, 2012 at 6:32 PM, An
> Hi Ted,
> Thanks for your reply.
> I am doing clustering of 10^6 objects (thus affinity matrix of that size)
> and expect 4000-10,000 clusters. That's why I need those many eigenvectors.
>
> Will SVD be faster in this case ?
>
> Aniruddha
>
>
>
> On Jul 19, 2012, a
I am unaware of such comparisons. I also don't know of any practical
implementations for doing really huge decompositions in parallel.
On Sat, Jul 28, 2012 at 10:27 AM, mohsen jadidi wrote:
> Thank you for your replies. What I am interested to know is that if I want
> to compute the SVD for hug
The algorithm used doesn't change this. If U S V' = A is the SVD of A, then
A' A = (U S V')' U S V'
= V S U' U S V'
= V S^2 V'
On Thu, Jul 26, 2012 at 4:31 PM, John Stewart wrote:
> With Lanczos, the eigenvectors of A'A give you the orthogonal matrix V of
> SVD, and th
Pat,
Seed selection is a big deal. See this paper for some ideas:
http://www.math.uwaterloo.ca/~cswamy/papers/kmeansfnl.pdf
On Mon, Jul 30, 2012 at 11:33 AM, Pat Ferrel wrote:
> I need to create groups of items that are similar to a seed item. This
> seed item may be a synthetic vector or may
WHy are you using Lanczos? Why not use something more recent?
On Tue, Jul 31, 2012 at 7:00 PM, Aniruddha Basak wrote:
> Hi,
> I am working on Spectral Kmeans which involves an eigen-decomposition step
> using Lanczos. As I did not get exact similar results as expected, I tried
> to understand th
ternative of Lanczos.
>
> Thanks,
> Aniruddha
>
>
> -Original Message-
> From: Ted Dunning [mailto:ted.dunn...@gmail.com]
> Sent: Tuesday, July 31, 2012 6:24 PM
> To: user@mahout.apache.org
> Cc: Jake Mannix
> Subject: Re: Mahout LanczosSolver explanation
>
>
I would like to endorse this point.
If your sparse data fits in memory on a single machine, it is very unlikely
that you will be able to improve on the cost of doing a stochastic
projection on that one machine using any Hadoop based solution.
Even with MPI and crazy RDMA networking, I doubt that
tf-idf is a good approximation of the LLR score for many applications and
often gives useful signatures although not always super pretty.
It helps to have an overall minimum document frequency for terms of the be
considered for being tags. This is the same as an IDF maximum.
On Fri, Aug 3, 2012
Unstemming is pretty simple. Just build an unstemming dictionary based on
seeing what word forms have lead to a stemmed form. Include frequencies.
When unstemming in the context of a document, pick the most popular
(corpus-wide) version that actually appears in the document.
On Fri, Aug 3, 2012
This is definitely just the first step. Similar goofs happen with
inappropriate stemming. For instance, AIDS should not stem to aid.
A reasonable way to find and classify exceptional cases is to look at
cooccurrence statistics. The contexts of original forms can be examined to
find cases where
Later diagrams in the classifier section were created using Omnigraffle.
Again, nothing too fancy.
On Fri, Aug 3, 2012 at 2:53 PM, Sean Owen wrote:
> (You can ask in the book forum if it is specific to the book rather than
> the project. Maybe I can follow up with you directly off list.)
>
> Wh
I didn't think that Java supports jars inside jars.
On Sat, Aug 4, 2012 at 5:04 PM, Lance Norskog wrote:
> The Maven build does a grand project unpacking multiple jars into one
> big one. Java apparently supports packing jars inside other jars- the
> outer jar needs a classpath property for the
upId}:${artifact.artifactId}
> org.apache.hadoop:hadoop-core
>
>
>
> true
>
> ${artifact.groupId}:${artifact.artifactId}
>
>
>
>
>
>
> On Sun, Aug 5, 2012 at 1:10 AM, Ted Dunning wrote:
> > I didn&
On Fri, Aug 3, 2012 at 1:08 PM, Dawid Weiss
> > > wrote:
> > >> I know, I know. :) Just wanted to mention that it could lead to funny
> > >> results, that's all. There are lots of way of doing proper form
> > >> disambiguation, includ
The upcoming knn package has a file based matrix implementation that uses
memory mapping to allow sharing a copy of a large matrix between processes and
threads.
Sent from my iPhone
On Aug 9, 2012, at 1:48 AM, Abramov Pavel wrote:
> Hello,
>
> If think Zipf's law is relevant for my data.
/knn
Any help in testing these new capabilities or plumbing them into the
standard Mahout capabilities would be very much appreciated.
On Thu, Aug 9, 2012 at 7:05 AM, Ted Dunning wrote:
> The upcoming knn package has a file based matrix implementation that uses
> memory mapping to allow sha
Recommenders and classifiers are very similar animals in general
except for the training data.
You can view a recommender as an engine that invents a classifier for
each user but it does this by using other user histories as training
data.
This means that there can be a lot of confusion when look
The Watchmaker implementation was not very scalable and there was no
perceptible user demand for it. There was also nobody who was maintaining
it.
So we nuked it.
There is still a limited evolutionary algorithm that is part of the
AdaptiveLogisticRegression. It is likely to be pretty good on pr
27;ll take a look at the old Watchmaker code and maybe try to improve on it.
> Thanks for the help.
>
> -Jason
>
>
> On Sat, Aug 11, 2012 at 6:20 PM, Ted Dunning
> wrote:
>
> > The Watchmaker implementation was not very scalable and there was no
> > perceptible
2012 at 1:36 AM, Ted Dunning
> wrote:
>
> > That sounds like a continuous optimization problem.
> >
> > Look at the org.apache.mahout.ep.EvolutionaryProcess
> >
> > It is an implementation of recorded step meta-mutation and does quite
> well
> > on many pr
Mattie,
Would this help?
https://github.com/tdunning/knn/blob/master/src/main/java/org/apache/mahout/knn/cluster/BallKmeans.java
and
https://github.com/tdunning/knn/blob/master/docs/scaling-k-means/scaling-k-means.pdf
On Wed, Aug 15, 2012 at 10:45 AM, Whitmore, Mattie wrote:
> Hi!
>
> I have
If your data is dense and numerical, then you don't need anything but
trivial encoding. Just copy the values from your CSV file into the vector,
converting to numbers as you go. If some of your data are categorical or
textual, you will need fancier footwork.
On Thu, Aug 16, 2012 at 3:28 AM, Chan
Most algorithms have non-hadoop versions.
On Thu, Aug 16, 2012 at 9:22 AM, Chandra Mohan, Ananda Vel Murugan <
ananda.muru...@honeywell.com> wrote:
> I think Mahout can be used as a library too. Some algorithms are
> implemented in map-reduce fashion and they may need Hadoop, but rewriting
> them
footwork? Should I convert categories into some numbers
> and store in vector? Thanks!!
>
> -Original Message-
> From: Ted Dunning [mailto:ted.dunn...@gmail.com]
> Sent: Thursday, August 16, 2012 8:08 PM
> To: user@mahout.apache.org
> Cc: mahout-u...@apache.org
> Subject
gt;= clusterClassificationThreshold;
> > }
> >
> > On 17-08-2012 20:06, Whitmore, Mattie wrote:
> >
> >> Hi Ted,
> >>
> >> Yes this is great! I hope to start working with this algorithm in the
> next couple weeks.
> >>
> >&
algorithm from dropping non-distinct vectors/data
> points (which is what I THINK but have yet to verify is what is going on)?
>
> Thanks,
>
> Mattie
>
> -----Original Message-
> From: Ted Dunning [mailto:ted.dunn...@gmail.com]
> Sent: Wednesday, August 22, 2012 1:18 PM
&
Not yet, but it makes a lot of sense to allow an InputProvider from the
guava library in addition to a single file. Not a lot of sense in things
in between.
On Wed, Aug 22, 2012 at 8:55 PM, Ahmed Elgohary wrote:
> Hi,
>
> I was wondering why the constructor of DistributedRowMatrix restricts the
Obviously, you need to refer also to scores of other items as well.
One handy stat is AUC whcih you can compute by averaging to get the
probability that a relevant (viewed) item has a higher recommendation score
than a non-relevant (not viewed) item.
On Sun, Aug 26, 2012 at 5:55 PM, Sean Owen wr
Here is some pretty old work that did the same sort of thing. The self
organizing map (SOM) is an interesting alternative to MDS since it allows
mapping a low dimensional approximate manifold to a linear space. The
basic idea is that it preserves close distances and doesn't much care about
distan
In another forum, I responded to this question this way:
One short answer is that you only need enough test data to drive the
> accuracy of your PR estimates to the point you need them. That isn't all
> that much data so the sequential version should do rather well.
> The gold standard, of course,
These are fairly straightforward to generate from random data.
Not particularly realistic, but highly parametrizable.
RCV1 should be almost in that range. I think that the recent KDD music
classification exercise would be in that range if viewed as a
classification exercise. See
http://jmlr.csa
The single most effective thing you can do with malicious users like this
is to let them think that they have won. In the ideal case, you can detect
simple click frauds and maintain a per user play adjustment so that they
see the fraudulent stats and everybody else sees the corrected stats. If
yo
can solve this case that happened to
> Amazon
> http://news.cnet.com/2100-1023-976435.html
>
> Thanks
>
>
>
>
> On Tue, Aug 28, 2012 at 8:23 PM, Ted Dunning
> wrote:
> > The single most effective thing you can do with malicious users like this
> > is to
It isn't a big deal to increase the Znode size, but it is bad practice. ZK
isn't a file store. It is a coordination server. The size limit is
intended to prevent large operations slowing down other operations. If you
aren't sharing your ZK or your neighbors don't have response time
expectations
distinct (albeit the data point is the same as other points
> in the set) will this keep the algorithm from dropping non-distinct
> vectors/data points (which is what I THINK but have yet to verify is what
> is going on)?
> >>
> >> Thanks,
> >>
> >> Mattie
&
Karl,
I don't think that I understand your request.
What I think I hear is that you want an implementation (with unknown inputs
and outputs) that encodes a Voronoi tesselation using boundary vertices
instead of centroids.
Is that correct?
If so, it is relatively easy to go from centroid form to
, Whitmore, Mattie wrote:
> I need to be using the matrices for BallKmeans. Can matrices be named? By
> this I mean can I assign a column of my matrix to be the "name" of each row?
>
> Thanks!
>
> -Original Message-
> From: Ted Dunning [mailto:ted.dunn...@gmail.co
But columns aren't what I would expect you to want labeled. I think that
row labels might be nicer. Happily, each named vector has a name for the
entire vector as well.
On Thu, Aug 30, 2012 at 2:48 PM, Ted Dunning wrote:
> The input to the BallKmeans is actually not a matrix.
s for the guidance!
>
>
> -Original Message-
> From: Ted Dunning [mailto:ted.dunn...@gmail.com]
> Sent: Thursday, August 30, 2012 2:52 PM
> To: user@mahout.apache.org
> Subject: Re: Mahout-279/kmeans++
>
> But columns aren't what I would expect you to want labeled.
No. The algorithm works either way. The algorithm doesn't need the full
capabilities of a matrix since it just makes a few sequential passes
through the data.
On Thu, Aug 30, 2012 at 3:25 PM, Whitmore, Mattie wrote:
> Would the algorithm implement better as if given a matrix? I'm thinking of
>
Yes. Essentially this means construct the Voronoi tesellation for all
points and for each post code, use the union of the regions for each point
in that post code. You will not necessarily have convex hulls for each
post-code, but you will have hulls and will almost certainly have a single
hull f
First, this is a tiny training set. You are well outside the intended
application range so you are likely to find less experience in the
community in that range. That said, the algorithm should still produce
reasonably stable results.
Here are a few questions:
a) which class are you using to tr
longs to.
>
> b) I am passing through the data once (at least this is what I think). I
> folowed the 20newsgroup example code(in java) and dint find that the data
> was passed more than once.
> Yes I randomize the order every time.
>
> a) I am using AdaptiveLearningRegression (j
>
> And randomize the order each time?
>
> On Fri, Aug 31, 2012 at 9:04 AM, Salman Mahmood
> wrote:
> > Cheers ted. Appreciate the input!
> >
> > Sent from my iPhone
> >
> > On 31 Aug 2012, at 17:53, Ted Dunning wrote:
> >
> >> OK.
> &g
] http://en.wikipedia.org/wiki/Bootstrapping_(statistics)
On Fri, Aug 31, 2012 at 11:24 PM, Ted Dunning wrote:
> That would be best, but practically speaking, randomizing once is usually
> OK. With a tiny data set like this that is in memory anyway, I wouldn't
> take any chances.
>
&
Can you provide your test code?
What difference did you observe?
Did you account for the fact that your matrix is small enough that it
probably wasn't divided correctly?
On Sat, Sep 1, 2012 at 1:27 AM, Ahmed Elgohary wrote:
> Hi,
>
> I used mahout's stochastic svd implementation to find the si
t singular vectors. The only thing is that they
> > seem to change the sign between R and Mahout's version but otherwise
> > they fit more or less exactly.
> >
> > So yeah i am seeing some stochastic effects in these for k and p being
> > so low -- so are you saying
x27;s version but otherwise
> > they fit more or less exactly.
> >
> > So yeah i am seeing some stochastic effects in these for k and p being
> > so low -- so are you saying your errors are greater than those? I did
> > not test sequential version with similar paramet
With 57 crawled docs, you can't reasonably set p > 57. That is your second
error.
On Sat, Sep 1, 2012 at 10:32 AM, Pat Ferrel wrote:
> I have a small data set that I am using in local mode for debugging
> purposes. The data is 57 crawled docs with something like 2200 terms. I run
> this through
gt;
> On Sep 1, 2012, at 7:53 AM, Ted Dunning wrote:
>
> With 57 crawled docs, you can't reasonably set p > 57. That is your second
> error.
>
> On Sat, Sep 1, 2012 at 10:32 AM, Pat Ferrel wrote:
>
> > I have a small data set that I am using in local mode for
On Sun, Sep 2, 2012 at 12:26 AM, Ahmed Elgohary wrote:
> - I am using k = 30 and p = 2 so (k+p)<99 (Rank(A))
> - I am attaching the csv file of the matrix A
>
Brilliant. And the attachment actually made it through.
> - yes, the difference is significant. Here is the output of the sequential
>
Did Ahmed even use a power iteration?
On Sun, Sep 2, 2012 at 1:35 AM, Dmitriy Lyubimov wrote:
> but there is still a concern in a sense that power iterations
> should've helped more than they did. I'll take a closer look but it
> will take me a while to figure if there's something we can improve
spectrum. Flat spectrum just means you don't have
> > trends in those directions, i.e. essentially a random noise. If you
> > have random noise, direction of that noise is usually of little
> > interest, but because spectrum (i.e. singular values) is measured
> > b
A quick t-test on these differences gives the same results no
significant difference.
On Mon, Sep 3, 2012 at 11:34 PM, Dmitriy Lyubimov wrote:
> Then i subtracted error means between two methods (+ sign means
> smaller error for MR version, -sign means smaller error for R
> sequential versio
The model size is very simple. If you have k categories and m features,
the model size will be (k-1) x m x s1 + m * s2 + s3 where s1 is roughly 8
bytes and s2 is about 4 bytes and s3 is probably around 100 bytes. These
are approximate numbers and could be off by 2 if I forgot something. The
firs
Yes. (A-M)V is U \Sigma. You may actually want something like U \sqrt
\Sigma instead, though.
On Wed, Sep 5, 2012 at 4:10 PM, Dmitriy Lyubimov wrote:
> Hello,
>
> I have a question w.r.t what to advise people in the SSVD manual for PCA.
>
> So we have
>
> (A-M) \approx U \Sigma V^t
>
> and st
This sounds pretty exciting. Beyond that, it is hard to say much.
Can you say a bit more about how you would see introducing the code into
Mahout?
On Thu, Sep 6, 2012 at 9:14 AM, Gokhan Capan wrote:
> By the way, I want to mention that my thesis is advised by Ozgur Yilmazel,
> who is a foundin
Try transforming them as well, likely with a log if they are positive and
have heavily skewed values.
Can you suck the data into R and paste in the results of summary(x)?
(assuming you put the data into the variable x). This should look
something like:
> summary(x)
>v1 v2
t above are welcome...to help me
> validate my thought process.
>
> Thanks for the hints, I will let you know how it turns out.
>
> Mike
>
> On Thu, Sep 6, 2012 at 8:14 PM, Ted Dunning wrote:
> >
> > Try transforming them as well, likely with a log if they are positi
You are using lots of threads but the sparse matrix structure is not thread
safe. Setting a value in the SparseMatrix causes mutation to internal data
structures.
If you can have each thread do all the updates for a single thread, that
would be much better. Another option is to synchronize on th
Great.
If the update has a huge impact on existing code, can you break it into
manageable pieces?
If it is just an addition, having a big blob of stuff is probably fine.
On Sun, Sep 9, 2012 at 7:01 AM, Gokhan Capan wrote:
> On Fri, Sep 7, 2012 at 12:48 AM, Ted Dunning
> wrote:
>
Multi-threading at the cell level will not likely help.
Multi-threading at the row level might help.
I would recommend that you use a threaded pool executor and feed the rows
into the pool. You won't need locks this way and you will maximize your
use of your cores.
The basic code would look rou
Yes.
I have been working (slowly) on moving some very fast single pass
clustering into Mahout. My work in progress currently does very fast
clustering of small dense vectors and it should scale to sparse vectors
fairly well with some small changes.
See https://github.com/tdunning/knn for more in
Also, with 500MB of data, this is likely to only take a few minutes on a
single machine with the new clustering stuff. It is hard to estimate
precisely, however, due to the difference between dense and sparse cases.
On Wed, Sep 12, 2012 at 8:42 PM, Pat Ferrel wrote:
> 200 iterations?
>
> What i
Yes. It is a grave embarrassment to us, but not a functional requirement.
On Thu, Sep 13, 2012 at 6:42 AM, I-Scarlatti, David <
david.scarla...@boeing.com> wrote:
> Ok. So tests are just tests... not needed for having mahout running
>
> Thanks!
>
>
> -Original Message-
> From: Parito
Hi Ted,
> >>
> >> Sorry to bother you again.
> >>
> >> One quick question: Does Mahout support SVM, what is the Java class
> name ?
> >> Any inputs on its stability and performance ?
> >>
> >>
> >> Thanks
> >> Ra
And if you want the reduced rank representation of A, you have it already
with
A_k = U_k S_k V_k'
Assume that A is n x m in size. This means that U_k is n x k and V_k is m
x k
The rank reduced projection of an n x 1 column vector is
u_k = U_k U_k' u
Beware that v_k is probably not spa
pends on V?
>
> On Sun, Sep 16, 2012 at 5:33 PM, Ted Dunning
> wrote:
> > And if you want the reduced rank representation of A, you have it already
> > with
> >
> > A_k = U_k S_k V_k'
> >
> > Assume that A is n x m in size. This means that U_
. (Try to figure out Figure
> 1.) And it proceeds in its analysis by basically saying that the
> projection is Uk' times the new vector, so, I never understood this
> expression.
>
> On Sun, Sep 16, 2012 at 7:13 PM, Ted Dunning
> wrote:
> > A is in there implicitly.
> &
u_/A
If you shove u through U_k U_k' you get this:
U_k U_k' u = U_k U_k' (u_A + u_/A) = U_k U_k' (u_A) + 0 = u_A
This is another way of showing that U_k U_k' projects a vector into span A.
On Sun, Sep 16, 2012 at 12:55 PM, Ted Dunning wrote:
> U_k ' U_k =
are talking about expressing things in terms of the
latent variables.
> On Sun, Sep 16, 2012 at 8:55 PM, Ted Dunning
> wrote:
> > U_k ' U_k = I
> >
> > U_k U_k ' != I
>
x27;t even think that your claim that decreasing k increases recall is
correct.
> On Sun, Sep 16, 2012 at 4:11 PM, Ted Dunning
> wrote:
> > On Sun, Sep 16, 2012 at 1:49 PM, Sean Owen wrote:
> >
> >> Oh right. It's the columns that are orthogonal. Cancel that.
>
If a classifier is presented text with no words in common with the training
data, it will give you back the most common category in the training data.
That said, it is likely to be quite rare when a new document consists
*entirely* of new words. Any overlap with trained vocabulary is likely to
ov
PM, Lance Norskog wrote:
> Shouldn't this be 'unclassified'? I think I have seen data in the
> unclassified buckets with both Bayes and SGD.
>
> ----- Original Message -
> | From: "Ted Dunning"
> | To: user@mahout.apache.org
> | Sent: Wednesday, Se
On the other hand, the only way that I have been able to do a major version
upgrade of Hadoop is to start a new company.
It is really hard to change code and platform at the same time. If you
don't have enough hardware to have two clusters temporarily, things will be
really hard moving off of 0.1
This changes the initial learning rate. CHanging this can definitely
change convergence properties.
On Fri, Sep 21, 2012 at 9:33 AM, Watson Watson wrote:
> Hi,
> My question is why changing the rate parameter we always change the
> coefficients (results of RunLogistic)?
>
> I encounter the enig
I think that there is an excessive stability issue, actually.
What seems to happen is that the adaptive part locks down the learning rate
too quickly.
This is related to several other issues:
- the cross fold learning paradigm is kind of dangerous since it depends on
the user not having duplicat
Combiners can be called zero or more times. That can happen on the map
side or on the reduce side.
On Thu, Sep 27, 2012 at 4:56 AM, Sigurd Spieckermann <
sigurd.spieckerm...@gmail.com> wrote:
> @Jake: Could you please elaborate on how exactly the combiner can be called
> before the reducer gets
Other experiments have shown that 60-80% of perception of music "likes" is
due to social factors.
Factoring this out may or may not be a good thing. My feeling is that if
you are trying to make people happy with what you recommend then you need
to go with whatever makes them happy whether it is i
Johannes,
Funny you should mention matrix factorization and k-means at the same
moment. I am talking this afternoon in Oxford about just this topic.
Yes, you can use the proximity to near clusters as a useful modeling
feature, but as Sean said, the cost of matrix factorization should not be
the
On Fri, Oct 5, 2012 at 4:57 PM, Johannes Schulte wrote:
> Hi Ted,
>
> thanks for the hints. I am however wondering what the reverse projection
> would be needed for. Do you mean for explaining stuff only? Or validating a
> model manually?
>
Or for converting recommendations back to items.
> Al
e a more sparse feature
> vector or pre clustering. It probably depends :)
>
> Thanks for the feedback Ted!
>
> I will continue my quest how to construct a ctr prediction for a
> recommendation delivery. Maybe I should have pointed that goal out before.
>
> On Fri, Oct 5,
See this page: http://leon.bottou.org/research/stochastic
Google is your friend.
This API is, however, not particularly friendly. Therefore, you will have
to read about the basics and be able to figure these things out from first
principles. There is some documentation in the code. You can al
Sgd is more suitable for large data. I will take a look later today.
Sent from my iPhone
On Oct 9, 2012, at 11:29 PM, Rajesh Nikam wrote:
> Hi Ted,
>
> Putting specific question with data for getting problem with SGD.
>
> I am using Iris Plants Database from Michael Marshall. PFA iris.arff
This might work, but the messages indicate that the environment is
seriously messed up. Just getting the code isn't going to help. The tests
are indicating that there is a real problem (and it isn't likely Mahout).
That problem needs fixing and once fixed running the tests isn't a bad
thing.
On
ks
> Rajesh
>
>
> On Wed, Oct 10, 2012 at 8:08 PM, Ted Dunning
> wrote:
>
> > Sgd is more suitable for large data. I will take a look later today.
> >
> > Sent from my iPhone
> >
> > On Oct 9, 2012, at 11:29 PM, Rajesh Nikam wrote:
> >
You have to tokenize your text and then use some form of vector encoding.
If you have a known dictionary of all interesting words, you can simply
make a vector as long as the number of words in your dictionary and put a 1
in the right place.
If you don't want to do that either because you don't k
Not sure just off=hand. Need to look in more detail in a debugger. Need
to find time to do that.
On Thu, Oct 11, 2012 at 1:58 AM, Rajesh Nikam wrote:
> what could be the problem with data formatting ?
> Could you please update on the same.
>
> On Thu, Oct 11, 2012 at 11:31 AM,
I would love to help and will before long. Just can't do it in the first
part of this week.
On Mon, Oct 15, 2012 at 6:28 AM, Rajesh Nikam wrote:
> Hello,
>
> I have asked below question on issue with using sgd on mahout forum.
>
> Similar issue with sgd is reported by
>
> http://stackoverflow.c
ion: [[*26563.0, 23006.0*], [0.0, 0.0]]
> entropy: [[-0.0, -0.0], [-46.1, -21.4]]
>
> I am not sure why this is failing all the time.
>
> Looking forward for your reply.
>
> Thanks
> Rajesh
>
>
>
> On Tue, Oct 16, 2012 at 3:57 AM, Ted Dunning
> wrote:
>
>
Computing the svd with the stochastic projection is your best bet.
Sent from my iPhone
On Oct 17, 2012, at 10:42 PM, Ranjith Uthaman
wrote:
> Hi,
>
> Does map reduce implementation of Pseudo-Inverse of a matrix exist in the
> current Mahout framework? What are the various ways to achieve it
If we have descended to personal advertising, then I should mention that I
am speaking as well.
http://strataconf.com/stratany2012/public/schedule/speaker/126559
I will also have office hours afterwards during which the topic is
unlimited. Drop by!
On Sun, Oct 21, 2012 at 11:20 AM, Josh Patters
1 - 100 of 2397 matches
Mail list logo