[
https://issues.apache.org/jira/browse/MAHOUT-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12860690#action_12860690
]
Jake Mannix commented on MAHOUT-369:
Danny, thanks for looking into this so carefully
[
https://issues.apache.org/jira/browse/MAHOUT-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix reassigned MAHOUT-369:
--
Assignee: Jake Mannix
Issues with DistributedLanczosSolver output
On Mon, Apr 19, 2010 at 9:13 AM, Sean Owen sro...@gmail.com wrote:
More on Vector, as I'm browsing through it:
AbstractVector.minus(Vector) says:
//snip
The stanza after the instanceof checks can just become the body of an
overriding method in these two subclasses right?
Yep, sure.
, Sean Owen sro...@gmail.com wrote:
On Mon, Apr 19, 2010 at 5:33 PM, Jake Mannix jake.man...@gmail.com
wrote:
result.times(-1.0)
with
result.assign(Functions.negate)
Cool, good one.
The efficiency points are twofold: number of nonzero elements, and
the impl: you don't want to iterate
[
https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858659#action_12858659
]
Jake Mannix commented on MAHOUT-364:
Moving this discussion over to MAHOUT-383
Which one is this? Wrapping Vector impls into a
NamedVector/LabeledVector,
or seeing if we even need the label *inside* of the Vector itself, and
instead
just having those live in the key part of the key-value pair in hadoop,
like
DistributedRowMatrix has it?
-jake
On Sun, Apr 18, 2010 at
:41 PM, Jake Mannix jake.man...@gmail.com
wrote:
Which one is this? Wrapping Vector impls into a
NamedVector/LabeledVector,
or seeing if we even need the label *inside* of the Vector itself, and
instead
just having those live in the key part of the key-value pair in hadoop,
like
.
Am I convincing?
On Sun, Apr 18, 2010 at 6:45 PM, Jake Mannix jake.man...@gmail.com wrote:
Ok this is a good con...
What would be the Writable hierarchy with this NamedVector proposal?
On Apr 18, 2010 11:05 AM, Sean Owen sro...@gmail.com wrote: On
keeping 'name': sure, I ...
On Sun, Apr 18, 2010 at 6:45 PM, Jake Mannix jake.man...@gmail.com wrote:
Ok this is a good con...
this a decorator pattern rather than subclass.
On Sun, Apr 18, 2010 at 7:26 PM, Jake Mannix jake.man...@gmail.com wrote:
What would be the Wri...
putting the name into the vector and
accepting
whatever strange semantics that result (missing == instead of null,
for
instance) more attractive as a temporary measure.
On Sun, Apr 18, 2010 at 11:44 AM, Jake Mannix jake.man...@gmail.com
wrote:
It's not just that it is complicated
On Sat, Apr 17, 2010 at 2:14 PM, Robin Anil robin.a...@gmail.com wrote:
For this bug, lets put the id back in and remove it from the
comparator/equals. Lets focus on getting the document structure correct
You mean put the 'name' back in?
Since Sean has done the initial work of possibly
clustering algorithm so that the number of clusters does
not need to be specified in advance. Has anyone done anything like
this in Mahout yet? Also, I'd be happy to contribute the code to
Mahout if anyone is interested.
Thanks,
Anthony
On Fri, Apr 16, 2010 at 9:50 AM, Jake Mannix jman
So here's my take: once we're a TLP (next month sometime?), it is
a good time to start allowing subprojects or submodules which are
scripting layers on top of Mahout - whether they are PigLatin, or
Cascalog, JRuby, or Clojure. If it's JVM-based, especially, having
code/scripts which are drivers
On Fri, Apr 16, 2010 at 11:31 AM, Robin Anil robin.a...@gmail.com wrote:
Hmm... this was a bit scattered of a response, but I'm really loathe
to turn away a) nice hooks between Solr and Mahout, b) scripting-style
wrappers which could expand our community, and c) simply new
On Fri, Apr 16, 2010 at 11:26 AM, Grant Ingersoll gsing...@apache.orgwrote:
On Apr 16, 2010, at 2:21 PM, Jake Mannix wrote:
So here's my take: once we're a TLP (next month sometime?), it is
a good time to start allowing subprojects or submodules which are
Submodules, yes, subprojects
On Fri, Apr 16, 2010 at 11:56 AM, Sean Owen sro...@gmail.com wrote:
On Fri, Apr 16, 2010 at 7:39 PM, Jake Mannix jake.man...@gmail.com
wrote:
I will start playing around with Anthony's github-based stuff, and
see where a patch can be made. The question is where it would
go? It's a fully
Hey Sean,
On Thu, Apr 15, 2010 at 7:16 AM, Sean Owen sro...@gmail.com wrote:
Along the way to a patch for MAHOUT-379, I'm having some trouble
figuring out SequentialAccessSparseVector.DenseVector. I think it can
be simplified, but unless I'm misunderstanding there are several bugs
here. I'd
Ok, back on list with this then (Thanks Danny for reminding us to deal with
this perennial issue we have!)
On Wed, Apr 14, 2010 at 2:26 AM, Sean Owen (JIRA) j...@apache.org wrote:
Yeah let's take some time to get this right. At the moment I see four
notions of equivalence in Vector (which is
+1
-jake
On Apr 14, 2010 3:20 PM, Jeff Eastman j...@windwardsolutions.com wrote:
Ted Dunning wrote: On Wed, Apr 14, 2010 at 12:53 PM, Sean Owen
sro...@gmail.com wrote: ...
+1 from the creator thereof, even. Especially since they never got used.
[
https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12856711#action_12856711
]
Jake Mannix commented on MAHOUT-364:
Zoran,
Any form of BSD-style license _is_
From what Grant said last time we talked about this, we need to wait
until the next Apache directors meeting (or whatever it's called) before
we move forward with that, I thought.
-jake
On Mon, Apr 12, 2010 at 2:43 PM, Robin Anil robin.a...@gmail.com wrote:
Hi everyone,
I am
[
https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855742#action_12855742
]
Jake Mannix commented on MAHOUT-364:
Hi Zoran,
Neuroph looks very interesting
I haven't had a chance to read your attached pdf, but I *have* had a chance
to code up an impl of this jira. Patch coming soon.
On Apr 11, 2010 6:50 AM, Ted Dunning (JIRA) j...@apache.org wrote:
[
[
https://issues.apache.org/jira/browse/MAHOUT-363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854950#action_12854950
]
Jake Mannix commented on MAHOUT-363:
If possible, Shannon, if you could simply add
[
https://issues.apache.org/jira/browse/MAHOUT-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855014#action_12855014
]
Jake Mannix commented on MAHOUT-369:
Hold on that Sean, I made the loop like
I agree in principal, but having a whole different set of versionings seems
kinda... messy? If m-collections goes 1.0, and then 1.1, and then m-math
goes 1.0, and core goes to 0.5, we have a whole pile of different version
numbers to keep track of.
Didn't Lucene and Solr just intentionally do
[
https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854304#action_12854304
]
Jake Mannix commented on MAHOUT-364:
I've got to say, this is a fantastically well
Hi Richard,
A few notes about what would be required to get a nice distributed SVD
recommender in Mahout: if you look at the current distributed recommenders
(in org.apache.mahout.cf.taste.hadoop package and children), you can see
how it works: using HDFS-backed data, a batch of
[
https://issues.apache.org/jira/browse/MAHOUT-363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853308#action_12853308
]
Jake Mannix commented on MAHOUT-363:
... and actually, there is no need for Hama
Umm, I actually depend pretty heavily on the logging in the SVD solvers.
They are very long-running processes, and give off a ton of useful
information about what the heck is going on.
Reducing dependencies is great, but logging? I think the math stuff could
really use logging. I haven't been
thought but for me collections and Math are just tools to aid
complex algorithms in Mahout core. Maybe we can move it under core and
adding the required logging.
Robin
On Mon, Apr 5, 2010 at 11:03 AM, Jake Mannix jake.man...@gmail.com
wrote:
Umm, I actually depend pretty heavily
thanks.
On Sun, Apr 4, 2010 at 10:40 PM, Sean Owen sro...@gmail.com wrote:
Oh OK I'll revert the change then, didn't know you wanted that. Some
of the other statements could probably go but not worth digging
through it.
On Mon, Apr 5, 2010 at 6:33 AM, Jake Mannix jake.man...@gmail.com wrote
[
https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852477#action_12852477
]
Jake Mannix commented on MAHOUT-350:
bq. I suppose I hadn't wanted to be presumptuous
Awesome, thanks guys.
Doesn't Maven do this kind of thing for us, if we tell it to?
(ie can't we also have daily updates of the 0.4-SNAPSHOT javadocs
automagically posted up there too?)
-jake
On Tue, Mar 30, 2010 at 6:28 AM, Sean Owen sro...@gmail.com wrote:
Done, they're all up under
[
https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851015#action_12851015
]
Jake Mannix commented on MAHOUT-350:
Don't the jobs which implement Tool allow
Hey gang,
Where are the 0.3 javadocs on the web? All I can find right now are the
0.1's http://lucene.apache.org/mahout/javadoc/core/index.html.
-jake
+1
-jake
Hi Pallavi,
I personally agree that keeping the name as part of the mathematical
vector is wrong, because it leads to not only the issues you've brought up,
but also means we still have these *4* different ways of saying that two
vectors are the same: ==, equals(), equivalent(), and
[
https://issues.apache.org/jira/browse/MAHOUT-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12845434#action_12845434
]
Jake Mannix commented on MAHOUT-337:
So a question about this: do we really want to do
+1 from over here.
On Mon, Mar 15, 2010 at 11:36 AM, Drew Farris drew.far...@gmail.com wrote:
+1 as well.
On Mon, Mar 15, 2010 at 2:34 PM, Ted Dunning ted.dunn...@gmail.com
wrote:
Dang. I can only second second it now.
On Mon, Mar 15, 2010 at 11:28 AM, Robin Anil robin.a...@gmail.com
[
https://issues.apache.org/jira/browse/MAHOUT-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12845495#action_12845495
]
Jake Mannix commented on MAHOUT-337:
[quote]
Yes it's possible to fix this by forcing
[
https://issues.apache.org/jira/browse/MAHOUT-322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix updated MAHOUT-322:
---
Fix Version/s: (was: 0.3)
pulling this out of the track for 0.3
DistributedRowMatrix should
[
https://issues.apache.org/jira/browse/MAHOUT-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix resolved MAHOUT-315.
Resolution: Fixed
Fix Version/s: (was: 0.4)
0.3
Committed
On Thu, Mar 4, 2010 at 7:41 AM, Robin Anil robin.a...@gmail.com wrote:
Based on what i have in mind, the usage will just be
mahout vectorize -i s3://input -o s3://output -tmp hdfs://file (here, there
is a risk of fixing a exact path and not knowing the hadoop user, I would
have preferred a
Hi Mike,
Welcome to the long journey down the road of dimensional reduction. :)
On Fri, Mar 5, 2010 at 5:05 PM, mike bowles m...@mbowles.com wrote:
Really large matrices require using one of the randomizing methods to get
done.
Require is a strong term. Really really large (but still
[
https://issues.apache.org/jira/browse/MAHOUT-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix resolved MAHOUT-310.
Resolution: Fixed
Fix Version/s: 0.3
committed
LanczosSolver and DistributedLanczosSolver
[
https://issues.apache.org/jira/browse/MAHOUT-313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix resolved MAHOUT-313.
Resolution: Fixed
Fix Version/s: 0.3
Committed, code piggybacks on timesSquared
[
https://issues.apache.org/jira/browse/MAHOUT-314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix resolved MAHOUT-314.
Resolution: Fixed
Fix Version/s: 0.3
Committed.
Current implementation is a map-side
[
https://issues.apache.org/jira/browse/MAHOUT-322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12842213#action_12842213
]
Jake Mannix commented on MAHOUT-322:
It should actually be noted that Danny's original
[
https://issues.apache.org/jira/browse/MAHOUT-322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841382#action_12841382
]
Jake Mannix commented on MAHOUT-322:
Meaning what, Robin?
We can certainly come up
On Thu, Mar 4, 2010 at 8:54 AM, Ted Dunning ted.dunn...@gmail.com wrote:
I haven't examined the out-of-core scenarios at all, but in-memory, it is
possible to have labels with no performance cost if you assume add the
constraint that labeled matrices are only conformable if they share the
Adding a skipZero() method to all the functions is probably better here,
because that will be faster than an instanceof check, and easier to
document than other interfaces.
On Tue, Mar 2, 2010 at 1:22 AM, Sean Owen sro...@gmail.com wrote:
How about merely a flag/method on BinaryFunction /
On Tue, Mar 2, 2010 at 5:21 AM, Sean Owen sro...@gmail.com wrote:
I'll have a look there. May be worth piling in one more little thing
like this in the 'code freeze'.
Incidentally Hadoop announced version 0.20.2 a few days ago -- still
looking for it on Maven but I will be starting up our
[
https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix resolved MAHOUT-301.
Resolution: Fixed
Checked in a version of this which works, not sure if it had the most updated
[
https://issues.apache.org/jira/browse/MAHOUT-311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix updated MAHOUT-311:
---
Resolution: Fixed
Status: Resolved (was: Patch Available)
committed
Update assemblies
Hey all,
Just an update on the new-and-improved command-line UI we have now.
After a ton of iterations back and forth with Drew (thanks!), MAHOUT-301
has been committed, and brings with it the easy ability to trim down your
long long command lines for most of our *Driver main() methods, by
[
https://issues.apache.org/jira/browse/MAHOUT-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix updated MAHOUT-310:
---
Attachment: MAHOUT-lots.diff
I hope we get this release out soon, I've got a giant pile of code
. It
does this I think, though it has not yet been wired into
ClusterDumper.printClusters. I wanted to give the ClusterDumper users a
chance to critique my formatting but it is like the below.
Jeff
Jake Mannix (JIRA) wrote:
VectorDumper should also do printing to simple {index
I thought you were doing the secondary sort idea? That's certainly the
way to make sure you need nothing significant kept in memory, and this
clearly won't scale without that optimization...
I'd say this should get fixed before we release 0.3
-jake
On Sun, Feb 28, 2010 at 7:30 AM, Drew
Project: Mahout
Issue Type: Bug
Affects Versions: 0.3
Reporter: Jake Mannix
Assignee: Jake Mannix
DistributedRowMatrixIterator does not properly handle file glob paths of the
various part-0 files.
--
This message is automatically generated by JIRA.
-
You can
Feature
Affects Versions: 0.3
Reporter: Jake Mannix
Assignee: Jake Mannix
pretty self-explanatory.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
/jira/browse/MAHOUT-314
Project: Mahout
Issue Type: New Feature
Affects Versions: 0.3
Reporter: Jake Mannix
Assignee: Jake Mannix
If the matrix which is being multiplied by has been transformed into a
column-sparse matrix backed
URL: https://issues.apache.org/jira/browse/MAHOUT-315
Project: Mahout
Issue Type: Improvement
Affects Versions: 0.2
Reporter: Jake Mannix
Assignee: Jake Mannix
Fix For: 0.4
I've got a patch for this, tied up in other code
was
Key: MAHOUT-316
URL: https://issues.apache.org/jira/browse/MAHOUT-316
Project: Mahout
Issue Type: Improvement
Components: Math
Affects Versions: 0.2
Reporter: Jake Mannix
Fix For: 0.4
CardinalityException already has
What's the final size of the vectoized output?
-jake
On Feb 28, 2010 6:47 PM, Robin Anil robin.a...@gmail.com wrote:
Finally some good news tried with cloudera 4 node c1.medium on 6 GB
compressed(26GB uncompressed wikipeda)
org.apache.mahout.text.SparseVectorsFromSequenceFiles -i wikipedia/
Hey Robin,
Couple questions: what is the contents of this sequence file? Is this
the
output of the SparseVectorsFromSequenceFiles? Do you know the number
of key-value pairs, and the cardinality of the rows? Or is this just the
Text,Text raw contents sequence files?
Also - how do we get
Hey Robin, that http url gives me a permission denied response... I'm not
too S3 savvy, not sure if I'm checking on it right...
On Sat, Feb 27, 2010 at 12:40 PM, Robin Anil robin.a...@gmail.com wrote:
Its uploaded here and its public. I will monitor usage and see if my
credits
dont get run
the url you tried
On Sun, Feb 28, 2010 at 2:59 AM, Jake Mannix jake.man...@gmail.com
wrote:
Hey Robin, that http url gives me a permission denied response... I'm not
too S3 savvy, not sure if I'm checking on it right...
On Sat, Feb 27, 2010 at 12:40 PM, Robin Anil robin.a...@gmail.com
wrote
On Sun, Feb 28, 2010 at 3:04 AM, Jake Mannix jake.man...@gmail.com
wrote:
Er, the one you posted!
http://mahout-wikipedia.s3.amazonaws.com/wikipedia-jan-2010-seqfile-deflate-chunk-[0-5]
http://mahout-wikipedia.s3.amazonaws.com/wikipedia-jan-2010-seqfile-deflate-chunk-[0-5
15GB of tokenized documents, not bad, not bad. We're not going
to get a multi-billion entry matrix out of this though, are we?
-jake
On Sat, Feb 27, 2010 at 2:06 PM, Robin Anil robin.a...@gmail.com wrote:
Update:
in 20 mins the tokenization stage is complete But its not evident in the
said only 5 mil articles. Maybe you can generate a co-occurrence
matrix :) every ngram to every other ngram :) Sounds fun? It will be HUGE!
On Sun, Feb 28, 2010 at 3:43 AM, Jake Mannix jake.man...@gmail.com
wrote:
15GB of tokenized documents, not bad, not bad. We're not going
to get a multi
. So bye
Robin
On Sun, Feb 28, 2010 at 3:57 AM, Robin Anil robin.a...@gmail.com wrote:
like i said only 5 mil articles. Maybe you can generate a co-occurrence
matrix :) every ngram to every other ngram :) Sounds fun? It will be
HUGE!
On Sun, Feb 28, 2010 at 3:43 AM, Jake Mannix
URL: https://issues.apache.org/jira/browse/MAHOUT-310
Project: Mahout
Issue Type: Improvement
Affects Versions: 0.3
Reporter: Jake Mannix
Assignee: Jake Mannix
LanczosSolver calls inputMatrix.timesSquared(Vector) as it's Krylov iteration
[
https://issues.apache.org/jira/browse/MAHOUT-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix updated MAHOUT-310:
---
Attachment: MAHOUT-310.patch
Patch has newly modified unit tests to test the symmetric case
Hmm...
code: *check*
desire to add stochastic decomp to code: *check*
amazon credits: *check* (my account today: almost $300 left burning hole in
pocket)
relatively gigantic social graph: *check*
legal ability to put gigantic social graph on ec2: not so check, but maybe
some
clever anonymization
On Thu, Feb 25, 2010 at 12:38 PM, Robin Anil robin.a...@gmail.com wrote:
Whats the largest dataset available? BixoLabs ? Wikipedia(5 Mil
articles)...
I dont know anything public that is that big
5 million articles, if you take all the 1,2,3,4, and 5-grams data out of it,
you
could easily hit
On Thu, Feb 25, 2010 at 12:42 PM, Grant Ingersoll gsing...@apache.orgwrote:
I'd be a little wary of that and I'd hate to see anything happen to it (AOL
comes to mind). That being said, if you just export the vectors w/o the
key, it really is pretty anonymous.What other sources can we
On Thu, Feb 25, 2010 at 12:49 PM, Robin Anil robin.a...@gmail.com wrote:
unigrams 3 = 384 MB dictionary... with all ngrams(pruned by llr 1) we
might hit some 5-10GB of entries. With some 25 char average for 5 grams it
might be safe to say that we might say hit 100 million rows easily ?
Wait
:
Stochastic decomposition doesn't care about this, I don't think.
On Thu, Feb 25, 2010 at 1:43 PM, Jake Mannix jake.man...@gmail.com
wrote:
Of course, at this point we've
got
too many terms to properly do the decomposition directly on the input
matrix,
--
Ted Dunning, CTO
DeepDyve
in yet, but it's a pretty critically
useful
enhancement to DistributedSparseRowMatrix which we need anyways.
-jake
-jake
On Thu, Feb 25, 2010 at 1:43 PM, Jake Mannix jake.man...@gmail.com
wrote:
Of course, at this point we've
got
too many terms to properly do the decomposition
On Thu, Feb 25, 2010 at 2:09 PM, Jake Mannix jake.man...@gmail.com wrote:
On Thu, Feb 25, 2010 at 1:48 PM, Ted Dunning ted.dunn...@gmail.comwrote:
After we delete hapax, we may have considerably fewer tokens. But the LLR
step that Robin implied may have already dealt with that.
The more I
[
https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838725#action_12838725
]
Jake Mannix commented on MAHOUT-301:
Drew, do you have a patch with your last changes
[
https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837917#action_12837917
]
Jake Mannix commented on MAHOUT-301:
Awesome Drew, I'll check it out.
{quote}
One
Type: Improvement
Components: Math
Affects Versions: 0.3
Environment: all
Reporter: Jake Mannix
Assignee: Jake Mannix
Fix For: 0.4
DistributedLanczosSolver currently keeps all Lanczos vectors in memory on the
driver (client) computer while
[
https://issues.apache.org/jira/browse/MAHOUT-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838069#action_12838069
]
Jake Mannix commented on MAHOUT-308:
Of course, making sure that individual mappers
Reporter: Jake Mannix
Assignee: Jake Mannix
Fix For: 0.4
Techniques reviewed in a href=http://arxiv.org/abs/0909.4061;Halko,
Martinsson, and Tropp/a.
The basic idea of the implementation is as follows: if the input matrix is
represented
why is this not showing up in the unit tests?
On Wed, Feb 24, 2010 at 6:36 PM, Jeff Eastman
jeast...@windwardsolutions.com wrote:
AbstractVector.minus has a bug in the first if clause. Don't know if my fix
or this one would do what is intended by the optimization:
On Wed, Feb 24, 2010 at 6:43 PM, Jeff Eastman j...@windwardsolutions.comwrote:
The unit test is subtracting a vector from itself and testing for zero :)
Egads!
[
https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix updated MAHOUT-301:
---
Attachment: MAHOUT-301.patch
Improve command-line shell script by allowing default properties files
[
https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838159#action_12838159
]
Jake Mannix commented on MAHOUT-301:
Ok, new patch, with the modification that indeed
[
https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jake Mannix updated MAHOUT-301:
---
Fix Version/s: (was: 0.4)
0.3
Let's release this. Others want to try it out
[
https://issues.apache.org/jira/browse/MAHOUT-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837324#action_12837324
]
Jake Mannix commented on MAHOUT-180:
Hi Danny, thanks for trying this out!
You have
So to be an annoying voice of dissent... I'm going to keep iterating on
MAHOUT-301,
targetted for 0.4, and I will keep it in patch form (not checked in) _for
now_... but
if it can get its wrinkles ironed out before Hadoop gets its act together, I
really
think it should get committed to 0.3.
It's
[
https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837345#action_12837345
]
Jake Mannix commented on MAHOUT-301:
Hey Drew, thanks for looking at this. Problems
...@gmail.com wrote:
WHat about a new follow-on JIRA so 301 can stay in the official release
notes?
On Tue, Feb 23, 2010 at 9:40 AM, Jake Mannix jake.man...@gmail.com
wrote:
So to be an annoying voice of dissent... I'm going to keep iterating on
MAHOUT-301,
targetted for 0.4, and I
[
https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837351#action_12837351
]
Jake Mannix commented on MAHOUT-301:
Ok, Drew, got your patch in diff mode against mine
[
https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837428#action_12837428
]
Jake Mannix commented on MAHOUT-301:
{quote}
Something else I noticed
[
https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837440#action_12837440
]
Jake Mannix commented on MAHOUT-301:
{quote}
Jake, the basic idea is that you would
[
https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837472#action_12837472
]
Jake Mannix commented on MAHOUT-301:
{quote}
Ahh, I see where you're coming from, so
1 - 100 of 406 matches
Mail list logo