Yeah this was my change that didn't work:
public class DummyOutputCollectorK extends WritableComparable, V
extends Writable
public class DummyOutputCollectorK extends WritableComparable?, V
extends Writable
The latter is more correct and as far as I know identical. I don't see
why this doesn't
On Mon, Jan 18, 2010 at 2:24 AM, Drew Farris drew.far...@gmail.com wrote:
On Sun, Jan 17, 2010 at 9:10 PM, Sean Owen sro...@gmail.com wrote:
There are already cases where code needs to control the seed (mostly
to serialize/deserialize the exact state of an object). I don't think
that's the
Same here, I don't like Spring myself as it smells like
overengineering -- certainly for this case. I'm otherwise a luddite
though and could more broadly be convinced.
On Mon, Jan 18, 2010 at 2:49 AM, Ted Dunning ted.dunn...@gmail.com wrote:
I have had too many unpleasant experiences using
[
https://issues.apache.org/jira/browse/MAHOUT-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801708#action_12801708
]
Sean Owen commented on MAHOUT-260:
--
I still don't understand what this solves. We already
This just avoids the class load in the test. I don't think it is necessary.
On Mon, Jan 18, 2010 at 1:04 AM, Sean Owen (JIRA) j...@apache.org wrote:
I still don't understand what this solves. We already 'fixed' the
performance issue.
--
Ted Dunning, CTO
DeepDyve
[
https://issues.apache.org/jira/browse/MAHOUT-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801716#action_12801716
]
Pallavi Palleti commented on MAHOUT-153:
Hi all,
I am ready with my patch.
2010/1/18 Jeff Eastman j...@windwardsolutions.com:
Sean Owen wrote:
Could be. I took an indirect stab at mitigating possible sources of
this issue by increasing encapsulation in the tests -- I still believe
fields should never by non-private. This may start to surface the
behind-the-scenes
As I troll through the code at times trying to polish here and there I
notice small issues to bring up --
Line separators. Lots of code independently reads
System.getProperty(line.separator) in order to output a platform
specific line break. I argue this is actually slightly bad, since it
means
[
https://issues.apache.org/jira/browse/MAHOUT-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801748#action_12801748
]
Benson Margulies commented on MAHOUT-260:
-
Well,
I thought I saw email go by to
Its this kind of thing that forced to move to sequence files instead of
TextKeyValueInput format and other text based/ csv based formats. Kind of
regretting the decision to go with tab separated format for BayesClassifier
which i wrote it 2 years ago. I will be modifying this to use sparse vectors
[
https://issues.apache.org/jira/browse/MAHOUT-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801750#action_12801750
]
Sean Owen commented on MAHOUT-260:
--
My take is that we have injection already, via
[
https://issues.apache.org/jira/browse/MAHOUT-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801755#action_12801755
]
Grant Ingersoll commented on MAHOUT-153:
Please keep the same issue. That way the
On Mon, Jan 18, 2010 at 3:58 AM, Sean Owen sro...@gmail.com wrote:
The real fix is centralizing management of Random, tracking them, and
being able to reset them all remotely.
In what cases would you want to reset them all remotely, at the
beginning of each test?
It is injected already --
On Jan 17, 2010, at 8:35 PM, Ted Dunning wrote:
We should have a beer some time anyway and the beers we owe you for cleaning
up Colt more than cancel any potential beer on this issue so I will be happy
to buy (Sean, you are included for similar reasons if we ever see each
other).
After the
Hello,
I am currently testing the MAHOUT-228-3.patch applied to the current
trunk. The merge went mostly well except a couple of duplicated chunks
in the patchs (probably applied otherwise to the trunk) and a
duplicated wordlist.
However to make the tests pass I add to reduce the precision of
On Mon, Jan 18, 2010 at 2:00 PM, Drew Farris drew.far...@gmail.com wrote:
In what cases would you want to reset them all remotely, at the
beginning of each test?
You pretty much said it -- tests should start from a known, fixed
state, so that the result is the same each time, and we can assert
could you be specific on which map/reduce job you encountered the error ?
On Mon, Jan 18, 2010 at 7:28 PM, Olivier Grisel olivier.gri...@ensta.orgwrote:
2010/1/18 Robin Anil robin.a...@gmail.com:
Its this kind of thing that forced to move to sequence files instead of
TextKeyValueInput
2010/1/18 Robin Anil robin.a...@gmail.com:
could you be specific on which map/reduce job you encountered the error ?
I thought it was on:
hadoop jar examples/target/mahout-examples-0.3-SNAPSHOT.job
org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver -i
wikipediadump/chunk-0001.xml
On Mon, Jan 18, 2010 at 9:06 AM, Sean Owen sro...@gmail.com wrote:
(Separately you could argue we're going about this all wrong, by
trying to depend on the exact output of the RNG..
No argument here. In practice I don't think we can really get around
using a pre-seeded RNG for tests.
You've
You're suggesting the class choose between a regular and test-friendly
RNG, by calling one of two methods. Doesn't that put the decision with
the class instead of externally? Right now it's already external.
RandomUtils decides what to instantiate.
On Mon, Jan 18, 2010 at 2:21 PM, Drew Farris
On Mon, Jan 18, 2010 at 9:23 AM, Sean Owen sro...@gmail.com wrote:
You're suggesting the class choose between a regular and test-friendly
RNG, by calling one of two methods. Doesn't that put the decision with
the class instead of externally? Right now it's already external.
RandomUtils decides
On Mon, Jan 18, 2010 at 2:36 PM, Drew Farris drew.far...@gmail.com wrote:
I'm suggesting that the instantiator/caller of the class choose
between a regular and test-friendly RNG. In some classes that creator
will be a unit test in other cases the creator will be another piece
of production
On Mon, Jan 18, 2010 at 9:42 AM, Sean Owen sro...@gmail.com wrote:
You can punt the choice all the way up to fix that. Then regular
callers are forced to instantiate and supply the RNG in all cases, and
the API has Randoms all over the place, and I suppose I don't quite
like that
[
https://issues.apache.org/jira/browse/MAHOUT-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benson Margulies resolved MAHOUT-259.
-
Resolution: Fixed
Fix Version/s: 0.3
committed.
Remove all code for Object
On Mon, Jan 18, 2010 at 2:59 PM, Benson Margulies bimargul...@gmail.com wrote:
Doing significant work in static code blocks leads to nothing but
trouble, as the Random situation demonstrates.
I don't know that this is the conclusion? You're critiquing one means
of implementing injection, but
[
https://issues.apache.org/jira/browse/MAHOUT-261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benson Margulies updated MAHOUT-261:
Status: Patch Available (was: Open)
Give the primitive-value maps an adjustOrPutValue
Give the primitive-value maps an adjustOrPutValue call, like Trove.
---
Key: MAHOUT-261
URL: https://issues.apache.org/jira/browse/MAHOUT-261
Project: Mahout
Issue Type:
[
https://issues.apache.org/jira/browse/MAHOUT-261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benson Margulies updated MAHOUT-261:
Attachment: MAHOUT-261.patch
Give the primitive-value maps an adjustOrPutValue call, like
I created this subject thread so that you could use the other one for
repeatability.
[
https://issues.apache.org/jira/browse/MAHOUT-261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benson Margulies updated MAHOUT-261:
Resolution: Fixed
Status: Resolved (was: Patch Available)
Done.
Give the
I think I might be done with collections. I can't work up any
enthusiasm for iterators, or java.util. decorators, and I think I have
the basic functionality all in place. There are a number of perhaps
pointless ways in which Colt diverges from Java collections,
particularly in the area of return
On Mon, Jan 18, 2010 at 10:10 AM, Sean Owen sro...@gmail.com wrote:
... can I try again to drag attention to an actual problem? the
repeatability issue. This injection discussion is orthogonal to it.
Is the repeatability issue caused by the switch to forkOnce? What
specifically is the issue
On Mon, Jan 18, 2010 at 10:47 AM, Drew Farris drew.far...@gmail.com wrote:
On Mon, Jan 18, 2010 at 10:10 AM, Sean Owen sro...@gmail.com wrote:
... can I try again to drag attention to an actual problem? the
repeatability issue. This injection discussion is orthogonal to it.
Arrrg. Could we
could you check the logs. you will see a bigger stack trace might lead back
to mahout classes
On Mon, Jan 18, 2010 at 9:19 PM, Olivier Grisel olivier.gri...@ensta.orgwrote:
2010/1/18 Olivier Grisel olivier.gri...@ensta.org:
2010/1/18 Robin Anil robin.a...@gmail.com:
could you be specific
My 2 cents:
I wouldn't mind making all components that are non-deterministic in
nature having their constructor explicitly pass a RNG instance
(instead of using static magic).
That can be helpful when running several versions of the same
algorithms with different hyper-parameters in separate
CXF has a very different requirement profile than Mahout. People want
to plug web service clients and servers into all kinds of
environments, and get all huffy if forced to use something like Spring
or Guice. Mahout, at this point in its career, at least, probably
doesn't have this problem.
The
2010/1/18 Robin Anil robin.a...@gmail.com:
could you check the logs. you will see a bigger stack trace might lead back
to mahout classes
In the tasktracker logs I could find a more complete stacktrace (jetty
related, not sign of mahout classes) and google could pointed me to
this:
I'm planning on attending
Jeff
Grant Ingersoll wrote:
On Jan 17, 2010, at 8:35 PM, Ted Dunning wrote:
We should have a beer some time anyway and the beers we owe you for cleaning
up Colt more than cancel any potential beer on this issue so I will be happy
to buy (Sean, you are included
[
https://issues.apache.org/jira/browse/MAHOUT-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801872#action_12801872
]
Jake Mannix commented on MAHOUT-261:
Ooooh, we need this in the Vectors.
Give the
That looks like a bug, to me... not sure where it is though...
-jake
On Mon, Jan 18, 2010 at 6:03 AM, Olivier Grisel olivier.gri...@ensta.orgwrote:
Hello,
I am currently testing the MAHOUT-228-3.patch applied to the current
trunk. The merge went mostly well except a couple of duplicated
If it's SF on Thursday, someone will have to have a beer as my proxy.
I'll be back here in the snow.
On Mon, Jan 18, 2010 at 12:21 PM, Jeff Eastman
j...@windwardsolutions.com wrote:
I'm planning on attending
Jeff
Grant Ingersoll wrote:
On Jan 17, 2010, at 8:35 PM, Ted Dunning wrote:
We
[
https://issues.apache.org/jira/browse/MAHOUT-251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Eastman resolved MAHOUT-251.
-
Resolution: Fixed
r900519 wrapped up loose ends in the patch, adding new command line arguments
I am going to address IoC issues only on this thread. The other
repeatability issues should be address, but on the other thread.
On Mon, Jan 18, 2010 at 7:10 AM, Sean Owen sro...@gmail.com wrote:
I am not especially in favor of my own Random patch. If people are
willing to run in
These bounds were too tight in any case. I had to loosen other bounds
during development and should have loosened these as well.
Your change is a good one.
On Mon, Jan 18, 2010 at 6:03 AM, Olivier Grisel olivier.gri...@ensta.orgwrote:
Is this a consequence of the recent
2010/1/18 Ted Dunning ted.dunn...@gmail.com:
These bounds were too tight in any case. I had to loosen other bounds
during development and should have loosened these as well.
Your change is a good one.
Great! so here is the sequel:
I have written a real training convergence test and
THANK YOU.
I have been very grumpy that I couldn't get to doing this yet.
I will coordinate closely with you. I haven't used git yet in anger so it
will be a learning experience. Don't expect me to have time, though. ( I
will try ... but expect not to find a hole )
On Mon, Jan 18, 2010 at
[
https://issues.apache.org/jira/browse/MAHOUT-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801960#action_12801960
]
Ted Dunning commented on MAHOUT-153:
+1 to what Grant said. Go ahead and post a patch
I'll be there.
Sean, are you really going to be there? That would be fantastic.
On Mon, Jan 18, 2010 at 6:02 AM, Grant Ingersoll gsing...@apache.orgwrote:
On Jan 17, 2010, at 8:35 PM, Ted Dunning wrote:
We should have a beer some time anyway and the beers we owe you for
cleaning
up
For the past... 5 years? I've been using Spring as a DI container
at every job I've had. At LinkedIn, in fact we have extended
Spring extensively
(see here: http://www.springsource.com/files/SpringAtLinkedIn.pdf
for some details). It's incredibly powerful, and while the config files
can be
Hmm, if all you guys are going to be there, I may need to push back my
flight -
I'm scheduled to fly *out* of SFO right around the time of the Meetup, but
if I can push back that flight, I will.
-jake
On Mon, Jan 18, 2010 at 1:24 PM, Ted Dunning ted.dunn...@gmail.com wrote:
I'll be there.
Hi,
mloss.org will be hosting the workshop on Machine Learning Open Source
Software at the International Conference on Machine Learning (MLOSS
'10), following similar workshops at NIPS. I believe it would be a
great venue to not only present mahout but also to get in touch with
other MLOSS
On Mon, Jan 18, 2010 at 3:20 PM, Olivier Grisel olivier.gri...@ensta.orgwrote:
In the mean time could you please give me a hint on how to value the
probes of the binary randomizer w.r.t. the window size?
The basic trade-off is the standard hashed learning trade-off between number
of training
On Jan 18, 2010, at 12:34 PM, Benson Margulies wrote:
If it's SF on Thursday, someone will have to have a beer as my proxy.
I volunteer ;-)
Sounds like a we have a post meetup meetup brewing. I'm not familiar with the
area, anyone know where we can go afterwards? Also, I'll need a ride
I would love to, but there is no chance I could make it that far.
On Mon, Jan 18, 2010 at 2:32 PM, Markus Weimer mar...@weimo.de wrote:
Hi,
mloss.org will be hosting the workshop on Machine Learning Open Source
Software at the International Conference on Machine Learning (MLOSS
'10),
On Mon, Jan 18, 2010 at 4:46 PM, Grant Ingersoll gsing...@apache.orgwrote:
On Jan 18, 2010, at 12:34 PM, Benson Margulies wrote:
If it's SF on Thursday, someone will have to have a beer as my proxy.
I volunteer ;-)
You're on.
Sounds like a we have a post meetup meetup brewing. I'm
Hi Drew,
Including a source code in snapshots that will be great.
Currently, the HDFS reader does not work in 0.20.2. Without source code,
it's not convenient for me to debug the code.
Cheers,
Zhendong
On Sat, Jan 9, 2010 at 12:25 AM, Drew Farris drew.far...@gmail.com wrote:
I wonder if we
[
https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802106#action_12802106
]
Ted Dunning commented on MAHOUT-228:
{quote}
make sure that L1 is sparsity inducing my
[
https://issues.apache.org/jira/browse/MAHOUT-232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhao zhendong updated MAHOUT-232:
-
Attachment: SequentialSVM_0.4.patch
1) Supporting sequential multi-classification (both
[
https://issues.apache.org/jira/browse/MAHOUT-232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhao zhendong updated MAHOUT-232:
-
Description:
After discussed with guys in this community, I decided to re-implement a
59 matches
Mail list logo