+1
On Mon, Apr 12, 2010 at 4:50 AM, Ted Dunning wrote:
> +1 (on trust, really)
>
> On Sun, Apr 11, 2010 at 6:49 PM, Benson Margulies
> wrote:
>
>> https://repository.apache.org/content/repositories/orgapachemahout-015/
>>
>> contains (this time for sure) all the artifacts for release 1.0 of the
+1
On Thu, Apr 8, 2010 at 2:57 AM, Drew Farris wrote:
> +1
>
> On Tue, Apr 6, 2010 at 9:08 PM, Benson Margulies
> wrote:
>> In order to decouple the mahout-collections library from the rest of
>> Mahout, to allow more frequent releases and other good things, we
>> propose to release the code ge
[x] +1 I'm for Mahout being a TLP and the resolution below.
On 3/19/10, Drew Farris wrote:
> [x] +1 I'm for Mahout being a TLP and the resolution below.
>
> On Fri, Mar 19, 2010 at 10:50 AM, Grant Ingersoll
> wrote:
>
>>
>> [1]
>> X. Establish the Apache Mahout Project
>>
>> WHEREAS, the Boar
close, actually:
عبد الحكيم
=D
On Thu, Mar 18, 2010 at 6:41 PM, Benson Margulies wrote:
> Or perhaps:
>
> عبدل حكيم
>
> ?
>
>
> On Thu, Mar 18, 2010 at 1:34 PM, deneche abdelhakim
> wrote:
>> should be "Abdelhakim Deneche <...>" cause my fir
VED, that the persons listed immediately below be and
> hereby are appointed to serve as the initial members of the
> Apache Mahout Project:
>
> • Deneche Abdelhakim
> • Isabel Drost (isa...@...)
> • Ted Dunning (tdunn...@...)
> • Jeff Eastman (jeast
I'm in too
On Thu, Mar 18, 2010 at 2:31 AM, Benson Margulies wrote:
> Oh, sorry. I'm in.
>
> On Wed, Mar 17, 2010 at 9:17 PM, Grant Ingersoll wrote:
>>
>> On Mar 17, 2010, at 9:43 AM, Grant Ingersoll wrote:
>>
>>> Formalizing a bit more and updating the resolution based on the "opt-in"
>>> emai
+1 too
On Mon, Mar 15, 2010 at 7:53 PM, Jake Mannix wrote:
> +1 from over here.
>
> On Mon, Mar 15, 2010 at 11:36 AM, Drew Farris wrote:
>
>> +1 as well.
>>
>> On Mon, Mar 15, 2010 at 2:34 PM, Ted Dunning
>> wrote:
>> > Dang. I can only second second it now.
>> >
>> > On Mon, Mar 15, 2010 at 1
just to get it right: not being in the PMC doesn't mean I'm no more a
committer, right ?
On Mon, Mar 15, 2010 at 6:08 PM, Jake Mannix wrote:
> +1 and I'm in (my email @apache is just jmannix btw, for some reason its not
> listed on those resolutions)
>
> On Mar 15, 2010 9:07 AM, "Robin Anil" wro
yes, I'm planning to make DF "look" more like a Mahout classifier. I
will take a look at bayes.
On Sun, Mar 7, 2010 at 7:09 PM, Robin Anil (JIRA) wrote:
>
> [
> https://issues.apache.org/jira/browse/MAHOUT-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommen
oops, will attach it as soon as possible. I really wonder why submit a
patch and attach a patch are two different operations in JIRA ?
On Sat, Mar 6, 2010 at 10:08 PM, Robin Anil (JIRA) wrote:
>
> [
> https://issues.apache.org/jira/browse/MAHOUT-323?page=com.atlassian.jira.plugin.system.issue
The current implementation of Random Forests looks good indeed. My
latest tests on NSL-KDD (http://nsl.cs.unb.ca/NSL-KDD/) shows similar
recognition rates such as those reported in the paper. After the
release of Mahout 0.3 (and the end of the current code freeze), I
should post some additions, for
Welcome Drew
=D
On Fri, Feb 19, 2010 at 5:02 AM, Grant Ingersoll wrote:
>
> On Feb 18, 2010, at 8:32 PM, Drew Farris wrote:
>
>> There's lots more stuff I'd like to get in there,
>> now I only need to figure how to squeeze 48 hours of consciousness
>> into a day.
>
> I believe there is a compre
> One important question in my mind here is how does this effect 0.20 based
> jobs and pre 0.20 based jobs. I had written pfpgrowth in pure 0.20 api. and
> deneche is also maintaining two version it seems. I will check the
> AbstractJob and see
although I maintain two versions of Decision Forests,
The only example that actually uses watchmaker-swing is "Travelling
Salesman", mainly because it was a direct port of an existing
watchmaker example. And if I remember well, it does not actually use
JFreeChart...so I think it's safe to exclude it.
On Sat, Jan 30, 2010 at 5:19 AM, Drew Farris wrot
13:53:25 +0100 (Sat, 09 Jan 2010) | 1 line
Code style adjustments; enabled/fixed TestSamplingIterator
On Sun, Jan 17, 2010 at 5:47 AM, deneche abdelhakim wrote:
> I'm getting similar slowdowns with my VirtualBox Ubuntu 9.04
>
> I'm suspecting that the problem is not -only- cau
I'm getting similar slowdowns with my VirtualBox Ubuntu 9.04
I'm suspecting that the problem is not -only- caused by RandomUtils because:
1. I'm familiar with MerseneTwisterRNG slowdowns (I use it a lot) but
the test time used to be reported accurately by maven. Now maven
reports that a test took
Yeah, its probably due to the way I used to generate random data...the
problem is that I never get this error =P so it's very difficult to
fix...I'll try my best as soon as I have some time. In the mean time,
rerunning 'mvn clean install' again generally does the trick.
On Sat, Jan 16, 2010 at 6:5
Welcome =D
On Wed, Jan 13, 2010 at 10:36 PM, Drew Farris wrote:
> Congratulations Benson. It is wonderful to see your great work in the
> mahout-math (and the future mahout-collections?) come together quickly.
>
> On Wed, Jan 13, 2010 at 3:28 PM, Grant Ingersoll wrote:
>
>> The Lucene PMC is plea
the build is successful, thanks =D
On Fri, Jan 8, 2010 at 9:23 AM, Robin Anil wrote:
> Try Now
>
yep :p
On Sun, Jan 3, 2010 at 4:41 PM, Sean Owen (JIRA) wrote:
>
> [
> https://issues.apache.org/jira/browse/MAHOUT-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> ]
>
> Sean Owen resolved MAHOUT-71.
> -
>
> Resolution: Later
> Fix
last time I tried, running Hadoop 0.20 on Windows was impossible for
me...should we still try to support Windows ? I found that installing
Ubuntu on Windows using Virtual Box is the easiest way to use Hadoop
inside Windows
On Mon, Dec 28, 2009 at 8:47 PM, Benson Margulies wrote:
> Robin & I just
I'm not planing to make new changes to 'mapred', my new code should go
to 'mapreduce'
On Thu, Dec 3, 2009 at 3:34 PM, Isabel Drost wrote:
> On Thu Sean Owen wrote:
>
>> I suggest our current stance be that we use 0.20.x, with the old APIs.
>> When 0.21 comes out and stabilizes, we move. So I sug
df/mapred works with the old hadoop API
df/mapreduce works with hadoop 0.20 API
On Saturday, November 28, 2009, Sean Owen wrote:
> I'm all for generating and publishing this.
>
>
> The CPD results highlight a question I had: what's up with the amount
> of duplication between org/apache/mahout/df/
please use "Decision Forests" instead of "Random Forests"
On Thu, Nov 12, 2009 at 9:01 AM, Robin Anil wrote:
> Please edit/add stuff.
>
> Robin
>
>
> ==
>
> Apache Mahout 0.2 has been released and is now available for public
> download. Apache Mahout is a sub
I'm getting the following build failure when running 'mvn clean install' :
...
[INFO]
[INFO] Building Mahout core
[INFO]task-segment: [clean, install]
[INFO] ---
There is also a main() method in:
./examples/src/main/java/org/apache/mahout/ga/watchmaker/cd/CDGA.java
I should be able to post a patch saturday concerning CDInfosTool and CDGA.
On Thu, Oct 8, 2009 at 7:29 AM, Isabel Drost (JIRA) wrote:
>
> [
> https://issues.apache.org/jira/browse/MAHOUT-
Sure.
On Fri, Oct 2, 2009 at 8:59 AM, Isabel Drost (JIRA) wrote:
>
> [
> https://issues.apache.org/jira/browse/MAHOUT-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761501#action_12761501
> ]
>
> Isabel Drost commented on MAHOUT-184:
> --
> On Sep 27, 2009, at 6:47 AM, Simon Willnauer wrote:
>
>> Are you commiting into a http or https path. you must check out via
>> https in order to commit this has been an issue for many new
>> commiters.
>>
>> Simon
>>
>> On Sun, Sep 27, 2009
I'm trying to commit [MAHOUT-122 |
https://issues.apache.org/jira/browse/MAHOUT-122], but I'm getting the
following error:
svn: Commit failed (details follow):
svn: Server sent unexpected return value (403 Forbidden) in response
to MKACTIVITY request for
'/repos/asf/!svn/act/de296129-b366-459b-b18
yes its meant to be run twice, one time selecting the training samples
and the next time the testing samples. It assumes that RNG will return
the exact same numbers twice.
On Mon, Sep 21, 2009 at 1:54 PM, Sean Owen wrote:
> I rolled it back. So the reader depends on the seed and the exact
> behav
The change in
"examples/src/main/java/org/apache/mahout/ga/watchmaker/cd/hadoop/DatasetSplit.java"
could lead to a bug. The problem is in the following modification:
- rng = new MersenneTwisterRNG(split.getSeed());
+ rng = RandomUtils.getRandom();
rng is supposed to use the seed given
> Are there tests that could benefit from a little
> optimization to run faster?
in my machine, the examples tests are very (very...) slow. Those tests are
related to watchmaker integration (my code =P ). I wrote them a year ago and I
think that they can be made faster
--- En date de : Ven 18.9
Grant Ingersoll wrote:
>
> > Now when I did a forrest clean I get the same error.
> >
> > On Sep 16, 2009, at 9:44 AM, deneche abdelhakim
> wrote:
> >
> >> 'forrest site' gives me:
> >>
> ***
écrit :
> De: Grant Ingersoll
> Objet: Re: Updating the Web site
> À: mahout-dev@lucene.apache.org
> Date: Mercredi 16 Septembre 2009, 15h35
> What's the full log say?
>
> On Sep 16, 2009, at 7:15 AM, deneche abdelhakim wrote:
>
> > forest is installed in
ns to write on the forrest
> install. I believe Forrest downloads stuff to its
> directories. I
> recall seeing similar things. Very annoying.
>
> On Sep 15, 2009, at 7:12 AM, deneche abdelhakim wrote:
>
> > I'm already using Java 1.5 !
> >
> > ---
Thanks!
--- En date de : Mar 15.9.09, Isabel Drost a écrit :
> De: Isabel Drost
> Objet: Re: JIRA permission ?
> À: mahout-dev@lucene.apache.org
> Date: Mardi 15 Septembre 2009, 17h23
> On Tue, 15 Sep 2009 14:52:28 +
> (GMT)
> deneche abdelhakim
> wrote:
>
&
now that I'm a committer ( 8D ) I suppose I can assign JIRA issues to myself.
Do I need a special permission to do that ? because I'm not able to find a way
to do it =P
.5 for it and it should
> work.
>
> On Sep 15, 2009, at 6:24 AM, deneche abdelhakim wrote:
>
> > I followed the instructions available here:
> >
> > http://cwiki.apache.org/MAHOUT/howtoupdatethewebsite.html
> >
> > in order to add my name to the committer
te de : Mar 15.9.09, Isabel Drost a écrit :
> De: Isabel Drost
> Objet: Re: Re : Welcome the newest Mahouts!
> À: mahout-dev@lucene.apache.org
> Date: Mardi 15 Septembre 2009, 12h29
> On Tue, 15 Sep 2009 10:11:56 +
> (GMT)
> deneche abdelhakim
> wrote:
>
> > Got
I followed the instructions available here:
http://cwiki.apache.org/MAHOUT/howtoupdatethewebsite.html
in order to add my name to the committer list =P
when running 'forrest run' but I'm getting broken links:
X [0] skin/images/current.gif
BROKEN: /home/hakim/apache-forrest-0.8/main/webapp/.
learn much more.
--- En date de : Mer 26.8.09, Grant Ingersoll a écrit :
> De: Grant Ingersoll
> Objet: Welcome the newest Mahouts!
> À: mahout-u...@lucene.apache.org, "Mahout Dev List"
>
> Date: Mercredi 26 Août 2009, 16h57
> I am pleased to announce that the
>
Thanks Robin =D
--- En date de : Lun 14.9.09, Robin Anil a écrit :
> De: Robin Anil
> Objet: Comprehensive study on Java Memory Optimization
> À: "mahout-dev"
> Date: Lundi 14 Septembre 2009, 9h08
> Hope it would be useful.
> Link:
> http://www.cs.virginia.edu/kim/publicity/pldi09tutorials/mem
done.
--- En date de : Mar 8.9.09, Grant Ingersoll a écrit :
> De: Grant Ingersoll
> Objet: [GSOC] Code Submissions
> À: "Mahout Dev List"
> Date: Mardi 8 Septembre 2009, 13h09
> Hi Robin, David and Deneche,
>
> You will need to submit code samples. Please see
> http://groups.google.com/gro
I used MerseneTwisterRNG a lot and noticed it somehow slows down the unit
tests, maybe because it always tries to get a random seed from the net.
--- En date de : Lun 7.9.09, Sean Owen a écrit :
> De: Sean Owen
> Objet: Standardize on Mersenne Twister RNG?
> À: mahout-dev@lucene.apache.org
> D
I'll try...may take some time but I 'll surely learn a lot (will also need a
refill on my pain killers)
--- En date de : Dim 6.9.09, Ted Dunning a écrit :
> De: Ted Dunning
> Objet: Re: [jira] Commented: (MAHOUT-145) PartialData mapreduce Random
> Forests
> À: mahout-dev@lucene.apache.org
>
just got the same error, nuking .m2 AND installing maven 2.2.1 solved the
problem
--- En date de : Mar 25.8.09, Ted Dunning a écrit :
> De: Ted Dunning
> Objet: Re: build failure
> À: mahout-dev@lucene.apache.org, isa...@apache..org
> Date: Mardi 25 Août 2009, 0h58
> Tried the -U solution. N
lucene.apache.org
> Date: Lundi 17 Août 2009, 14h46
>
> On Aug 17, 2009, at 7:36 AM, deneche abdelhakim wrote:
>
> > I moved recently some of the "Decision Forest"
> examples from the core project to the examples project.
> While in core they worked perfectly in h
I moved recently some of the "Decision Forest" examples from the core project
to the examples project. While in core they worked perfectly in hadoop 0..19.1
(pseudo-distributed), but now they don't !!!
For example, running my org.apache.mahout.df.BuildForest gives the following
exception:
maven 2.1.0
deleting the local repository solves the problems, just hopes I wont have to do
it often
- Message d'origine
De : Grant Ingersoll
À : mahout-dev@lucene.apache.org
Envoyé le : Mercredi, 22 Juillet 2009, 19h42mn 04s
Objet : Re: Error building Mahout
What version of Mvn? W
I'm getting it too when building from the base directory
- Message d'origine
De : Robin Anil
À : mahout-dev
Envoyé le : Mercredi, 22 Juillet 2009, 19h15mn 38s
Objet : Error building Mahout
I am getting this error on building mahout. mvn clean install -e take
a look at the debug outpu
Actually, I'm not used any reducer at all, the output of the mappers is
collected and handled by the main program after the end of the job.
Running the job with 10 map tasks in a 10 instances (c1.medium) cluster takes
0h 11m 39s 209, speculative execution is on so 12 map tasks have been launche
The students mid-term survey is available online. I'm posting this because I
almost forgot it =P
--- En date de : Mer 17.6.09, Grant Ingersoll a écrit :
> De: Grant Ingersoll
> Objet: [GSOC] July 6 is mid-term evaluations
> À: mahout-dev@lucene.apache.org
> Date: Mercredi 17 Juin 2009, 15h54
: Mar 30.6.09, Grant Ingersoll a écrit :
> De: Grant Ingersoll
> Objet: Re: problems downloading lucene-analyzers
> À: mahout-dev@lucene.apache.org
> Date: Mardi 30 Juin 2009, 15h20
>
> FWIW, it works for me.
>
> On Jun 30, 2009, at 6:54 AM, deneche abdelhakim wrote:
>
I'm having problems with lucene-analyzers (2.9-SNAPSHOT) dependency, because
its a snapshot "mvn install" downloads a new version every day, and most of the
time I got checksum failures !!! Is any body else having the same problem ?
mvn -version :
Maven version: 2.0.9
Java version: 1.6.0_0
OS n
nteresting to
> modify the original
> algorithm to build multiple trees for different portions of
> the data. That
> loses some of the solidity of the original method, but
> could actually do
> better if the splits exposed non-stationary behavior.
>
> On Wed, Jun 17, 2009 at 3:
As we talked about in the following discussion (A), I'm considering two ways to
implement a distributed map-reduce builder.
Given the reference implementation, the easiest implementation is the following:
* the data is distributed to the slave nodes using the DistributedCache
* each mapper load
Ok I'm going to add the tests...
--- En date de : Mer 27.5.09, Ted Dunning a écrit :
> De: Ted Dunning
> Objet: Re: [jira] Updated: (MAHOUT-122) Random Forests Reference
> Implementation
> À: mahout-dev@lucene.apache.org
> Date: Mercredi 27 Mai 2009, 19h38
> Tests are incredibly valuable,
>
ar 21.4.09, David Hall a écrit :
> De: David Hall
> Objet: Re: [GSOC] Accepted Students
> À: mahout-dev@lucene.apache.org
> Date: Mardi 21 Avril 2009, 8h30
> On Mon, Apr 20, 2009 at 11:18 PM,
> deneche abdelhakim
> wrote:
> >
> > Hi,
> >
> > =D
> >
Hi,
=D
I've been accepted. And I'll be working on Random Forests
=P
Given it's my second participation, I have one advise : don't be shy to ask
about anything related to your project on this list (starting from now), its
the fastest way to learn about Mahout.
Who else has been accepted ?
ee your application on the GSOC web site.
> Nor on the apache wiki.
>
> Time is running out and I would hate to not see you in the
> program. Is it
> just that I can't see the application yet?
>
> On Tue, Mar 31, 2009 at 1:05 PM, deneche abdelha
Here is a draft of my proposal
**
Title/Summary: [Apache Mahout] Implement parallel Random/Regression Forests
Student: AbdelHakim Deneche
Student e-mail: ...
Student Major: Phd in Computer Science
Student Degree: Master in Computer Science
Student
at would help.
>
> You will know much more about this after you finish the
> non-parallel
> implementation than either of us knows now.
>
> On Mon, Mar 30, 2009 at 7:24 AM, deneche abdelhakim wrote:
>
> > There is still one case that this approach, even
> out
e
> remedied trivially.
>
> Another way to put this is that the key question is how
> single node
> computation scales with input size. If the scaling is
> relatively linear
> with data size, then your approach (3) will work no matter
> the data size.
> If scaling shows an
you should read in . 2a
. This implementation is, relatively, "easy" given...
--- En date de : Sam 28.3.09, deneche abdelhakim a écrit :
> De: deneche abdelhakim
> Objet: Re: [gsoc] random forests
> À: mahout-dev@lucene.apache.org
> Date: Samedi 28 Mars 2009, 16h14
>
about the nose-bleed tendency between the
> two methods.
>
> On Sat, Mar 21, 2009 at 4:46 AM, deneche abdelhakim wrote:
>
> > I can't find a no-nose-bleeding algorithm
>
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>
talking about Random Forests, I think there are two possible ways to actually
implement them:
The first implementation is useful when the dataset is not that big (<= 2Go
perhaps) and thus can be distributed via Hadoop's DistributedCache. In this
case each mapper has access to all the dataset a
pdf>
>
> (these seem to be versions of the same paper).
>
> On Sun, Mar 15, 2009 at 1:53 AM, deneche abdelhakim wrote:
>
> >
> > I added a page to the wiki that describes how to build
> a random forest and
> > how to use it to classify new cases.
> >
&
or a long time is
> whether the choice of
> variables to use for splits is chosen once per tree or
> again at each split.
>
> I think that the latter interpretation is actually the
> correct one. You
> should check my thought.
>
> On Sun, Mar 15, 2009 at 1:53 AM, denech
I added a page to the wiki that describes how to build a random forest and how
to use it to classify new cases.
http://cwiki.apache.org/confluence/display/MAHOUT/Random+Forests
The following classes uses the Deque interface, which is not available in Java
1.5:
. org.apache.mahout.classifier.bayes.BayesClassifier
. org.apache.mahout.classifier.cbayes.CBayesClassifier
--- En date de : Lun 9.3.09, Sean Owen a écrit :
> De: Sean Owen
> Objet: Re: Mahout for 1.5 JVM
>
Im seriously considering Random Forests (RF) as my GSoC project, they seem
interesting, and judging by how often they have been suggested, they are very
useful to Mahout. I found the following discussion:
http://markmail.org/message/dancn3n76ken6thb
that gives many useful informations about RF
ople how to do this on EC2, but
> the bigger focus to me should be on demoing/documenting
> Mahout's capabilities, versus showing how to run Mahout
> on any particular platform.
>
>
> On Feb 26, 2009, at 9:58 AM, deneche abdelhakim wrote:
>
> >
> > Hi,
&g
Hi,
Im planning to participate, again, at GSoC and I want to do it, again, with
Mahout.
This year, lets make Mahout run over Amazon EC2. This means building the proper
AMIs, run some Mahout projects (the GA examples) over EC2, give feedback and
write simple, clear How-Tos about running a Mahout
It's a silly question :P but can you use Mahout without using Hadoop ? do you
mean that when having one single 'multi-core" machine, one can use Mahout alone
? (Ok, that's two silly questions)
--- En date de : Lun 2.2.09, Sean Owen a écrit :
> De: Sean Owen
> Objet: Re: Thought: offering EC2
About MAHOUT-102 (https://issues.apache.org/jira/browse/MAHOUT-102), the patch
is already available, is someone could just commit it.
Also, I'm not able to make my patchs delete files (or directories) when
applied, is it because I'm not a commiter or because I'm using TortoiseSVN ?
--- En date
nnotations
> À: mahout-dev@lucene.apache.org
> Date: Jeudi 22 Janvier 2009, 10h05
> I think mahout should compile with both 1.5 and 1.6.
>
> On Wed, Jan 21, 2009 at 11:23 PM, deneche abdelhakim
> wrote:
>
> > Last time I tried to compile the Mahout trunk, I got a
> si
Last time I tried to compile the Mahout trunk, I got a similar problem. In my
case, I'm using Eclipse and the errors were caused by the JDK Compliance Level
(in the project properties). In short, I was using JVM 1.6 JRE but with 5.0
compliance level (forgot to change it !).
I found the answer i
--- En date de : Dim 19.10.08, Grant Ingersoll <[EMAIL PROTECTED]> a écrit :
> De: Grant Ingersoll <[EMAIL PROTECTED]>
> Objet: Re: More proposed changes across code
> À: mahout-dev@lucene.apache.org
> Date: Dimanche 19 Octobre 2008, 18h30
> On Oct 19, 2008, at 11:16 AM, Sean Owen wrote:
>
> >
> 5. BruteForceTravellingSalesman says "copyright Daniel
> Dwyer" -- can
> this be replaced by the standard copyright header?
Oups, I tought I changed them all ! Yes you can replace it.
__
Do You Yahoo!?
En finir avec le spam? Yahoo! Mail vous offre
AM, Karl Wettin
> <[EMAIL PROTECTED]> wrote:
> > Hmm, if this is test/resources, shouldn't they be
> accessed using
> > getResourceAsStream instead? I'll see what I can
> do.
> >
> > 22 sep 2008 kl. 10.15 skrev Sean Owen:
> >
> >> Oh O
> Dumb question: why does example code depend on test code?
> Can this be solved by severing that dependency?
It's not from the example code but from the example's test code. In this case
the example's tries to access a directory (wdbc) put into test/resources. The
content of test/resources is a
; once.
> >
> > But the same is true of the console - you won't be
> able to interact
> > with the program that way either.
> >
> > It does sound good, in any event, to separate out
> Swing client code
> > from the core logic.
> >
> > On 9/2
Sounds cool :)
I'll do the TSP part, but it may take some time because I'm a bit busy (PhD's
administrative stuff).
There are many available large TSP benchmarks, and it seems that there is a
common file format for them TSPLIB
(http://www.informatik.uni-heidelberg.de/groups/comopt/software/TSP
I came across the following competition
http://www.netflixprize.com/index
It's about recommender systems, so I think it's a Taste stuff. The training
dataset consists of more than 100M ratings.
- Message d'origine
De : Josh Myer <[EMAIL PROTECTED]>
À : mahout-dev@lucene.apache.org
En
Go on, I will do my part, I just hope GA likes Java 6 :P
- Message d'origine
De : Sean Owen <[EMAIL PROTECTED]>
À : mahout-dev@lucene.apache.org
Envoyé le : Samedi, 30 Août 2008, 21h26mn 45s
Objet : Re: Going to move us to Hadoop 0.18.0, Java 6
So I should hold off on committing change
You should run the "job" task in the examples directory (ant job), it will
generate a file (in examples/build) called
""apache-mahout-examples-0.1-dev.job", this is the jar (even if it ends with
.job) that contains both the examples and the core.
- Message d'origine
De : Robin Anil <[
I did a fresh svn checkout of mahout and a got two hadoop jars !
hadoop-0.17.0-core.jar
hadoop-0.17.1-core.jar
I don't think we need both of them, do we ? :P
_
Envoyez avec Yahoo! Mail. Une boite mail plus intel
> En date de : Jeu 10.7.08, Grant Ingersoll <[EMAIL PROTECTED]> a écrit :
>
> On Jul 8, 2008, at 5:56 AM, deneche abdelhakim wrote:
> >
> > Now that the Class Discovery (CD) example is up and running, it's
> > time to think about what to do next. I alread
Now that the Class Discovery (CD) example is up and running, it's time to think
about what to do next. I already have some ideas, but I want to check with the
community first.
I see two possible ways ahead of me:
A.Enhance the (CD) example
a1. handle categorical attributes
a2. generate datase
I had to use an -Xmx 256m to get the tests to run without heap problems.
Jeff
deneche abdelhakim wrote:
> I've been using Eclipse for all my testing and all just works fine.
But now I want to build and test the examples using ant. I managed to modify
the build.xml to generate the examples j
I've been using Eclipse for all my testing and all just works fine. But now I
want to build and test the examples using ant. I managed to modify the
build.xml to generate the examples job. But when I run one of the examples (for
example : ...clustering.syntheticcontrol.canopy.Job) I get the foll
In my case I used a SequenceFileOutputFormat, then I use the following code to
merge, sort and read the output:
...
import org.apache.hadoop.io.SequenceFile.Reader;
import org.apache.hadoop.io.SequenceFile.Sorter;
...
// list all files in the output path (each reducer should genrate a differen
I just did a fresh checkout and all the tests are successfull !!!
--- En date de : Sam 21.6.08, Allen Day <[EMAIL PROTECTED]> a écrit :
> De: Allen Day <[EMAIL PROTECTED]>
> Objet: getting started with mahout, failing tests
> À: mahout-dev@lucene.apache.org
> Date: Samedi 21 Juin 2008, 8h00
> Hi
links to basic
> papers there, so people that aren't familiar can go do
> some background
> reading.
>
> I will try to get to MAHOUT-56 this week, but others can
> jump in and
> review as well.
>
> -Grant
>
> On May 27, 2008, at 4:52 AM, deneche abde
doop and for that I
> am reading up on
> >> Hadoop.
> >
> > If you have any questions, feel free to ask us or post
> your questions to the
> > Hadoop mailinglists.
> >
> >
> >> Could you tell me again if this fits well with
> Mahou
PROTECTED]> wrote:
> > I'm getting the same errors upon upgrading. We
> should look at the Hadoop 17
> > changes.
> >
> > On Jun 2, 2008, at 5:50 AM, deneche abdelhakim wrote:
> >
> >> I checked the last version of Mahout (rev. 662372)
> and got t
n Mahout ?
>
>
> 2 jun 2008 kl. 11.50 skrev deneche abdelhakim:
> > I checked the last version of Mahout (rev. 662372) and
> got the
> > following exception with many tests (the list of this
> tests is at
> > the end of this post):
> >
> &g
I checked the last version of Mahout (rev. 662372) and got the following
exception with many tests (the list of this tests is at the end of this post):
java.io.IOException: Job failed!
the following message is printed in System.err :
java.lang.OutOfMemoryError: Java heap space
I think its som
--- En date de : Mer 28.5.08, Ted Dunning <[EMAIL PROTECTED]> a écrit :
> How about writing the population to the file and using it as
> input to map-reduce directly? Evaluations that fit into a map can
> obviously handle that.
This is exacly what I did in [Mahout-56]
> Evaluations that need t
> Ted Dunning <[EMAIL PROTECTED]> wrote:
>
> Conceptually, at least, it would be good to have the option for fitness
> functions to be expressed as map-reduce programs. Unfortunately, having
> mappers spawn MR programs runs the real risk of dead-lock, especially on
> less than grandiose clusters.
1 - 100 of 113 matches
Mail list logo