Re: VOTE: take 2: mahout-collections-1.0

2010-04-11 Thread deneche abdelhakim
+1 On Mon, Apr 12, 2010 at 4:50 AM, Ted Dunning wrote: > +1 (on trust, really) > > On Sun, Apr 11, 2010 at 6:49 PM, Benson Margulies > wrote: > >> https://repository.apache.org/content/repositories/orgapachemahout-015/ >> >> contains (this time for sure) all the artifacts for release 1.0 of the

Re: VOTE: release mahout-collections-codegen 1.0

2010-04-07 Thread deneche abdelhakim
+1 On Thu, Apr 8, 2010 at 2:57 AM, Drew Farris wrote: > +1 > > On Tue, Apr 6, 2010 at 9:08 PM, Benson Margulies > wrote: >> In order to decouple the mahout-collections library from the rest of >> Mahout, to allow more frequent releases and other good things, we >> propose to release the code ge

Re: [VOTE] Mahout as TLP

2010-03-19 Thread deneche abdelhakim
[x] +1 I'm for Mahout being a TLP and the resolution below. On 3/19/10, Drew Farris wrote: > [x] +1 I'm for Mahout being a TLP and the resolution below. > > On Fri, Mar 19, 2010 at 10:50 AM, Grant Ingersoll > wrote: > >> >> [1] >> X. Establish the Apache Mahout Project >> >>  WHEREAS, the Boar

Re: [DISCUSS] Mahout TLP Board Resolution

2010-03-18 Thread deneche abdelhakim
close, actually: عبد الحكيم =D On Thu, Mar 18, 2010 at 6:41 PM, Benson Margulies wrote: > Or perhaps: > > عبدل حكيم > > ? > > > On Thu, Mar 18, 2010 at 1:34 PM, deneche abdelhakim > wrote: >> should be "Abdelhakim Deneche <...>" cause my fir

Re: [DISCUSS] Mahout TLP Board Resolution

2010-03-18 Thread deneche abdelhakim
VED, that the persons listed immediately below be and >   hereby are appointed to serve as the initial members of the >   Apache Mahout Project: > >        • Deneche Abdelhakim >        • Isabel Drost (isa...@...) >        • Ted Dunning (tdunn...@...) >        • Jeff Eastman (jeast

Re: [DISCUSS] Mahout TLP Board Resolution

2010-03-18 Thread deneche abdelhakim
I'm in too On Thu, Mar 18, 2010 at 2:31 AM, Benson Margulies wrote: > Oh, sorry. I'm in. > > On Wed, Mar 17, 2010 at 9:17 PM, Grant Ingersoll wrote: >> >> On Mar 17, 2010, at 9:43 AM, Grant Ingersoll wrote: >> >>> Formalizing a bit more and updating the resolution based on the "opt-in" >>> emai

Re: [NOMINATION] Sean Owen as Mahout PMC Chair

2010-03-15 Thread deneche abdelhakim
+1 too On Mon, Mar 15, 2010 at 7:53 PM, Jake Mannix wrote: > +1 from over here. > > On Mon, Mar 15, 2010 at 11:36 AM, Drew Farris wrote: > >> +1 as well. >> >> On Mon, Mar 15, 2010 at 2:34 PM, Ted Dunning >> wrote: >> > Dang.  I can only second second it now. >> > >> > On Mon, Mar 15, 2010 at 1

Re: [DISCUSS] Mahout TLP Board Resolution

2010-03-15 Thread deneche abdelhakim
just to get it right: not being in the PMC doesn't mean I'm no more a committer, right ? On Mon, Mar 15, 2010 at 6:08 PM, Jake Mannix wrote: > +1 and I'm in (my email @apache is just jmannix btw, for some reason its not > listed on those resolutions) > > On Mar 15, 2010 9:07 AM, "Robin Anil" wro

Re: [jira] Commented: (MAHOUT-323) Classify new data using Decision Forest

2010-03-07 Thread deneche abdelhakim
yes, I'm planning to make DF "look" more like a Mahout classifier. I will take a look at bayes. On Sun, Mar 7, 2010 at 7:09 PM, Robin Anil (JIRA) wrote: > >    [ > https://issues.apache.org/jira/browse/MAHOUT-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommen

Re: [jira] Commented: (MAHOUT-323) Classify new data using Decision Forest

2010-03-07 Thread deneche abdelhakim
oops, will attach it as soon as possible. I really wonder why submit a patch and attach a patch are two different operations in JIRA ? On Sat, Mar 6, 2010 at 10:08 PM, Robin Anil (JIRA) wrote: > >    [ > https://issues.apache.org/jira/browse/MAHOUT-323?page=com.atlassian.jira.plugin.system.issue

Re: [jira] Created: (MAHOUT-318) Decision Tree Learning

2010-03-01 Thread deneche abdelhakim
The current implementation of Random Forests looks good indeed. My latest tests on NSL-KDD (http://nsl.cs.unb.ca/NSL-KDD/) shows similar recognition rates such as those reported in the paper. After the release of Mahout 0.3 (and the end of the current code freeze), I should post some additions, for

Re: Welcome Drew Farris

2010-02-18 Thread deneche abdelhakim
Welcome Drew =D On Fri, Feb 19, 2010 at 5:02 AM, Grant Ingersoll wrote: > > On Feb 18, 2010, at 8:32 PM, Drew Farris wrote: > >>  There's lots more stuff I'd like to get in there, >> now I only need to figure how to squeeze 48 hours of consciousness >> into a day. > > I believe there is a compre

Re: Mahout 0.3 Plan and other changes

2010-02-04 Thread deneche abdelhakim
> One important question in my mind here is how does this effect 0.20 based > jobs and pre 0.20 based jobs. I had written pfpgrowth in pure 0.20 api. and > deneche is also maintaining two version it seems. I will check the > AbstractJob and see although I maintain two versions of Decision Forests,

Re: dependency question: mahout-examples <- watchmaker-swing <- jfreechart <- jcommons?

2010-01-29 Thread deneche abdelhakim
The only example that actually uses watchmaker-swing is "Travelling Salesman", mainly because it was a direct port of an existing watchmaker example. And if I remember well, it does not actually use JFreeChart...so I think it's safe to exclude it. On Sat, Jan 30, 2010 at 5:19 AM, Drew Farris wrot

Re: Unit test lag?

2010-01-16 Thread deneche abdelhakim
13:53:25 +0100 (Sat, 09 Jan 2010) | 1 line Code style adjustments; enabled/fixed TestSamplingIterator On Sun, Jan 17, 2010 at 5:47 AM, deneche abdelhakim wrote: > I'm getting similar slowdowns with my VirtualBox Ubuntu 9.04 > > I'm suspecting that the problem is not -only- cau

Re: Unit test lag?

2010-01-16 Thread deneche abdelhakim
I'm getting similar slowdowns with my VirtualBox Ubuntu 9.04 I'm suspecting that the problem is not -only- caused by RandomUtils because: 1. I'm familiar with MerseneTwisterRNG slowdowns (I use it a lot) but the test time used to be reported accurately by maven. Now maven reports that a test took

Re: Unit test failure

2010-01-16 Thread deneche abdelhakim
Yeah, its probably due to the way I used to generate random data...the problem is that I never get this error =P so it's very difficult to fix...I'll try my best as soon as I have some time. In the mean time, rerunning 'mvn clean install' again generally does the trick. On Sat, Jan 16, 2010 at 6:5

Re: Welcome Benson Marguiles as Mahout Committer

2010-01-13 Thread deneche abdelhakim
Welcome =D On Wed, Jan 13, 2010 at 10:36 PM, Drew Farris wrote: > Congratulations Benson. It is wonderful to see your great work in the > mahout-math (and the future mahout-collections?) come together quickly. > > On Wed, Jan 13, 2010 at 3:28 PM, Grant Ingersoll wrote: > >> The Lucene PMC is plea

Re: svn commit: r896922 [1/3] - in /lucene/mahout/trunk: core/src/main/java/org/apache/mahout/common/ core/src/main/java/org/apache/mahout/fpm/pfpgrowth/ core/src/main/java/org/apache/mahout/fpm/pfp

2010-01-08 Thread deneche abdelhakim
the build is successful, thanks =D On Fri, Jan 8, 2010 at 9:23 AM, Robin Anil wrote: > Try Now >

Re: [jira] Resolved: (MAHOUT-71) Dataset to Matrix Reader

2010-01-03 Thread deneche abdelhakim
yep :p On Sun, Jan 3, 2010 at 4:41 PM, Sean Owen (JIRA) wrote: > >     [ > https://issues.apache.org/jira/browse/MAHOUT-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Sean Owen resolved MAHOUT-71. > - > >       Resolution: Later >    Fix

Re: [math] watch out for Windows

2009-12-29 Thread deneche abdelhakim
last time I tried, running Hadoop 0.20 on Windows was impossible for me...should we still try to support Windows ? I found that installing Ubuntu on Windows using Virtual Box is the easiest way to use Hadoop inside Windows On Mon, Dec 28, 2009 at 8:47 PM, Benson Margulies wrote: > Robin & I just

Re: Publish code quality reports on web-site?

2009-12-03 Thread deneche abdelhakim
I'm not planing to make new changes to 'mapred', my new code should go to 'mapreduce' On Thu, Dec 3, 2009 at 3:34 PM, Isabel Drost wrote: > On Thu Sean Owen wrote: > >> I suggest our current stance be that we use 0.20.x, with the old APIs. >> When 0.21 comes out and stabilizes, we move. So I sug

Re: Publish code quality reports on web-site?

2009-11-28 Thread deneche abdelhakim
df/mapred works with the old hadoop API df/mapreduce works with hadoop 0.20 API On Saturday, November 28, 2009, Sean Owen wrote: > I'm all for generating and publishing this. > > > The CPD results highlight a question I had: what's up with the amount > of duplication between org/apache/mahout/df/

Re: 0.2 status

2009-11-12 Thread deneche abdelhakim
please use "Decision Forests" instead of "Random Forests" On Thu, Nov 12, 2009 at 9:01 AM, Robin Anil wrote: > Please edit/add stuff. > > Robin > > > == > > Apache Mahout 0.2 has been released and is now available for public > download. Apache Mahout is a sub

build failure

2009-11-09 Thread deneche abdelhakim
I'm getting the following build failure when running 'mvn clean install' : ... [INFO] [INFO] Building Mahout core [INFO]task-segment: [clean, install] [INFO] ---

Re: [jira] Commented: (MAHOUT-138) Convert main() methods to use Commons CLI for argument processing

2009-10-08 Thread deneche abdelhakim
There is also a main() method in: ./examples/src/main/java/org/apache/mahout/ga/watchmaker/cd/CDGA.java I should be able to post a patch saturday concerning CDInfosTool and CDGA. On Thu, Oct 8, 2009 at 7:29 AM, Isabel Drost (JIRA) wrote: > >    [ > https://issues.apache.org/jira/browse/MAHOUT-

Re: [jira] Commented: (MAHOUT-184) Code tweaks for .df.* code

2009-10-02 Thread deneche abdelhakim
Sure. On Fri, Oct 2, 2009 at 8:59 AM, Isabel Drost (JIRA) wrote: > >    [ > https://issues.apache.org/jira/browse/MAHOUT-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761501#action_12761501 > ] > > Isabel Drost commented on MAHOUT-184: > --

Re: commit rights ?

2009-09-27 Thread deneche abdelhakim
> On Sep 27, 2009, at 6:47 AM, Simon Willnauer wrote: > >> Are you commiting into a http or https path. you must check out via >> https in order to commit this has been an issue for many new >> commiters. >> >> Simon >> >> On Sun, Sep 27, 2009

commit rights ?

2009-09-26 Thread deneche abdelhakim
I'm trying to commit [MAHOUT-122 | https://issues.apache.org/jira/browse/MAHOUT-122], but I'm getting the following error: svn: Commit failed (details follow): svn: Server sent unexpected return value (403 Forbidden) in response to MKACTIVITY request for '/repos/asf/!svn/act/de296129-b366-459b-b18

Re: svn commit: r816569 - in /lucene/mahout/trunk/examples/src: main/java/org/apache/mahout/classifier/bayes/ main/java/org/apache/mahout/clustering/meanshift/ main/java/org/apache/mahout/clustering

2009-09-21 Thread deneche abdelhakim
yes its meant to be run twice, one time selecting the training samples and the next time the testing samples. It assumes that RNG will return the exact same numbers twice. On Mon, Sep 21, 2009 at 1:54 PM, Sean Owen wrote: > I rolled it back. So the reader depends on the seed and the exact > behav

Re: svn commit: r816569 - in /lucene/mahout/trunk/examples/src: main/java/org/apache/mahout/classifier/bayes/ main/java/org/apache/mahout/clustering/meanshift/ main/java/org/apache/mahout/clustering

2009-09-20 Thread deneche abdelhakim
The change in "examples/src/main/java/org/apache/mahout/ga/watchmaker/cd/hadoop/DatasetSplit.java" could lead to a bug. The problem is in the following modification: - rng = new MersenneTwisterRNG(split.getSeed()); + rng = RandomUtils.getRandom(); rng is supposed to use the seed given

Re : Unit Tests pretty slow?

2009-09-18 Thread deneche abdelhakim
> Are there tests that could benefit from a little > optimization to run faster? in my machine, the examples tests are very (very...) slow. Those tests are related to watchmaker integration (my code =P ). I wrote them a year ago and I think that they can be made faster --- En date de : Ven 18.9

Re: Updating the Web site

2009-09-16 Thread deneche abdelhakim
Grant Ingersoll wrote: > > > Now when I did a forrest clean I get the same error. > > > > On Sep 16, 2009, at 9:44 AM, deneche abdelhakim > wrote: > > > >> 'forrest site' gives me: > >> > ***

Re: Updating the Web site

2009-09-16 Thread deneche abdelhakim
écrit : > De: Grant Ingersoll > Objet: Re: Updating the Web site > À: mahout-dev@lucene.apache.org > Date: Mercredi 16 Septembre 2009, 15h35 > What's the full log say? > > On Sep 16, 2009, at 7:15 AM, deneche abdelhakim wrote: > > > forest is installed in

Re: Updating the Web site

2009-09-16 Thread deneche abdelhakim
ns to write on the forrest  > install.  I believe Forrest downloads stuff to its > directories.  I  > recall seeing similar things.  Very annoying. > > On Sep 15, 2009, at 7:12 AM, deneche abdelhakim wrote: > > > I'm already using Java 1.5 ! > > > > ---

Re: JIRA permission ?

2009-09-15 Thread deneche abdelhakim
Thanks! --- En date de : Mar 15.9.09, Isabel Drost a écrit : > De: Isabel Drost > Objet: Re: JIRA permission ? > À: mahout-dev@lucene.apache.org > Date: Mardi 15 Septembre 2009, 17h23 > On Tue, 15 Sep 2009 14:52:28 + > (GMT) > deneche abdelhakim > wrote: > &

JIRA permission ?

2009-09-15 Thread deneche abdelhakim
now that I'm a committer ( 8D ) I suppose I can assign JIRA issues to myself. Do I need a special permission to do that ? because I'm not able to find a way to do it =P

Re: Updating the Web site

2009-09-15 Thread deneche abdelhakim
.5 for it and it should  > work. > > On Sep 15, 2009, at 6:24 AM, deneche abdelhakim wrote: > > > I followed the instructions available here: > > > > http://cwiki.apache.org/MAHOUT/howtoupdatethewebsite.html > > > > in order to add my name to the committer

Re: Re : Welcome the newest Mahouts!

2009-09-15 Thread deneche abdelhakim
te de : Mar 15.9.09, Isabel Drost a écrit : > De: Isabel Drost > Objet: Re: Re : Welcome the newest Mahouts! > À: mahout-dev@lucene.apache.org > Date: Mardi 15 Septembre 2009, 12h29 > On Tue, 15 Sep 2009 10:11:56 + > (GMT) > deneche abdelhakim > wrote: > > > Got

Updating the Web site

2009-09-15 Thread deneche abdelhakim
I followed the instructions available here: http://cwiki.apache.org/MAHOUT/howtoupdatethewebsite.html in order to add my name to the committer list =P when running 'forrest run' but I'm getting broken links: X [0] skin/images/current.gif BROKEN: /home/hakim/apache-forrest-0.8/main/webapp/.

Re : Welcome the newest Mahouts!

2009-09-15 Thread deneche abdelhakim
learn much more. --- En date de : Mer 26.8.09, Grant Ingersoll a écrit : > De: Grant Ingersoll > Objet: Welcome the newest Mahouts! > À: mahout-u...@lucene.apache.org, "Mahout Dev List" > > Date: Mercredi 26 Août 2009, 16h57 > I am pleased to announce that the >

Re : Comprehensive study on Java Memory Optimization

2009-09-14 Thread deneche abdelhakim
Thanks Robin =D --- En date de : Lun 14.9.09, Robin Anil a écrit : > De: Robin Anil > Objet: Comprehensive study on Java Memory Optimization > À: "mahout-dev" > Date: Lundi 14 Septembre 2009, 9h08 > Hope it would be useful. > Link: > http://www.cs.virginia.edu/kim/publicity/pldi09tutorials/mem

Re : [GSOC] Code Submissions

2009-09-08 Thread deneche abdelhakim
done. --- En date de : Mar 8.9.09, Grant Ingersoll a écrit : > De: Grant Ingersoll > Objet: [GSOC] Code Submissions > À: "Mahout Dev List" > Date: Mardi 8 Septembre 2009, 13h09 > Hi Robin, David and Deneche, > > You will need to submit code samples.  Please see > http://groups.google.com/gro

Re : Standardize on Mersenne Twister RNG?

2009-09-07 Thread deneche abdelhakim
I used MerseneTwisterRNG a lot and noticed it somehow slows down the unit tests, maybe because it always tries to get a random seed from the net. --- En date de : Lun 7.9.09, Sean Owen a écrit : > De: Sean Owen > Objet: Standardize on Mersenne Twister RNG? > À: mahout-dev@lucene.apache.org > D

Re: [jira] Commented: (MAHOUT-145) PartialData mapreduce Random Forests

2009-09-06 Thread deneche abdelhakim
I'll try...may take some time but I 'll surely learn a lot (will also need a refill on my pain killers) --- En date de : Dim 6.9.09, Ted Dunning a écrit : > De: Ted Dunning > Objet: Re: [jira] Commented: (MAHOUT-145) PartialData mapreduce Random > Forests > À: mahout-dev@lucene.apache.org >

Re: build failure

2009-08-26 Thread deneche abdelhakim
just got the same error, nuking .m2 AND installing maven 2.2.1 solved the problem --- En date de : Mar 25.8.09, Ted Dunning a écrit : > De: Ted Dunning > Objet: Re: build failure > À: mahout-dev@lucene.apache.org, isa...@apache..org > Date: Mardi 25 Août 2009, 0h58 > Tried the -U solution.  N

Re: class not found bug ?

2009-08-18 Thread deneche abdelhakim
lucene.apache.org > Date: Lundi 17 Août 2009, 14h46 > > On Aug 17, 2009, at 7:36 AM, deneche abdelhakim wrote: > > > I moved recently some of the "Decision Forest" > examples from the core project to the examples project. > While in core they worked perfectly in h

class not found bug ?

2009-08-17 Thread deneche abdelhakim
I moved recently some of the "Decision Forest" examples from the core project to the examples project. While in core they worked perfectly in hadoop 0..19.1 (pseudo-distributed), but now they don't !!! For example, running my org.apache.mahout.df.BuildForest gives the following exception:

Re : Error building Mahout

2009-07-23 Thread deneche abdelhakim
maven 2.1.0 deleting the local repository solves the problems, just hopes I wont have to do it often - Message d'origine De : Grant Ingersoll À : mahout-dev@lucene.apache.org Envoyé le : Mercredi, 22 Juillet 2009, 19h42mn 04s Objet : Re: Error building Mahout What version of Mvn? W

Re : Error building Mahout

2009-07-22 Thread deneche abdelhakim
I'm getting it too when building from the base directory - Message d'origine De : Robin Anil À : mahout-dev Envoyé le : Mercredi, 22 Juillet 2009, 19h15mn 38s Objet : Error building Mahout I am getting this error on building mahout. mvn clean install -e take a look at the debug outpu

Re: [jira] Commented: (MAHOUT-140) In-memory mapreduce Random Forests

2009-07-18 Thread deneche abdelhakim
Actually, I'm not used any reducer at all, the output of the mappers is collected and handled by the main program after the end of the job. Running the job with 10 map tasks in a 10 instances (c1.medium) cluster takes 0h 11m 39s 209, speculative execution is on so 12 map tasks have been launche

Re : [GSOC] July 6 is mid-term evaluations

2009-07-07 Thread deneche abdelhakim
The students mid-term survey is available online. I'm posting this because I almost forgot it =P --- En date de : Mer 17.6.09, Grant Ingersoll a écrit : > De: Grant Ingersoll > Objet: [GSOC] July 6 is mid-term evaluations > À: mahout-dev@lucene.apache.org > Date: Mercredi 17 Juin 2009, 15h54

Re: problems downloading lucene-analyzers

2009-07-01 Thread deneche abdelhakim
: Mar 30.6.09, Grant Ingersoll a écrit : > De: Grant Ingersoll > Objet: Re: problems downloading lucene-analyzers > À: mahout-dev@lucene.apache.org > Date: Mardi 30 Juin 2009, 15h20 > > FWIW, it works for me. > > On Jun 30, 2009, at 6:54 AM, deneche abdelhakim wrote: >

problems downloading lucene-analyzers

2009-06-30 Thread deneche abdelhakim
I'm having problems with lucene-analyzers (2.9-SNAPSHOT) dependency, because its a snapshot "mvn install" downloads a new version every day, and most of the time I got checksum failures !!! Is any body else having the same problem ? mvn -version : Maven version: 2.0.9 Java version: 1.6.0_0 OS n

Re: [GSOC] Thoughts about Random forests map-reduce implementation

2009-06-18 Thread deneche abdelhakim
nteresting to > modify the original > algorithm to build multiple trees for different portions of > the data.  That > loses some of the solidity of the original method, but > could actually do > better if the splits exposed non-stationary behavior. > > On Wed, Jun 17, 2009 at 3:

[GSOC] Thoughts about Random forests map-reduce implementation

2009-06-17 Thread deneche abdelhakim
As we talked about in the following discussion (A), I'm considering two ways to implement a distributed map-reduce builder. Given the reference implementation, the easiest implementation is the following: * the data is distributed to the slave nodes using the DistributedCache * each mapper load

Re: [jira] Updated: (MAHOUT-122) Random Forests Reference Implementation

2009-05-28 Thread deneche abdelhakim
Ok I'm going to add the tests... --- En date de : Mer 27.5.09, Ted Dunning a écrit : > De: Ted Dunning > Objet: Re: [jira] Updated: (MAHOUT-122) Random Forests Reference > Implementation > À: mahout-dev@lucene.apache.org > Date: Mercredi 27 Mai 2009, 19h38 > Tests are incredibly valuable, >

Re: [GSOC] Accepted Students

2009-04-21 Thread deneche abdelhakim
ar 21.4.09, David Hall a écrit : > De: David Hall > Objet: Re: [GSOC] Accepted Students > À: mahout-dev@lucene.apache.org > Date: Mardi 21 Avril 2009, 8h30 > On Mon, Apr 20, 2009 at 11:18 PM, > deneche abdelhakim > wrote: > > > > Hi, > > > > =D > >

[GSOC] Accepted Students

2009-04-20 Thread deneche abdelhakim
Hi, =D I've been accepted. And I'll be working on Random Forests =P Given it's my second participation, I have one advise : don't be shy to ask about anything related to your project on this list (starting from now), its the fastest way to learn about Mahout. Who else has been accepted ?

Re: [gsoc] random forests

2009-04-01 Thread deneche abdelhakim
ee your application on the GSOC web site.  > Nor on the apache wiki. > > Time is running out and I would hate to not see you in the > program.  Is it > just that I can't see the application yet? > > On Tue, Mar 31, 2009 at 1:05 PM, deneche abdelha

Re: [gsoc] random forests

2009-03-31 Thread deneche abdelhakim
Here is a draft of my proposal ** Title/Summary: [Apache Mahout] Implement parallel Random/Regression Forests Student: AbdelHakim Deneche Student e-mail: ... Student Major: Phd in Computer Science Student Degree: Master in Computer Science Student

Re: [gsoc] random forests

2009-03-30 Thread deneche abdelhakim
at would help. > > You will know much more about this after you finish the > non-parallel > implementation than either of us knows now. > > On Mon, Mar 30, 2009 at 7:24 AM, deneche abdelhakim wrote: > > > There is still one case that this approach, even > out

Re: [gsoc] random forests

2009-03-30 Thread deneche abdelhakim
e > remedied trivially. > > Another way to put this is that the key question is how > single node > computation scales with input size.  If the scaling is > relatively linear > with data size, then your approach (3) will work no matter > the data size. > If scaling shows an

Re: [gsoc] random forests

2009-03-28 Thread deneche abdelhakim
you should read in . 2a . This implementation is, relatively, "easy" given... --- En date de : Sam 28.3.09, deneche abdelhakim a écrit : > De: deneche abdelhakim > Objet: Re: [gsoc] random forests > À: mahout-dev@lucene.apache.org > Date: Samedi 28 Mars 2009, 16h14 >

Re: [gsoc] random forests

2009-03-28 Thread deneche abdelhakim
about the nose-bleed tendency between the > two methods. > > On Sat, Mar 21, 2009 at 4:46 AM, deneche abdelhakim wrote: > > > I can't find a no-nose-bleeding algorithm > > > > > -- > Ted Dunning, CTO > DeepDyve >

Re: GSoC 2009-Discussion

2009-03-24 Thread deneche abdelhakim
talking about Random Forests, I think there are two possible ways to actually implement them: The first implementation is useful when the dataset is not that big (<= 2Go perhaps) and thus can be distributed via Hadoop's DistributedCache. In this case each mapper has access to all the dataset a

Re: [gsoc] random forests

2009-03-21 Thread deneche abdelhakim
pdf> > > (these seem to be versions of the same paper). > > On Sun, Mar 15, 2009 at 1:53 AM, deneche abdelhakim wrote: > > > > > I added a page to the wiki that describes how to build > a random forest and > > how to use it to classify new cases. > > &

Re: [gsoc] random forests

2009-03-17 Thread deneche abdelhakim
or a long time is > whether the choice of > variables to use for splits is chosen once per tree or > again at each split. > > I think that the latter interpretation is actually the > correct one.  You > should check my thought. > > On Sun, Mar 15, 2009 at 1:53 AM, denech

[gsoc] random forests

2009-03-15 Thread deneche abdelhakim
I added a page to the wiki that describes how to build a random forest and how to use it to classify new cases. http://cwiki.apache.org/confluence/display/MAHOUT/Random+Forests

Re: Mahout for 1.5 JVM

2009-03-09 Thread deneche abdelhakim
The following classes uses the Deque interface, which is not available in Java 1.5: . org.apache.mahout.classifier.bayes.BayesClassifier . org.apache.mahout.classifier.cbayes.CBayesClassifier --- En date de : Lun 9.3.09, Sean Owen a écrit : > De: Sean Owen > Objet: Re: Mahout for 1.5 JVM >

Re: Google SoC 2009

2009-03-03 Thread deneche abdelhakim
Im seriously considering Random Forests (RF) as my GSoC project, they seem interesting, and judging by how often they have been suggested, they are very useful to Mahout. I found the following discussion: http://markmail.org/message/dancn3n76ken6thb that gives many useful informations about RF

Re: GSoC 2009 proposition

2009-02-26 Thread deneche abdelhakim
ople how to do this on EC2, but > the bigger focus to me should be on demoing/documenting > Mahout's capabilities, versus showing how to run Mahout > on any particular platform. > > > On Feb 26, 2009, at 9:58 AM, deneche abdelhakim wrote: > > > > > Hi, &g

GSoC 2009 proposition

2009-02-26 Thread deneche abdelhakim
Hi, Im planning to participate, again, at GSoC and I want to do it, again, with Mahout. This year, lets make Mahout run over Amazon EC2. This means building the proper AMIs, run some Mahout projects (the GA examples) over EC2, give feedback and write simple, clear How-Tos about running a Mahout

Re: Thought: offering EC2/S3-based services

2009-02-03 Thread deneche abdelhakim
It's a silly question :P but can you use Mahout without using Hadoop ? do you mean that when having one single 'multi-core" machine, one can use Mahout alone ? (Ok, that's two silly questions) --- En date de : Lun 2.2.09, Sean Owen a écrit : > De: Sean Owen > Objet: Re: Thought: offering EC2

Re: Towards 0.1

2009-01-30 Thread deneche abdelhakim
About MAHOUT-102 (https://issues.apache.org/jira/browse/MAHOUT-102), the patch is already available, is someone could just commit it. Also, I'm not able to make my patchs delete files (or directories) when applied, is it because I'm not a commiter or because I'm using TortoiseSVN ? --- En date

Re: Re : @Override annotations

2009-01-22 Thread deneche abdelhakim
nnotations > À: mahout-dev@lucene.apache.org > Date: Jeudi 22 Janvier 2009, 10h05 > I think mahout should compile with both 1.5 and 1.6. > > On Wed, Jan 21, 2009 at 11:23 PM, deneche abdelhakim > wrote: > > > Last time I tried to compile the Mahout trunk, I got a > si

Re : @Override annotations

2009-01-21 Thread deneche abdelhakim
Last time I tried to compile the Mahout trunk, I got a similar problem. In my case, I'm using Eclipse and the errors were caused by the JDK Compliance Level (in the project properties). In short, I was using JVM 1.6 JRE but with 5.0 compliance level (forgot to change it !). I found the answer i

Re: More proposed changes across code

2008-10-20 Thread deneche abdelhakim
--- En date de : Dim 19.10.08, Grant Ingersoll <[EMAIL PROTECTED]> a écrit : > De: Grant Ingersoll <[EMAIL PROTECTED]> > Objet: Re: More proposed changes across code > À: mahout-dev@lucene.apache.org > Date: Dimanche 19 Octobre 2008, 18h30 > On Oct 19, 2008, at 11:16 AM, Sean Owen wrote: > > >

Re : More proposed changes across code

2008-10-19 Thread deneche abdelhakim
> 5. BruteForceTravellingSalesman says "copyright Daniel > Dwyer" -- can > this be replaced by the standard copyright header? Oups, I tought I changed them all ! Yes you can replace it. __ Do You Yahoo!? En finir avec le spam? Yahoo! Mail vous offre

Re: Hardcoded paths in examples

2008-09-22 Thread deneche abdelhakim
AM, Karl Wettin > <[EMAIL PROTECTED]> wrote: > > Hmm, if this is test/resources, shouldn't they be > accessed using > > getResourceAsStream instead? I'll see what I can > do. > > > > 22 sep 2008 kl. 10.15 skrev Sean Owen: > > > >> Oh O

Re: Hardcoded paths in examples

2008-09-22 Thread deneche abdelhakim
> Dumb question: why does example code depend on test code? > Can this be solved by severing that dependency? It's not from the example code but from the example's test code. In this case the example's tries to access a directory (wdbc) put into test/resources. The content of test/resources is a

Re: Mahout on EC2

2008-09-21 Thread deneche abdelhakim
; once. > > > > But the same is true of the console - you won't be > able to interact > > with the program that way either. > > > > It does sound good, in any event, to separate out > Swing client code > > from the core logic. > > > > On 9/2

Re : Mahout on EC2

2008-09-20 Thread deneche abdelhakim
Sounds cool :) I'll do the TSP part, but it may take some time because I'm a bit busy (PhD's administrative stuff). There are many available large TSP benchmarks, and it seems that there is a common file format for them TSPLIB (http://www.informatik.uni-heidelberg.de/groups/comopt/software/TSP

Re : FYI Cloud Computing Resources

2008-09-03 Thread deneche abdelhakim
I came across the following competition http://www.netflixprize.com/index It's about recommender systems, so I think it's a Taste stuff. The training dataset consists of more than 100M ratings. - Message d'origine De : Josh Myer <[EMAIL PROTECTED]> À : mahout-dev@lucene.apache.org En

Re : Going to move us to Hadoop 0.18.0, Java 6

2008-08-31 Thread deneche abdelhakim
Go on, I will do my part, I just hope GA likes Java 6 :P - Message d'origine De : Sean Owen <[EMAIL PROTECTED]> À : mahout-dev@lucene.apache.org Envoyé le : Samedi, 30 Août 2008, 21h26mn 45s Objet : Re: Going to move us to Hadoop 0.18.0, Java 6 So I should hold off on committing change

Re : the Job jar file doesn't contain the core jar in it.

2008-08-17 Thread deneche abdelhakim
You should run the "job" task in the examples directory (ant job), it will generate a file (in examples/build) called ""apache-mahout-examples-0.1-dev.job", this is the jar (even if it ends with .job) that contains both the examples and the core. - Message d'origine De : Robin Anil <[

double Hadoop !

2008-07-12 Thread deneche abdelhakim
I did a fresh svn checkout of mahout and a got two hadoop jars ! hadoop-0.17.0-core.jar hadoop-0.17.1-core.jar I don't think we need both of them, do we ? :P _ Envoyez avec Yahoo! Mail. Une boite mail plus intel

Re: Mahout.GA, what comes next ?

2008-07-10 Thread deneche abdelhakim
> En date de : Jeu 10.7.08, Grant Ingersoll <[EMAIL PROTECTED]> a écrit : > > On Jul 8, 2008, at 5:56 AM, deneche abdelhakim wrote: > > > > Now that the Class Discovery (CD) example is up and running, it's > > time to think about what to do next. I alread

Mahout.GA, what comes next ?

2008-07-08 Thread deneche abdelhakim
Now that the Class Discovery (CD) example is up and running, it's time to think about what to do next. I already have some ideas, but I want to check with the community first. I see two possible ways ahead of me: A.Enhance the (CD) example  a1. handle categorical attributes  a2. generate datase

Re: Problems running the examples

2008-07-01 Thread deneche abdelhakim
I had to use an -Xmx 256m to get the tests to run without heap problems. Jeff deneche abdelhakim wrote: > I've been using Eclipse for all my testing and all just works fine. But now I want to build and test the examples using ant. I managed to modify the build.xml to generate the examples j

Problems running the examples

2008-07-01 Thread deneche abdelhakim
I've been using Eclipse for all my testing and all just works fine. But now I want to build and test the examples using ant. I managed to modify the build.xml to generate the examples job. But when I run one of the examples (for example : ...clustering.syntheticcontrol.canopy.Job) I get the foll

Re: About the Bayes TrainerDriver

2008-06-30 Thread deneche abdelhakim
In my case I used a SequenceFileOutputFormat, then I use the following code to merge, sort and read the output: ... import org.apache.hadoop.io.SequenceFile.Reader; import org.apache.hadoop.io.SequenceFile.Sorter; ...   // list all files in the output path (each reducer should genrate a differen

Re : getting started with mahout, failing tests

2008-06-21 Thread deneche abdelhakim
I just did a fresh checkout and all the tests are successfull !!! --- En date de : Sam 21.6.08, Allen Day <[EMAIL PROTECTED]> a écrit : > De: Allen Day <[EMAIL PROTECTED]> > Objet: getting started with mahout, failing tests > À: mahout-dev@lucene.apache.org > Date: Samedi 21 Juin 2008, 8h00 > Hi

Re: GSOC Mahout.GA, next steps ?

2008-06-09 Thread deneche abdelhakim
links to basic > papers there, so people that aren't familiar can go do > some background > reading. > > I will try to get to MAHOUT-56 this week, but others can > jump in and > review as well. > > -Grant > > On May 27, 2008, at 4:52 AM, deneche abde

Re: Gene Expression Programming in Mahout

2008-06-02 Thread deneche abdelhakim
doop and for that I > am reading up on > >> Hadoop. > > > > If you have any questions, feel free to ask us or post > your questions to the > > Hadoop mailinglists. > > > > > >> Could you tell me again if this fits well with > Mahou

Re: OutOfMemory Exception !

2008-06-02 Thread deneche abdelhakim
PROTECTED]> wrote: > > I'm getting the same errors upon upgrading. We > should look at the Hadoop 17 > > changes. > > > > On Jun 2, 2008, at 5:50 AM, deneche abdelhakim wrote: > > > >> I checked the last version of Mahout (rev. 662372) > and got t

Re: OutOfMemory Exception !

2008-06-02 Thread deneche abdelhakim
n Mahout ? > > > 2 jun 2008 kl. 11.50 skrev deneche abdelhakim: > > I checked the last version of Mahout (rev. 662372) and > got the > > following exception with many tests (the list of this > tests is at > > the end of this post): > > > &g

OutOfMemory Exception !

2008-06-02 Thread deneche abdelhakim
I checked the last version of Mahout (rev. 662372) and got the following exception with many tests (the list of this tests is at the end of this post): java.io.IOException: Job failed! the following message is printed in System.err : java.lang.OutOfMemoryError: Java heap space I think its som

Re: GSOC Mahout.GA, next steps ?

2008-05-29 Thread deneche abdelhakim
--- En date de : Mer 28.5.08, Ted Dunning <[EMAIL PROTECTED]> a écrit : > How about writing the population to the file and using it as > input to map-reduce directly? Evaluations that fit into a map can > obviously handle that. This is exacly what I did in [Mahout-56] > Evaluations that need t

Re: GSOC Mahout.GA, next steps ?

2008-05-28 Thread deneche abdelhakim
> Ted Dunning <[EMAIL PROTECTED]> wrote: > > Conceptually, at least, it would be good to have the option for fitness > functions to be expressed as map-reduce programs. Unfortunately, having > mappers spawn MR programs runs the real risk of dead-lock, especially on > less than grandiose clusters.

  1   2   >