Re: Dangling collections in front of commons

2010-04-03 Thread Dawid Weiss
>> I'm neutral... maybe let it marinate longer in Mahout, prove it's used >> and worthwhile and such? > > Yeah, I'd tend to agree here.  Let's see if we get some contributions on it > and how it plays out for us. "Marination" is exactly my motive why I work on HPPC in separation from Mahout... On

Re: [collections] and what about 'identity'?

2010-04-03 Thread Dawid Weiss
-) > HPPC. > > We'll see who gets where first. > > --benson > > > On Fri, Apr 2, 2010 at 10:06 AM, Dawid Weiss wrote: > >> > What's the use case for needing to vary the hash function? It's one of >> > those things where I assume there are incor

Re: [collections] and what about 'identity'?

2010-04-02 Thread Dawid Weiss
> What's the use case for needing to vary the hash function? It's one of > those things where I assume there are incorrect ways to do it, and > correct ways, and among the correct ways fairly clear arguments about > which function will be better -- i.e. the object should provide the > best function

Re: Collocations docs

2010-03-17 Thread Dawid Weiss
I am by no means an expert in English, but it seems the double l is etymologically justified: ORIGIN Latin, from collocare ‘place together’. (Oxford dictionary). Dawid On Wed, Mar 17, 2010 at 4:28 PM, Drew Farris wrote: > In the corpus-linguistics sense collocation is proper although to me > i

[ANN] Carrot2 release 3.2.0.

2010-03-03 Thread Dawid Weiss
m for details. Carrot Search Labs shares some small pieces of software we created when working on Carrot2 and Lingo3G. Please see http://labs.carrotsearch.com for details and downloads. Thanks! Dawid Weiss, Stanislaw Osinski Carrot Search, i...@carrot-search.com

Re: [off-topic] Maven and SCP deploy.

2010-02-24 Thread Dawid Weiss
Thanks Ted, will try it. D. On Wed, Feb 24, 2010 at 12:09 AM, Ted Dunning wrote: > barely.  This article might help: > > > http://unitstep.net/blog/2009/05/18/resolving-log4j-1215-dependency-problems-in-maven-using-exclusions/ > > On Tue, Feb 23, 2010 at 12:20 PM, Dawid Wei

Re: [off-topic] Maven and SCP deploy.

2010-02-23 Thread Dawid Weiss
> Can you over-ride that dependency? > > On Tue, Feb 23, 2010 at 12:07 PM, Dawid Weiss wrote: > >> but apparently the problem is due to an outdated jsch > > > > > -- > Ted Dunning, CTO > DeepDyve >

Re: [off-topic] Maven and SCP deploy.

2010-02-23 Thread Dawid Weiss
e page on the Wiki. > > On Feb 23, 2010, at 3:16 AM, Dawid Weiss wrote: > >> There are many folks knowledgeable about maven on this list, so I >> thought I'd ask -- I'm trying to write a POM with scp deployment, but >> maven consistently fails for me with authentic

[off-topic] Maven and SCP deploy.

2010-02-23 Thread Dawid Weiss
There are many folks knowledgeable about maven on this list, so I thought I'd ask -- I'm trying to write a POM with scp deployment, but maven consistently fails for me with authentication errors -- this is most likely caused by an outdated (and buggy) jsch dependency (0.1.38 instead of 0.1.42). Any

Re: Mahout as TLP

2010-02-12 Thread Dawid Weiss
> 1.  We'd like to organize several subprojects we wish to introduce (Core, > NLP, Recommenders/Taste, Ports - C++, etc.) that wouldn't really fit as > Lucene subprojects. And the collections package, vectors, verification and evaluation code, potential test data sets... yes, makes sense to make

Re: ]math[ 1.5, really?

2010-02-08 Thread Dawid Weiss
> It's only some @Overrides -- those on interfaces. My vote is to > eliminate  them  from the 1.5-compatible projects. +1 from me. D.

Re: ]math[ 1.5, really?

2010-02-08 Thread Dawid Weiss
> Your experience is the reverse of mine. In maven, no javac complaints. > In eclipse, plenty-o-complaints. Ooops, had the wrong class as the test. Correct, just tried to compile this: public class OverrideIsHell { public interface A { public void a(); } public class B im

Re: ]math[ 1.5, really?

2010-02-08 Thread Dawid Weiss
>> was in the pom.xml file.  But it worked. Until like yesterday. Color me >> confused. Perhaps you had these classes compiled from the previous runs (when 1.6 flag was on)? I don't see how it could work with javac. Oh, the third option is to remove @Override; it's not that useful anyway. D.

Re: ]math[ 1.5, really?

2010-02-08 Thread Dawid Weiss
was in the pom.xml file.  But it worked. Until like yesterday. Color me > confused. > > On Feb 8, 2010 9:00 AM, "Dawid Weiss" wrote: > > I wrote a post about this a while ago. You need to use the 1.6 > compiler, but set the target to 1.5 -- this way you can keep @Overr

Re: ]math[ 1.5, really?

2010-02-08 Thread Dawid Weiss
I wrote a post about this a while ago. You need to use the 1.6 compiler, but set the target to 1.5 -- this way you can keep @Override annotations, but emit valid 1.5 code anyway. I don't know about Maven (javac), but it definitely works in Eclipse (can be set manually via project properties). D.

Re: MAHOUT-275, collections on its own two feet

2010-02-07 Thread Dawid Weiss
Hi Benson, Apologies for my latest inactivity on this -- urgent family matters in the form of a 60cm little newborn... Just to share some of my thoughts on the collections stuff. I will still wait for a numbered release of Mahout with colt and collections in -- we need this to proceed with Carrot

Re: [math] dependency issue

2010-02-07 Thread Dawid Weiss
Kill it. Shuffling can be easily done externally should somebody need it. Dawid On Sun, Feb 7, 2010 at 7:43 AM, Ted Dunning wrote: > +1 > > On Sat, Feb 6, 2010 at 5:12 PM, Jake Mannix wrote: > >> Kill it if you don't see internal use. >> >>  +1 >> >>  -jake >> >> On Feb 6, 2010 4:42 PM, "Benson

Re: [math] More hash questions

2010-01-27 Thread Dawid Weiss
I don't think this one qualifies as a dumb bug. If it makes you (Sean) feel better, this bug is still present in PCJ... and it was never fixed in the library's 10 years or so of history. Dawid On Wed, Jan 27, 2010 at 7:30 PM, Ted Dunning wrote: > I was when I coded that algorithm! > > It can def

Re: [math] More hash questions

2010-01-27 Thread Dawid Weiss
> Yep, that's a good point. It'll involve a little copy-and-paste to > implement this alternate way of looking for a slot efficiently, but > probably worth it. Depends if you're doing interleaved put/gets, but it may be, yes. I like the guard sentinel object for marking removed keys -- I think I'l

Re: [math] More hash questions

2010-01-27 Thread Dawid Weiss
Ooops, apologies, didn't analyze this condition properly, you're right, it will go past REMOVED: while (currentKey != null && (currentKey == REMOVED || !key.equals(currentKey))) { But then -- the same thing applies to put; if you don't find the key in the map and there is a removed slot on th

Re: [math] More hash questions

2010-01-27 Thread Dawid Weiss
ntKey == REMOVED || key != currentKey)) > { >      if (index < jump) { >        index += hashSize - jump; >      } else { >        index -= jump; >      } >      currentKey = keys[index]; >    } >    return index; >  } > > > On Wed, Jan 27, 2010 at 8:38

Re: [math] More hash questions

2010-01-27 Thread Dawid Weiss
The implementation in Colt is correct, it is double addressing, the value of the second hash is always relatively prime to the first one (and must not be zero). The colt's implementation can be rewritten as: const_increment = 1 + h % (m - 2); if you do a loop while (true) { slot = (slot + cons

[jira] Updated: (MAHOUT-266) Broken Sorting can result in AIOOB exception.

2010-01-25 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated MAHOUT-266: --- Attachment: AIOOBInSortingTest.java Definitely a bug. Attaching a test case from Carrot2 that causes

Re: [math] Another minor colt mystery

2010-01-25 Thread Dawid Weiss
> (Out of curiosity what does the distribution have to do with it -- > what's a distribution for which something besides identity is better?) Oh, I don't know... let's say you know that your keys are always even, then you could have a hash that divides by two so that you avoid collisions in the ha

Re: Release thinking

2010-01-25 Thread Dawid Weiss
I strongly support this -- ironically, we in Carrot2 also need such a release (versioned, with Maven artefact to refer to). HPPC works more than fine for us, but portions of the code are bound to Colt and we can't easily switch all of it to HPPC yet. I'd apply that patch for sorting first though,

Re: [math] Another minor colt mystery

2010-01-25 Thread Dawid Weiss
eason. > > On Mon, Jan 25, 2010 at 10:46 AM, Sean Owen wrote: >> Dumb question, what would be better? >> >> On Jan 25, 2010 3:24 PM, "Dawid Weiss" wrote: >> >> It's consistent with standard Java library. I guess it does not matter >> much, un

Re: [math] Another minor colt mystery

2010-01-25 Thread Dawid Weiss
It's consistent with standard Java library. I guess it does not matter much, unless you have a really weird distribution of the input values. D. On Mon, Jan 25, 2010 at 4:13 PM, Benson Margulies wrote: >  Why do you think they decided that the best hash function for an int > was the int? > >  /*

Re: Sorting invariants.

2010-01-25 Thread Dawid Weiss
This looks like a bug, it's a different pattern everywhere else. See my patch. D. On Mon, Jan 25, 2010 at 4:16 PM, Benson Margulies wrote: > Well, > > I deleted my Harmony tree, I'll refetch it tonight and check. > > --benson > > > On Mon, Jan 25, 2010 at

[jira] Created: (MAHOUT-266) Broken Sorting can result in AIOOB exception.

2010-01-25 Thread Dawid Weiss (JIRA)
Reporter: Dawid Weiss Attachments: MAHOUT-266.patch The sorting condition is checked too eagerly; probably a typo while porting from Harmony (all other sorting routines have similar pattern except this one). -- This message is automatically generated by JIRA. - You can reply to this email to

[jira] Updated: (MAHOUT-266) Broken Sorting can result in AIOOB exception.

2010-01-25 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated MAHOUT-266: --- Attachment: MAHOUT-266.patch Patch that solves the issue. > Broken Sorting can result in AI

Sorting invariants.

2010-01-25 Thread Dawid Weiss
Hi Benson (and others), Say, when you moved the code from Apache Harmony, did you modify it along the way, or is it what's found in the Harmony's source code? I'm asking because we're still hitting those array out of bounds exceptions sometimes. They are tough to isolate, so I reverted to white-bo

[jira] Issue Comment Edited: (MAHOUT-264) Make mahout-math compatible with Java 1.5 (bytecode and standard library).

2010-01-22 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803346#action_12803346 ] Dawid Weiss edited comment on MAHOUT-264 at 1/22/10 8:2

[jira] Commented: (MAHOUT-264) Make mahout-math compatible with Java 1.5 (bytecode and standard library).

2010-01-21 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803346#action_12803346 ] Dawid Weiss commented on MAHOUT-264: Because these methods in java.util.Arrays

Re: Compiling mahout-math in 1.5-compatibility mode.

2010-01-21 Thread Dawid Weiss
d > >  -jake > > On Wed, Jan 20, 2010 at 10:57 AM, Dawid Weiss wrote: > >> I must have compiled to 1.5-bytecode, but using 1.6 standard library. >> There are calls to Arrays#copyOf and, as far as I can tell, it's the >> only thing there that is 1.6-specific.

Re: [math] collections cooked?

2010-01-20 Thread Dawid Weiss
I think your suggestion makes a lot of sense. D. On Wed, Jan 20, 2010 at 8:54 PM, Benson Margulies wrote: > On Wed, Jan 20, 2010 at 2:04 PM, Ted Dunning wrote: >> I think that is the brave and wise choice. > > If no one objects in the next day or so, I'll set up a patch to do the > splitting. >

[jira] Updated: (MAHOUT-264) Make mahout-math compatible with Java 1.5 (bytecode and standard library).

2010-01-20 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated MAHOUT-264: --- Attachment: MAHOUT-264.patch As far a I can tell, this patch solves this issue. You still _must_

[jira] Created: (MAHOUT-264) Make mahout-math compatible with Java 1.5 (bytecode and standard library).

2010-01-20 Thread Dawid Weiss (JIRA)
Issue Type: Wish Components: Math Reporter: Dawid Weiss Assignee: Benson Margulies Priority: Minor -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.

Re: Compiling mahout-math in 1.5-compatibility mode.

2010-01-20 Thread Dawid Weiss
I must have compiled to 1.5-bytecode, but using 1.6 standard library. There are calls to Arrays#copyOf and, as far as I can tell, it's the only thing there that is 1.6-specific. Will file a patch for this. Dawid On Wed, Jan 20, 2010 at 7:14 PM, Dawid Weiss wrote: >> Gee, I was sure s

Re: Compiling mahout-math in 1.5-compatibility mode.

2010-01-20 Thread Dawid Weiss
> Gee, I was sure something in there was using a 1.6 feature, perhaps of Arrays. Really? I recompiled it under 1.5... or so I thought... Might have been 1.6 JRE with 1.5 compatibility switch for the produced bytecode... Will look into it. Dawid

Compiling mahout-math in 1.5-compatibility mode.

2010-01-20 Thread Dawid Weiss
Hi. Is it possible to compile mahout-math in 1.5-compatibility mode? This would require adding compiler plugin rules to POM. Mahout-math does not use any of the Java 1.6-specific API, I checked. Dawid

Re: [math] collections cooked?

2010-01-20 Thread Dawid Weiss
I have integrated HPPC collections with our open source and commercial stuff, replacing PCJ. All tests pass, which is a good sign in addition to the tests already included in HPPC. The code is temporarily released in Carrot2 SVN at: https://carrot2.svn.sourceforge.net/svnroot/carrot2/labs/hppc/hpp

Re: A modest proposal for the Carrot integration

2010-01-16 Thread Dawid Weiss
f it's all right. D. > > If you think you can refine a patch to go straight into the mahout > trunk, don't let me stop you. > > > On Sat, Jan 16, 2010 at 3:48 PM, Dawid Weiss wrote: >> Have you finished with Colt? I think this is still worth completing >&

Re: A modest proposal for the Carrot integration

2010-01-16 Thread Dawid Weiss
Have you finished with Colt? I think this is still worth completing before we proceed to HPPC. Just talked to Staszek, we will move HPPC code to Carrot2 labs SVN repository (sourceforge) because we want to get rid of PCJ as soon as possible and need something versioned and sticky. I plan to make a

Re: A modest proposal for the Carrot integration

2010-01-16 Thread Dawid Weiss
> I propose a branch. Diffs from the branch to the trunk can still be > posted on the JIRA, but I think that a branch would be worthwhile in > facilitating collaboration. Do you mean -- for merging with the code I posted earlier? By the way, I've intergrated Colt from Mahout with our code base. I

[jira] Updated: (MAHOUT-253) Proposal for high performance primitive collections.

2010-01-16 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated MAHOUT-253: --- Attachment: hppc-1.0-dev.zip > Proposal for high performance primitive collecti

[jira] Created: (MAHOUT-253) Proposal for high performance primitive collections.

2010-01-16 Thread Dawid Weiss (JIRA)
: Utils Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor A proposal for template-driven collections library (lists, sets, maps, deques), with specializations for Java primitive types to save memory and increase performance. The "templates" a

Re: Collections of primitives.

2010-01-14 Thread Dawid Weiss
m that was not related to > LGPL). I think that Jake and Sean get the credit for the heavy > lifting. > > On Thu, Jan 14, 2010 at 6:52 AM, Dawid Weiss wrote: >> Oh, as a side note to Benson -- your effort on porting these COLT >> collections is appreciated from more tha

Re: Collections of primitives.

2010-01-14 Thread Dawid Weiss
http://issues.carrot2.org/browse/CARROT-614 Dawid On Thu, Jan 14, 2010 at 9:07 AM, Dawid Weiss wrote: > Let's do this, guys: I have finished the implementation of basic data > structures. I will try to merge this code with Carrot2, replacing PCJ; > this should give me an additional

Re: Collections of primitives.

2010-01-14 Thread Dawid Weiss
Let's do this, guys: I have finished the implementation of basic data structures. I will try to merge this code with Carrot2, replacing PCJ; this should give me an additional level of confidency that everything is working fine. I plan to have this step done by Friday. Then, I will make this code a

Re: Welcome Benson Marguiles as Mahout Committer

2010-01-14 Thread Dawid Weiss
Congratulations, Benson! D. On Wed, Jan 13, 2010 at 9:28 PM, Grant Ingersoll wrote: > The Lucene PMC is pleased to welcome the addition of Benson Marguiles as a > committer on Mahout.  I hope you'll join me in offering Benson a warm welcome. > > Benson, Lucene tradition is that new committers pr

Re: Collections of primitives.

2010-01-12 Thread Dawid Weiss
Thanks for the clarification and understanding of my motives, Benson. I know Trove and I know other libraries of this type -- PCJ has been our favorite so far, but it's LGPL and our persistent attempts to ask Soren Bak to distribute that code under a different license have failed. Adapters are a

Collections of primitives.

2010-01-12 Thread Dawid Weiss
Hi guys, I see Benson working really hard on converting Colt primitive collections to Mahout -- this is great effort, really, since no such library currently exists with an Apache or BSD license. I wanted to ask you if compatibility with Java Collections is something you consider crucial for a se

[jira] Commented: (MAHOUT-121) Speed up distance calculations for sparse vectors

2009-05-23 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712388#action_12712388 ] Dawid Weiss commented on MAHOUT-121: There is even a hash map implementation in Lu

Re: [jira] Created: (MAHOUT-120) Site search powered by Solr

2009-05-22 Thread Dawid Weiss
This looks very cool, Grant. Dawid Grant Ingersoll (JIRA) wrote: Site search powered by Solr --- Key: MAHOUT-120 URL: https://issues.apache.org/jira/browse/MAHOUT-120 Project: Mahout Issue Type: Improvement

Re: BI Over Petabytes: Meet Apache Mahout

2009-04-22 Thread Dawid Weiss
Nice, Jeff. Darn, I need to post a nicer picture of myself, the current one has a penitentiary feel to it ;) D.

Re: GSoC 2009-Discussion

2009-03-23 Thread Dawid Weiss
> [snip] a web crawler. By doing this, a crawler, for instance, can use the output of the classification to only follow certain links that lie on informative content parts. Is this interesting & make sense for you guys? Hi Samuel. This would be of great interest for the Nutch folks, I thi

Re: [VOTE] Mahout 0.1

2009-03-20 Thread Dawid Weiss
tar: A lone zero block at 14473 I assume you are on a Mac? I get that too, but it always seems to be fine. Nope, it's OpenSuSE (Linux), 64-bit. I've seen these warnins with gzip and bzip-compressed tar files occasionally, but they never meant anything that would indicate data corruption.

Re: [VOTE] Mahout 0.1

2009-03-20 Thread Dawid Weiss
Hi guys. Not much activity from me -- really ashamed of it, but swamped in other duties. Anyway, downloaded mahout-0.1-project.tar.bz2 and (OpenSuSE 10.3): tar -jxf *.bz2 gives a warning: tar: A lone zero block at 14473 Running mvn:install (Maven 2.0.9) hangs for a long time on one of the t

Re: Mahout for 1.5 JVM

2009-03-11 Thread Dawid Weiss
Once upon a time I used a simple (read: naive) workaround of copying all the sources with token replace "@Override" -> "", then compiling against Java 1.5 (to avoid potential class library linking problems). This was integrated in the project's ANT build process, was quite ugly, but did the jo

Re: Welcome Ted Dunning as Mahout Committer

2008-04-30 Thread Dawid Weiss
Cheers Ted, good to have you. How many projects can one man handle? You're a machine, mate! :) Dawid Grant Ingersoll wrote: Hi Mahouters, I'm pleased to announce that the Lucene PMC has elected Ted Dunning as a committer for Mahout (what, you mean he wasn't already? :-) ) As is probabl

[jira] Commented: (MAHOUT-27) Canopy/KMeans unit tests failing

2008-04-07 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586362#action_12586362 ] Dawid Weiss commented on MAHOUT-27: --- This is a Hadoop issue, really. In case the JV

Re: kMeans

2008-03-27 Thread Dawid Weiss
Hi Marko, Is it acceptable solution for Google Summer of Code? I don't think it's an acceptable project for Mahout -- Mahout goals are in large data set processing, supported by Map-Reduce. Clustering search results is usually in-memory, on-line clustering with few information sources (titl

Re: [?? Probable Spam] 答复: kMeans

2008-03-27 Thread Dawid Weiss
Carrot2 is for clustering web search results -- it's not exactly the same thing. D. shunkai.fu wrote: There is one project called Carrot2 focusing on this topic already. -邮件原件- 发件人: Marko Novakovic [mailto:[EMAIL PROTECTED] 发送时间: 2008年3月27日 7:03 收件人: mahout-dev@lucene.apache.org 主题:

Re: Demos/Tutorials

2008-03-18 Thread Dawid Weiss
This is absolutely necessary, if not for just showing off with the project, then certainly for verification of correctness of algorithms inside it. I will certainly hop in to such a subtask to the extent of my current available time resources (not much, sadly). D. Grant Ingersoll wrote: No

Re: Hama contribution, [was Re: [jira] Commented: (MAHOUT-16) Hama contrib package for the mahout]

2008-03-17 Thread Dawid Weiss
I would still wait a bit until the code we have is actually put to use. Having some real-time applications and demos is the best way to convince people the project has a future. D. Grant Ingersoll wrote: What I would do is ask on Hadoop if there is interest in making it a subproject. In do

Re: Mahout logo proposal (second refactoring)

2008-03-17 Thread Dawid Weiss
Good points, Andrzej. * it should be meaningful and acceptable for the target audience, or abstract enough that it doesn't matter. I'm not sure how the IBM-type suits would react to the beach-ball if it were to appear in the documentation of their product ;) Oh, the collar worker developers

Re: Mahout logo proposal (second refactoring)

2008-03-17 Thread Dawid Weiss
ent: Friday, February 29, 2008 12:45 AM To: mahout-dev@lucene.apache.org Subject: Re: Mahout logo proposal (second refactoring) hi, i am new to mahout-dev. shouldn't the mahout, the guy, be a bit more prominent? when this logo is in thumbnail size he would be hardly visible. On Fri, Feb

Re: Welcome Jeff Eastman as Mahout Committer

2008-03-17 Thread Dawid Weiss
Welcome Jeff, you're so fast I can't catch up with reviewing your code, not to mention writing at that pace ;) D. Grant Ingersoll wrote: It is my pleasure to welcome Jeff Eastman as a committer to the Mahout project on behalf of the Lucene PMC. Jeff has already made a number of nice cont

Re: [jira] Updated: (MAHOUT-16) Hama contrib package for the mahout

2008-03-14 Thread Dawid Weiss
I don't think you can make a branch of Apache's SVN available to non-committers (even if fine grained access level is possible to set up, you still need to log on to the branch). Working with JIRA patches is a pain, I agree, but it seems like the only sensible way to go. Another is to set up

[jira] Updated: (MAHOUT-13) Investigate Mahout jar loading

2008-03-14 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-13?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated MAHOUT-13: -- Fix Version/s: 0.1 > Investigate Mahout jar load

[jira] Resolved: (MAHOUT-13) Investigate Mahout jar loading

2008-03-14 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-13?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved MAHOUT-13. --- Resolution: Fixed Resolved in trunk. > Investigate Mahout jar load

Re: Class Loader Problem

2008-03-12 Thread Dawid Weiss
I have no problem with your proposal, but have not tested it. If the unit tests still run then go ahead and commit it. If this means we no longer need They do run fine. No need to provide the JAR name (this has been included in the patch). D.

Re: Class Loader Problem

2008-03-11 Thread Dawid Weiss
Jeff, did you have a chance to try it? Can we close this issue? D. Dawid Weiss wrote: Hi Jeff, Like I said -- it seems that this issue is actually quite trivial to solve by changing to the context class loader. See attached patch at MAHOUT-13. Please check if it works (I did some testing

Re: [jira] Commented: (MAHOUT-6) Need a matrix implementation

2008-03-11 Thread Dawid Weiss
Try turning on verbose gc. Benchmarking is a difficult stuff, really. Much harder than folks usually consider it. Looking at GC, for example, you have to be aware not only of the GC type used (and its settings), but also of the machine the benchmarking is run on. In Jason's case, if he's r

[jira] Commented: (MAHOUT-6) Need a matrix implementation

2008-03-10 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577080#action_12577080 ] Dawid Weiss commented on MAHOUT-6: -- If you're looking at collection implementati

[jira] Commented: (MAHOUT-6) Need a matrix implementation

2008-03-10 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577037#action_12577037 ] Dawid Weiss commented on MAHOUT-6: -- A quickie: 1. Make many, many rounds through the

Re: Committing was Re: [jira] Commented: (MAHOUT-10) Replace fall-through exception handlers with propagated unchecked exception.

2008-03-09 Thread Dawid Weiss
It will happen again, trust me :-) I wouldn't worry about it, though. Yeah, well, this was, hmm... let's call it Sunday negligence ;) I didn't think the point patch could affect anything else, so I failed to run unit tests before the commit; it turned out canopy clustering used some very s

Re: Committing was Re: [jira] Commented: (MAHOUT-10) Replace fall-through exception handlers with propagated unchecked exception.

2008-03-09 Thread Dawid Weiss
Definitely. You have full powers :-) I know. I committed all pending stuff -- one with some problems so you'll see multiple commits for a single issue (my fault, won't happen again). change, then ask around and/or say something like "I plan on committing in 2 days or something like that."

Re: Class Loader Problem

2008-03-09 Thread Dawid Weiss
code worked fine when running locally from Eclipse, and I only saw the failures when running on a remote cluster. Evidently, Eclipse's classpath environment is different than that of a deployed map task. Jeff -Original Message- From: Dawid Weiss [mailto:[EMAIL PROTECTED] Sent: Thursday, M

[jira] Updated: (MAHOUT-13) Investigate Mahout jar loading

2008-03-09 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-13?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated MAHOUT-13: -- Attachment: mah-13.patch - A patch changing Class.forName to context class loader, - removing all

[jira] Resolved: (MAHOUT-12) Point formatting and parsing improved (StringBuilder, no need for trailing comma).

2008-03-09 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved MAHOUT-12. --- Resolution: Fixed Implemented in trunk. > Point formatting and parsing improved (StringBuilder,

[jira] Resolved: (MAHOUT-10) Replace fall-through exception handlers with propagated unchecked exception.

2008-03-09 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved MAHOUT-10. --- Resolution: Fixed Applied to trunk. > Replace fall-through exception handlers with propaga

[jira] Commented: (MAHOUT-10) Replace fall-through exception handlers with propagated unchecked exception.

2008-03-09 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576749#action_12576749 ] Dawid Weiss commented on MAHOUT-10: --- Hey guys. What's our committing policy? Can

Re: [jira] Assigned: (MAHOUT-12) Point formatting and parsing improved (StringBuilder, no need for trailing comma).

2008-03-08 Thread Dawid Weiss
I concur that we ought to have additional Writable representations to make intra-Hadoop transfers more streamlined. This is certainly *not* too late to pursue. I would encourage you to propose a record for Point (which is in trunk) and these could be added to Vector and Matrix later (once we get

Re: Class Loader Problem

2008-03-07 Thread Dawid Weiss
BTW, the original code worked fine when running locally from Eclipse, and I only saw the failures when running on a remote cluster. Evidently, Eclipse's classpath environment is different than that of a deployed map task. Jeff -Original Message----- From: Dawid Weiss [mailto:[EMAIL P

Re: Google Summer of Code

2008-03-07 Thread Dawid Weiss
What about encouraging your students to submit their work at Mahout? Just a naive thought of mine. Those students I'm in charge of have their area of interest defined already -- too late to change it. Good idea for the future, I have been thinking about it, actually. D.

Re: [jira] Assigned: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2008-03-07 Thread Dawid Weiss
browse/MAHOUT-11?page=com.atlassian.jira. plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned MAHOUT-11: ----- Assignee: Dawid Weiss Static fields used throughout clustering code (Canopy, K-Means). -

Re: [jira] Assigned: (MAHOUT-12) Point formatting and parsing improved (StringBuilder, no need for trailing comma).

2008-03-07 Thread Dawid Weiss
move forward. Jeff -----Original Message- From: Dawid Weiss (JIRA) [mailto:[EMAIL PROTECTED] Sent: Thursday, March 06, 2008 5:01 AM To: mahout-dev@lucene.apache.org Subject: [jira] Assigned: (MAHOUT-12) Point formatting and parsing improved (StringBuilder, no need for trailing comma). [ ht

[jira] Assigned: (MAHOUT-13) Investigate Mahout jar loading

2008-03-06 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-13?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned MAHOUT-13: - Assignee: Dawid Weiss > Investigate Mahout jar load

[jira] Assigned: (MAHOUT-10) Replace fall-through exception handlers with propagated unchecked exception.

2008-03-06 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned MAHOUT-10: - Assignee: Dawid Weiss > Replace fall-through exception handlers with propagated unchec

[jira] Assigned: (MAHOUT-12) Point formatting and parsing improved (StringBuilder, no need for trailing comma).

2008-03-06 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned MAHOUT-12: - Assignee: Dawid Weiss > Point formatting and parsing improved (StringBuilder, no need

[jira] Assigned: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2008-03-06 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-11?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned MAHOUT-11: - Assignee: Dawid Weiss > Static fields used throughout clustering code (Canopy, K-Me

Re: Class Loader Problem

2008-03-06 Thread Dawid Weiss
. On Mar 6, 2008, at 5:30 AM, Dawid Weiss wrote: As a side note -- Hadoop uses the simplest trick possible to figure out the JAR location of the originating class -- it attempts to load a resource named after the class' bytecode... private static String findContainingJar(Class my_

Re: Class Loader Problem

2008-03-06 Thread Dawid Weiss
hey seem to work according to my intuition I presented earlier (thread context class loader has pointers to the invoked JAR plus all jars under lib/), there should be no need to specify jars explicitly. I even tend to think this is a headache for the future... Dawid Dawid Weiss wrote: I ch

Re: Class Loader Problem

2008-03-06 Thread Dawid Weiss
I changed the main's to pass in the location of the jar, since the ANT task puts the jar in basedir/dist. I made a comment about it on Mahout-3. The Canopy driver should do the right thing? I also did the same thing w/ the k-means. I honestly don't think the JAR file must be specified

[jira] Updated: (MAHOUT-12) Point formatting and parsing improved (StringBuilder, no need for trailing comma).

2008-03-06 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated MAHOUT-12: -- Attachment: mah-12.patch Patch implementing the change. > Point formatting and parsing impro

[jira] Created: (MAHOUT-12) Point formatting and parsing improved (StringBuilder, no need for trailing comma).

2008-03-06 Thread Dawid Weiss (JIRA)
Issue Type: Improvement Components: Clustering Affects Versions: 0.1 Reporter: Dawid Weiss Priority: Trivial Added test case to point class, improved parsing (no need to recompile the pattern all over again) and concatenation of points

[jira] Created: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2008-03-06 Thread Dawid Weiss (JIRA)
Components: Clustering Affects Versions: 0.1 Reporter: Dawid Weiss I file this as a bug, even though I'm not 100% sure it is one. In the currect code the information is exchanged via static fields (for example, distance measure and thresholds for Canopies are static field).

[jira] Updated: (MAHOUT-10) Replace fall-through exception handlers with propagated unchecked exception.

2008-03-06 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated MAHOUT-10: -- Attachment: mah-10.patch Patch replacing printStackTrace with rethrowing of a RuntimeException

[jira] Created: (MAHOUT-10) Replace fall-through exception handlers with propagated unchecked exception.

2008-03-06 Thread Dawid Weiss (JIRA)
Issue Type: Improvement Components: Clustering Affects Versions: 0.1 Reporter: Dawid Weiss Priority: Minor I am doing a belated code review. There certain issues that I would like to change, for example fall-through exception handlers like this one: try

  1   2   >