Hi Urun and Josh,

I'd also be interested in helping out in whatever way I can. 

One question, I've noticed that MAHOUT-334 was not ultimately adopted. Do we 
know the reason for this?

Would it be best to finish out the patch in 232, or instead add the 
functionality into the existing SGD modules as Ted suggested?

On Nov 15, 2011, at 9:46 AM, Josh Patterson <[email protected]> wrote:

> Urun,
> I've been looking at MAHOUT-232 and reading Nello Cristianini's book
> on SVMs. It sounds like you've done considerable more work than I in
> this arena. I'd be interested in collaborating with you on finishing
> out this patch, if you are interested in that type arrangement (there
> is plenty of work to do, us splitting it might be an interesting
> path), as it would be help in terms of "bandwidth" for both of us.
> 
> I can also help you get used to building hadoop, mahout, tools, etc, if 
> needed.
> 
> JP
> 
> On Tue, Nov 15, 2011 at 2:29 PM, urun dogan <[email protected]> wrote:
>> Dear Josh and Ted;
>> 
>> Both ideas are very attractive. Honestly I want to do both of them. I am
>> completely aware
>> that this quite some work to do. As I mentioned before, I am a Postdoc now
>> and I am trying
>> to develop new techniques by using AGSD. During my PhD I developed an
>> efficient solver for
>> multiclass SVMs which uses SMO based techniques. For comparing my solver
>> with others,
>> I have implemented Pegasos for a single core machine using C++.  For both
>> of the methods,
>> I have a theoretical background. Further I believe I have enough time for
>> coding these kind of techniques.
>> I will appreciate your supervisions. I think that implementing and
>> optimization an algorithm for cloud
>> computing is very different that implementing it for a workstation /
>> desktop PC. As I said ,I am willing
>> to contribute on these issues because these projects fit in my experience.
>> However, if you think that
>> this is to much work for one person, your comments are accepted. Then, if
>> you tell me the priority
>> of these two features, I will first implement the most important feature.
>> Further if nobody implements
>> the second one until I finish the first one, I will implement the second
>> one also.
>> 
>> Best regards,
>> Ürün
>> 
>> 
>> 
>> 
>> On Tue, Nov 15, 2011 at 6:34 PM, Ted Dunning <[email protected]> wrote:
>> 
>>> ASGD is also an opportunity laying on the table.
>>> 
>>> http://leon.bottou.org/projects/sgd
>>> 
>>> It would be lovely to have the current SGD system upgraded to use ASGD and
>>> allow multiple loss functions to allow SVM training as well as the current
>>> logistic regression.  I would be happy to supervise, but can't do the code
>>> right now.
>>> 
>>> On Tue, Nov 15, 2011 at 9:31 AM, Josh Patterson <[email protected]> wrote:
>>> 
>>>> Urun,
>>>> Sounds like you have quite a bit of SVM experience. There is always:
>>>> 
>>>> https://issues.apache.org/jira/browse/MAHOUT-232
>>>> 
>>>> to take a look at which involves getting SVMs going in Mahout. I've
>>>> looked at it a bit while working on some smaller patches, I'd be
>>>> interested in discussing it with you given your experience if you are
>>>> interested.
>>>> 
>>>> I can help you get a development env going if and send some tips your
>>>> way if you have any questions about getting going with developing for
>>>> Mahout.
>>>> 
>>>> Josh
>>>> 
>>>> On Mon, Nov 14, 2011 at 6:04 PM, urun dogan <[email protected]> wrote:
>>>>> Hi All;
>>>>> 
>>>>> I want to give my congratulation to all of the contributors of the
>>>> project.
>>>>> I found the idea of this project so nice and I want to contribute to
>>> the
>>>>> project.
>>>>> 
>>>>> I am postdoctoral researcher who is involved on developing machine
>>>> learning
>>>>> algorithms. During my PhD I have developed several multiclass SVM
>>>>> 
>>>>> techniques and solvers. Now I am involved in a European Union project
>>>> which
>>>>> deals with large scale machine learning problems. I have a 5-6 years of
>>>>> 
>>>>> C++ development experience and I like developing and implementing new
>>>>> machine learning techniques (Yes I know that Mahout uses Java :) , I
>>> will
>>>>> try my best) .
>>>>> 
>>>>> My main expertise are classification, regression and transfer
>>> learning. I
>>>>> have seen several open topics in http://mahout.apache.org/ and these
>>> are
>>>>> 
>>>>> 1) Locally Weighted Linear Regression
>>>>> 
>>>>> 2) Gaussian Discriminative Analysis
>>>>> 
>>>>> 3) Independent Component Analysis
>>>>> 
>>>>> 4) Principal Components Analysis
>>>>> 
>>>>> 5) Classification with Perceptron or Winnow
>>>>> 
>>>>> 6) Neural Network
>>>>> 
>>>>> I am aware that in Jira there are also some open issues. I can work on
>>>>> anything. I think that before starting
>>>>> 
>>>>> any kind of coding I need to take the comments of experts in this
>>>> project?
>>>>> What do you recommend to me to start with?
>>>>> 
>>>>> Cheers
>>>>> 
>>>>> Ueruen
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Twitter: @jpatanooga
>>>> Solution Architect @ Cloudera
>>>> hadoop: http://www.cloudera.com
>>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Twitter: @jpatanooga
> Solution Architect @ Cloudera
> hadoop: http://www.cloudera.com

Reply via email to