Hi,
I am a graduate student at IIIT Hyderabad, working in the field of
Information Extraction at IE Lab in IIIT (http://search.iiit.ac.in/)
I have access to a Hadoop cluster of 10 nodes. We are using this cluster
exclusively for running Nutch jobs. Most of my current work focuses on
making minor changes to Hadoop code to suit our requirements. I can use
this cluster to run tests. However this cluster will not be accessible
from outside the campus.
I am also interested in contributing code to this project. Many students
in our lab need to use machine learning tools for large sets of crawled
data (some times > 5 TB), and so this project will help us a lot.
Please let me know if I can be of any help.
Regards,
Jaideep

> Hi Jeff,
>
> does it mean you are offering an 12+node cluster for free experiments for
> Mahout project?
> If yes then what is the best way one should contact you and is there any
> formal way how one should reguest such service? Is it free "as a beer or
> as
> a speech"? ;-)
>
> I am trying to figure out the status of such service because I think I
> would
> need one in the future for some Mahout experiments.
>
> Lukas
>
> On Jan 27, 2008 7:53 PM, Jeff Eastman <[EMAIL PROTECTED]> wrote:
>
>> I don't think I can contribute much to the algorithms themselves, but
>> I've got a 12+ node Hadoop cluster and I'd be keen on helping to run
>> them on it.
>>
>> Jeff Eastman
>>
>> -----Original Message-----
>> From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
>> Sent: Sunday, January 27, 2008 10:39 AM
>> To: [email protected]
>> Subject: Machine Resources [was Re: Confluence Wiki]
>>
>>
>>
>> On Jan 25, 2008, at 11:43 PM, Mason Tang wrote:
>> >
>> > Also, is there any chance we'll be able to get a small (and I mean
>> > small) cluster to run some tests on?  Local Hadoop testing only gets
>> > you so far...
>>
>> Yeah, this type of thing is perennially a problem.  I think we will
>> have to beg/borrow/steal (just kidding on the steal).  I think the key
>> will be to get local stuff running and then start looking around for
>> resources.  Amazon EC2 is an obvious place, but short of someone
>> donating time on it, I am not sure how we would come about it.
>>
>> I don't know enough about Apache's infrastructure to know whether
>> there is enough to cobble together.  Committers can get access to
>> Lucene's zones (virtual server) machine.  I know that it is a problem
>> that Nutch faces as well, presumably.  Hadoop, luckily, is fairly well
>> supported by Yahoo! and other companies with machine access.  My hope
>> is if we can show some promise with code that runs well on single or
>> small clusters that maybe we can garner some interest from bigger
>> supporters.  And, of course, most machines are multi-core these days
>> and Hadoop can leverage that, as I understand it.
>>
>> Perhaps, if we can organize it and make sure it is secure, we can try
>> to figure out a way for the various people here to pull together our
>> resources.
>>
>> Just thinking out loud...
>>
>> -Grant
>>
>>
>>
>
>
> --
> http://blog.lukas-vlcek.com/
>
>
>



Reply via email to