Yeah, you are right about the google fs.

I have also heard from this list that some people are planning on
adding the append functionality to Hadoop, but it's just not there
yet.  I am not sure why.

Perhaps my "inefficient" comment was premature.  The term logging
stuck in my head and I have preconceived ideas of what you are doing.
I am thinking that continuously writing extremely small chucks to a
distributed file system would cause a lot of latency that would
probably slow your system down considerably. But again, I am not sure
of your situation.

As for the way hadoop is now, you would have to "copyFromLocal", which
probably sucks in your situation.  I can understand your pain in this
area.

Anyone else have any ideas?


On 6/13/07, Phantom <[EMAIL PROTECTED]> wrote:
Hmm I was under the impression that HDFS is like GFS optimized for appends
although GFS supports random writes. So let's say I want to process logs
using Hadoop. The only way I can do it is to move the entire log into Hadoop
from some place else and then perhaps run Map/Reduce jobs against it. It
seems to kind defeat the purpose. Am I missing something ?

Thanks
A

On 6/13/07, Briggs <[EMAIL PROTECTED]> wrote:
>
> No appending, AFAIK.  Hadoop is not intended for writing in this way.
> It's more of a write few read many system. Such granular writes would
> be inefficient.
>
> On 6/13/07, Phantom <[EMAIL PROTECTED]> wrote:
> > Hi
> >
> > Can this only be done for read only and write only mode ? How do I do
> > appends ? Because if I am using this for writing logs then I would want
> to
> > append to the file rather overwrite which is what the write only mode is
> > doing.
> >
> > Thanks
> > A
> >
>
>
> --
> "Conscious decisions by conscious minds are what make reality real"
>



--
"Conscious decisions by conscious minds are what make reality real"

Reply via email to