Yeah, you are right about the google fs. I have also heard from this list that some people are planning on adding the append functionality to Hadoop, but it's just not there yet. I am not sure why.
Perhaps my "inefficient" comment was premature. The term logging stuck in my head and I have preconceived ideas of what you are doing. I am thinking that continuously writing extremely small chucks to a distributed file system would cause a lot of latency that would probably slow your system down considerably. But again, I am not sure of your situation. As for the way hadoop is now, you would have to "copyFromLocal", which probably sucks in your situation. I can understand your pain in this area. Anyone else have any ideas? On 6/13/07, Phantom <[EMAIL PROTECTED]> wrote:
Hmm I was under the impression that HDFS is like GFS optimized for appends although GFS supports random writes. So let's say I want to process logs using Hadoop. The only way I can do it is to move the entire log into Hadoop from some place else and then perhaps run Map/Reduce jobs against it. It seems to kind defeat the purpose. Am I missing something ? Thanks A On 6/13/07, Briggs <[EMAIL PROTECTED]> wrote: > > No appending, AFAIK. Hadoop is not intended for writing in this way. > It's more of a write few read many system. Such granular writes would > be inefficient. > > On 6/13/07, Phantom <[EMAIL PROTECTED]> wrote: > > Hi > > > > Can this only be done for read only and write only mode ? How do I do > > appends ? Because if I am using this for writing logs then I would want > to > > append to the file rather overwrite which is what the write only mode is > > doing. > > > > Thanks > > A > > > > > -- > "Conscious decisions by conscious minds are what make reality real" >
-- "Conscious decisions by conscious minds are what make reality real"
