Hi, Can't we append in hadoop-0.20.203.0?
Regards, Jagaran ________________________________ From: Dmitriy Ryaboy <dvrya...@gmail.com> To: user@pig.apache.org Sent: Wed, 15 June, 2011 9:57:20 AM Subject: Re: Running multiple Pig jobs simultaneously on same data Yong, You can't. Hence, immutable. It's not a database. It's a write-once file system. Approaches to solve updates include: 1) rewrite everything 2) write a separate set of "deltas" into other files and join them in at read time 3) do 2, and occasionally run a "compaction" which does a complete rewrite based on existing deltas 4) write to something like HBase that handles all of this under the covers D 2011/6/15 勇胡 <yongyong...@gmail.com>: > Jon, > > If I want to modify data(insert or delete) in the HDFS, how can I do it? > From the description, I can not directly modify the data itself(update the > data), I can not append the new data to the file! How the HDFS implement the > data modification? I just feel a little bit confusion. > > Yong > 在 2011年6月15日 下午3:36,Jonathan Coveney <jcove...@gmail.com>写道: > >> Yong, >> >> Currently, HDFS does not support appending to a file. So once a file is >> created, it literally cannot be changed (although it can be deleted, I >> suppose). this lets you avoid issues where I do a SELECT * on the entire >> database, and the dba can't update a row, or other things like that. There >> are some append patches in the works but I am not sure how they handle the >> concurrency implications. >> >> Make sense? >> Jon >> >> 2011/6/15 勇胡 <yongyong...@gmail.com> >> >> > I read the link, and I just felt that the HDFS is designed for the >> > read-frequently operation, not for the write-frequently( A file >> > once created, written, and closed need not be changed.) . >> > >> > For your description (Immutable means that after creation it cannot be >> > modified.), if I understand correct, you mean that the HDFS can not >> > implement "update" semantics as same as in the database area? The write >> > operation can not directly apply to the specific tuple or record? The >> > result >> > of write operation just appends at the end of the file. >> > >> > Regards >> > >> > Yong >> > >> > 2011/6/15 Nathan Bijnens <nat...@nathan.gs> >> > >> > > Immutable means that after creation it cannot be modified. >> > > >> > > HDFS applications need a write-once-read-many access model for files. A >> > > file >> > > once created, written, and closed need not be changed. This assumption >> > > simplifies data coherency issues and enables high throughput data >> access. >> > A >> > > MapReduce application or a web crawler application fits perfectly with >> > this >> > > model. There is a plan to support appending-writes to files in the >> > future. >> > > >> > > >> > >>http://hadoop.apache.org/hdfs/docs/current/hdfs_design.html#Simple+Coherency+Model >>l >> > > >> > > Best regards, >> > > Nathan >> > > --- >> > > nat...@nathan.gs : http://nathan.gs : http://twitter.com/nathan_gs >> > > >> > > >> > > On Wed, Jun 15, 2011 at 12:58 PM, 勇胡 <yongyong...@gmail.com> wrote: >> > > >> > > > How can I understand immutable? I mean whether the HDFS implements >> lock >> > > > mechanism to obtain immutable data access when the concurrent tasks >> > > process >> > > > the same set of data or uses other strategy to implement immutable? >> > > > >> > > > Thanks >> > > > >> > > > Yong >> > > > >> > > > 2011/6/14 Bill Graham <billgra...@gmail.com> >> > > > >> > > > > Yes, this is possible. Data in HDFS is immutable and MR tasks are >> > > spawned >> > > > > in >> > > > > their own VM so multiple concurrent jobs acting on the same input >> > data >> > > > are >> > > > > fine. >> > > > > >> > > > > On Tue, Jun 14, 2011 at 11:18 AM, Pradipta Kumar Dutta < >> > > > > pradipta.du...@me.com> wrote: >> > > > > >> > > > > > Hi All, >> > > > > > >> > > > > > We have a requirement where we have to process same set of data >> (in >> > > > > Hadoop >> > > > > > cluster) by running multiple Pig jobs simultaneously. >> > > > > > >> > > > > > Any idea whether this is possible in Pig? >> > > > > > >> > > > > > Thanks, >> > > > > > Pradipta >> > > > > > >> > > > > >> > > > >> > > >> > >> >