On Sat, 2005-11-12 at 20:56 -0600, David Masover wrote: > Ming Zhang wrote: > > On Sat, 2005-11-12 at 15:46 -0600, David Masover wrote: > > > >>Ming Zhang wrote: > >> > >>>On Fri, 2005-11-11 at 16:56 -0800, Peter van Hardenberg wrote: > >>> > >>> > >>>>On November 11, 2005 05:59 am, John Gilmore wrote: > >>>> > >>>> > >>>>>Does anybody remember GoBack? It was a versioning > >>>>>system for windows 95/98 that was incredibly flexible and useful. Tracked > >>>>>all changes to the whole disk. Old versions of a file? no problem. grab > >>>>>an > >>>>>old version of a directory for referance temporarily? easy. Got a virus? > >>>>>revert the whole HD, and then grab the newer copies of your documents and > >>>>>saved games as needed. > >>>> > >>>>My thoughts on this: > >>>> > >>>>The versioning would be an audit plugin. When the file is modified, tag > >>>>the > >>>>current version, copy it into a sub-directory (oh, I don't know, say > >>>>file/.revisions/<number/date>), and disable write access to it. You might > >>>>not > >>>>even need extended filesystem attributes for this, but they would be > >>>>handy > >>>>for tagging particular versions. > >>> > >>> > >>>if a file is opened, modified 2 times, then closed. u will only generate > >>>1 version right? so "When the file is modified" is inaccurate. > >> > >>How about "When the transaction was completed?" Why does it matter? > > > > > > then how u define a transaction? i mean we first need to choose a good > > event/period to define what is a good meaningful version. > > > > > > > >>>>Copy-on-write would make this action extremely cheap, only adding a > >>>>couple of > >>>>extra writes to make it work. > >>> > >>> > >>>add 1 line at the beginning of a 100MB text file will make this uncheap. > >> > >>Who has to work with 100 meg text files? And why has this person not > >>broken them down into 100 kilobyte text files? Storage efficiency isn't > >>really an issue there... > > > > > > yes, 100MB/s text file is an extreme example, but a common case can be u > > delete 1 frame in a streaming media file. > > What do you mean by "streaming"? (To me, "streaming media" usually > means "over the Internet", which makes no sense here.)
what i mean is frame is independent from each other, so when u delete one frame, other frame data keep unchanged, like change ABCDEFG from ACDEFG. > > > basically, a cow is not good > > for a data shift situation. u have >99% data unchanged, just their > > offset in file is changed. this lead to all blocks changed, then COW > > will need to copy a lot. > > When do you have a data shift situation where this is significant enough > to impact COW, but not significant enough to affect normal performance? > > As far as I know, *nix has no way to append to the beginning of a file, > so if you're editing a large video file, say several gigs of DVD, you > have to write out several gigs worth of data all over again because you > want it shifted. yes, this is also what i know. thanks for u analysis, i now agree that COW should be ok for this case, considering the overhead. but another issue about COW is that when u have lots of versions, any write to original data will lead a lot of new writings to these COW storage. any place i can find document about how to write a plugin for reiser? sounds like interesting. :P ming > > The filesystem may eventually provide more intelligent ways of messing > with a file, and the COW system should be able to handle when a program > appends to or chops off the beginning of a file. > > Until then, we can rely somewhat on programs optimizing for speed -- > rather than rewrite several gigs, it could split the file into smaller > files (thus, only the file which was changed is copied), or make it a > sort of mini-FS in that it fragments the logical structure of the file > so that it writes as little as possible -- for instance, inserting a > clip in the middle might write to the end of a "project" file, instead > of shifting half of that file over first. You'd keep versions of the > project file, not the stream (properly defragmented) you'd export when > you're done. > > For cases where developers didn't have to deal with the speed issues, we > don't have to worry about it. In the case of audio editing, if it's > actually messing with the sound itself, no COW in the world will catch > that. If it's a mixing/sequencing program, that's usually stored as a > "project", accompanied by lots of little WAV files, which don't change, > and a tiny "project" file describing how they go together, which does > change. > > And for text files and office documents, the sizes just aren't usually > enough for us to care. My biggest OpenOffice.org document probably > isn't a hundred kilobytes, and my disk space is measured in gigabytes. > It'd take over ten thousand revisions to fill a gig with copies of one > of those files. Sure, we could make an Oasis plugin for OO.o to use, so > all the contents of the document are stored as individual files, turned > into a zipfile on demand to match the current standard -- but that's not > worth it in the short term, and only really helps with presentations in > the long term. > > Actually, while I think it'd be nice to be able to more advanced > splicing in a file (append or delete from the beginning or middle), I > think it's more important to come up with a sane way for a program to > access a file as a lot of little pieces, and to have a standard way of > serializing them for transport (email or otherwise). Kind of like XML, > only it could be more efficient than the old model, instead of less. > > Like XML in that XML allows programmers to dump internal structures to a > human-readable file without writing parsers and serializers. Move the > serializing logic out to the FS, let it handle the performance, version > control, and export issues.
