Ming Zhang wrote:
>On Sat, 2005-11-12 at 20:56 -0600, David Masover wrote: > > >>Ming Zhang wrote: >> >> >>>On Sat, 2005-11-12 at 15:46 -0600, David Masover wrote: >>> >>> >>> >>>>Ming Zhang wrote: >>>> >>>> >>>> >>>>>On Fri, 2005-11-11 at 16:56 -0800, Peter van Hardenberg wrote: >>>>> >>>>> >>>>> >>>>> >>>>>>On November 11, 2005 05:59 am, John Gilmore wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>>Does anybody remember GoBack? It was a versioning >>>>>>>system for windows 95/98 that was incredibly flexible and useful. Tracked >>>>>>>all changes to the whole disk. Old versions of a file? no problem. grab >>>>>>>an >>>>>>>old version of a directory for referance temporarily? easy. Got a virus? >>>>>>>revert the whole HD, and then grab the newer copies of your documents and >>>>>>>saved games as needed. >>>>>>> >>>>>>> >>>>>>My thoughts on this: >>>>>> >>>>>>The versioning would be an audit plugin. When the file is modified, tag >>>>>>the >>>>>>current version, copy it into a sub-directory (oh, I don't know, say >>>>>>file/.revisions/<number/date>), and disable write access to it. You might >>>>>>not >>>>>>even need extended filesystem attributes for this, but they would be >>>>>>handy >>>>>>for tagging particular versions. >>>>>> >>>>>> >>>>>if a file is opened, modified 2 times, then closed. u will only generate >>>>>1 version right? so "When the file is modified" is inaccurate. >>>>> >>>>> >>>>How about "When the transaction was completed?" Why does it matter? >>>> >>>> >>>then how u define a transaction? i mean we first need to choose a good >>>event/period to define what is a good meaningful version. >>> >>> >>> >>> >>> >>>>>>Copy-on-write would make this action extremely cheap, only adding a >>>>>>couple of >>>>>>extra writes to make it work. >>>>>> >>>>>> >>>>>add 1 line at the beginning of a 100MB text file will make this uncheap. >>>>> >>>>> >>>>Who has to work with 100 meg text files? And why has this person not >>>>broken them down into 100 kilobyte text files? Storage efficiency isn't >>>>really an issue there... >>>> >>>> >>>yes, 100MB/s text file is an extreme example, but a common case can be u >>>delete 1 frame in a streaming media file. >>> >>> >>What do you mean by "streaming"? (To me, "streaming media" usually >>means "over the Internet", which makes no sense here.) >> >> > >what i mean is frame is independent from each other, so when u delete >one frame, other frame data keep unchanged, like change ABCDEFG from >ACDEFG. > > > > >>>basically, a cow is not good >>>for a data shift situation. u have >99% data unchanged, just their >>>offset in file is changed. this lead to all blocks changed, then COW >>>will need to copy a lot. >>> >>> >>When do you have a data shift situation where this is significant enough >>to impact COW, but not significant enough to affect normal performance? >> >>As far as I know, *nix has no way to append to the beginning of a file, >>so if you're editing a large video file, say several gigs of DVD, you >>have to write out several gigs worth of data all over again because you >>want it shifted. >> >> > >yes, this is also what i know. thanks for u analysis, i now agree that >COW should be ok for this case, considering the overhead. > >but another issue about COW is that when u have lots of versions, any >write to original data will lead a lot of new writings to these COW >storage. > >any place i can find document about how to write a plugin for reiser? >sounds like interesting. :P > >ming > > > >>The filesystem may eventually provide more intelligent ways of messing >>with a file, and the COW system should be able to handle when a program >>appends to or chops off the beginning of a file. >> >>Until then, we can rely somewhat on programs optimizing for speed -- >>rather than rewrite several gigs, it could split the file into smaller >>files (thus, only the file which was changed is copied), or make it a >>sort of mini-FS in that it fragments the logical structure of the file >>so that it writes as little as possible -- for instance, inserting a >>clip in the middle might write to the end of a "project" file, instead >>of shifting half of that file over first. You'd keep versions of the >>project file, not the stream (properly defragmented) you'd export when >>you're done. >> >>For cases where developers didn't have to deal with the speed issues, we >>don't have to worry about it. In the case of audio editing, if it's >>actually messing with the sound itself, no COW in the world will catch >>that. If it's a mixing/sequencing program, that's usually stored as a >>"project", accompanied by lots of little WAV files, which don't change, >>and a tiny "project" file describing how they go together, which does >>change. >> >>And for text files and office documents, the sizes just aren't usually >>enough for us to care. My biggest OpenOffice.org document probably >>isn't a hundred kilobytes, and my disk space is measured in gigabytes. >>It'd take over ten thousand revisions to fill a gig with copies of one >>of those files. Sure, we could make an Oasis plugin for OO.o to use, so >>all the contents of the document are stored as individual files, turned >>into a zipfile on demand to match the current standard -- but that's not >>worth it in the short term, and only really helps with presentations in >>the long term. >> >>Actually, while I think it'd be nice to be able to more advanced >>splicing in a file (append or delete from the beginning or middle), I >>think it's more important to come up with a sane way for a program to >>access a file as a lot of little pieces, and to have a standard way of >>serializing them for transport (email or otherwise). Kind of like XML, >>only it could be more efficient than the old model, instead of less. >> >>Like XML in that XML allows programmers to dump internal structures to a >>human-readable file without writing parsers and serializers. Move the >>serializing logic out to the FS, let it handle the performance, version >>control, and export issues. >> >> > > > > > well, frames should be handled by inheritance, because there are times you want to see them as separate objects....
