Re: Versioning Plugin

Hans Reiser Sat, 12 Nov 2005 20:23:49 -0800

Ming Zhang wrote:


>On Sat, 2005-11-12 at 20:56 -0600, David Masover wrote:
>  
>
>>Ming Zhang wrote:
>>    
>>
>>>On Sat, 2005-11-12 at 15:46 -0600, David Masover wrote:
>>>
>>>      
>>>
>>>>Ming Zhang wrote:
>>>>
>>>>        
>>>>
>>>>>On Fri, 2005-11-11 at 16:56 -0800, Peter van Hardenberg wrote:
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>>>On November 11, 2005 05:59 am, John Gilmore wrote:
>>>>>>
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>Does anybody remember GoBack? It was a versioning
>>>>>>>system for windows 95/98 that was incredibly flexible and useful. Tracked
>>>>>>>all changes to the whole disk. Old versions of a file? no problem. grab 
>>>>>>>an
>>>>>>>old version of a directory for referance temporarily? easy. Got a virus?
>>>>>>>revert the whole HD, and then grab the newer copies of your documents and
>>>>>>>saved games as needed.
>>>>>>>              
>>>>>>>
>>>>>>My thoughts on this:
>>>>>>
>>>>>>The versioning would be an audit plugin. When the file is modified, tag 
>>>>>>the 
>>>>>>current version, copy it into a sub-directory (oh, I don't know, say 
>>>>>>file/.revisions/<number/date>), and disable write access to it. You might 
>>>>>>not 
>>>>>>even need extended filesystem attributes for this, but they would be 
>>>>>>handy 
>>>>>>for tagging particular versions.
>>>>>>            
>>>>>>
>>>>>if a file is opened, modified 2 times, then closed. u will only generate
>>>>>1 version right? so "When the file is modified" is inaccurate.
>>>>>          
>>>>>
>>>>How about "When the transaction was completed?"  Why does it matter?
>>>>        
>>>>
>>>then how u define a transaction? i mean we first need to choose a good
>>>event/period to define what is a good meaningful version.
>>>
>>>
>>>
>>>      
>>>
>>>>>>Copy-on-write would make this action extremely cheap, only adding a 
>>>>>>couple of 
>>>>>>extra writes to make it work.
>>>>>>            
>>>>>>
>>>>>add 1 line at the beginning of a 100MB text file will make this uncheap.
>>>>>          
>>>>>
>>>>Who has to work with 100 meg text files?  And why has this person not
>>>>broken them down into 100 kilobyte text files?  Storage efficiency isn't
>>>>really an issue there...
>>>>        
>>>>
>>>yes, 100MB/s text file is an extreme example, but a common case can be u
>>>delete 1 frame in a streaming media file.
>>>      
>>>
>>What do you mean by "streaming"?  (To me, "streaming media" usually
>>means "over the Internet", which makes no sense here.)
>>    
>>
>
>what i mean is frame is independent from each other, so when u delete
>one frame, other frame data keep unchanged, like change ABCDEFG from
>ACDEFG.
>
>
>  
>
>>>basically, a cow is not good
>>>for a data shift situation. u have >99% data unchanged, just their
>>>offset in file is changed. this lead to all blocks changed, then COW
>>>will need to copy a lot.
>>>      
>>>
>>When do you have a data shift situation where this is significant enough
>>to impact COW, but not significant enough to affect normal performance?
>>
>>As far as I know, *nix has no way to append to the beginning of a file,
>>so if you're editing a large video file, say several gigs of DVD, you
>>have to write out several gigs worth of data all over again because you
>>want it shifted.
>>    
>>
>
>yes, this is also what i know. thanks for u analysis, i now agree that
>COW should be ok for this case, considering the overhead.
>
>but another issue about COW is that when u have lots of versions, any
>write to original data will lead a lot of new writings to these COW
>storage.
>
>any place i can find document about how to write a plugin for reiser?
>sounds like interesting. :P
>
>ming
>
>  
>
>>The filesystem may eventually provide more intelligent ways of messing
>>with a file, and the COW system should be able to handle when a program
>>appends to or chops off the beginning of a file.
>>
>>Until then, we can rely somewhat on programs optimizing for speed --
>>rather than rewrite several gigs, it could split the file into smaller
>>files (thus, only the file which was changed is copied), or make it a
>>sort of mini-FS in that it fragments the logical structure of the file
>>so that it writes as little as possible -- for instance, inserting a
>>clip in the middle might write to the end of a "project" file, instead
>>of shifting half of that file over first.  You'd keep versions of the
>>project file, not the stream (properly defragmented) you'd export when
>>you're done.
>>
>>For cases where developers didn't have to deal with the speed issues, we
>>don't have to worry about it.  In the case of audio editing, if it's
>>actually messing with the sound itself, no COW in the world will catch
>>that.  If it's a mixing/sequencing program, that's usually stored as a
>>"project", accompanied by lots of little WAV files, which don't change,
>>and a tiny "project" file describing how they go together, which does
>>change.
>>
>>And for text files and office documents, the sizes just aren't usually
>>enough for us to care.  My biggest OpenOffice.org document probably
>>isn't a hundred kilobytes, and my disk space is measured in gigabytes.
>>It'd take over ten thousand revisions to fill a gig with copies of one
>>of those files.  Sure, we could make an Oasis plugin for OO.o to use, so
>>all the contents of the document are stored as individual files, turned
>>into a zipfile on demand to match the current standard -- but that's not
>>worth it in the short term, and only really helps with presentations in
>>the long term.
>>
>>Actually, while I think it'd be nice to be able to more advanced
>>splicing in a file (append or delete from the beginning or middle), I
>>think it's more important to come up with a sane way for a program to
>>access a file as a lot of little pieces, and to have a standard way of
>>serializing them for transport (email or otherwise).  Kind of like XML,
>>only it could be more efficient than the old model, instead of less.
>>
>>Like XML in that XML allows programmers to dump internal structures to a
>>human-readable file without writing parsers and serializers.  Move the
>>serializing logic out to the FS, let it handle the performance, version
>>control, and export issues.
>>    
>>
>
>
>
>  
>
well, frames should be handled by inheritance, because there are times
you want to see them as separate objects....

Re: Versioning Plugin

Reply via email to