Re: FSFS format7 and compressed XML bundles

2013-03-06 Thread Vincent Lefevre
On 2013-03-06 18:55:55 +, Julian Foad wrote: > I don't know if anything like that would be feasible.  It may be > possible in theory but too complex in practice.  The parameters we > need to extract would include such things as the Huffman coding > tables used and also parameters that influence

Re: FSFS format7 and compressed XML bundles

2013-03-06 Thread Julian Foad
Vincent Lefevre wrote: > On 2013-03-05 16:52:30 +, Julian Foad wrote: >> Vincent Lefevre wrote: > [about server-side vs client-side] [...] > Because the diff between two huge compressed files is generally huge > (unless some rsync-friendly option has been applied, when available). > So, if th

Re: FSFS format7 and compressed XML bundles

2013-03-06 Thread Magnus Thor Torfason
This is all very insightful and informative. For fun, I threw together a quick script which commits a series of extremely minor changes to a MS Word file and monitors how the repository size evolves. I then added the following lines to the script to commit not the original Word file but an unzip

Re: FSFS format7 and compressed XML bundles

2013-03-06 Thread Vincent Lefevre
On 2013-03-05 16:52:30 +, Julian Foad wrote: > Vincent Lefevre wrote: [about server-side vs client-side] > > But even if there would be no problems with the > > construction/reconstruction, it would be a bad solution, IMHO. > > Indeed, for a commit, it is the client that is supposed to expand >

Re: FSFS format7 and compressed XML bundles

2013-03-05 Thread Julian Foad
I (Julian Foad) wrote: > Vincent Lefevre wrote: >>  On 2013-03-05 13:30:28 +, Julian Foad wrote: >>> Vincent Lefevre wrote: On 2013-03-01 14:58:10 +, Philip Martin wrote: > A server-side solution is difficult.  Suppose the client has some > uncompressed content U which it compr

Re: FSFS format7 and compressed XML bundles

2013-03-05 Thread Julian Foad
Vincent Lefevre wrote: > On 2013-03-05 13:30:28 +, Julian Foad wrote: >> Vincent Lefevre wrote: >> > On 2013-03-01 14:58:10 +, Philip Martin wrote: >> >>  A server-side solution is difficult.  Suppose the client has some >> >>  uncompressed content U which it compresses to C and sends

Re: FSFS format7 and compressed XML bundles

2013-03-05 Thread Vincent Lefevre
Hi Julian, On 2013-03-05 13:30:28 +, Julian Foad wrote: > Vincent Lefevre wrote: > > > On 2013-03-01 14:58:10 +, Philip Martin wrote: > >> A server-side solution is difficult.  Suppose the client has some > >> uncompressed content U which it compresses to C and sends to the server. > >>

Re: FSFS format7 and compressed XML bundles

2013-03-05 Thread Julian Foad
Vincent Lefevre wrote: > On 2013-03-01 14:58:10 +, Philip Martin wrote: >> A server-side solution is difficult.  Suppose the client has some >> uncompressed content U which it compresses to C and sends to the server. >> The server can uncompress C to get U but unless the compression scheme

Re: FSFS format7 and compressed XML bundles

2013-03-05 Thread Vincent Lefevre
On 2013-03-01 14:58:10 +, Philip Martin wrote: > A server-side solution is difficult. Suppose the client has some > uncompressed content U which it compresses to C and sends to the server. > The server can uncompress C to get U but unless the compression scheme > has a canonical compressed for

Re: FSFS format7 and compressed XML bundles

2013-03-01 Thread Ben Reser
On Fri, Mar 1, 2013 at 5:49 PM, Julian Foad wrote: > No, that's not true. I think the article Ben read was inaccurate. The > '--rsyncable' option doesn't reset the compression after a fixed number of > bytes, but rather at every point where a rolling checksum of the last N bytes > leading up

Re: FSFS format7 and compressed XML bundles

2013-03-01 Thread Julian Foad
> Vincent Lefevre wrote: >> Ben Reser wrote: >>> It resets the compression algorithm every 1000 bytes and thus makes >>> blocks that can be saved between revisions of the file. >> >> Wouldn't this work only when data are appended to the file? >> If data are inserted or deleted, this would cha

Re: FSFS format7 and compressed XML bundles

2013-03-01 Thread Ben Reser
On Thu, Feb 28, 2013 at 12:08 PM, Stefan Fuhrmann wrote: > ZIP - in contrast to .tar.gz - compresses each of these files > individually and then mainly concatenates them into the > result file. As long as you don't change the template or > any of the existing pictures, for instance, larger parts o

Re: FSFS format7 and compressed XML bundles

2013-03-01 Thread Ben Reser
On Fri, Mar 1, 2013 at 5:44 AM, Vincent Lefevre wrote: > Wouldn't this work only when data are appended to the file? > If data are inserted or deleted, this would change the block > boundaries. Instead of fixed-length blocks, I'd rather see > boundaries based on the file contents. That's true, th

Re: FSFS format7 and compressed XML bundles

2013-03-01 Thread Ben Reser
On Fri, Mar 1, 2013 at 6:30 AM, Vincent Lefevre wrote: > On 2013-03-01 14:24:07 +, Philip Martin wrote: >> $ gzip --help | grep rsync >> --rsyncable Make rsync-friendly archive >> $ dpkg -s gzip | grep Version >> Version: 1.5-1.1 > > OK, then it seems that its man page is out-of-date.

Re: FSFS format7 and compressed XML bundles

2013-03-01 Thread Philip Martin
Julian Foad writes: > Yes, a client-side plug-in -- either to Subversion or to OpenOffice -- > seems to me the best practical solution. A server-side solution is difficult. Suppose the client has some uncompressed content U which it compresses to C and sends to the server. The server can uncomp

Re: FSFS format7 and compressed XML bundles

2013-03-01 Thread Vincent Lefevre
On 2013-03-01 14:24:07 +, Philip Martin wrote: > $ gzip --help | grep rsync > --rsyncable Make rsync-friendly archive > $ dpkg -s gzip | grep Version > Version: 1.5-1.1 OK, then it seems that its man page is out-of-date. -- Vincent Lefèvre - Web: 100% access

Re: FSFS format7 and compressed XML bundles

2013-03-01 Thread Philip Martin
Vincent Lefevre writes: > On 2013-02-28 10:58:07 -0800, Ben Reser wrote: >> Speaking with Julian here at ApacheCon he mentioned that gzip has a >> rsyncable option. Looking into this turns out that there is a patch >> applied to Debian's gzip that provides this option. > > I can't see such an op

Re: FSFS format7 and compressed XML bundles

2013-03-01 Thread Vincent Lefevre
On 2013-02-28 10:58:07 -0800, Ben Reser wrote: > Speaking with Julian here at ApacheCon he mentioned that gzip has a > rsyncable option. Looking into this turns out that there is a patch > applied to Debian's gzip that provides this option. I can't see such an option in Debian's gzip (from unstab

Re: FSFS format7 and compressed XML bundles

2013-03-01 Thread Vincent Lefevre
On 2013-02-28 08:37:51 -0800, Ben Reser wrote: > 3) You'd be saving storage at the expense of using time (read: CPU) on > every client that's working with those files when checking out. So > the end result may be worse than the current problem. But storage is permanent, while a checkout may occur

RE: FSFS format7 and compressed XML bundles

2013-02-28 Thread Bert Huijben
> -Original Message- > From: Julian Foad [mailto:[email protected]] > Sent: donderdag 28 februari 2013 20:54 > To: Ben Reser > Cc: Magnus Thor Torfason; Subversion Development > Subject: Re: FSFS format7 and compressed XML bundles > > Ben Reser wro

Re: FSFS format7 and compressed XML bundles

2013-02-28 Thread Stefan Fuhrmann
On Thu, Feb 28, 2013 at 5:04 PM, Magnus Thor Torfason < [email protected]> wrote: > Hey all, > Sorry that I have to disagree with what most people said. I guess, Mark got closed to the what the current intend is. I've been following the discussion about FSFS format7, and had a question: > I

Re: FSFS format7 and compressed XML bundles

2013-02-28 Thread Julian Foad
Ben Reser wrote: > Speaking with Julian here at ApacheCon he mentioned that gzip has a > rsyncable option.  Looking into this turns out that there is a patch > applied to Debian's gzip that provides this option.  It resets the > compression algorithm every 1000 bytes and thus makes blocks that can

Re: FSFS format7 and compressed XML bundles

2013-02-28 Thread Mark Phippard
On Thu, Feb 28, 2013 at 1:45 PM, Ben Reser wrote: > On Thu, Feb 28, 2013 at 8:28 AM, Mark Phippard wrote: > > FWIW, the Branch Readme does imply he intends to work on some things that > > might have an impact here. > I pasted the contents of the readme merely to point out that he indicates that

Re: FSFS format7 and compressed XML bundles

2013-02-28 Thread Ben Reser
On Thu, Feb 28, 2013 at 8:37 AM, Ben Reser wrote: > I just don't see this happening unless someone has a very clever idea > that I haven't thought of. Speaking with Julian here at ApacheCon he mentioned that gzip has a rsyncable option. Looking into this turns out that there is a patch applied t

Re: FSFS format7 and compressed XML bundles

2013-02-28 Thread Ben Reser
On Thu, Feb 28, 2013 at 8:28 AM, Mark Phippard wrote: > FWIW, the Branch Readme does imply he intends to work on some things that > might have an impact here. Specifically: > > TxDelta v2 > -- > > Version 1 of txdelta turns out to be limited in its effectiveness for > larger files when da

Re: FSFS format7 and compressed XML bundles

2013-02-28 Thread Ben Reser
On Thu, Feb 28, 2013 at 8:04 AM, Magnus Thor Torfason wrote: > I've been following the discussion about FSFS format7, and had a question: > Is there any chance that the format would improve storage efficiency for > documents that are stored as compressed (zipped) bundles of XML files and > other r

Re: FSFS format7 and compressed XML bundles

2013-02-28 Thread Mark Phippard
On Thu, Feb 28, 2013 at 11:25 AM, Branko Čibej wrote: > On 28.02.2013 08:04, Magnus Thor Torfason wrote: > > Hey all, > > > > I've been following the discussion about FSFS format7, and had a > > question: Is there any chance that the format would improve storage > > efficiency for documents that

Re: FSFS format7 and compressed XML bundles

2013-02-28 Thread Branko Čibej
On 28.02.2013 08:04, Magnus Thor Torfason wrote: > Hey all, > > I've been following the discussion about FSFS format7, and had a > question: Is there any chance that the format would improve storage > efficiency for documents that are stored as compressed (zipped) > bundles of XML files and other r