I received several off-list replies to this, including a pointer to previous experiments by Data 61 colleagues that concluded LZO [0] was preferable to Gzip, due to faster compression and decompression times. For the representative heap below, LZO produces a 490M output. Compression and decompression take only a couple of seconds, compared with ~40s for compression and ~9s for decompression with Gzip. LZO also has the advantage of almost zero memory overhead for both operations. These measurements are fairly coarse grained, using the default settings of the respective command line tools but I expect them to be representative.

It looks available for most platforms, but is also ANSI C and GPLed so we could include it in-tree. I do not know what people's preferences are in this area.

From my perspective, LZO seems the clear front runner for now. Please let me know if you have an alternative suggestion or if you outright object to compressing saved states.

  [0]: http://www.oberhumer.com/opensource/lzo/

On 12/02/16 09:11, Matthew Fernandez wrote:
Hi Poly folk,

I am an Isabelle/HOL user interested in reducing the on-disk size of my 
Isabelle/HOL heaps. These are read and written
via the save state functionality in Poly/ML. Saved states compress very well 
via conventional algorithms. On a
representative Isabelle/HOL heap I see the following:

     $ ls -lh
     -r--r--r-- 1 matthew matthew 1.7G Feb 10 11:22 MyHeap
     -r--r--r-- 1 matthew matthew 279M Feb 10 11:22 MyHeap.bz2
     -r--r--r-- 1 matthew matthew 331M Feb 10 11:22 MyHeap.gz
     -r--r--r-- 1 matthew matthew 173M Feb 10 11:22 MyHeap.xz

I would like to propose integrating compression into the save state 
functionality itself. The runtime of Bzip and XZ are
probably slow enough to irritate some users, but Gzip is reasonably fast for 
the benefit it shows. Gzipping saved states
would require linking against zlib or something similar.

David Matthews has suggested the save state code detect whether a state is 
compressed or not on loading and act
accordingly, which would provide a nice transition path for users with existing 
saved states.

I'm interested in hearing others' opinions about this proposal. If we feel this 
is something reasonable to do, I'll work
on implementing this. When replying to this email, please CC me directly as I'm 
not subscribed to the list.

Thanks,
Matthew
_______________________________________________
polyml mailing list
[email protected]
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

Reply via email to