Indeed - this is a population genomic dataset with very few site patterns
relative to the size of the full dataset. Cool!
Jake
> On Apr 18, 2017, at 2:18 PM, Joe Felsenstein wrote:
>
> I would guess that the compressibility of interleaved sequences would
> be highest when the sequences are clos
Jacob Berv noted:
>
> I noticed today that the compression ratio for an interleaved phylip file
> (zip compressed) was about 84:1, (390MB uncompressed —> 4.6MB compressed)
> whereas the compression ratio for the same data non-interleaved was a much
> worse 3.4:1 (390 MB uncompressed —> 113.9 MB