On Thursday, 15 March 2018 at 18:45:51 UTC, Anton Fediushin wrote:
$ dd if=test.raw | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 5.49022 s, 12.2 MB/s
67119122 # Raw files are terrible for compression
$ dd if=test.raw | ./ecoji-d | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 27.9972 s, 2.4 MB/s
32178275 # 48% improvement
$ dd if=test.raw | base64 | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 10.3381 s, 6.5 MB/s
68892893 # Pretty bad, yeah

Randomness isn't compressible. The fact that ecoji-d compresses anything above 1% shows only that there is a bug in your library:

```
$ dd if=/dev/urandom bs=4K count=16K of=test.raw
16384+0 records in
16384+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 0.373423 s, 180 MB/s

$ dd if=test.raw | ./ecoji-d | gzip -c | gzip -cd | ./ecoji-d -d > test2.raw
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 24.9523 s, 2.7 MB/s

$ wc -c test.raw test2.raw
67108864 test.raw
11185155 test2.raw
```

So definitely not the same files before and after compression/decompression. However the beginning is the same:

```
$ xxd test.raw | head
00000010: a05f c801 bf01 13c1 04a2 556a 6d79 a09c ._........Ujmy.. 00000020: 8032 523e 851d 419a b0d3 0c4f e7ba 93e1 .2R>..A....O.... 00000030: 9fdc 7c55 2645 f6e7 3f9e f5db bc92 1e29 ..|U&E..?......) 00000040: 457a a3b9 c274 3b08 6bde 486a 1798 f281 Ez...t;.k.Hj.... 00000050: 9d91 e97a f13f db8b 5d0c 114a 27be 2154 ...z.?..]..J'.!T 00000060: a9a2 3a17 36e4 9181 64f2 35b6 aa91 064d ..:.6...d.5....M 00000070: 863a ddbd 8776 f87d 3eb2 634f 12dc 6e7f .:...v.}>.cO..n. 00000080: 46c9 bc95 2620 b315 e84d 9ee4 8651 d172 F...& ...M...Q.r 00000090: 836d 7bf8 9e1c 09c3 0e10 b787 7e06 bc39 .m{.........~..9

$ xxd test2.raw | head
00000010: a05f c801 bf01 13c1 04a2 556a 6d79 a09c ._........Ujmy.. 00000020: 8032 523e 851d 419a b0d3 0c4f e7ba 93e1 .2R>..A....O.... 00000030: 9fdc 7c55 2645 f6e7 3f9e f5db bc92 1e29 ..|U&E..?......) 00000040: 457a a3b9 c274 3b08 6bde 486a 1798 f281 Ez...t;.k.Hj.... 00000050: 9d91 e97a f13f db8b 5d0c 114a 27be 2154 ...z.?..]..J'.!T 00000060: a9a2 3a17 36e4 9181 64f2 35b6 aa91 064d ..:.6...d.5....M 00000070: 863a ddbd 8776 f87d 3eb2 634f 12dc 6e7f .:...v.}>.cO..n. 00000080: 46c9 bc95 2620 b315 e84d 9ee4 8651 d172 F...& ...M...Q.r 00000090: 836d 7bf8 9e1c 09c3 0e10 b787 7e06 bc39 .m{.........~..9
```

So I think ecoji-d just truncates its input at some point.

Reply via email to