Re: Ecoji-d v1.0.0 is released - Base1024 using emojis 

2018-03-18 Thread Abdulhaq via Digitalmars-d-announce
On Wednesday, 14 March 2018 at 17:30:18 UTC, Anton Fediushin 
wrote:
, I'm glad to announce that ecoji-d - pure D implementation of 
ecoji encoding version 1️⃣.0️⃣.0️⃣ is finally released❗


[...]


Congratulations, it's a nice bit of fun.


Re: Ecoji-d v1.0.0 is released - Base1024 using emojis 

2018-03-18 Thread bauss via Digitalmars-d-announce

On Sunday, 18 March 2018 at 12:51:23 UTC, Anton Fediushin wrote:

On Friday, 16 March 2018 at 08:25:30 UTC, bauss wrote:
Besides your encoding isn't going to work with actual 
web-pages anyway, because your encoder doesn't have browser 
support.


Well, encoding is not *mine*, only D implementation is. What do 
you mean by "browser support"? Indeed, ecoji-d cannot be used 
on the client side, but since algorithm is simple and code is 
publically available anyone can implement decoding in 
JavaScript or any other language.




Yes, but that makes your example pointless, because having to 
decode in javascript is not exactly something that anybody in 
their sane mind would ever do with a webpage or anything like 
that anyway.




Re: Ecoji-d v1.0.0 is released - Base1024 using emojis 

2018-03-18 Thread Anton Fediushin via Digitalmars-d-announce

On Sunday, 18 March 2018 at 11:25:45 UTC, Cym13 wrote:

So I think ecoji-d just truncates its input at some point.


Indeed, there's an error somewhere. For some reason it stops 
after 7457792 bytes. I'll create an issue for that and will look 
into this later


Re: Ecoji-d v1.0.0 is released - Base1024 using emojis 

2018-03-18 Thread Anton Fediushin via Digitalmars-d-announce

On Friday, 16 March 2018 at 08:25:30 UTC, bauss wrote:
Besides your encoding isn't going to work with actual web-pages 
anyway, because your encoder doesn't have browser support.


Well, encoding is not *mine*, only D implementation is. What do 
you mean by "browser support"? Indeed, ecoji-d cannot be used on 
the client side, but since algorithm is simple and code is 
publically available anyone can implement decoding in JavaScript 
or any other language.


Sure you can encode your data and gzip it, but once it reaches 
the browser and it unzips it, then what? The browser doesn't 
know what to do with the data. You can't even use base64 for 
http headers.


Then you use client-side decoder, of course!



Re: Ecoji-d v1.0.0 is released - Base1024 using emojis 

2018-03-18 Thread Cym13 via Digitalmars-d-announce

On Thursday, 15 March 2018 at 18:45:51 UTC, Anton Fediushin wrote:

$ dd if=test.raw | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 5.49022 s, 12.2 MB/s
67119122 # Raw files are terrible for compression
$ dd if=test.raw | ./ecoji-d | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 27.9972 s, 2.4 MB/s
32178275 # 48% improvement
$ dd if=test.raw | base64 | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 10.3381 s, 6.5 MB/s
68892893 # Pretty bad, yeah


Randomness isn't compressible. The fact that ecoji-d compresses 
anything above 1% shows only that there is a bug in your library:


```
$ dd if=/dev/urandom bs=4K count=16K of=test.raw
16384+0 records in
16384+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 0.373423 s, 180 MB/s

$ dd if=test.raw | ./ecoji-d | gzip -c | gzip -cd | ./ecoji-d -d 
> test2.raw

131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 24.9523 s, 2.7 MB/s

$ wc -c test.raw test2.raw
67108864 test.raw
11185155 test2.raw
```

So definitely not the same files before and after 
compression/decompression. However the beginning is the same:


```
$ xxd test.raw | head
0010: a05f c801 bf01 13c1 04a2 556a 6d79 a09c  
._Ujmy..
0020: 8032 523e 851d 419a b0d3 0c4f e7ba 93e1  
.2R>..AO
0030: 9fdc 7c55 2645 f6e7 3f9e f5db bc92 1e29  
..|U?..)
0040: 457a a3b9 c274 3b08 6bde 486a 1798 f281  
Ez...t;.k.Hj
0050: 9d91 e97a f13f db8b 5d0c 114a 27be 2154  
...z.?..]..J'.!T
0060: a9a2 3a17 36e4 9181 64f2 35b6 aa91 064d  
..:.6...d.5M
0070: 863a ddbd 8776 f87d 3eb2 634f 12dc 6e7f  
.:...v.}>.cO..n.
0080: 46c9 bc95 2620 b315 e84d 9ee4 8651 d172  F...& 
...M...Q.r
0090: 836d 7bf8 9e1c 09c3 0e10 b787 7e06 bc39  
.m{.~..9


$ xxd test2.raw | head
0010: a05f c801 bf01 13c1 04a2 556a 6d79 a09c  
._Ujmy..
0020: 8032 523e 851d 419a b0d3 0c4f e7ba 93e1  
.2R>..AO
0030: 9fdc 7c55 2645 f6e7 3f9e f5db bc92 1e29  
..|U?..)
0040: 457a a3b9 c274 3b08 6bde 486a 1798 f281  
Ez...t;.k.Hj
0050: 9d91 e97a f13f db8b 5d0c 114a 27be 2154  
...z.?..]..J'.!T
0060: a9a2 3a17 36e4 9181 64f2 35b6 aa91 064d  
..:.6...d.5M
0070: 863a ddbd 8776 f87d 3eb2 634f 12dc 6e7f  
.:...v.}>.cO..n.
0080: 46c9 bc95 2620 b315 e84d 9ee4 8651 d172  F...& 
...M...Q.r
0090: 836d 7bf8 9e1c 09c3 0e10 b787 7e06 bc39  
.m{.~..9

```

So I think ecoji-d just truncates its input at some point.


Re: Ecoji-d v1.0.0 is released - Base1024 using emojis 

2018-03-17 Thread Manu via Digitalmars-d-announce
On 15 March 2018 at 11:45, Anton Fediushin via Digitalmars-d-announce <
digitalmars-d-announce@puremagic.com> wrote:

>
> Even though each emoji is 4 bytes long, there is a noticable difference in
> size when we are talking about larger chunks of data:
>

This doesn't make sense. For every 10 bits, you're emitting 32 bits...
you're more than tripling the size of the data.
Base64 takes 6 bits and emits 8 bits, which is a third larger. 1.333x is
smaller than 3.2x. O_o


Re: Ecoji-d v1.0.0 is released - Base1024 using emojis 

2018-03-17 Thread Faux Amis via Digitalmars-d-announce

On 2018-03-14 18:30, Anton Fediushin wrote:
, I'm glad to announce that ecoji-d - pure D implementation of ecoji 
encoding version 1️⃣.0️⃣.0️⃣ is finally released❗


What is ecoji?

Ecoji encodes data as base1024 with an emoji character set. It can be 
used instead of boring and old base64 冷冷冷.


Encoding example:

---
$ echo "Base64 is so 1999, isn't there something better?" | ecoji-d
論撚若 



Useful feature: Easy manual verification.


Re: Ecoji-d v1.0.0 is released - Base1024 using emojis 

2018-03-16 Thread Rainer Schuetze via Digitalmars-d-announce



On 15/03/2018 19:45, Anton Fediushin wrote:

$ dd if=test.raw | ./ecoji-d | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 27.9972 s, 2.4 MB/s
32178275 # 48% improvement


If you can compress random data to 52% of the original data, you should 
repeat this step until there is a single byte left.


Re: Ecoji-d v1.0.0 is released - Base1024 using emojis 

2018-03-16 Thread bauss via Digitalmars-d-announce

On Thursday, 15 March 2018 at 18:45:51 UTC, Anton Fediushin wrote:

On Thursday, 15 March 2018 at 09:32:50 UTC, bauss wrote:

Fun, but seems pretty useless in practice.


I disagree. Ecoji (base1024) has bigger character set meaning 
that it can encode more information per emoji than base64 can 
encode per character.


For example ecoji encoded "abcde" looks like this: ""
And base64 encoded one looks like this: "YWJjZGU=".

Even though each emoji is 4 bytes long, there is a noticable 
difference in size when we are talking about larger chunks of 
data:


---
$ dd if=/dev/urandom bs=4K count=16K of=test.raw
16384+0 records in
16384+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 1.90423 s, 35.2 MB/s
$ dd if=test.raw | ./ecoji-d |  wc -c
67108864 bytes (67 MB, 64 MiB) copied, 6.7699 s, 9.9 MB/s
71591534 # Size increased just by 6%
$ dd if=test.raw | base64 |  wc -c
67108864 bytes (67 MB, 64 MiB) copied, 0.750174 s, 89.5 MB/s
90655837 # 35%(!) increase in size
---

And if we move to real word scenarios, where web pages are 
gzip'ped most of the time:


---
$ dd if=test.raw | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 5.49022 s, 12.2 MB/s
67119122 # Raw files are terrible for compression
$ dd if=test.raw | ./ecoji-d | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 27.9972 s, 2.4 MB/s
32178275 # 48% improvement
$ dd if=test.raw | base64 | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 10.3381 s, 6.5 MB/s
68892893 # Pretty bad, yeah
---

So yeah, ecoji is better than base64 in everything but speed. 
Speed will be improved. Later.


If your care about size of data then you're not going to encode 
anyway.

Same goes for speed.

Besides your encoding isn't going to work with actual web-pages 
anyway, because your encoder doesn't have browser support.


Sure you can encode your data and gzip it, but once it reaches 
the browser and it unzips it, then what? The browser doesn't know 
what to do with the data. You can't even use base64 for http 
headers.


At most it could be used for email clients, since they do support 
"Content-Transfer-Encoding" but browsers don't. They only support 
"Content-Encoding" which at most can be compressions such as gzip.


Re: Ecoji-d v1.0.0 is released - Base1024 using emojis 

2018-03-15 Thread Anton Fediushin via Digitalmars-d-announce

On Thursday, 15 March 2018 at 09:32:50 UTC, bauss wrote:

Fun, but seems pretty useless in practice.


I disagree. Ecoji (base1024) has bigger character set meaning 
that it can encode more information per emoji than base64 can 
encode per character.


For example ecoji encoded "abcde" looks like this: ""
And base64 encoded one looks like this: "YWJjZGU=".

Even though each emoji is 4 bytes long, there is a noticable 
difference in size when we are talking about larger chunks of 
data:


---
$ dd if=/dev/urandom bs=4K count=16K of=test.raw
16384+0 records in
16384+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 1.90423 s, 35.2 MB/s
$ dd if=test.raw | ./ecoji-d |  wc -c
67108864 bytes (67 MB, 64 MiB) copied, 6.7699 s, 9.9 MB/s
71591534 # Size increased just by 6%
$ dd if=test.raw | base64 |  wc -c
67108864 bytes (67 MB, 64 MiB) copied, 0.750174 s, 89.5 MB/s
90655837 # 35%(!) increase in size
---

And if we move to real word scenarios, where web pages are 
gzip'ped most of the time:


---
$ dd if=test.raw | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 5.49022 s, 12.2 MB/s
67119122 # Raw files are terrible for compression
$ dd if=test.raw | ./ecoji-d | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 27.9972 s, 2.4 MB/s
32178275 # 48% improvement
$ dd if=test.raw | base64 | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 10.3381 s, 6.5 MB/s
68892893 # Pretty bad, yeah
---

So yeah, ecoji is better than base64 in everything but speed. 
Speed will be improved. Later.




Re: Ecoji-d v1.0.0 is released - Base1024 using emojis 

2018-03-15 Thread bauss via Digitalmars-d-announce
On Wednesday, 14 March 2018 at 17:30:18 UTC, Anton Fediushin 
wrote:
, I'm glad to announce that ecoji-d - pure D implementation of 
ecoji encoding version 1️⃣.0️⃣.0️⃣ is finally released❗


What is ecoji?

Ecoji encodes data as base1024 with an emoji character set. It 
can be used instead of boring and old base64 冷冷冷.


Encoding example:

---
$ echo "Base64 is so 1999, isn't there something better?" | 
ecoji-d

論撚若
---

And decoding:

---
$ echo -n "論撚若" | ecoji-d 
-d

Base64 is so 1999, isn't there something better?
---


Ecoji-d's features:

✔️ Range interface
✔️ Lazy encoding/decoding
✔️ Low memory usage
✔️ @safe and pure when possible
✔️ Many tests
✔️ Can be used as a library and as a CLI utility


API consists of just 2️⃣ functions:

 `encode`, which does encoding
 `decode`, which does decoding


Links:

 DUB package page: http://code.dlang.org/packages/ecoji-d
 GitHub repository: https://github.com/ohdatboi/ecoji-d
蘭 GitHub repository of the reference Go implementation: 
https://github.com/keith-turner/ecoji


Fun, but seems pretty useless in practice.