The paper mentioned improving compression of enwik8 from 0.99 to 0.93 bits
per character but gives no details or citation. enwik8 is from my large
text benchmark and is the test file for the Hutter prize. The current
record is actually 1.22 bits per character and I haven't received an entry
from them. I am on the prize committee.

text8 is a clean version of enwik8 with only lowercase letters and spaces.
enwik8 is 100 MB of Wikipedia text with some XML formatting.

On Thu, Feb 14, 2019, 5:28 PM Robert Levy <[email protected] wrote:

> https://blog.openai.com/better-language-models/
>
> Impressive work. They're use the technique introduced in the "Attention Is
> All You Need" paper called "transformers".  See also:
> http://jalammar.github.io/illustrated-transformer/
> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> + delivery
> options <https://agi.topicbox.com/groups/agi/subscription> Permalink
> <https://agi.topicbox.com/groups/agi/T709a492ffd52fb84-Me16990cef63bf096e9e20452>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T709a492ffd52fb84-M35a1e191bf61bcdb8fb065f6
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to