You can use a language model to insert missing characters. Insert them
wherever it reduces the compressed size.

But for spaces, you don't even need them in your training set. Conditional
entropy is higher where you cross word boundaries. It's how babies learn to
segment continuous speech at 7-10 months, before they learn their first
word. I demonstrated this in 2000 on text without spaces.
https://cs.fit.edu/~mmahoney/dissertation/lex1.html

On Tue, Aug 31, 2021, 6:39 AM <[email protected]> wrote:

> But @Matt, he asked what if spaces are removed how it'll recognize, what
> if 't's are removed??? Ex. 'the cat ate food at night' ... 'he ca ae food a
> nigh'
>
> Can you read wha I am rying o say if I ake away all the ou of his senence
> I jus wroe
>
> ???!?!??!?!!!????
> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> +
> delivery options <https://agi.topicbox.com/groups/agi/subscription>
> Permalink
> <https://agi.topicbox.com/groups/agi/T90b7756a48658254-Mdfa1a1c06faca96875d0a4f1>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T90b7756a48658254-Ma9e1bb56af64707cec4dbfd4
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to