Natural language is ambiguous at every level including tokens. Is "someone" one word or two? Language models handle this by mixing the predictions given by the contexts "some", "one", and "someone".
Using fixed dictionaries is a compromise that reduces accuracy for reducing computation, like all tradeoffs in data compressors. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M44b0fc5b236911fe9a971c6d Delivery options: https://agi.topicbox.com/groups/agi/subscription