You cant compress data more than 10%, and thats music mp3, or a 1 bit adc if your technically clever. Text will never get below the point where it cant be lossless anymore. Compression is a guaranteed failure if you want something more than a sensical entropic limit.
But, there is an answer, but it doesnt involve a dataset, it involves procedural generation right? ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T8d109c89dd30f9b5-Me6c56faea408fff13c4fcaa6 Delivery options: https://agi.topicbox.com/groups/agi/subscription
