On Mon, Jun 7, 2021, 1:47 PM <[email protected]> wrote: > > https://encode.su/threads/3635-Hutter-Prize-Entry-quot-STARLIT-quot-Open-For-Comments >
I am testing it now. It should win the Hutter prize if it passes, but won't claim the top spot on the large text benchmark because of the CPU time and memory constraints. The trick is to reorder the articles in enwik9 to maximize mutual information between adjacent articles. This matters more in constrained memory contests than when the complete model can fit in RAM. Then the text is preprocessed using the dictionary from phda9 and compressed using a modified version of cmix with parts removed to meet the requirements of using 10 GB memory and 50000/(geekbench 5 score) hours. For my 2.8 GHz Lenovo i7-1165G7 with a geekbench 5 score of 1427, that's 70 hours in one thread. I expect it to take 50 hours based on smaller test files. The original cmix takes a week with 32 GB. cmix uses a LSTM neural network. The leader, NNCP, uses a transformer network on a GPU, which is not eligible for the Hutter prize. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T950e4979da3bfbbf-M540c46fbc5ccc9888122b6f6 Delivery options: https://agi.topicbox.com/groups/agi/subscription
