Yes but I won't be clustering vectors nor using backprop. I will however later store dog related words under a node though, maybe, which uses a cluster to quickly learn to relate new words.
Is the dictionary I use from cmix (which turns enwik8.txt from 100MBs to 61MBs) grouping related words? From what I can know, it just gives words shorter codes, however looking at some tests below, it seems to often give related words similar codes? This can't be good if so, it is not considering probabilities...I wonder how much the smaller codes and how much the grouping is helping my compression? and and and and then then then then mom when dad and and and and then then then then mother when father and and and and then then then then dog when cat Ô¤ Ô¤ Ô¤ Ô¤ mom ½ dad Ô¤ Ô¤ Ô¤ Ô¤ ê ½ ê¡ Ô¤ Ô¤ Ô¤ Ô¤ ò¹ ½ ò¸ after after after after mom after after after after dad after after after after after after after after mother after after after after father after after after after after after after after dog after after after after cat after after after after horse after after after after pig after after after after man after after after after after after after after food after after after after meal after after after after lunch after after after after dinner after after after after after after after after walk after after after after run after after after after jog after after after after move after after after after ¾ ¾ ¾ ¾ mom ¾ ¾ ¾ ¾ dad ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ê ¾ ¾ ¾ ¾ ê¡ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ò¹ ¾ ¾ ¾ ¾ ò¸ ¾ ¾ ¾ ¾ ò° ¾ ¾ ¾ ¾ pig ¾ ¾ ¾ ¾ ê± ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ æ¡ ¾ ¾ ¾ ¾ ÷ì› ¾ ¾ ¾ ¾ ÷é€ ¾ ¾ ¾ ¾ öÔÀ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ÑŒ ¾ ¾ ¾ ¾ ÐÉ ¾ ¾ ¾ ¾ jog ¾ ¾ ¾ ¾ а ¾ ¾ ¾ ¾ ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T192296c5c5a27230-M7d50af85c3e90222e372d498 Delivery options: https://agi.topicbox.com/groups/agi/subscription
