Actually I made a class doing the same thing, "OccTree <http://code.botcompany.de/1003331>" (occurrence tree). With the "add" method, you can add a whole word list at once.
It was made in 2017. Do I beat you? :-D On Mon, 9 Sep 2019 at 21:01, Stefan Reich < [email protected]> wrote: > Interesting. The Huffman compression seems quite unrelated to the actual > algorithm, but I see you do that to save memory. > > Your description is not entirely clear to me. You search for _multiple_ > words at once (possibly very many without loss of speed), right? But the > system only finds full words (or maybe words with a given prefix, but not > random substrings of words) because that way the search stays O(n) in the > length of the text which you describe as "really fast". > > Searching for substrings too would make this somewhat slower (probably > O(n*m) with m being the search word length). > > Correct so far? > > I do like this, I'm always looking for ways to search through a lot of > text quickly. I usually look for only a few strings at a time though. In > fact I search my code repository many times a day by simple brute force > (~45 MB). Might be smart to add some indexing. Although in Java with > Boyer-Moore, I can get the search time down to, I think, 50 ms. > > > On Mon, 9 Sep 2019 at 03:35, <[email protected]> wrote: > >> Here's one of the things I just made. Try it out. I tested it on >> Windows. You can search for any word in a huge amount of data really fast >> as if it is a 1 word search. Runs on CPU. You can swap the 200MB in the src >> folder. Run it in Visual Studio 2019. You can edit the words I search for >> in main.cpp line 199. >> >> This is useful for when you have many many items you want to search for >> and you have a large amount of data to look through. >> >> https://www.dropbox.com/s/v9vxy1bhpogppkq/FastSearch.rar?dl=0 >> *Artificial General Intelligence List <https://agi.topicbox.com/latest>* >> / AGI / see discussions <https://agi.topicbox.com/groups/agi> + >> participants <https://agi.topicbox.com/groups/agi/members> + delivery >> options <https://agi.topicbox.com/groups/agi/subscription> Permalink >> <https://agi.topicbox.com/groups/agi/T44eb904095b7612b-Md51849a19ede6a3999e03c8e> >> > > > -- > Stefan Reich > BotCompany.de // Java-based operating systems > -- Stefan Reich BotCompany.de // Java-based operating systems ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T44eb904095b7612b-M1fa9cd6af5e615f20aadcba5 Delivery options: https://agi.topicbox.com/groups/agi/subscription
