Hi Alec, Now that you've made the changes we discussed on IRC, I've merged your work into the main repository. Thanks for the contribution!
One issue I found, however, is that these two unit tests from bloom-filters-tests.factor cause Factor to die with an 'out of memory' error on my 64-bit Mac OS X build: [ oversized-filter-params size-bloom-filter ] [ capacity-error? ] must-fail-with [ oversized-filter-params <bloom-filter> ] [ capacity-error? ] must-fail-with I've commented these tests out for now, but I'd like you to fix the problem. Could it be related to the fact that the maximum array size on 64-bit is very, very large? Slava On Thu, May 7, 2009 at 10:21 PM, Alec Berryman <[email protected]> wrote: > I implemented a Bloom filters vocab: > > git://github.com/alec/factor.git in the bloom-filters branch > > It's still a bit rough around the edges, but it's usable and has both > tests and documentation. Any feedback is appreciated; if it looks > useful, please pull it into Factor. > > On a 1.4GHz 32-bit Pentium M, I can create a filter from the ~100k words > in /usr/share/dict/words in about a second and look them all back up in > about the same amount of time. The false positive rate is ~10x what the > math predicts it should be; there are some notes in the code about how > that could be improved. > > > I have a question on error handling. If my math is right, > max-array-capacity on linux-x86-32 means that the largest bit-array I > can create is about 16MB. That's a lot of bits, but not that many. > What's the best way to signal to the user, "I can't create something > that big?" > > I see that some arrays will signal from the VM, but that doesn't look > particularly accessible for my code. The other behavior I saw was from > the bit-arrays vocab, which will effectively mod the number of bits > requested by max-array-capacity and return a surprisingly-sized array. > SBCL will yell at you if you try to store a non-fixnum into a fixnum > slot; I would find that behavior useful from Factor. > > > ------------------------------------------------------------------------------ > The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your > production scanning environment may not be a perfect world - but thanks to > Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 > Series Scanner you'll get full speed at 300 dpi even with all image > processing features enabled. http://p.sf.net/sfu/kodak-com > _______________________________________________ > Factor-talk mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/factor-talk > ------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com _______________________________________________ Factor-talk mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/factor-talk
