On 3 Sep., 17:14, Barney <[email protected]> wrote: > Is it realistic to use HashSet to determine if a large amount of > string data (2 000 000 strings of length 20) is composed of unique > entry ?
i needed something like this recently, i used a radix tree data structure to store all strings. quite space-saving. stored 3M customer names, adresses in memory. was no problem memory-wise. there is a practical implementation over at http://code.google.com/p/radixtree/ while building up the radix tree you can check if you have any duplication easily. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "The Java Posse" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/javaposse?hl=en -~----------~----~----~----~------~----~------~--~---
