Hi, Dominique and Jon, I would believe that the option of using the .int files is always more efficient than the option of reading the strings from the raw data files. In terms of memory usage, the requirement of reading the positions of the strings would be nearly as expensive as as reading the integer version of the string values (which is what is in .int files). That is to say, the new option will not use more memory than the old option in normal cases. It is possible that the string positions does not fit in memory, the existing implementation will attempt to read the string positions one at a time. This is the option that will definitely use less memory, but it would be extremely slow. As long as there is sufficient memory, we try to avoid this "low-memory" option.
Anyway, the code for utilizing .int files were incomplete, therefore, we are very happy to get the contribution from Dominique. Thanks, Dominique. John On 1/23/12 2:03 PM, Dominique Prunier wrote: > So far, i tried a ~20 values CATEGORY and a 100K values CATEGORY, > both of them where significantly faster (with the .int file) but i > haven't analyzed the memory footprint of these. I'd expect it to be > slightly higher since i'm loading first the list of keys which is > converted to the list of values. > > Thanks, > _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
