All, I'm new to fastbit and investigating if I can use it for a windows 7 64bit desktop application. For now I'm using ardea to convert the data to the required binary format. Most of my data are string values which could be represented/stored by a unique integer value with use of some dictionary. I understand from the documentation that the fastbit library provides this functionality for category columns. However I can't get ardea to create the dictionary files in the output directory for category columns.
- Can ardea be configured to create the dictionary files for category columns? - Are category columns represented with strings or integer values in the binary format (if a dictionary file is not created)? - What happens if a category value doesn't occur in the user supplied dictionary or the user supplied dictionary file doesn't exist at all. - Is it be possible to query the category data using a range of integer values (that represent the category values)? The reason that I ask this is that for this application it makes sense to group the category values together (on many levels for most columns) and I hope to use an ordered integer representation to improve the performance of querying those category values (see example below) while not storing additional grouping information for all category values for all records. The grouping/levels logic would then be implemented in the application logic (preferably by using the fastbit dictionary. Best regards, Matthieu Example of grouping of category values: IntegerRepresentation,Value, Level0, Level1,Level2,etc 1,Jimmy'sBoa,BoaConstrictor,Snake,Reptile 2,Kaa,BoaConstrictor,Snake,Reptile 3,Nagini,Python,Snake,Reptile 4,Winnie-the-Pooh,BrownBear,Bear,Mammal. 5,Baloo,BrownBear,Bear,Mammal Example of query: All Snakes would translate to Value>0 and Value <4 instead of Value==Jimmy'sBoa or Value==Kaa or Value==Nagini
_______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
