Dear Dr. Wu,
I have been using your FastBit program for a few months now and have finally got to understand it a bit better. I use it in economics research looking through large real-time historical datasets. I like it a lot, and I have a Java JNI interface which we put together. The speed is really impressive :). I have structured the data into three tables, and further divided the data into a partition for each day; so 10 days data will have 30 partitions. The partition sizes are 12G, 2.6Megs, and 3.6G bytes for a days worth. Some days you have more data. Each day is represented by a three partition set. Right now to run for calculations for a day it takes about 120 minutes. I am trying to speed this up and have some questions. The code runs through the data, performing calculations for a set of N events. For each event I run a "select" on the three partitions for a day. 1) I noticed that the fileManager caches files. But it only seems to do so for the "select clause variables" and not for the "where clause variables". When I run a strace on Linux, I see it opening and then mmaping the "where clause files" again and again for each select. I have also tried to GetFile the "where clause files", but noticed that the nacc, nref and last used times are never incremented. Do the where clause variables ever hit the fileManager? I also never see a hit on a XXX.idx (index file) either. But I guess it must be using indexes somehow. I suspect my slow speed is due to the opening and closing of the same file. I can open a roFile in the fileManager, but this never gets any hits for some reason, so makes no difference to the speed. 2) In general is it better to run one big Query and suck (A) out a large chunk of data, or (B) run lots of little queries? (B) is easier to code, but would (A) be faster. I was thinking of selecting say 100 events (an Event-Chunk" into an in memory 3 table set, then do another select on this "Event-Chunk". Is there a better way to do this using Fastbit? Many thanks for your kind help and advice. Warmest regards, Mike. DISCLAIMER: This e-mail message and any attachments are intended solely for the use of the individual or entity to which it is addressed and may contain information that is confidential or legally privileged. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, copying or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately and permanently delete this message and any attachments. DISCLAIMER: This e-mail message and any attachments are intended solely for the use of the individual or entity to which it is addressed and may contain information that is confidential or legally privileged. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, copying or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately and permanently delete this message and any attachments. _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
