Dear Dr. Wu,

I have been using your FastBit program for a few months now and have finally 
got to understand it a bit better. I use it in economics research looking 
through large real-time historical datasets. I like it a lot, and I have a Java 
JNI interface which we put together. The speed is really impressive :).

I have structured the data into three tables, and further divided the data into 
a partition for each day; so 10 days data will have 30 partitions. The 
partition sizes are 12G, 2.6Megs, and 3.6G bytes for a days worth. Some days 
you have more data. Each day is represented by a three partition set.

Right now to run for calculations for a day it takes about 120 minutes. I am 
trying to speed this up and have some questions. The code runs through the 
data, performing calculations for a set of N events. For each event I run a 
"select" on the three partitions for a day.

1)      I noticed that the fileManager caches files. But it only seems to do so 
for the "select clause variables" and not for the "where clause variables". 
When I run a strace on Linux, I see it opening and then mmaping the "where 
clause files" again and again for each select. I have also tried to GetFile the 
"where clause files", but noticed that the nacc, nref and last used times are 
never incremented. Do the where clause variables ever hit the fileManager? I 
also never see a hit on a XXX.idx (index file) either. But I guess it must be 
using indexes somehow. I suspect my slow speed is due to the opening and 
closing of the same file. I can open a roFile in the fileManager, but this 
never gets any hits for some reason, so makes no difference to the speed.

2)      In general is it better to run one big Query and suck (A) out a large 
chunk of data, or (B) run lots of little queries? (B) is easier to code, but 
would (A) be faster. I was thinking of selecting say 100 events (an 
Event-Chunk" into an in memory 3 table set, then do another select on this 
"Event-Chunk". Is there a better way to do this using Fastbit?

Many thanks for your kind help and advice.

Warmest regards, Mike.

DISCLAIMER: This e-mail message and any attachments are intended solely for the 
use of the individual or entity to which it is addressed and may contain 
information that is confidential or legally privileged. If you are not the 
intended recipient, you are hereby notified that any dissemination, 
distribution, copying or other use of this message or its attachments is 
strictly prohibited. If you have received this message in error, please notify 
the sender immediately and permanently delete this message and any attachments.

DISCLAIMER: This e-mail message and any attachments are intended solely for the 
use of the individual or entity to which it is addressed and may contain 
information that is confidential or legally privileged. If you are not the 
intended recipient, you are hereby notified that any dissemination, 
distribution, copying or other use of this message or its attachments is 
strictly prohibited. If you have received this message in error, please notify 
the sender immediately and permanently delete this message and any attachments.
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to