Hi, Mike,

Thanks for your interest in FastBit.

If you have enough memory on your machine, then you can load all 
indexes into memory by calling ibis::part::loadIndexes with the second 
argument readall set to a value greater than 0.  This should prevent 
FastBit from ever attempt to read the index again.

Seems like you have quite a lot of data, therefore, you might not want 
to force all the indexes to be loaded into memory.  In this case, you 
are probably looking for an option that will memory map the whole 
index file.  This can be accomplished by setting parameter

preferMMapIndex = true

This parameter can be set in a parameter file or can be added in a 
program by calling

ibis::gParameters().add("preferMMapIndex", "true");

before performing queries.

It might be helpful to set the default option in FastBit to use memory 
map.  I will do some experiments and see how to best handle the 
default option.

I am not sure I understand your queries enough to give you any useful 
advices.  Typically, if your small queries can be combined into larger 
ones, FastBit needs to do less work such as reading files and parsing 
queries.  However, this it not always the case.  To be more specific, 
I would need to understand your queries a little better.

John


On 4/5/11 9:18 AM, Chong, Michael wrote:
>
>
> Dear Dr. Wu,
>
> I have been using your FastBit program for a few months now and
> have finally got to understand it a bit better. I use it in
> economics research looking through large real-time historical
> datasets. I like it a lot, and I have a Java JNI interface which we
> put together. The speed is really impressive :).
>
> I have structured the data into three tables, and further divided
> the data into a partition for each day; so 10 days data will have
> 30 partitions. The partition sizes are 12G, 2.6Megs, and 3.6G bytes
> for a days worth. Some days you have more data. Each day is
> represented by a three partition set.
>
> Right now to run for calculations for a day it takes about 120
> minutes. I am trying to speed this up and have some questions. The
> code runs through the data, performing calculations for a set of N
> events. For each event I run a "select" on the three partitions for
> a day.
>
> 1)      I noticed that the fileManager caches files. But it only
> seems to do so for the "select clause variables" and not for the
> "where clause variables". When I run a strace on Linux, I see it
> opening and then mmaping the "where clause files" again and again
> for each select. I have also tried to GetFile the "where clause
> files", but noticed that the nacc, nref and last used times are
> never incremented. Do the where clause variables ever hit the
> fileManager? I also never see a hit on a XXX.idx (index file)
> either. But I guess it must be using indexes somehow. I suspect my
> slow speed is due to the opening and closing of the same file. I
> can open a roFile in the fileManager, but this never gets any hits
> for some reason, so makes no difference to the speed.
>
> 2)      In general is it better to run one big Query and suck (A)
> out a large chunk of data, or (B) run lots of little queries? (B)
> is easier to code, but would (A) be faster. I was thinking of
> selecting say 100 events (an Event-Chunk" into an in memory 3 table
> set, then do another select on this "Event-Chunk". Is there a
> better way to do this using Fastbit?
>
> Many thanks for your kind help and advice.
>
> Warmest regards, Mike.
>
> DISCLAIMER: This e-mail message and any attachments are intended
> solely for the use of the individual or entity to which it is
> addressed and may contain information that is confidential or
> legally privileged. If you are not the intended recipient, you are
> hereby notified that any dissemination, distribution, copying or
> other use of this message or its attachments is strictly
> prohibited. If you have received this message in error, please
> notify the sender immediately and permanently delete this message
> and any attachments.
>
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to