Hi John, On 15 October 2011 06:02, K. John Wu <[email protected]> wrote:
> Hi, Petr, > > Thanks for you interest in FastBit. > > Regarding isfinite, can you tell me what OS and compiler you are using? > The function isfinite should be defined in math.h which is included in the > header. > > I use OpenSUSE linux 11.4 with gcc compiler (g++) in latest version available in the distro (not sure which it is right now, I can let you know on monday). The problem is that c++ math library <cmath> is included, which includes C <math.h>, but has #undefine isfinite macro inside. Then it defines its own isfinite macro inside STD namespace. Thus the need for "using std::isfinite;". I believe this header file is very similar to mine: http://www.aoc.nrao.edu/php/tjuerges/ALMA/STL/html-3.4.6/cmath-source.html FastBit attempts to hold to things that it has read into memory already > until the memory is needed for something else. This may explain why it is > holding on to so much memory. If the memory is needed for the next set of > tasks, FastBit will give up the oldest data. > That might explain my experience right enough. The problem is that when i try a query that returns even more records, it uses up all the memory in my machine, making it unusable. That is when i set ibis::fileManager::adjustCacheSize(10GB - a lot); I do this because I need to work with large files, bigger that my RAM memory. This allows me to mmap those files into process virtual memory, but unfortunately it also allows fastbit to allocate too much memory for itself. It might be worth a try to modify fileManager not to account for mmaped memory in its cache size, at least on 64bit systems where address space is really vast. The the cache size might be set to some reasonable value. Or as I suggested earlier, use another cache size variable for mmaped memory. Please let us know if valgrind reports memory leak or other issues. > > Valgrind reported memory leaks in SVN version 425, but in 432 it does not seem to be an issue anymore. John Petr > > > On 10/14/11 2:32 AM, Thorgrin wrote: > >> Dear John, >> >> we are using fastbit to store netflows in ipfix format and we need to >> work with large amounts of data. >> >> I'am using latest SVN version (432), though it needed some tuning to >> compile (putting "using std::isfinite;" in part.cpp and mensa.cpp). >> >> The workflow is as follows: >> 1) Open the partition using >> ibis:part *part= new ibis::part("path/to/partition", NULL, true); >> 2) Create table from this partition >> ibis::table *table = ibis::table::create(*part); >> 3) Perfom a query >> table = table->select("col1, col2, ...", "col1 = 80 AND ..."); >> /* also delete original table */ >> 4) Use cursor to print the result >> ..... >> >> The problem is with amount of allocated memory. >> I had to raise the limit for memory allocation in fileManager to allow >> for mmapping of large files. But this also means that fastbit is >> allowed to allocate large amount of physical memory. I believe that >> there should be two limits, one for physical memory and for mapping >> files, because on 64bit system I just don't care that 6GB file is >> mmaped, but I certainly don't fastbit to claim so much memory for itself. >> >> Nonetheless, i don't understand why it takes so much memory while >> performing simple select. Here are some measurements form valgrind: >> >> Query 1: >> >> Estimating between 26 and 26 records /* table->estimate() */ >> Created new table, MB in use: 0 /* this is after table::create(*part) */ >> Table filtered, MB in use: 4527 /* this comes from >> ibis::fileManager::bytesInUse(), after table->select() */ >> >> ==19057== HEAP SUMMARY: >> ==19057== in use at exit: 0 bytes in 0 blocks >> ==19057== total heap usage: 3,165 allocs, 3,165 frees, 63,483,310 >> bytes allocated >> >> Query 2: >> Estimating between 1832 and 1832 records >> Created new table, MB in use: 0 >> Table filtered, MB in use: 4130 >> >> ==17557== HEAP SUMMARY: >> ==17557== in use at exit: 0 bytes in 0 blocks >> ==17557== total heap usage: 44,471 allocs, 44,471 frees, 614,111,365 >> bytes allocated >> >> The data: for queries 1,2 >> ~/Documents/devel/data/fi2/000000000001/1> ls -lh >> total 4.3G >> -rw-r--r-- 1 velan users 611M Oct 13 16:48 e0id1 >> -rw-r--r-- 1 velan users 153M Oct 13 16:48 e0id11 >> -rw-r--r-- 1 velan users 332M Oct 13 17:53 e0id11.idx >> -rw-r--r-- 1 velan users 306M Oct 13 16:48 e0id12 >> -rw-r--r-- 1 velan users 379M Oct 14 09:03 e0id12.idx >> -rw-r--r-- 1 velan users 611M Oct 13 16:48 e0id152 >> -rw-r--r-- 1 velan users 611M Oct 13 16:48 e0id153 >> -rw-r--r-- 1 velan users 611M Oct 13 16:48 e0id2 >> -rw-r--r-- 1 velan users 77M Oct 13 16:48 e0id4 >> -rw-r--r-- 1 velan users 23M Oct 13 17:54 e0id4.idx >> -rw-r--r-- 1 velan users 77M Oct 13 16:48 e0id5 >> -rw-r--r-- 1 velan users 77M Oct 13 16:48 e0id6 >> -rw-r--r-- 1 velan users 153M Oct 13 16:48 e0id7 >> -rw-r--r-- 1 velan users 306M Oct 13 16:48 e0id8 >> -rw-r--r-- 1 velan users 870 Oct 13 16:48 -part.txt >> >> Query uses columns with indexes for filtering and those indexes were >> generated automatically >> >> ------------------------------------------------------------------------------------------ >> Query 3: >> Estimating between 3515 and 3515 records >> Created new table, MB in use: 0 >> Table filtered, MB in use: 2715 >> >> ==17773== HEAP SUMMARY: >> ==17773== in use at exit: 0 bytes in 0 blocks >> ==17773== total heap usage: 82,753 allocs, 82,753 frees, >> 2,500,927,652 bytes allocated >> >> >> Data for query 3 >> ~/Documents/devel/data/fi2/000000000002/1> ls -lh >> total 2.8G >> -rw-r--r-- 1 velan users 400M Oct 13 17:06 e0id1 >> -rw-r--r-- 1 velan users 100M Oct 13 17:06 e0id11 >> -rw-r--r-- 1 velan users 227M Oct 13 17:51 e0id11.idx >> -rw-r--r-- 1 velan users 200M Oct 13 17:06 e0id12 >> -rw-r--r-- 1 velan users 251M Oct 14 09:02 e0id12.idx >> -rw-r--r-- 1 velan users 400M Oct 13 17:06 e0id152 >> -rw-r--r-- 1 velan users 400M Oct 13 17:06 e0id153 >> -rw-r--r-- 1 velan users 400M Oct 13 17:06 e0id2 >> -rw-r--r-- 1 velan users 50M Oct 13 17:06 e0id4 >> -rw-r--r-- 1 velan users 15M Oct 13 17:51 e0id4.idx >> -rw-r--r-- 1 velan users 50M Oct 13 17:06 e0id5 >> -rw-r--r-- 1 velan users 50M Oct 13 17:06 e0id6 >> -rw-r--r-- 1 velan users 100M Oct 13 17:06 e0id7 >> -rw-r--r-- 1 velan users 200M Oct 13 17:06 e0id8 >> -rw-r--r-- 1 velan users 870 Oct 13 17:06 -part.txt >> Query uses columns with indexes for filtering and those indexes were >> generated automatically >> >> ------------------------------------------------------------------------------------------ >> >> Query three returns 3515 records but allocates 2.5GB memory. This seem >> kinda too much when all data is process have 2.8GB altogether. >> >> If I am doing something wrong here, could you please advise me how to >> improve my interaction with fastbit API? >> Should you require more information, please ask, any advice would be >> most welcome. >> >> Yours sincerely, >> >> Petr Velan >> >> >> _______________________________________________ >> FastBit-users mailing list >> [email protected] >> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >> >
_______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
