Hi, I had a question about results I’m seeing from subqueries over multiple
partitions. It only occurs on queries where I point at a parent directory.
The gist is that as I issue sub-selects were 1=1 I seem to lose rows in the
resulting table. From the output below tbl1 is correct. tbl2 is missing a 2.
tbl3 is missing a 2 and 5. Going from tbl3 to tbl4 (and beyond) does not lose
any more data. I see this issue on both centos and OS X. I am using fastbit
1.3.8. Does anyone know why this could be?
Thanks,
Sean
OUTPUT:
* tbl1 (rows: 6) -----------------------
1
2
2
3
4
5
* tbl2 (rows: 5) -----------------------
1
2
3
4
5
* tbl3 (rows: 4) -----------------------
1
2
3
4
* tbl4 (rows: 4) -----------------------
1
2
3
4
PROGRAM:
#include <iostream>
#include <ibis.h>
using namespace std;
/*
partitions generated with:
ardea -d index/a -m "visitor_id:long" -t a.csv
ardea -d index/b -m "visitor_id:long" -t b.csv
a.csv:
1
2
b.csv:
2
3
4
5
*/
int main(int argc, const char* argv[]) {
ibis::init();
ibis::table* tbl1 = ibis::table::create("data/parts/index");
ibis::table* tbl2 = tbl1->select("visitor_id","1=1");
ibis::table* tbl3 = tbl2->select("visitor_id","1=1");
ibis::table* tbl4 = tbl3->select("visitor_id","1=1");
cout << " * tbl1 (rows: " << tbl1->nRows() << ") -----------------------" <<
endl;
tbl1->dump(cout);
cout << " * tbl2 (rows: " << tbl2->nRows() << ") -----------------------" <<
endl;
tbl2->dump(cout);
cout << " * tbl3 (rows: " << tbl3->nRows() << ") -----------------------" <<
endl;
tbl3->dump(cout);
cout << " * tbl4 (rows: " << tbl4->nRows() << ") -----------------------" <<
endl;
tbl4->dump(cout);
return 0;
}
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users