Hello,
Sorry to send this email again, I realized that the email is not sent to
fastbit user mailing list. Following is my problem.
I tried the estimate function as you instructed before, however I got a wrong
answer from estimate function (FastBit version is 1.3.6). Could you help
me ?
I have data which has following distribution: value range |
# of element locates in this range [0 - 10) |
8 [10 - 20) | 7 [20 - 30)
| 12 [30 - 40) |
11 [40 - 50) | 10 [50 - 60)
| 9 [60 - 70) |
15 [70 - 80) | 10 [80 - 90)
| 7 [90 - 100) |
11 Above data was binned into 4 bins, whose boundaries are "10, 40, 70,
100".
I applied estimate function when the query is " xxx where data value < 15 ",
the estimate function return 0, which is not right. If i use evaluate
function given by same query, the results number is 15 which is correct.
Here is my code :
vector <uint32_t> RIDs;
ibis::part table ("test", static_cast<const char*>(0));
// create a query object with the current user name. ibis::query
estimate_query (ibis::util::userName(), &table); ibis::query evaluate_query
(ibis::util::userName(), &table);
evaluate_query.setWhereClause ("data < 15"); assert
(evaluate_query.evaluate () >= 0); evaluate_query.getHitRows (RIDs);
uint32_t evaluate_size = RIDs.size ();
cout << "number of records where data < 15: evaluate() = " << evaluate_size
<< " records." << endl; // here it returns 15
RIDs.clear ();
estimate_query.setWhereClause ("data < 15"); estimate_query.getHitRows
(RIDs); uint64_t min_hits = estimate_query.getMinNumHits (); uint64_t
max_hits = estimate_query.getMaxNumHits (); uint32_t estimate_size =
RIDs.size ();
cout << "number of records where data < 15: estimate() = " << estimate_size
<< " records between " << min_hits << " and " << max_hits << " hits." << endl;
// value of variable estimate_size is 0 , and min_hits = 0, and max_hits = 100
Any clue why it is not returning the right value? Thanks
Nan
From: [email protected]
To: [email protected]
Subject: RE: [FastBit-users] How to enable fastbit to answer the query without
touching raw data
Date: Thu, 9 May 2013 22:35:58 +0800
Thank you very much.
nan
> Date: Wed, 8 May 2013 14:52:31 -0700
> From: [email protected]
> To: [email protected]
> CC: [email protected]
> Subject: Re: [FastBit-users] How to enable fastbit to answer the query
> without touching raw data
>
> Yes, your understanding is correct.
>
> John
>
>
> On 5/8/13 1:38 PM, nan zhou wrote:
> > Hi, John,
> >
> > Further question would be how the `estimate` function works. For
> > example, if I have bin boundaries, such as: 0, 10 , 20, 30, 40, and
> > 50 , six bin boundaries for column A( bin 1: [0, 10), bin 2: [10, 20),
> > bin 3: [20, 30), bin 4 [30, 40), bin 5 [40, 50) ) . The where clause
> > has 21<= A <= 35. In such as, all bit positions/RIDs in bin 3 and bin
> > 4 are retrieved, no matter whether the actual value is in the query
> > range or not. Do I understand it correctly?
> >
> > Thanks.
> >
> > nan
> >
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users