The documentation of ibis::query::estimate states that
Returns 0 for success, a negative value for error.
Since the function call was completed correctly, it should have
returned 0. To find out the minimum and maximum number of hits
determined by ibis::query::estimate, you need to call
ibis::query::getMinNumHits and ibis::query::getMaxNumHits. You can
see an example of how they are used in examples/ibis.cpp line 3549 and
3550.
John
On 6/11/13 2:50 PM, nan zhou wrote:
> Hello,
>
> Sorry to send this email again, I realized that the email is not
> sent to fastbit user mailing list. Following is my problem.
>
> I tried the estimate function as you instructed before, however I
> got a wrong answer from estimate function (FastBit version is 1.3.6).
> Could you help me ?
>
> I have data which has following distribution:
> value range | # of element locates in this range
> [0 - 10) | 8
> [10 - 20) | 7
> [20 - 30) | 12
> [30 - 40) | 11
> [40 - 50) | 10
> [50 - 60) | 9
> [60 - 70) | 15
> [70 - 80) | 10
> [80 - 90) | 7
> [90 - 100) | 11
> Above data was binned into 4 bins, whose boundaries are "10, 40,
> 70, 100".
>
> I applied estimate function when the query is " xxx where data
> value < 15 ", the estimate function return 0, which is not right.
> If i use evaluate function given by same query, the results number
> is 15 which is correct.
>
> Here is my code :
>
> vector <uint32_t> RIDs;
>
> ibis::part table ("test", static_cast<const char*>(0));
>
> // create a query object with the current user name.
> ibis::query estimate_query (ibis::util::userName(), &table);
> ibis::query evaluate_query (ibis::util::userName(), &table);
>
> evaluate_query.setWhereClause ("data < 15");
> assert (evaluate_query.evaluate () >= 0);
> evaluate_query.getHitRows (RIDs);
>
> uint32_t evaluate_size = RIDs.size ();
>
> cout << "number of records where data < 15: evaluate() = " <<
> evaluate_size << " records." << endl; *// here it returns 15*
>
> RIDs.clear ();
>
> estimate_query.setWhereClause ("data < 15");
> estimate_query.getHitRows (RIDs);
>
> uint64_t min_hits = estimate_query.getMinNumHits ();
> uint64_t max_hits = estimate_query.getMaxNumHits ();
> uint32_t estimate_size = RIDs.size ();
>
> cout << "number of records where data < 15: estimate() = " <<
> estimate_size << " records between " << min_hits << " and " <<
> max_hits << " hits." << endl; *// value of variable estimate_size
> is 0 , and min_hits = 0, and max_hits = 100*
>
> Any clue why it is not returning the right value? Thanks
>
> Nan
>
>
> ----------------------------------------------------------------------
> From: [email protected]
> To: [email protected]
> Subject: RE: [FastBit-users] How to enable fastbit to answer the query
> without touching raw data
> Date: Thu, 9 May 2013 22:35:58 +0800
>
> Thank you very much.
>
> nan
>
>> Date: Wed, 8 May 2013 14:52:31 -0700
>> From: [email protected]
>> To: [email protected]
>> CC: [email protected]
>> Subject: Re: [FastBit-users] How to enable fastbit to answer the
> query without touching raw data
>>
>> Yes, your understanding is correct.
>>
>> John
>>
>>
>> On 5/8/13 1:38 PM, nan zhou wrote:
>> > Hi, John,
>> >
>> > Further question would be how the `estimate` function works. For
>> > example, if I have bin boundaries, such as: 0, 10 , 20, 30, 40, and
>> > 50 , six bin boundaries for column A( bin 1: [0, 10), bin 2: [10, 20),
>> > bin 3: [20, 30), bin 4 [30, 40), bin 5 [40, 50) ) . The where clause
>> > has 21<= A <= 35. In such as, all bit positions/RIDs in bin 3 and bin
>> > 4 are retrieved, no matter whether the actual value is in the query
>> > range or not. Do I understand it correctly?
>> >
>> > Thanks.
>> >
>> > nan
>> >
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users