Hi John,
The pos column has the large values, "CT" is UBYTE. So there is a problem with
column order as well, maybe fixing that will fix the orderby?
Andrew
On Sep 19, 2011, at 5:30 PM, K. John Wu wrote:
> Hi, Andrew,
>
> As far as I can tell, you are using the API correctly.
>
> In the print out, the first column should be "pos" and second column "CT".
> Therefore, the first print out is already ordered by pos and the orderby
> function has nothing to do.
>
> If you are actually expecting the second column in the print out to be "pos,"
> then it is possible other part of the FastBit code is not behaving as they
> should..
>
> John
>
>
> On 9/16/11 10:56 AM, Olson, Andrew wrote:
>> Hi John,
>> I noticed that the orderby isn't actually sorting the table::select results.
>> Below is a snippet of my code:
>>
>> ibis::table *res = tbl->select("pos,CT",qcnd);
>> res->dump(std::cout, "\t");
>> res->orderby("pos");
>> std::cout<< "nRows = "<< res->nRows()<< std::endl;
>> res->dump(std::cout, "\t");
>>
>> And this is the sample output:
>> 2 107013068
>> 5 107013059
>> 6 107013070
>> nRows = 3
>> binsize = 0
>> 2 107013068
>> 5 107013059
>> 6 107013070
>>
>> Am I using the API properly? The data is stored across multiple partitions.
>>
>> Andrew
>>
>> On Aug 31, 2011, at 5:01 PM, K. John Wu wrote:
>>
>>> Hi, Andrew,
>>>
>>> Currently, there is no logic in the code to recognize that the two parts of
>>> the data are sorted already. Therefore, the orderby call will use a
>>> generic sorting procedure.
>>>
>>> We are contemplating revamping the groupby and orderby operations, we will
>>> keep this in mind when we redesign things.
>>>
>>> In the mean time, hopefully, the sorting cost is not too much and response
>>> time of the additional orderby is tolerable..
>>>
>>> John
>>>
>>>
>>>
>>> On 8/31/11 1:41 PM, Olson, Andrew wrote:
>>>> Hi,
>>>> I have a data set that is split into two partitions (A and B). Each
>>>> partition has columns position and score. The -part.txt file has a
>>>> metaTags entry corresponding to the partition (A or B). Usually I query
>>>> only one partition at a time, but I would like to query across both
>>>> partitions by pointing to the parent directory.
>>>>
>>>> tbl = ibis::table::create(datadir);
>>>> res = tbl->select("position,score",qcnd);
>>>>
>>>> later on I retreive the positions and scores as follows:
>>>>
>>>> uint64_t ierr = res->getColumnAsUInts("position", positions);
>>>> uint64_t ierr = res->getColumnAsUInts("score", scores);
>>>>
>>>> The code that follows expects the positions to be sorted (they are sorted
>>>> in each partition - and -part.txt has "sorted = true").
>>>> I suspect I need to do this before populating positions[] and scores[]:
>>>> sorted_results = res->orderby("position");
>>>>
>>>> So my question is, since each partition is already sorted by position, is
>>>> there logic in place that uses this info to do the orderby more quickly?
>>>> On a related note, if I add the orderby() to my code, will it slow down
>>>> when querying only one partition (already sorted by position)?
>>>>
>>>> Thanks,
>>>> Andrew
>>>> _______________________________________________
>>>> FastBit-users mailing list
>>>> [email protected]
>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users