Hi John,

I think there is a slightly confusing issue with how result columns are 
accessible by the public methods in the ibis::table C++ API.

In my case, I was issueing a SELECT as follows:
SELECT int,AVG(double),COUNT(*) WHERE 1=1

"int" and "double" are colum names for columns that have an int and a double 
data type.


When executing the select with the following code, the results are also ok:

ibis::part* database=NULL;
ibis::table* table=NULL;
ibis::table* select=NULL;

database=new ibis::part("/tmp",true);
table=ibis::table::create(*database);
select=table->select("int,AVG(double),COUNT(*)","1=1");

select->dumpNames(std::cout,",");
select->dump(std::cout,",");


Both dumpNames() and dump() return the columns in the order specified in the 
select.
The column names returned by dumpNames() are as expected: int,avg2,count3

So far, so good.

However, when iterating over the column names using select->columNames(), the 
column names are: avg2, count3, int. 
This order is not the same as the order that dumpNames() returned.

When trying to access the result set's column types using 
select->columnTypes(), the column types are: double, int, int.
Again, this is not what is present in the result of dump() or dumpNames(), 
however, it matches the order of columnNames().

Obviously, the aggregate column (avg2) is moved to the front of the list. This 
also happens for other aggregators that change column types.


I think the reason is the different implementation of the accessor methods in 
bord.cpp. Some of the methods access the columns in order as defined in the 
"columns" property, some methods use the "colorder" property if it is populated:


columnTypes() always returns the columns in an order defined by the "columns" 
property:

ibis::table::typeList ibis::bord::columnTypes() const {
  ibis::table::typeList res(mypart.nColumns());
  for (uint32_t i = 0; i < mypart.nColumns(); ++ i) {
    ibis::column* col = mypart.getColumn(i);    // this does access columns in 
order of mypart.columns property
    res[i] = (col != 0 ? col->type() : ibis::UNKNOWN_TYPE);
  }
  return res;
}

The same order is using in columnNames().


dumpNames() however uses the "colorder" property if initialized and "columns" 
if not:

void ibis::bord::part::dumpNames(std::ostream& out, const char* del) const {
    if (colorder.empty()) {
        for (ibis::part::columnList::const_iterator it = columns.begin();
             it != columns.end(); ++ it) {
            if (it != columns.begin())
                out << del; // this does access columns in order of the columns 
property
            out << (*it).first;
        }
    }
    else if (colorder.size() == columns.size()) {
        out << colorder[0]->name();
        for (uint32_t i = 1; i < columns.size(); ++ i)
            out << del << colorder[i]->name(); // this does access columns in 
order of the colorder property
    }
...


Overall, the results of columnNames() and columnTypes() might have a different 
order than the results returned by dumpNames() and dump().

I was able to confirm the above parts are actually causing the issue by hacking 
ibis::part::getColumn(uint32_t) in part.h. This method is used internally by 
columnNames() and columnTypes():

inline ibis::column* ibis::part::getColumn(uint32_t ind) const {
  // also use colorder if valid
  if (!colorder.empty() && ind<colorder.size()) {
    return const_cast<ibis::column*>(colorder[ind]);
  }
...


This will make getColumn() use the colorder property as well so it is 
consistent with dump() and dumpNames().

I don't think the hack is very good as the above method is very frequently used 
and also an inline method. 
Do you think there is a proper fix for this?

I hope the whole issue is not a mere misunderstanding from my end.

Best regards
Jan

--
Jan Steemann
Team Manager Development Panel | [email protected] Phone +49 2233 
7933 752 | Fax +49 2233 7933 788 

Globalpark AG | Kalscheurener Str. 19a | 50354 Huerth | Germany
Vorstand/Chief Executive Officer (CEO) | Dr. Lorenz Graef
Vorsitzender d. Aufsichtsrats/Chairperson of the Supervisory Board | Dr. 
Richard C. Geibel 
HRG Amtsgericht Koeln/Entered on Cologne Local Court Commercial Register | HRB 
64032

GLOBALPARK - manage what matters | http://www.globalpark.de

_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to