I have come across some behavior in rbindlist that look unexpected to me:

> rbindlist(list(data.table(a=1, b=2), data.table(b=4, a=3)))
   a b
1: 1 2
2: 4 3

So it appears to assume (without checking) that all objects have not only the 
same column names but also the same column order.  So a value assigned to 
column ‘a’ in the second object was used for column ‘b’ in the end result (and 
vice-versa).

I know the documentation says rbindlist uses the column types from the first 
entry of the list, but I didn’t see any mention to column order or names 
anywhere. 

I suggest that column names are matched, even if they are not in the same 
order. Perhaps a ‘use.names’ parameter could be used to ask for this behavior 
to avoid breaking backwards compatibility. 

Or, at the very least, I suggest the documentation of bindlist be updated to 
explicitly mention that the columns will be considered by position only, and 
that callers need to ensure the column orders of all objects match exactly. And 
that a warning is issued by rbindlist when the column names don’t match.

-- 
Alexandre Sieira
CISA, CISSP, ISO 27001 Lead Auditor

"The truth is rarely pure and never simple."
Oscar Wilde, The Importance of Being Earnest, 1895, Act I
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to