Re: [datatable-help] Unexpected behavior in setnames()

Steve Lianoglou Wed, 06 Nov 2013 15:01:42 -0800

On Wed, Nov 6, 2013 at 2:50 PM, Arunkumar Srinivasan
<[email protected]> wrote:
> Eddi,
>
> 1) We can still allow duplicate names in "fread" and during creation of
> data.table with the data.table() command.
> 2) There's really no loss of data as we can allow "setnames" to set
> duplicate names/unduplicate them (and they anyways have the data as they
> load that into R using fread). There's therefore no *real* loss of data.
> 3) The point is to decide upon where duplicate names are allowed and where
> it should give an error…
>
> As I said before, I think it's essential to allow duplicate names while
> loading a file (and therefore for consistency during creation of data.table
> as well). However, all grouping/aggregating/subsetting etc.. where ambiguity
> can arise should end in error. At least this is my stance so far. Are we
> agreeing on this?


Add "evaluation in `j`" to the things you want to throw an error, and
I guess I'm ok w/ Arun's stance, too, since I guess we should stay as
close to data.frame as possible (even though I think it's still
"wrong" to have duplicate column names in principle).

I guess a more clever handling of setnames needs to happen too, as it
fails if the target data.table has any duplicate names (I'm assuming
this has come up already, but I'm only half-tuned-in to this
discussion)

I also think that the output of the aggregation example Eddi used
earlier should be changed, ie:

R> x <- data.table(V1=sample(letters[1:3], 10, rep=TRUE), B=rnorm(10))
R> x[, sum(B), by=V1]
   V1         V1
1:  b -0.8581098
2:  a  0.8762710
3:  c  1.3274762

Just feels wrong for the `sum`ed column to also be V1, but maybe this
is an FR for another day.

-steve

-- 
Steve Lianoglou
Computational Biologist
Bioinformatics and Computational Biology
Genentech
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Re: [datatable-help] Unexpected behavior in setnames()

Reply via email to