@Matthew,  

On another note, are there plans to implement "drop=T/F" in data.table?  

Arun


On Saturday, May 18, 2013 at 5:21 PM, Arunkumar Srinivasan wrote:

> Matthew wrote "..the length of an input shouldn't change the type of the 
> output (only the type of the input should be able to change the type of the 
> output)."
> That's a very nice way to put it.
>  
>  
> Arun
>  
>  
> On Saturday, May 18, 2013 at 5:18 PM, Matthew Dowle wrote:
>  
> >   
> > And FAQ 2.17 has a little more on that :
> > "In [.data.frame we very often set drop=FALSE. When we forget, bugs can 
> > arise in edge cases
> > where single columns are selected and all of a sudden a vector is returned 
> > rather than a single
> > column data.frame. In [.data.table we took the opportunity to make it 
> > consistent and drop
> > drop."  
> >   
> > If it helps to know, I also use DT[["somename"]] quite a bit.  
> >   
> > Matthew
> >   
> > On 18.05.2013 10:04, Matthew Dowle wrote:
> > >   
> > > All good points. The thinking here has this mind :
> > >  
> > > myvars = c("col1","col2")
> > > DT[, myvars, with=FALSE]
> > >  
> > > We don't want the type of the result to depend on whether myvars is 
> > > length 1 or not. Otherwise we may end up with surprises (in production 
> > > code for example) if myvars becomes length 1 in future. That's a strong 
> > > principle that data.table follows : the length of an input shouldn't 
> > > change the type of the output (only the type of the input should be able 
> > > to change the type of the output).
> > >  
> > > I've just changed those two parts of ?data.table (thanks for 
> > > highlighting) :
> > >  
> > > was :
> > > "... or (when with=FALSE) same as j in [.data.frame."
> > > now :
> > > "... or (when with=FALSE) a vector of names or positions to select."
> > >  
> > > Matthew  
> > >   
> > > On 17.05.2013 20:34, Ricardo Saporta wrote:
> > > > Hm... Eddi does seem to have a point here.    While I agree with Frank 
> > > > that once you're used to it, it is rather straightforward to deal with, 
> > > > I can see why one would have the expectation of a vector.   ie, that 
> > > > the last of the following `identical` statements should evaluate to 
> > > > `TRUE`  
> > > >     df <- as.data.frame(dt)
> > > >     > identical(df[, "a"], dt[, get("a")])
> > > >     [1] TRUE
> > > >     > identical(df[, "a"], dt[["a"]])
> > > >     [1] TRUE
> > > >     > identical(df[, "a"], dt[, "a", with=FALSE])
> > > >     [1] FALSE
> > > >     rm(df)
> > > >  
> > > > -Rick
> > > >  
> > > > Ricardo Saporta  
> > > > Graduate Student, Data Analytics
> > > > Rutgers University, New Jersey
> > > > e: [email protected] (mailto:[email protected])
> > > >  
> > > >  
> > > >  
> > > >  
> > > > On Fri, May 17, 2013 at 4:26 PM, Eduard Antonyan 
> > > > <[email protected] (mailto:[email protected])> wrote:
> > > > > Well, looking at the documentation:  
> > > > > j: A single column name, single expresson of column names, list() of 
> > > > > expressions of column names, an expression or function call that 
> > > > > evaluates to list (including data.frame and data.table which are 
> > > > > lists, too), or (when with=FALSE) same as j in [.data.frame.
> > > > > ...
> > > > > with: By default with=TRUE and j is evaluated within the frame of x. 
> > > > > The column names can be used as variables. When with=FALSE, j works 
> > > > > as it does in [.data.frame.
> > > > >   
> > > > >  
> > > > > The bolded out part of the documentation doesn't match the actual 
> > > > > behavior.  
> > > > >  
> > > > >  
> > > > > On Fri, May 17, 2013 at 2:44 PM, Frank Erickson <[email protected] 
> > > > > (mailto:[email protected])> wrote:
> > > > > > @Arun and eddi: This question has come up before.  
> > > > > > http://r.789695.n4.nabble.com/Better-hacks-getting-a-vector-AND-using-with-inserting-chunks-of-rows-tt4666592.html
> > > > > > (And I'm sure there are other times, too.) I can't say I've heard 
> > > > > > anyone arguing about it, though. :)
> > > > > > I guess it works that way because
> > > > > > ...in dt[ ,a], j is an expression which evaluates to a vector
> > > > > > ...in dt[,"a",with=FALSE] the option turns on the "you must want 
> > > > > > one or more columns" mode, translating j from "a" to list(a)
> > > > > > It's unintuitive if you're expecting data frame behavior (you know, 
> > > > > > drop=TRUE, as Arun mentioned), but if you've already seen 
> > > > > > dt[,list(a)], it shouldn't be much of a surprise. Adding the drop 
> > > > > > option, and maybe defaulting it to TRUE when with=FALSE might 
> > > > > > satisfy eddi's concern...?
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > On Fri, May 17, 2013 at 10:22 AM, Eduard Antonyan 
> > > > > > <[email protected] (mailto:[email protected])> 
> > > > > > wrote:
> > > > > > > I don't remember discussing this issue...? What is the conceptual 
> > > > > > > difference between dt[, a] and dt[, "a", with = F] and what does 
> > > > > > > 'drop' have to do with this??  
> > > > > > >  
> > > > > > >  
> > > > > > > On Fri, May 17, 2013 at 10:02 AM, Arunkumar Srinivasan 
> > > > > > > <[email protected] (mailto:[email protected])> wrote:
> > > > > > > > Eduard, are we discussing the same thing again :)? Wasn't this 
> > > > > > > > somehow your question as well.. the discrepancy between:  
> > > > > > > > dt[, a] and dt[, "a", with=FALSE].  
> > > > > > > > There should be a drop=TRUE/FALSE option (as in the case of 
> > > > > > > > data.frame) that should be used when you use `with=FALSE`. 
> > > > > > > > Until then, the default option seems to be drop=FALSE, which 
> > > > > > > > results in a data.table.
> > > > > > > > Alexandre, as of now, it could be done as Eduard points out.
> > > > > > > > Arun
> > > > > > > >  
> > > > > > > >  
> > > > > > > > On Friday, May 17, 2013 at 4:59 PM, Eduard Antonyan wrote:
> > > > > > > >  
> > > > > > > > > Use dt[[colname]], but this seems like a bug to me - I 
> > > > > > > > > would've thought that dt[, a] and dt[, "a", with = F] should 
> > > > > > > > > return the exact same thing.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > On Fri, May 17, 2013 at 9:42 AM, Alexandre Sieira 
> > > > > > > > > <[email protected] 
> > > > > > > > > (mailto:[email protected])> wrote:
> > > > > > > > > > Sorry if this is a basic question.  
> > > > > > > > > >   
> > > > > > > > > > I'm using R 3.0.0 and data.table 1.8.8. The documentation 
> > > > > > > > > > for 'j' states that "A single column or single expression 
> > > > > > > > > > returns that type, usually a vector."
> > > > > > > > > >  
> > > > > > > > > > I am able to obtain this behavior if I know the column name 
> > > > > > > > > > in advance:  
> > > > > > > > > >  
> > > > > > > > > >    
> > > > > > > > > > > dt = data.table(a=c(1, 2, 3), b=c(4, 5, 6))
> > > > > > > > > > > dt
> > > > > > > > > >    a b
> > > > > > > > > > 1: 1 4
> > > > > > > > > > 2: 2 5
> > > > > > > > > > 3: 3 6
> > > > > > > > > > > str(dt[,a])
> > > > > > > > > >  num [1:3] 1 2 3
> > > > > > > > > >   
> > > > > > > > > > However, if I don't, no such luck:
> > > > > > > > > > > colname="a"
> > > > > > > > > > > str(dt[,colname,with=F])
> > > > > > > > > > Classes ‘data.table’ and 'data.frame': 3 obs. of  1 
> > > > > > > > > > variable:
> > > > > > > > > >  $ a: num  1 2 3
> > > > > > > > > >  - attr(*, ".internal.selfref")=<externalptr>  
> > > > > > > > > >  
> > > > > > > > > > If there a way to extract an entire column as a vector if I 
> > > > > > > > > > have the column name as a character scalar?
> > > > > > > > > > Thank you!
> > > > > > > > > > --  
> > > > > > > > > > Alexandre Sieira
> > > > > > > > > > CISA, CISSP, ISO 27001 Lead Auditor
> > > > > > > > > >  
> > > > > > > > > > "The truth is rarely pure and never simple."
> > > > > > > > > > Oscar Wilde, The Importance of Being Earnest, 1895, Act I  
> > > > > > > > > > _______________________________________________
> > > > > > > > > > datatable-help mailing list
> > > > > > > > > > [email protected] 
> > > > > > > > > > (mailto:[email protected])
> > > > > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > > > > > > >   
> > > > > > > > > _______________________________________________
> > > > > > > > > datatable-help mailing list
> > > > > > > > > [email protected] 
> > > > > > > > > (mailto:[email protected])
> > > > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > _______________________________________________
> > > > > > > datatable-help mailing list
> > > > > > > [email protected] 
> > > > > > > (mailto:[email protected])
> > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > > > >   
> > > > > _______________________________________________
> > > > > datatable-help mailing list
> > > > > [email protected] 
> > > > > (mailto:[email protected])
> > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > >   
> > >   
> > >   
> > >  
> >  
> >   
> >   
> >  
> >  
> >  
>  
>  

_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to