Matthew wrote "..the length of an input shouldn't change the type of the output
(only the type of the input should be able to change the type of the output)."
That's a very nice way to put it.
Arun
On Saturday, May 18, 2013 at 5:18 PM, Matthew Dowle wrote:
>
> And FAQ 2.17 has a little more on that :
> "In [.data.frame we very often set drop=FALSE. When we forget, bugs can arise
> in edge cases
> where single columns are selected and all of a sudden a vector is returned
> rather than a single
> column data.frame. In [.data.table we took the opportunity to make it
> consistent and drop
> drop."
>
> If it helps to know, I also use DT[["somename"]] quite a bit.
>
> Matthew
>
> On 18.05.2013 10:04, Matthew Dowle wrote:
> >
> > All good points. The thinking here has this mind :
> >
> > myvars = c("col1","col2")
> > DT[, myvars, with=FALSE]
> >
> > We don't want the type of the result to depend on whether myvars is length
> > 1 or not. Otherwise we may end up with surprises (in production code for
> > example) if myvars becomes length 1 in future. That's a strong principle
> > that data.table follows : the length of an input shouldn't change the type
> > of the output (only the type of the input should be able to change the type
> > of the output).
> >
> > I've just changed those two parts of ?data.table (thanks for highlighting) :
> >
> > was :
> > "... or (when with=FALSE) same as j in [.data.frame."
> > now :
> > "... or (when with=FALSE) a vector of names or positions to select."
> >
> > Matthew
> >
> > On 17.05.2013 20:34, Ricardo Saporta wrote:
> > > Hm... Eddi does seem to have a point here. While I agree with Frank
> > > that once you're used to it, it is rather straightforward to deal with, I
> > > can see why one would have the expectation of a vector. ie, that the
> > > last of the following `identical` statements should evaluate to `TRUE`
> > > df <- as.data.frame(dt)
> > > > identical(df[, "a"], dt[, get("a")])
> > > [1] TRUE
> > > > identical(df[, "a"], dt[["a"]])
> > > [1] TRUE
> > > > identical(df[, "a"], dt[, "a", with=FALSE])
> > > [1] FALSE
> > > rm(df)
> > >
> > > -Rick
> > >
> > > Ricardo Saporta
> > > Graduate Student, Data Analytics
> > > Rutgers University, New Jersey
> > > e: [email protected] (mailto:[email protected])
> > >
> > >
> > >
> > >
> > > On Fri, May 17, 2013 at 4:26 PM, Eduard Antonyan
> > > <[email protected] (mailto:[email protected])> wrote:
> > > > Well, looking at the documentation:
> > > > j: A single column name, single expresson of column names, list() of
> > > > expressions of column names, an expression or function call that
> > > > evaluates to list (including data.frame and data.table which are lists,
> > > > too), or (when with=FALSE) same as j in [.data.frame.
> > > > ...
> > > > with: By default with=TRUE and j is evaluated within the frame of x.
> > > > The column names can be used as variables. When with=FALSE, j works as
> > > > it does in [.data.frame.
> > > >
> > > >
> > > > The bolded out part of the documentation doesn't match the actual
> > > > behavior.
> > > >
> > > >
> > > > On Fri, May 17, 2013 at 2:44 PM, Frank Erickson <[email protected]
> > > > (mailto:[email protected])> wrote:
> > > > > @Arun and eddi: This question has come up before.
> > > > > http://r.789695.n4.nabble.com/Better-hacks-getting-a-vector-AND-using-with-inserting-chunks-of-rows-tt4666592.html
> > > > > (And I'm sure there are other times, too.) I can't say I've heard
> > > > > anyone arguing about it, though. :)
> > > > > I guess it works that way because
> > > > > ...in dt[ ,a], j is an expression which evaluates to a vector
> > > > > ...in dt[,"a",with=FALSE] the option turns on the "you must want one
> > > > > or more columns" mode, translating j from "a" to list(a)
> > > > > It's unintuitive if you're expecting data frame behavior (you know,
> > > > > drop=TRUE, as Arun mentioned), but if you've already seen
> > > > > dt[,list(a)], it shouldn't be much of a surprise. Adding the drop
> > > > > option, and maybe defaulting it to TRUE when with=FALSE might satisfy
> > > > > eddi's concern...?
> > > > >
> > > > >
> > > > >
> > > > > On Fri, May 17, 2013 at 10:22 AM, Eduard Antonyan
> > > > > <[email protected] (mailto:[email protected])> wrote:
> > > > > > I don't remember discussing this issue...? What is the conceptual
> > > > > > difference between dt[, a] and dt[, "a", with = F] and what does
> > > > > > 'drop' have to do with this??
> > > > > >
> > > > > >
> > > > > > On Fri, May 17, 2013 at 10:02 AM, Arunkumar Srinivasan
> > > > > > <[email protected] (mailto:[email protected])> wrote:
> > > > > > > Eduard, are we discussing the same thing again :)? Wasn't this
> > > > > > > somehow your question as well.. the discrepancy between:
> > > > > > > dt[, a] and dt[, "a", with=FALSE].
> > > > > > > There should be a drop=TRUE/FALSE option (as in the case of
> > > > > > > data.frame) that should be used when you use `with=FALSE`. Until
> > > > > > > then, the default option seems to be drop=FALSE, which results in
> > > > > > > a data.table.
> > > > > > > Alexandre, as of now, it could be done as Eduard points out.
> > > > > > > Arun
> > > > > > >
> > > > > > >
> > > > > > > On Friday, May 17, 2013 at 4:59 PM, Eduard Antonyan wrote:
> > > > > > >
> > > > > > > > Use dt[[colname]], but this seems like a bug to me - I would've
> > > > > > > > thought that dt[, a] and dt[, "a", with = F] should return the
> > > > > > > > exact same thing.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, May 17, 2013 at 9:42 AM, Alexandre Sieira
> > > > > > > > <[email protected]
> > > > > > > > (mailto:[email protected])> wrote:
> > > > > > > > > Sorry if this is a basic question.
> > > > > > > > >
> > > > > > > > > I'm using R 3.0.0 and data.table 1.8.8. The documentation for
> > > > > > > > > 'j' states that "A single column or single expression returns
> > > > > > > > > that type, usually a vector."
> > > > > > > > >
> > > > > > > > > I am able to obtain this behavior if I know the column name
> > > > > > > > > in advance:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > dt = data.table(a=c(1, 2, 3), b=c(4, 5, 6))
> > > > > > > > > > dt
> > > > > > > > > a b
> > > > > > > > > 1: 1 4
> > > > > > > > > 2: 2 5
> > > > > > > > > 3: 3 6
> > > > > > > > > > str(dt[,a])
> > > > > > > > > num [1:3] 1 2 3
> > > > > > > > >
> > > > > > > > > However, if I don't, no such luck:
> > > > > > > > > > colname="a"
> > > > > > > > > > str(dt[,colname,with=F])
> > > > > > > > > Classes ‘data.table’ and 'data.frame': 3 obs. of 1 variable:
> > > > > > > > > $ a: num 1 2 3
> > > > > > > > > - attr(*, ".internal.selfref")=<externalptr>
> > > > > > > > >
> > > > > > > > > If there a way to extract an entire column as a vector if I
> > > > > > > > > have the column name as a character scalar?
> > > > > > > > > Thank you!
> > > > > > > > > --
> > > > > > > > > Alexandre Sieira
> > > > > > > > > CISA, CISSP, ISO 27001 Lead Auditor
> > > > > > > > >
> > > > > > > > > "The truth is rarely pure and never simple."
> > > > > > > > > Oscar Wilde, The Importance of Being Earnest, 1895, Act I
> > > > > > > > > _______________________________________________
> > > > > > > > > datatable-help mailing list
> > > > > > > > > [email protected]
> > > > > > > > > (mailto:[email protected])
> > > > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > datatable-help mailing list
> > > > > > > > [email protected]
> > > > > > > > (mailto:[email protected])
> > > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > datatable-help mailing list
> > > > > > [email protected]
> > > > > > (mailto:[email protected])
> > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > > >
> > > > _______________________________________________
> > > > datatable-help mailing list
> > > > [email protected]
> > > > (mailto:[email protected])
> > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > >
> >
> >
> >
>
>
>
>
>
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help