@Matthew,
On another note, are there plans to implement "drop=T/F" in data.table?
Arun
On Saturday, May 18, 2013 at 5:21 PM, Arunkumar Srinivasan wrote:
> Matthew wrote "..the length of an input shouldn't change the type of the
> output (only the type of the input should be able to change the type of the
> output)."
> That's a very nice way to put it.
>
>
> Arun
>
>
> On Saturday, May 18, 2013 at 5:18 PM, Matthew Dowle wrote:
>
> >
> > And FAQ 2.17 has a little more on that :
> > "In [.data.frame we very often set drop=FALSE. When we forget, bugs can
> > arise in edge cases
> > where single columns are selected and all of a sudden a vector is returned
> > rather than a single
> > column data.frame. In [.data.table we took the opportunity to make it
> > consistent and drop
> > drop."
> >
> > If it helps to know, I also use DT[["somename"]] quite a bit.
> >
> > Matthew
> >
> > On 18.05.2013 10:04, Matthew Dowle wrote:
> > >
> > > All good points. The thinking here has this mind :
> > >
> > > myvars = c("col1","col2")
> > > DT[, myvars, with=FALSE]
> > >
> > > We don't want the type of the result to depend on whether myvars is
> > > length 1 or not. Otherwise we may end up with surprises (in production
> > > code for example) if myvars becomes length 1 in future. That's a strong
> > > principle that data.table follows : the length of an input shouldn't
> > > change the type of the output (only the type of the input should be able
> > > to change the type of the output).
> > >
> > > I've just changed those two parts of ?data.table (thanks for
> > > highlighting) :
> > >
> > > was :
> > > "... or (when with=FALSE) same as j in [.data.frame."
> > > now :
> > > "... or (when with=FALSE) a vector of names or positions to select."
> > >
> > > Matthew
> > >
> > > On 17.05.2013 20:34, Ricardo Saporta wrote:
> > > > Hm... Eddi does seem to have a point here. While I agree with Frank
> > > > that once you're used to it, it is rather straightforward to deal with,
> > > > I can see why one would have the expectation of a vector. ie, that
> > > > the last of the following `identical` statements should evaluate to
> > > > `TRUE`
> > > > df <- as.data.frame(dt)
> > > > > identical(df[, "a"], dt[, get("a")])
> > > > [1] TRUE
> > > > > identical(df[, "a"], dt[["a"]])
> > > > [1] TRUE
> > > > > identical(df[, "a"], dt[, "a", with=FALSE])
> > > > [1] FALSE
> > > > rm(df)
> > > >
> > > > -Rick
> > > >
> > > > Ricardo Saporta
> > > > Graduate Student, Data Analytics
> > > > Rutgers University, New Jersey
> > > > e: [email protected] (mailto:[email protected])
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, May 17, 2013 at 4:26 PM, Eduard Antonyan
> > > > <[email protected] (mailto:[email protected])> wrote:
> > > > > Well, looking at the documentation:
> > > > > j: A single column name, single expresson of column names, list() of
> > > > > expressions of column names, an expression or function call that
> > > > > evaluates to list (including data.frame and data.table which are
> > > > > lists, too), or (when with=FALSE) same as j in [.data.frame.
> > > > > ...
> > > > > with: By default with=TRUE and j is evaluated within the frame of x.
> > > > > The column names can be used as variables. When with=FALSE, j works
> > > > > as it does in [.data.frame.
> > > > >
> > > > >
> > > > > The bolded out part of the documentation doesn't match the actual
> > > > > behavior.
> > > > >
> > > > >
> > > > > On Fri, May 17, 2013 at 2:44 PM, Frank Erickson <[email protected]
> > > > > (mailto:[email protected])> wrote:
> > > > > > @Arun and eddi: This question has come up before.
> > > > > > http://r.789695.n4.nabble.com/Better-hacks-getting-a-vector-AND-using-with-inserting-chunks-of-rows-tt4666592.html
> > > > > > (And I'm sure there are other times, too.) I can't say I've heard
> > > > > > anyone arguing about it, though. :)
> > > > > > I guess it works that way because
> > > > > > ...in dt[ ,a], j is an expression which evaluates to a vector
> > > > > > ...in dt[,"a",with=FALSE] the option turns on the "you must want
> > > > > > one or more columns" mode, translating j from "a" to list(a)
> > > > > > It's unintuitive if you're expecting data frame behavior (you know,
> > > > > > drop=TRUE, as Arun mentioned), but if you've already seen
> > > > > > dt[,list(a)], it shouldn't be much of a surprise. Adding the drop
> > > > > > option, and maybe defaulting it to TRUE when with=FALSE might
> > > > > > satisfy eddi's concern...?
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, May 17, 2013 at 10:22 AM, Eduard Antonyan
> > > > > > <[email protected] (mailto:[email protected])>
> > > > > > wrote:
> > > > > > > I don't remember discussing this issue...? What is the conceptual
> > > > > > > difference between dt[, a] and dt[, "a", with = F] and what does
> > > > > > > 'drop' have to do with this??
> > > > > > >
> > > > > > >
> > > > > > > On Fri, May 17, 2013 at 10:02 AM, Arunkumar Srinivasan
> > > > > > > <[email protected] (mailto:[email protected])> wrote:
> > > > > > > > Eduard, are we discussing the same thing again :)? Wasn't this
> > > > > > > > somehow your question as well.. the discrepancy between:
> > > > > > > > dt[, a] and dt[, "a", with=FALSE].
> > > > > > > > There should be a drop=TRUE/FALSE option (as in the case of
> > > > > > > > data.frame) that should be used when you use `with=FALSE`.
> > > > > > > > Until then, the default option seems to be drop=FALSE, which
> > > > > > > > results in a data.table.
> > > > > > > > Alexandre, as of now, it could be done as Eduard points out.
> > > > > > > > Arun
> > > > > > > >
> > > > > > > >
> > > > > > > > On Friday, May 17, 2013 at 4:59 PM, Eduard Antonyan wrote:
> > > > > > > >
> > > > > > > > > Use dt[[colname]], but this seems like a bug to me - I
> > > > > > > > > would've thought that dt[, a] and dt[, "a", with = F] should
> > > > > > > > > return the exact same thing.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, May 17, 2013 at 9:42 AM, Alexandre Sieira
> > > > > > > > > <[email protected]
> > > > > > > > > (mailto:[email protected])> wrote:
> > > > > > > > > > Sorry if this is a basic question.
> > > > > > > > > >
> > > > > > > > > > I'm using R 3.0.0 and data.table 1.8.8. The documentation
> > > > > > > > > > for 'j' states that "A single column or single expression
> > > > > > > > > > returns that type, usually a vector."
> > > > > > > > > >
> > > > > > > > > > I am able to obtain this behavior if I know the column name
> > > > > > > > > > in advance:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > dt = data.table(a=c(1, 2, 3), b=c(4, 5, 6))
> > > > > > > > > > > dt
> > > > > > > > > > a b
> > > > > > > > > > 1: 1 4
> > > > > > > > > > 2: 2 5
> > > > > > > > > > 3: 3 6
> > > > > > > > > > > str(dt[,a])
> > > > > > > > > > num [1:3] 1 2 3
> > > > > > > > > >
> > > > > > > > > > However, if I don't, no such luck:
> > > > > > > > > > > colname="a"
> > > > > > > > > > > str(dt[,colname,with=F])
> > > > > > > > > > Classes ‘data.table’ and 'data.frame': 3 obs. of 1
> > > > > > > > > > variable:
> > > > > > > > > > $ a: num 1 2 3
> > > > > > > > > > - attr(*, ".internal.selfref")=<externalptr>
> > > > > > > > > >
> > > > > > > > > > If there a way to extract an entire column as a vector if I
> > > > > > > > > > have the column name as a character scalar?
> > > > > > > > > > Thank you!
> > > > > > > > > > --
> > > > > > > > > > Alexandre Sieira
> > > > > > > > > > CISA, CISSP, ISO 27001 Lead Auditor
> > > > > > > > > >
> > > > > > > > > > "The truth is rarely pure and never simple."
> > > > > > > > > > Oscar Wilde, The Importance of Being Earnest, 1895, Act I
> > > > > > > > > > _______________________________________________
> > > > > > > > > > datatable-help mailing list
> > > > > > > > > > [email protected]
> > > > > > > > > > (mailto:[email protected])
> > > > > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > datatable-help mailing list
> > > > > > > > > [email protected]
> > > > > > > > > (mailto:[email protected])
> > > > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > datatable-help mailing list
> > > > > > > [email protected]
> > > > > > > (mailto:[email protected])
> > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > > > >
> > > > > _______________________________________________
> > > > > datatable-help mailing list
> > > > > [email protected]
> > > > > (mailto:[email protected])
> > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
> >
>
>
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help