Hi Berto,

I think we may be experiencing a language barrier here. This is datatable-help; i.e., not r-help. You *can* write in another language on this list, if you'd like to, in case someone else here understands better. The rules are less strict here. Nobody has yet
done so, but there is no rule against it. Why not?

Proceeding in English for now ...

I like the lack of spaces, but what do the * mean? In other words, you've presented
a line of code :
    DT[y>=3&*v<=7&w<=7*,sum(y), by=x]
but that doesn't actually evaluate to anything, does it? So that's pseudo-code. I don't even need to copy and paste that into R to know it's invalid. That cannot possible give the expected result, because of the "*" characters.

Might you be looking for something like :

    sapply(.SD, `<`, 7)

?   Dunno. Guessing.

But, focussing on this part of your email :

But if the number of columns grows, I can't specify all columns anymore,
maybe should I use column names?

You actually do, really, honestly, need to show us, physically, in email, what you mean. Columns of what? Growing how? Show us 2,3,4,5 columns. Show us the manual way. Show us the input and
show us the output.

Your email can be very long. It can contain very little English. But you actually need to show what the output is you would like, for me (at least)
to understand.

What I am certain of is that whatever you want to do is possible. And if it isn't, then
we will likely enhance data.table to do it.

Matthew


On 16.11.2012 18:32, Berto wrote:
Hi Matthew, thanks for the quick reply.

I want to find all the rows that are above a threshold for one column (e.g. y>=3) and below another threshold for all the rest (e.g. v<=7&w<=7, for a
threshold<7).

Once I have this subsetting, I'd like to use the sum function (e.g. sum(y),
by=x).

I know how to do it for a low number of columns, specifiying all columns
names<threshold:

DT[y>=3&*v<=7&w<=7*,sum(y), by=x]

which gives the expected result:

    x V1
1: a  9
2: b  3

But if the number of columns grows, I can't specify all columns anymore,
maybe should I use column names?

cols <- cols[names(DT) %in% "y" == FALSE] #column names excluding the one
with higher threshold

Hope to be clearer this time, otherwise please let me know!




--
View this message in context:

http://r.789695.n4.nabble.com/Subsetting-columns-in-data-table-tp4649736p4649779.html
Sent from the datatable-help mailing list archive at Nabble.com.
_______________________________________________
datatable-help mailing list
[email protected]

https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to