the problem is, it works with the example data i gave.  however, it does NOT
work with the data set i have, which is 600,000 rows.  the class is still a
data frame.

On Mon, Jul 27, 2009 at 12:15 PM, Steve Lianoglou <
mailinglist.honey...@gmail.com> wrote:

>
> On Jul 27, 2009, at 2:54 PM, Mehdi Khan wrote:
>
>  i am able to return the first column, but anything else returns this:
>> <0 rows> (or 0-length row.names)
>>
>> any idea?
>>
>
> I'm not sure what you're doing.
>
> The result you're getting happens when no rows "pass" the logical test that
> you are using to index the rows of your data.frame for.
>
> Can you show the code that you are using (based on the example data you
> gave) that is giving you the <0 rows> result?
>
> -steve
>
>
>
>> On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou <
>> mailinglist.honey...@gmail.com> wrote:
>>
>> On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote:
>>
>> I understand your explanation about the test for even numbers.  However I
>> am still a bit confused as to how to go about finding a particular value.
>>  Here is an example data set
>>
>> col #          attr1    attr2   attr 3    LON        LAT
>> 17209         D        NA    NA -122.9409 38.27645
>> 17210        BC        NA    NA -122.9581 38.36304
>> 17211         B        NA    NA -123.6851 41.67121
>> 17212        BC        NA    NA -123.0724 38.93073
>> 17213         C        NA    NA -123.7240 41.84403
>> 17214      <NA>       464    NA -122.9430 38.30988
>> 17215         C        NA    NA -123.4442 40.65369
>> 17216        BC        NA    NA -122.9389 38.31551
>> 17217         C        NA    NA -123.0747 38.97998
>> 17218         C        NA    NA -123.6580 41.59610
>> 17219         C        NA    NA -123.4513 40.70992
>> 17220         C        NA    NA -123.0901 39.06473
>> 17221        BC        NA    NA -123.0653 38.94845
>> 17222        BC        NA    NA -122.9464 38.36808
>> 17223      <NA>       464    NA -123.0143 38.70205
>> 17224      <NA>        NA     5 -122.8609 37.94137
>> 17225      <NA>        NA     5 -122.8628 37.95057
>> 17226      <NA>        NA     7 -122.8646 37.95978
>>
>> For future reference, perhaps paste this in a way that's easy for us to
>> paste into a running R session so we can use it, like so:
>>
>> df <- data.frame(
>> coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216, 17217,
>> 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226),
>>
>> attr1=c("D","BC","B","BC","C",NA,"C","BC","C","C","C","C","BC","BC",NA,NA,NA,NA),
>> attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA),
>> attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7),
>> LON=c(
>> -122.9409,-122.9581,-123.6851,-123.0724,-123.7240,-122.9430,-123.4442,-122.9389,-123.0747,-123.6580,-123.4513,-123.0901,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646),
>>
>> LAT=c(38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978))
>>
>>
>> If I wanted to find the row with Lat = 37.95978
>>
>> Using an "indexing vector":
>>
>> R> lats <- df$LAT == 37.95978
>> # or with the %~% from before:
>> # lats <- df$LAT %~% 37.95978
>> R> df[lats,]
>>   coln attr1 attr2 attr3       LON      LAT
>> 18 17226  <NA>    NA     7 -122.8646 37.95978
>>
>> Using the "subset" function:
>>
>> R> subset(df, LAT == 37.95978)
>>   coln attr1 attr2 attr3       LON      LAT
>> 18 17226  <NA>    NA     7 -122.8646 37.95978
>>
>>
>> , how would i do that?  How would  I find the rows with BC?
>>
>> R> subset(df, attr1 == 'BC')
>>   coln attr1 attr2 attr3       LON      LAT
>> 2  17210    BC    NA    NA -122.9581 38.36304
>> 4  17212    BC    NA    NA -123.0724 38.93073
>> 8  17216    BC    NA    NA -122.9389 38.31551
>> 13 17221    BC    NA    NA -123.0653 38.94845
>> 14 17222    BC    NA    NA -122.9464 38.36808
>>
>>
>> If you try with an "indexing vector" the NA's will trip you up:
>>
>> R> df[df$attr1 == 'BC',]
>>     coln attr1 attr2 attr3       LON      LAT
>> 2    17210    BC    NA    NA -122.9581 38.36304
>> 4    17212    BC    NA    NA -123.0724 38.93073
>> NA      NA  <NA>    NA    NA        NA       NA
>> 8    17216    BC    NA    NA -122.9389 38.31551
>> 13   17221    BC    NA    NA -123.0653 38.94845
>> 14   17222    BC    NA    NA -122.9464 38.36808
>> NA.1    NA  <NA>    NA    NA        NA       NA
>> NA.2    NA  <NA>    NA    NA        NA       NA
>> NA.3    NA  <NA>    NA    NA        NA       NA
>> NA.4    NA  <NA>    NA    NA        NA       NA
>>
>> So you could do something like:
>>
>> > df[df$attr1 == 'BC' & !is.na(df$attr1),]
>>   coln attr1 attr2 attr3       LON      LAT
>> 2  17210    BC    NA    NA -122.9581 38.36304
>> 4  17212    BC    NA    NA -123.0724 38.93073
>> 8  17216    BC    NA    NA -122.9389 38.31551
>> 13 17221    BC    NA    NA -123.0653 38.94845
>> 14 17222    BC    NA    NA -122.9464 38.36808
>>
>>
>> HTH,
>> -steve
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Physiology, Biophysics and Systems Biology
>> Weill Medical College of Cornell University
>>
>> Contact Info: 
>> http://cbio.mskcc.org/~lianos/contact<http://cbio.mskcc.org/%7Elianos/contact>
>>
>>
>>
>>
>>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  |  Memorial Sloan-Kettering Cancer Center
>
>  |  Weill Medical College of Cornell University
> Contact Info: 
> http://cbio.mskcc.org/~lianos/contact<http://cbio.mskcc.org/%7Elianos/contact>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to