Re: [R] how to automatically select certain columns using for loop in dataframe

milton ruser Thu, 09 Apr 2009 20:34:45 -0700

Hi Ferry,

It is not so elegant, but you can try


for (each_name in col_names) {

       sub.data <- subset( all.data,
                           !is.na( paste("NAME_", each_name, sep = '') ),
                           select = c( paste("NUM_", each_name, sep = '') ,
paste("NAME_", each_name, sep = '') )
                         )
    sub.data.2<-subset(sub.data, !is.na(sub.data[,2]))
       print(sub.data.2)
}


On Thu, Apr 9, 2009 at 6:30 PM, Ferry <[email protected]> wrote:

> Hi,
>
> I am trying to display / print certain columns in my data frame that share
> certain condition (for example, part of the column name). I am using for
> loop, as follow:
>
> # below is the sample data structure
> all.data <- data.frame( NUM_A = 1:5, NAME_A = c("Andy", "Andrew", "Angus",
> "Alex", "Argo"),
>                        NUM_B = 1:5, NAME_B = c(NA, "Barn", "Bolton",
> "Bravo", NA),
>                        NUM_C = 1:5, NAME_C = c("Candy", NA, "Cecil",
> "Crayon", "Corey"),
>                        NUM_D = 1:5, NAME_D = c("David", "Delta", NA, NA,
> "Dummy") )
>
> col_names <- c("A", "B", "C", "D")
>
> > all.data
>  NUM_A NAME_A NUM_B NAME_B NUM_C NAME_C NUM_D NAME_D
> 1     1   Andy     1   <NA>     1  Candy     1  David
> 2     2 Andrew     2   Barn     2   <NA>     2  Delta
> 3     3  Angus     3 Bolton     3  Cecil     3   <NA>
> 4     4   Alex     4  Bravo     4 Crayon     4   <NA>
> 5     5   Argo     5   <NA>     5  Corey     5  Dummy
> >
>
> Then for each col_names, I want to display the columns:
>
> for (each_name in col_names) {
>
>        sub.data <- subset( all.data,
>                            !is.na( paste("NAME_", each_name, sep = '') ),
>                            select = c( paste("NUM_", each_name, sep = '') ,
> paste("NAME_", each_name, sep = '') )
>                          )
>        print(sub.data)
> }
>
> the "incorrect" result:
>
> NUM_A NAME_A
> 1     1   Andy
> 2     2 Andrew
> 3     3  Angus
> 4     4   Alex
> 5     5   Argo
>  NUM_B NAME_B
> 1     1   <NA>
> 2     2   Barn
> 3     3 Bolton
> 4     4  Bravo
> 5     5   <NA>
>  NUM_C NAME_C
> 1     1  Candy
> 2     2   <NA>
> 3     3  Cecil
> 4     4 Crayon
> 5     5  Corey
>  NUM_D NAME_D
> 1     1  David
> 2     2  Delta
> 3     3   <NA>
> 4     4   <NA>
> 5     5  Dummy
> >
>
> What I want to achieve is that the result should only display the NUM and
> NAME that is not NA. Here, the NA can be NULL, or zero (or other specific
> values).
>
> the "correct" result:
>
> NUM_A NAME_A
> 1     1   Andy
> 2     2 Andrew
> 3     3  Angus
> 4     4   Alex
> 5     5   Argo
>  NUM_B NAME_B
>  2     2   Barn
> 3     3 Bolton
> 4     4  Bravo
>   NUM_C NAME_C
> 1     1  Candy
>  3     3  Cecil
> 4     4 Crayon
> 5     5  Corey
>  NUM_D NAME_D
> 1     1  David
> 2     2  Delta
> 5     5  Dummy
> >
>
> I am guessing that I don't use the correct type on the following statement
> (within the subset in the loop):
> !is.na( paste("NAME_", each_name, sep = '') )
>
> But then, I might be completely using a wrong approach.
>
> Any idea is definitely appreciated.
>
> Thank you,
>
> Ferry
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to automatically select certain columns using for loop in dataframe

Reply via email to