Re: [R] Adding a new conditional column to a list of dataframes

2018-04-15 Thread David Winsemius

> On Apr 15, 2018, at 4:08 AM, Allaisone 1  wrote:
> 
> 
> Hi all ..,
> 
> 
> I have a list of 7000 dataframes with similar column headers and I wanted to 
> add a new column to each dataframe based on a certain condition which is the 
> same for all dataframes.
> 
> 
> When I extract one dataframe and apply my code it works very well as follows 
> :-
> 
> 
> First suppose this is my first dataframe in the list
> 
>> OneDF <- Mylist[[1]]
> 
>> OneDF
> 
> 
> ID   Pdate  Tdate
> 
> 1 2010-09-30   2011-05-10
> 
> 2 2011-11-07   2009-09-31
> 
> 3 2012-01-052008-06-23
> 
> 
> To add a new column where "C" has to be written in that column only if the 
> date in
> 
> "Tdate" column is less than the first date(row) in "Pdate" column.Otherwise 
> "NA" is written.
> 
> I have written this code to do so :-
> 
> 
> OneDF$NewCol [ OneDF[ ,3] <  OneDF[ 1,2] ] <- "C"
> 
> 
> This gave me what I want as follows :-
> 
> 
> ID   Pdate  Tdate  NewCol
> 
> 1 2010-09-30   2011-05-10NA
> 
> 2 2011-11-07   2009-09-31  C
> 
> 3 2012-01-052008-06-23 C
> 
> 
> However, when I tried to apply this code in a function and then apply this 
> function
> 
> to all dataframes using lapply() function , I do not get what I want.
> 
> 
> I wrote this function first :-
> 
> 
> MyFunction <- function(x) x$NewCol [ x[ ,3] <  x[ 1,2] ] <- "C"
> 
> 
> Then I wrote this code to apply my function to all dataframes in "Mylist" :
> 
> 
> NewList <- lapply(names(Mylist), function(x) MyFunction(Mylist[[x]]))
> 
> 
> This returned a list of 7000 elements and each of which contain "C'' letter. 
> Each
> 
> dataframe has become a vector of "C'' letter which is totally away from what 
> I need.
> 
> I expected to see a list of my 7000 dataframes and each of which looks like 
> the output
> 
> I have shown above with the new column.
> 
> 
> I spent a lot of time trying to know what  is the mistake I have made in 
> these last two codes
> 
> but was not able to know the issue.

A function returns the result of the last function call. In your case the last 
function called was `[<-` and if you look at that function's help page you will 
find only the value of its RHS (in your case "C") is returned. That assignment 
function has is predominat action via side-effect rather than by a truly 
functional operation.. The function might have been written:

 MyFunction <- function(x) { x$NewCol [ x[ ,3] <  x[ 1,2] ] <- "C"; x } # so 
that x gets returned

I say "might" since you have not included a reproducible example.

Another point: The `$` operator is not ideal for work within functions.

And. Noting this:

>   [[alternative HTML version deleted]]
Do read the Posting Guide.
-- 

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   
-Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding a new conditional column to a list of dataframes

2018-04-15 Thread Jeff Newmiller
Your failure to send your question using plain text format means that the 
mailing list tried to fix that and we are seeing your code all messed up. 
Please learn how to use your email program... or we may not even be able to 
figure out your question at all. 

I think you need to pay attention to what your function is returning... 
whatever is on the last line of the function is key. Try testing your function 
separately from the lapply to insure you get what you want. Also be sure to use 
braces when they are needed to define your function. 

On April 15, 2018 4:08:35 AM PDT, Allaisone 1  wrote:
>
>Hi all ..,
>
>
>I have a list of 7000 dataframes with similar column headers and I
>wanted to add a new column to each dataframe based on a certain
>condition which is the same for all dataframes.
>
>
>When I extract one dataframe and apply my code it works very well as
>follows :-
>
>
>First suppose this is my first dataframe in the list
>
>> OneDF <- Mylist[[1]]
>
>> OneDF
>
>
>ID   Pdate  Tdate
>
>1 2010-09-30   2011-05-10
>
>2 2011-11-07   2009-09-31
>
>3 2012-01-052008-06-23
>
>
>To add a new column where "C" has to be written in that column only if
>the date in
>
>"Tdate" column is less than the first date(row) in "Pdate"
>column.Otherwise "NA" is written.
>
>I have written this code to do so :-
>
>
>OneDF$NewCol [ OneDF[ ,3] <  OneDF[ 1,2] ] <- "C"
>
>
>This gave me what I want as follows :-
>
>
>ID   Pdate  Tdate  NewCol
>
>1 2010-09-30   2011-05-10NA
>
>2 2011-11-07   2009-09-31  C
>
>3 2012-01-052008-06-23 C
>
>
>However, when I tried to apply this code in a function and then apply
>this function
>
>to all dataframes using lapply() function , I do not get what I want.
>
>
>I wrote this function first :-
>
>
>MyFunction <- function(x) x$NewCol [ x[ ,3] <  x[ 1,2] ] <- "C"
>
>
>Then I wrote this code to apply my function to all dataframes in
>"Mylist" :
>
>
>NewList <- lapply(names(Mylist), function(x) MyFunction(Mylist[[x]]))
>
>
>This returned a list of 7000 elements and each of which contain "C''
>letter. Each
>
>dataframe has become a vector of "C'' letter which is totally away from
>what I need.
>
>I expected to see a list of my 7000 dataframes and each of which looks
>like the output
>
>I have shown above with the new column.
>
>
>I spent a lot of time trying to know what  is the mistake I have made
>in these last two codes
>
>but was not able to know the issue.
>
>
>Could you please let me know my mistake and how to correct my syntax ?
>
>
>Kind Regards
>
>Allaisone
>
>
>
>
>
>
>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding a new conditional column to a list of dataframes

2018-04-15 Thread Duncan Murdoch

On 15/04/2018 7:08 AM, Allaisone 1 wrote:


Hi all ..,


I have a list of 7000 dataframes with similar column headers and I wanted to 
add a new column to each dataframe based on a certain condition which is the 
same for all dataframes.


When I extract one dataframe and apply my code it works very well as follows :-


First suppose this is my first dataframe in the list


OneDF <- Mylist[[1]]



OneDF



ID   Pdate  Tdate

1 2010-09-30   2011-05-10

2 2011-11-07   2009-09-31

3 2012-01-052008-06-23


To add a new column where "C" has to be written in that column only if the date 
in

"Tdate" column is less than the first date(row) in "Pdate" column.Otherwise 
"NA" is written.

I have written this code to do so :-


OneDF$NewCol [ OneDF[ ,3] <  OneDF[ 1,2] ] <- "C"


This gave me what I want as follows :-


ID   Pdate  Tdate  NewCol

1 2010-09-30   2011-05-10NA

2 2011-11-07   2009-09-31  C

3 2012-01-052008-06-23 C


However, when I tried to apply this code in a function and then apply this 
function

to all dataframes using lapply() function , I do not get what I want.


I wrote this function first :-


MyFunction <- function(x) x$NewCol [ x[ ,3] <  x[ 1,2] ] <- "C"


Then I wrote this code to apply my function to all dataframes in "Mylist" :


NewList <- lapply(names(Mylist), function(x) MyFunction(Mylist[[x]]))


This returned a list of 7000 elements and each of which contain "C'' letter. 
Each

dataframe has become a vector of "C'' letter which is totally away from what I 
need.

  I expected to see a list of my 7000 dataframes and each of which looks like 
the output

I have shown above with the new column.


I spent a lot of time trying to know what  is the mistake I have made in these 
last two codes

but was not able to know the issue.


Could you please let me know my mistake and how to correct my syntax ?



Your function should return x after modifying it.  As it is, it returns 
the value of x$NewCol [ x[ ,3] <  x[ 1,2] ] <- "C", which is "C".  So 
change it to


MyFunction <- function(x) {
  x$NewCol [ x[ ,3] <  x[ 1,2] ] <- "C"
  x
}

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.