Re: [R] concatenating columns in data.frame

2021-07-01 Thread Jeff Newmiller
I use parts of the tidyverse frequently, but this post is the best argument I 
can imagine for learning base R techniques.

On July 1, 2021 8:41:06 PM PDT, Avi Gross via R-help  
wrote:
>Micha,
>
>Others have provided ways in standard R so I will contribute a somewhat
>odd solution using the dplyr and related packages in the tidyverse
>including a sample data.frame/tibble I made. It requires newer versions
>of R and other  packages as it uses some fairly esoteric features
>including "the big bang" and the new ":=" operator and more.
>
>You can use your own data with whatever columns you need, of course.
>
>The goal is to have umpteen columns in the data that you want to add an
>additional columns to an existing tibble that is the result of
>concatenating the rowwise contents of a dynamically supplied vector of
>column names in quotes. First we need something to work with so here is
>a sample:
>
>#--start
># load required packages, or a bunch at once!
>library(tidyverse)
>
># Pick how many rows you want. For a demo, 3 is plenty N <- 3
>
># Make a sample tibble with N rows and the following 4 columns mydf <-
>tibble(alpha = 1:N, 
>   beta=letters[1:N],
>   gamma = N:1,
>   delta = month.abb[1:N])
>
># show the original tibble
>print(mydf)
>#--end
>
>In flat text mode, here is the output:
>
>> print(mydf)
># A tibble: 3 x 4
>alpha beta  gamma delta
>   
>  1 1 a 3 Jan  
>2 2 b 2 Feb  
>3 3 c 1 Mar
>
>Now I want to make a function that is used instead of the mutate verb.
>I made a weird one-liner that is a tad hard to explain so first let me
>mention the requirements.
>
>It will take a first argument that is a tibble and in a pipeline this
>would be passed invisibly.
>The second required argument is a vector or list containing the names
>of the columns as strings. A column can be re-used multiple times.
>The third optional argument is what to name the new column with a
>default if omitted.
>The fourth optional argument allows you to choose a different separator
>than "" if you wish.
>
>The function should be usable in a pipeline on both sides so it should
>also return the input tibble with an extra column to the output.
>
>Here is the function:
>
>my_mutate <- function(df, columns, colnew="concatenated", sep=""){
>  df %>%
>mutate( "{colnew}" := paste(!!!rlang::syms(columns), sep = sep )) }
>
>Yes, the above can be done inline as a long one-liner:
>
>my_mutate <- function(df, columns, colnew="concatenated", sep="")
>mutate(df, "{colnew}" := paste(!!!rlang::syms(columns), sep = sep ))
>
>Here are examples of it running:
>
>
>> choices <- c("beta", "delta", "alpha", "delta") mydf %>% 
>> my_mutate(choices, "me2")
># A tibble: 3 x 5
>alpha beta  gamma delta me2 
>   
>  1 1 a 3 Jan   aJan1Jan
>2 2 b 2 Feb   bFeb2Feb
>3 3 c 1 Mar   cMar3Mar
>> mydf %>% my_mutate(choices, "me2",":")
># A tibble: 3 x 5
>alpha beta  gamma delta me2
>  
>  1 1 a 3 Jan   a:Jan:1:Jan
>2 2 b 2 Feb   b:Feb:2:Feb
>3 3 c 1 Mar   c:Mar:3:Mar
>> mydf %>% my_mutate(c("beta", "beta", "gamma", "gamma", "delta", 
>> "alpha"))
># A tibble: 3 x 5
>alpha beta  gamma delta concatenated
>   
>  1 1 a 3 Jan   aa33Jan1
>2 2 b 2 Feb   bb22Feb2
>3 3 c 1 Mar   cc11Mar3
>> mydf %>% my_mutate(list("beta", "beta", "gamma", "gamma", "delta", 
>> "alpha"))
># A tibble: 3 x 5
>alpha beta  gamma delta concatenated
>   
>  1 1 a 3 Jan   aa33Jan1
>2 2 b 2 Feb   bb22Feb2
>3 3 c 1 Mar   cc11Mar3
>> mydf %>% my_mutate(columns=list("alpha", "beta", "gamma", "delta", 
>> "gamma", "beta", "alpha"),
> +sep="/*/",
> +colnew="NewRandomNAME"
> +)
># A tibble: 3 x 5
>alpha beta  gamma delta NewRandomNAME  
>  
>  1 1 a 3 Jan   1/*/a/*/3/*/Jan/*/3/*/a/*/1
>2 2 b 2 Feb   2/*/b/*/2/*/Feb/*/2/*/b/*/2
>3 3 c 1 Mar   3/*/c/*/1/*/Mar/*/1/*/c/*/3
>
>Does this meet your normal need? Just to show it works in a pipeline,
>here is a variant:
>
>mydf %>%
>  tail(2) %>%
>  my_mutate(c("beta", "beta"), "betabeta") %>%
>  print() %>%
>  my_mutate(list("alpha", "betabeta", "gamma"),
>"buildson", 
>"&")
>
>The above only keeps the last two lines of the tibble, makes a double
>copy of "beta" under a new name, prints the intermediate result,
>continues to make another concatenation using the variable created
>earlier then prints the result:
>
>Here is the run:
>
>> mydf %>%
>  +   tail(2) %>%
>  +   my_mutate(c("beta", "beta"), "betabeta") %>%
>  +   print() %>%
>  +   my_mutate(list("alpha", "betabeta", "gamma"),
>+ "buildson", 
>+ "&")
># A 

Re: [R] List / Matrix to Data Frame

2021-07-01 Thread Avi Gross via R-help
Bill,

A Matrix can only contain one kind of data. I ran your code after modifying
it to be proper and took a transpose to get it risght-side up:

t(wanted)
date netIncomegrossProfit  
2020-09-30 "2020-09-30" "5741100.00" "10495600.00"
2019-09-30 "2019-09-30" "5525600.00" "9839200.00" 
2018-09-30 "2018-09-30" "5953100.00" "10183900.00"
2017-09-30 "2017-09-30" "4835100.00" "8818600.00"

That looks better, I think.

So I did this:

wanted <- t(wanted)

It has rownames and colnames and you can make it a data frame easily enough:

mydf <- data.frame(wanted)

But the columns are all character strings, so CONVERT them as you wish:

> mydf
date  netIncome grossProfit
2020-09-30 2020-09-30 5741100.00 10495600.00
2019-09-30 2019-09-30 5525600.00  9839200.00
2018-09-30 2018-09-30 5953100.00 10183900.00

Your numbers are quite large and may or may not be meant to be integers:

> mydf$netIncome <- as.numeric(mydf$netIncome)
> mydf$grossProfit <- as.numeric(mydf$grossProfit)
> head(mydf)
date  netIncome grossProfit
2020-09-30 2020-09-30 5.7411e+10 1.04956e+11
2019-09-30 2019-09-30 5.5256e+10 9.83920e+10
2018-09-30 2018-09-30 5.9531e+10 1.01839e+11
2017-09-30 2017-09-30 4.8351e+10 8.81860e+10
2016-09-30 2016-09-30 4.5687e+10 8.42630e+10
2015-09-30 2015-09-30 5.3394e+10 9.36260e+10

The first entries may have something wrong as they become NA when I make
them integers.

The date column is the same as the rownames and is not in a normal vector
format. It shows as a list and you may want to convert it to one of several
formats R supports for dates or a more normal character string. 

So here is how I made it a character string:

> mydf <- as.data.frame(wanted)
> mydf$date <- as.character(mydf$date)
> mydf$netIncome <- as.numeric(mydf$netIncome)
> mydf$grossProfit <- as.numeric(mydf$grossProfit)
> head(mydf)
date  netIncome grossProfit
2020-09-30 2020-09-30 5.7411e+10 1.04956e+11
2019-09-30 2019-09-30 5.5256e+10 9.83920e+10
2018-09-30 2018-09-30 5.9531e+10 1.01839e+11
2017-09-30 2017-09-30 4.8351e+10 8.81860e+10
2016-09-30 2016-09-30 4.5687e+10 8.42630e+10
2015-09-30 2015-09-30 5.3394e+10 9.36260e+10

If you want a DATE, it can now be converted again using one of many methods.

Just FYI, numbers that big and rounded might work just as well measured in
millions as in what I did to grossProfit:

> mydf$grossProfit <- as.integer(mydf$grossProfit/100)
> mydf$netIncome <- as.integer(mydf$netIncome/100)
> head(mydf)
date netIncome grossProfit
2020-09-30 2020-09-30 57411  104956
2019-09-30 2019-09-30 55256   98392
2018-09-30 2018-09-30 59531  101839
2017-09-30 2017-09-30 48351   88186
2016-09-30 2016-09-30 45687   84263
2015-09-30 2015-09-30 53394   93626

Some of the numbers are negative though.

If rownames are not needed:

> rownames(mydf) <- NULL
> head(mydf)
date netIncome grossProfit
1 2020-09-30 57411  104956
2 2019-09-30 55256   98392
3 2018-09-30 59531  101839
4 2017-09-30 48351   88186
5 2016-09-30 45687   84263
6 2015-09-30 53394   93626

It may be easier to work with this, but again, if you need the dates to be
real dates, as in graphing.

Hope that helps. 





-Original Message-
From: R-help  On Behalf Of Bill Dunlap
Sent: Thursday, July 1, 2021 9:01 PM
To: Sparks, John 
Cc: r-help@r-project.org
Subject: Re: [R] List / Matrix to Data Frame

Does this do what you want?

> df <- data.frame(check.names=FALSE,
lapply(c(Date="date",netIncome="netIncome",`Gross Profit`="grossProfit"),
function(nm)vapply(ISY, "[[", nm, FUN.VALUE=NA_character_)))
> str(df)
'data.frame':   36 obs. of  3 variables:
 $ Date: chr  "2020-09-30" "2019-09-30" "2018-09-30" "2017-09-30"
...
 $ netIncome   : chr  "5741100.00" "5525600.00" "5953100.00"
"4835100.00" ...
 $ Gross Profit: chr  "10495600.00" "9839200.00" "10183900.00"
"8818600.00" ...
> df$Date <- as.Date(df$Date)
> df$netIncome <- as.numeric(df$netIncome) df$`Gross Profit` <- 
> as.numeric(df$`Gross Profit`)
> str(df)
'data.frame':   36 obs. of  3 variables:
 $ Date: Date, format: "2020-09-30" "2019-09-30" "2018-09-30"
"2017-09-30" ...
 $ netIncome   : num  5.74e+10 5.53e+10 5.95e+10 4.84e+10 4.57e+10 ...
 $ Gross Profit: num  1.05e+11 9.84e+10 1.02e+11 8.82e+10 8.43e+10 ...
> with(df, plot(Date, netIncome))

On Thu, Jul 1, 2021 at 5:35 PM Sparks, John  wrote:

> Hi R-Helpers,
>
> I am taking it upon myself to delve into the world of lists for R.  In 
> no small part because I appear to have discovered a source of data for 
> an exceptionally good price but that delivers much of that data in json
format.
>
> So over the last day or so I managed to fight the list processing 
> tools to a draw and get a list that has only selected elements 
> (actually it ends up in matrix form).  But when I try to convert that 
> to a data frame I can't get it to a 

Re: [R] concatenating columns in data.frame

2021-07-01 Thread Avi Gross via R-help
Micha,

Others have provided ways in standard R so I will contribute a somewhat odd 
solution using the dplyr and related packages in the tidyverse including a 
sample data.frame/tibble I made. It requires newer versions of R and other  
packages as it uses some fairly esoteric features including "the big bang" and 
the new ":=" operator and more.

You can use your own data with whatever columns you need, of course.

The goal is to have umpteen columns in the data that you want to add an 
additional columns to an existing tibble that is the result of concatenating 
the rowwise contents of a dynamically supplied vector of column names in 
quotes. First we need something to work with so here is a sample:

#--start
# load required packages, or a bunch at once!
library(tidyverse)

# Pick how many rows you want. For a demo, 3 is plenty N <- 3

# Make a sample tibble with N rows and the following 4 columns mydf <- 
tibble(alpha = 1:N, 
   beta=letters[1:N],
   gamma = N:1,
   delta = month.abb[1:N])

# show the original tibble
print(mydf)
#--end

In flat text mode, here is the output:

> print(mydf)
# A tibble: 3 x 4
alpha beta  gamma delta
   
  1 1 a 3 Jan  
2 2 b 2 Feb  
3 3 c 1 Mar

Now I want to make a function that is used instead of the mutate verb. I made a 
weird one-liner that is a tad hard to explain so first let me mention the 
requirements.

It will take a first argument that is a tibble and in a pipeline this would be 
passed invisibly.
The second required argument is a vector or list containing the names of the 
columns as strings. A column can be re-used multiple times.
The third optional argument is what to name the new column with a default if 
omitted.
The fourth optional argument allows you to choose a different separator than "" 
if you wish.

The function should be usable in a pipeline on both sides so it should also 
return the input tibble with an extra column to the output.

Here is the function:

my_mutate <- function(df, columns, colnew="concatenated", sep=""){
  df %>%
mutate( "{colnew}" := paste(!!!rlang::syms(columns), sep = sep )) }

Yes, the above can be done inline as a long one-liner:

my_mutate <- function(df, columns, colnew="concatenated", sep="") mutate(df, 
"{colnew}" := paste(!!!rlang::syms(columns), sep = sep ))

Here are examples of it running:


> choices <- c("beta", "delta", "alpha", "delta") mydf %>% 
> my_mutate(choices, "me2")
# A tibble: 3 x 5
alpha beta  gamma delta me2 
   
  1 1 a 3 Jan   aJan1Jan
2 2 b 2 Feb   bFeb2Feb
3 3 c 1 Mar   cMar3Mar
> mydf %>% my_mutate(choices, "me2",":")
# A tibble: 3 x 5
alpha beta  gamma delta me2
  
  1 1 a 3 Jan   a:Jan:1:Jan
2 2 b 2 Feb   b:Feb:2:Feb
3 3 c 1 Mar   c:Mar:3:Mar
> mydf %>% my_mutate(c("beta", "beta", "gamma", "gamma", "delta", 
> "alpha"))
# A tibble: 3 x 5
alpha beta  gamma delta concatenated
   
  1 1 a 3 Jan   aa33Jan1
2 2 b 2 Feb   bb22Feb2
3 3 c 1 Mar   cc11Mar3
> mydf %>% my_mutate(list("beta", "beta", "gamma", "gamma", "delta", 
> "alpha"))
# A tibble: 3 x 5
alpha beta  gamma delta concatenated
   
  1 1 a 3 Jan   aa33Jan1
2 2 b 2 Feb   bb22Feb2
3 3 c 1 Mar   cc11Mar3
> mydf %>% my_mutate(columns=list("alpha", "beta", "gamma", "delta", 
> "gamma", "beta", "alpha"),
 +sep="/*/",
 +colnew="NewRandomNAME"
 +)
# A tibble: 3 x 5
alpha beta  gamma delta NewRandomNAME  
  
  1 1 a 3 Jan   1/*/a/*/3/*/Jan/*/3/*/a/*/1
2 2 b 2 Feb   2/*/b/*/2/*/Feb/*/2/*/b/*/2
3 3 c 1 Mar   3/*/c/*/1/*/Mar/*/1/*/c/*/3

Does this meet your normal need? Just to show it works in a pipeline, here is a 
variant:

mydf %>%
  tail(2) %>%
  my_mutate(c("beta", "beta"), "betabeta") %>%
  print() %>%
  my_mutate(list("alpha", "betabeta", "gamma"),
"buildson", 
"&")

The above only keeps the last two lines of the tibble, makes a double copy of 
"beta" under a new name, prints the intermediate result, continues to make 
another concatenation using the variable created earlier then prints the result:

Here is the run:

> mydf %>%
  +   tail(2) %>%
  +   my_mutate(c("beta", "beta"), "betabeta") %>%
  +   print() %>%
  +   my_mutate(list("alpha", "betabeta", "gamma"),
+ "buildson", 
+ "&")
# A tibble: 2 x 5
alpha beta  gamma delta betabeta
   
  1 2 b 2 Feb   bb  
2 3 c 1 Mar   cc  
# A tibble: 2 x 6
alpha beta  gamma delta betabeta buildson
   
  1 2 b 2 Feb   bb   2&2  
2 3 c 1 Mar   cc   3&1  

As to how the darn function 

Re: [R] List / Matrix to Data Frame

2021-07-01 Thread Bill Dunlap
Does this do what you want?

> df <- data.frame(check.names=FALSE,
lapply(c(Date="date",netIncome="netIncome",`Gross Profit`="grossProfit"),
function(nm)vapply(ISY, "[[", nm, FUN.VALUE=NA_character_)))
> str(df)
'data.frame':   36 obs. of  3 variables:
 $ Date: chr  "2020-09-30" "2019-09-30" "2018-09-30" "2017-09-30"
...
 $ netIncome   : chr  "5741100.00" "5525600.00" "5953100.00"
"4835100.00" ...
 $ Gross Profit: chr  "10495600.00" "9839200.00" "10183900.00"
"8818600.00" ...
> df$Date <- as.Date(df$Date)
> df$netIncome <- as.numeric(df$netIncome)
> df$`Gross Profit` <- as.numeric(df$`Gross Profit`)
> str(df)
'data.frame':   36 obs. of  3 variables:
 $ Date: Date, format: "2020-09-30" "2019-09-30" "2018-09-30"
"2017-09-30" ...
 $ netIncome   : num  5.74e+10 5.53e+10 5.95e+10 4.84e+10 4.57e+10 ...
 $ Gross Profit: num  1.05e+11 9.84e+10 1.02e+11 8.82e+10 8.43e+10 ...
> with(df, plot(Date, netIncome))

On Thu, Jul 1, 2021 at 5:35 PM Sparks, John  wrote:

> Hi R-Helpers,
>
> I am taking it upon myself to delve into the world of lists for R.  In no
> small part because I appear to have discovered a source of data for an
> exceptionally good price but that delivers much of that data in json format.
>
> So over the last day or so I managed to fight the list processing tools to
> a draw and get a list that has only selected elements (actually it ends up
> in matrix form).  But when I try to convert that to a data frame I can't
> get it to a form that is workable.
>
> I have visited some pages about converting a matrix to a data frame but
> they result in highly redundant and inelegant data.
>
> I am thinking that someone who works with lists and matrices knows how to
> do this quite easily and would be willing to provide a solution.
>
> The reproducible example is shown below.  Just to be explicit, what I am
> trying to get to is something along the lines of a data frame like this.
>
> Date  netIncome  Gross Profit
> 2020-09-30 5741100 10495600
> 2019-09-30 5525600   9839200
>
> .
>
> The closest I get is a matrix that looks like this
>
> > wanted
> 2020-09-302019-09-30   2018-09-30
> 2017-09-30   2016-09-30   2015-09-30   2014-09-30
> date"2020-09-30"  "2019-09-30" "2018-09-30"
> "2017-09-30" "2016-09-30" "2015-09-30" "2014-09-30"
> netIncome   "5741100.00"  "5525600.00" "5953100.00"
> "4835100.00" "4568700.00" "5339400.00" "3951000.00"
> grossProfit "10495600.00" "9839200.00" "10183900.00"
> "8818600.00" "8426300.00" "9362600.00" "7053700.00"
>
> Code for example
>
> library(jsonlite)
> test <- jsonlite::fromJSON("
> https://eodhistoricaldata.com/api/fundamentals/AAPL.US?api_token=OeAFFmMliFG5orCUuwAKQ8l4WWFQ67YX
> ")
>
> hist<-test[[13]]
> ISY<-hist$Income_Statement$yearly
> wanted<-sapply(ISY, "[", j = c("date","netIncome","grossProfit"))
>
>
> Your guidance would be much appreciated.
>
> --John J. Sparks, Ph.D.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] List / Matrix to Data Frame

2021-07-01 Thread Sparks, John
Hi R-Helpers,

I am taking it upon myself to delve into the world of lists for R.  In no small 
part because I appear to have discovered a source of data for an exceptionally 
good price but that delivers much of that data in json format.

So over the last day or so I managed to fight the list processing tools to a 
draw and get a list that has only selected elements (actually it ends up in 
matrix form).  But when I try to convert that to a data frame I can't get it to 
a form that is workable.

I have visited some pages about converting a matrix to a data frame but they 
result in highly redundant and inelegant data.

I am thinking that someone who works with lists and matrices knows how to do 
this quite easily and would be willing to provide a solution.

The reproducible example is shown below.  Just to be explicit, what I am trying 
to get to is something along the lines of a data frame like this.

Date  netIncome  Gross Profit
2020-09-30 5741100 10495600
2019-09-30 5525600   9839200

.

The closest I get is a matrix that looks like this

> wanted
2020-09-302019-09-30   2018-09-302017-09-30 
  2016-09-30   2015-09-30   2014-09-30
date"2020-09-30"  "2019-09-30" "2018-09-30"  "2017-09-30"   
  "2016-09-30" "2015-09-30" "2014-09-30"
netIncome   "5741100.00"  "5525600.00" "5953100.00"  
"4835100.00" "4568700.00" "5339400.00" "3951000.00"
grossProfit "10495600.00" "9839200.00" "10183900.00" 
"8818600.00" "8426300.00" "9362600.00" "7053700.00"

Code for example

library(jsonlite)
test <- 
jsonlite::fromJSON("https://eodhistoricaldata.com/api/fundamentals/AAPL.US?api_token=OeAFFmMliFG5orCUuwAKQ8l4WWFQ67YX;)

hist<-test[[13]]
ISY<-hist$Income_Statement$yearly
wanted<-sapply(ISY, "[", j = c("date","netIncome","grossProfit"))


Your guidance would be much appreciated.

--John J. Sparks, Ph.D.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get data frame using DBI package

2021-07-01 Thread Jeff Newmiller
Please make the effort to post plain text email... formatted email gets 
stripped and what remains is often garbled.

Most people use dbGetQuery... but reading the help page accessible via 
?dbSendQuery should clarify how it should be used.

On July 1, 2021 3:36:46 PM PDT, Kai Yang via R-help  
wrote:
>Hi List,I use odbc to connect a MSSQL server. When I run the script of
>"res <- dbSendQuery(con, "SELECT * FROM BIODBX.MECCUNIQUE2")", the res
>is "Formal class OdbcResult". Can someone help me to modify the code to
>get a data frame?Thanks,Kai
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] get data frame using DBI package

2021-07-01 Thread Kai Yang via R-help
Hi List,I use odbc to connect a MSSQL server. When I run the script of "res <- 
dbSendQuery(con, "SELECT * FROM BIODBX.MECCUNIQUE2")", the res is "Formal class 
OdbcResult". Can someone help me to modify the code to get a data 
frame?Thanks,Kai
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Consulta filtro múltiple.

2021-07-01 Thread Eric Concha M.


 Jajaj que gran idea Carlos, gracias por tu aporte :)

 Sólo para salir de la duda, usaste set.key() con la opción de DT ?

 Saludos !!

 Eric.




> 
> De: Carlos Ortega 
> Enviado: jueves, 1 de julio de 2021 11:57
> Para: Rodr_guez Mu__os, Miguel _ngel
> Cc: juan manuel dias; Lista R
> Asunto: Re: [R-es] Consulta filtro m_ltiple.
> 
>  [ ... ]
> 
> Y el ganador en tiempos de ejecuci_n es...
> 
> Unit: microseconds
>   expr   min lq   meanmedian uq
> max neval datatable99.376   198.8520   338.4575   232.609
> 300.4635  7845.888  1000 dplyr  1417.242  1939.4520  2598.0892
> 2285.436  2884.1000 21591.185  1000 base62.11999.7405
> 158.8749   119.255   156.1890 10826.685  1000 sqldf 13058.622
> 16870.2300 21358.6144 19247.554 24269.0985 64807.865  1000
> 
>  [ ... ]
> 
> 
> 
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
> 
> 
> 
> 
> 
> Nota: A informaci_n contida nesta mensaxe e os seus posibles
> documentos adxuntos _ privada e confidencial e est_ dirixida
> _nicamente _ seu destinatario/a. Se vostede non _ o/a destinatario/a
> orixinal desta mensaxe, por favor elim_nea. A distribuci_n ou copia
> desta mensaxe non est_ autorizada.
> 
> Nota: La informaci_n contenida en este mensaje y sus posibles
> documentos adjuntos es privada y confidencial y est_ dirigida
> _nicamente a su destinatario/a. Si usted no es el/la destinatario/a
> original de este mensaje, por favor elim_nelo. La distribuci_n o
> copia de este mensaje no est_ autorizada.
> 
> See more languages: http://www.sergas.es/aviso-confidencialidad
> 
>   [[alternative HTML version deleted]]
> 

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] concatenating columns in data.frame

2021-07-01 Thread Eric Berger
Lovely one-liner Bert. Chapeau

On Thu, Jul 1, 2021 at 10:16 PM Berry, Charles 
wrote:

>
>
> > On Jul 1, 2021, at 11:24 AM, Bert Gunter  wrote:
> >
> > Why not simply:
> >
> > ## reprex
> > set.seed(123)
> > df = data.frame("A"=sample(letters, 10), "B"=sample(letters, 10),
> >"C"=sample(letters,10), "D"=sample(letters, 10))
> > df
> > use_columns = c("D", "B")
> >
> > ## one liner
> > df$combo_col <- do.call(paste,c(df[,use_columns], sep = "_"))
> > df
> >
> > In case you are wondering, this works because by definition *a date
> > frame **is** a list*, so the concatenation is list concatenation.
> >
>
> Why not?
>
> Because I erroneously thought that there is a "data.frame' method for `c`
> and that this would cause a problem.
>
> But I was wrong, so your solution wins.
>
> Best,
> Chuck
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] concatenating columns in data.frame

2021-07-01 Thread Berry, Charles



> On Jul 1, 2021, at 11:24 AM, Bert Gunter  wrote:
> 
> Why not simply:
> 
> ## reprex
> set.seed(123)
> df = data.frame("A"=sample(letters, 10), "B"=sample(letters, 10),
>"C"=sample(letters,10), "D"=sample(letters, 10))
> df
> use_columns = c("D", "B")
> 
> ## one liner
> df$combo_col <- do.call(paste,c(df[,use_columns], sep = "_"))
> df
> 
> In case you are wondering, this works because by definition *a date
> frame **is** a list*, so the concatenation is list concatenation.
> 

Why not? 

Because I erroneously thought that there is a "data.frame' method for `c` and 
that this would cause a problem.

But I was wrong, so your solution wins.

Best,
Chuck
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Consulta filtro múltiple.

2021-07-01 Thread juan manuel dias
Muchas gracias a todos! Que buen trabajo carlos comparando tiempos de
ejecucion con las distintas opciones r base dplyr y sqldf. Saludos! Juan.

El jue., 1 de jul. de 2021 9:37 a.m., Isidro Hidalgo Arellano <
ihida...@jccm.es> escribió:

> Carlos,
> ¿Te importaría poner el código?
> Mil gracias...
>
> Isidro Hidalgo Arellano
> Observatorio del Mercado de Trabajo
> Consejería de Economía, Empresas y Empleo
> http://www.castillalamancha.es/
>
> -Mensaje original-
> De: R-help-es  En nombre de
> miguel.angel.rodriguez.mui...@sergas.es
> Enviado el: jueves, 1 de julio de 2021 12:17
> Para: c...@qualityexcellence.es
> CC: r-help-es@r-project.org
> Asunto: Re: [R-es] Consulta filtro múltiple.
>
> Buen trabajo, Carlos.
>
>
> Efectivamente, sqldf es muy poco eficiente (porque "recorre" varias veces
> la base adem�de que tiene que traducir las instrucciones).
>
>
> Yo s�lo recomiendo para la gente que viene del mundo SQL (y del grandioso
> SELECT) y/o a los vagos que prefieran escribir poco c�o.
>
> (yo cumplo las dos condiciones)
>
>
> :-)
>
>
> ?
>
> Un saludo,
>
> Miguel.
>
>
>
>
>
> 
> De: Carlos Ortega 
> Enviado: jueves, 1 de julio de 2021 11:57
> Para: Rodr�ez Mu�s, Miguel �gel
> Cc: juan manuel dias; Lista R
> Asunto: Re: [R-es] Consulta filtro m�le.
>
>  [ ... ]
>
> Y el ganador en tiempos de ejecuci�s...
>
> Unit: microseconds
>   expr   min lq   meanmedian uq   max
> neval
>  datatable99.376   198.8520   338.4575   232.609   300.4635  7845.888
> 1000
>  dplyr  1417.242  1939.4520  2598.0892  2285.436  2884.1000 21591.185
> 1000
>   base62.11999.7405   158.8749   119.255   156.1890 10826.685
> 1000
>  sqldf 13058.622 16870.2300 21358.6144 19247.554 24269.0985 64807.865
> 1000
>
>  [ ... ]
>
>
>
> Saludos,
> Carlos Ortega
>
> http://secure-web.cisco.com/1sPMcNqTKgaZhCI3VHWDNPt8_Vz5bObsQ2pCpI6mALdCvwyT029ZW-ysfzukZ3rJw_JLfq3tp4HGeCTtarv64mANhWiqtSNw6eWZDkYL3pFw0Mf4Hx_YmTloiiwMhiH4at3c9HBAOyorwxRZbKClEi-JxpDosJjwQcIVT86Wygf27Pw4nI-yF7R0XhLNnfDbt8JrGM6GrqTB2Wtgqx4kVcJhNPmZ43oEzA-vHOyvU_IzV94U3bOinlD3q6bszB9KY-QH0OoIPQE69aJNCUipeKQ/http%3A%2F%2Fwww.qualityexcellence.es
> <
> http://secure-web.cisco.com/1sPMcNqTKgaZhCI3VHWDNPt8_Vz5bObsQ2pCpI6mALdCvwyT029ZW-ysfzukZ3rJw_JLfq3tp4HGeCTtarv64mANhWiqtSNw6eWZDkYL3pFw0Mf4Hx_YmTloiiwMhiH4at3c9HBAOyorwxRZbKClEi-JxpDosJjwQcIVT86Wygf27Pw4nI-yF7R0XhLNnfDbt8JrGM6GrqTB2Wtgqx4kVcJhNPmZ43oEzA-vHOyvU_IzV94U3bOinlD3q6bszB9KY-QH0OoIPQE69aJNCUipeKQ/http%3A%2F%2Fwww.qualityexcellence.es
> >
>
>
>
> 
>
> Nota: A informaci�ontida nesta mensaxe e os seus posibles documentos
> adxuntos �rivada e confidencial e est�irixida �mente �u destinatario/a. Se
> vostede non �/a destinatario/a orixinal desta mensaxe, por favor elim�a. A
> distribuci�u copia desta mensaxe non est�utorizada.
>
> Nota: La informaci�ontenida en este mensaje y sus posibles documentos
> adjuntos es privada y confidencial y est�irigida �mente a su
> destinatario/a. Si usted no es el/la destinatario/a original de este
> mensaje, por favor elim�lo. La distribuci� copia de este mensaje no
> est�utorizada.
>
> See more languages: http://www.sergas.es/aviso-confidencialidad
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] concatenating columns in data.frame

2021-07-01 Thread Bert Gunter
Why not simply:

## reprex
set.seed(123)
df = data.frame("A"=sample(letters, 10), "B"=sample(letters, 10),
"C"=sample(letters,10), "D"=sample(letters, 10))
df
use_columns = c("D", "B")

## one liner
df$combo_col <- do.call(paste,c(df[,use_columns], sep = "_"))
df

In case you are wondering, this works because by definition *a date
frame **is** a list*, so the concatenation is list concatenation.


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jul 1, 2021 at 7:37 AM Micha Silver  wrote:
>
> I need to create a new data.frame column as a concatenation of existing
> character columns. But the number and name of the columns to concatenate
> needs to be passed in dynamically. The code below does what I want, but
> seems very clumsy. Any suggestions how to improve?
>
>
> df = data.frame("A"=sample(letters, 10), "B"=sample(letters, 10),
> "C"=sample(letters,10), "D"=sample(letters, 10))
>
> # Which columns to concat:
>
> use_columns = c("D", "B")
>
>
> UpdateCombo = function(df, use_columns) {
>  use_df = df[, use_columns]
>  combo_list = lapply(1:nrow(use_df), function(r) {
>  r_combo = paste(use_df[r,], collapse="_")
>  return(data.frame("Combo" = r_combo))
>  })
>  combo = do.call(rbind, combo_list)
>
>  names(combo) = "Combo"
>
>  return(combo)
>
> }
>
>
> combo_col = UpdateCombo(df, use_columns)
>
> df_combo = do.call(cbind, list(df, combo_col))
>
>
> Thanks
>
>
> --
> Micha Silver
> Ben Gurion Univ.
> Sde Boker, Remote Sensing Lab
> cell: +972-523-665918
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] concatenating columns in data.frame

2021-07-01 Thread Berry, Charles



> On Jul 1, 2021, at 7:36 AM, Micha Silver  wrote:
> 
> I need to create a new data.frame column as a concatenation of existing 
> character columns. But the number and name of the columns to concatenate 
> needs to be passed in dynamically. The code below does what I want, but seems 
> very clumsy. Any suggestions how to improve?
> 
> 
> df = data.frame("A"=sample(letters, 10), "B"=sample(letters, 10), 
> "C"=sample(letters,10), "D"=sample(letters, 10))
> 
> # Which columns to concat:
> 
> use_columns = c("D", "B")
> 
> 
> UpdateCombo = function(df, use_columns) {
> use_df = df[, use_columns]
> combo_list = lapply(1:nrow(use_df), function(r) {
> r_combo = paste(use_df[r,], collapse="_")
> return(data.frame("Combo" = r_combo))
> })
> combo = do.call(rbind, combo_list)
> 
> names(combo) = "Combo"
> 
> return(combo)
> 
> }
> 
> 
> combo_col = UpdateCombo(df, use_columns)
> 
> df_combo = do.call(cbind, list(df, combo_col))
> 
> 


I'd do this:


UpdateCombo <-
  function(df, use_columns){
pasteUnder <- function(...) paste(..., sep="_")
do.call(pasteUnder, df[, use_columns])
  }

df_combo <- cbind(df, Combo=UpdateCombo(df, use_columns))

HTH,
Chuck

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Consulta filtro múltiple.

2021-07-01 Thread juan manuel dias
Hola! Si, pense en librería sqldf! Gracias!

El jue., 1 de jul. de 2021 4:25 a.m., <
miguel.angel.rodriguez.mui...@sergas.es> escribió:

> Hola Juan Manuel.
>
>
> Otro enfoque (de la vieja escuela)
>
>
> base <- read.csv2("base_monodrogas.csv")
>
> library(sqldf)
>
> ​seleccion <- sqldf("select * from base where (monodroga='aciclovir' AND
> UNIDADES=20) OR (monodroga='paracetamol' AND UNIDADES=10) ")
>
>
> Más info del paquete sqldf aquí -> https://rquer.netlify.app/sql/
>
>
>
> Un saludo,
>
> Miguel.
>
>
>
>
>
>
> --
> *De:* R-help-es  en nombre de juan
> manuel dias 
> *Enviado:* jueves, 1 de julio de 2021 0:15
> *Para:* Lista R
> *Asunto:* [R-es] Consulta filtro múltiple.
>
> Hola, como andan!
>
> Tengo una base de datos de medicamentos (monodrogas), con tres variables,
> unidades, precio y precio unitario. Necesito llegar a un data frame donde
> tenga solo las monodrogas que cumplen alguna condición en la variable
> unidades, pero considerando varias monodrogas.
>
> Esto es un recorte de la base:
>
> Monodroga UNIDADES Precio PrecioUnit
> aciclovir 20 111272 55.636
> aciclovir 20 97464 48.732
> aciclovir 40 98322 432
> aciclovir 40 98322 324
> paracetamol 1 19291 192.91
> paracetamol 1 24702 247.02
> paracetamol 1 21120 211.2
> paracetamol 10 9993 9.993
> paracetamol 10 10443 10.443
> rosuvastatina 14 141134 100.81
> rosuvastatina 28 258262 92.2364286
> rosuvastatina 28 201590 71.9964286
> rosuvastatina 30 183717 61.239
> rosuvastatina 30 231935 77.3116667
>
> Por ejemplo, para la monodroga "aciclovir" necesito solo las filas donde
> Unidades==20,  en paracetamol==10 y en rosuvastatina==30.
>
> Estoy trabajando con tidyverse y he probado algunas cosas que no han
> funcionado.
>
> prom_max_min_base_precios_May_2021_final<-base_precios_May_2021_final %>%
>   ##unite("concat1",CodDrog,CodForma,sep="",remove = FALSE) %>%
>   ##unite("concat2",CodDrog,CodForma,Potencia,sep="",remove = FALSE) %>%
>   filter(!is.na(CodDrog)) %>%
>   ##filter(monodroga=="aciclovir", Unidades %in% c(20)) %>%
>   group_by(concat1,concat2,monodroga) %>%
>   summarize(min_may_2021=min(precio_unitario),
> max_may_2021=max(precio_unitario),
> prom_may_2021=mean(precio_unitario)) %>%
>   ungroup()
>
> Ajdunto la base en csv.
>
> Muchas gracias!
>
> --
>
> Nota: A información contida nesta mensaxe e os seus posibles documentos
> adxuntos é privada e confidencial e está dirixida únicamente ó seu
> destinatario/a. Se vostede non é o/a destinatario/a orixinal desta mensaxe,
> por favor elimínea. A distribución ou copia desta mensaxe non está
> autorizada.
>
> Nota: La información contenida en este mensaje y sus posibles documentos
> adjuntos es privada y confidencial y está dirigida únicamente a su
> destinatario/a. Si usted no es el/la destinatario/a original de este
> mensaje, por favor elimínelo. La distribución o copia de este mensaje no
> está autorizada.
>
> See more languages: http://www.sergas.es/aviso-confidencialidad
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] Consulta filtro múltiple.

2021-07-01 Thread juan manuel dias
Muchas gracias! Esta interesante esta propuesta! Voy a probarla!

El jue., 1 de jul. de 2021 2:53 a.m., Víctor Granda García <
victorgrandagar...@gmail.com> escribió:

> Otra opción es combinar case_when y filter con dplyr. Con case_when creas
> una variable dummy y luego filtras por esta:
>
> data %>%
>   mutate(
> dummy = case_when(
>   Monodroga == aciclovir & unidades >= 20 ~ TRUE,
>   Monodroga == paracetamol & unidades >= 10 ~ TRUE,
>   Monodroga == rosuvastina & unidades >= 30 ~ TRUE,
>   TRUE ~ FALSE
>   )
> ) %>%
>   filter(isTRUE(dummy))
>
>
> *Víctor Granda García*
> Data Scientist
> Ecosystem Modelling Facility - CREAF
>
>
> Tel. +34 93 581 33 53
> CREAF. Campus UAB. Edifici C. 08193 Bellaterra (Barcelona)
>
> Antes de imprimir este mensaje electrónico piense en el medio ambiente.
>
>
>
> On Thu, 1 Jul 2021 at 06:53, juan manuel dias  wrote:
>
>> Muchas gracias! Lo veo una buena opción, mañana voy a probar con algunas
>> monodrogas para ver que funcione y en tal caso lo escalo a toda la base.
>> Muchas gracias! Juan.
>>
>> El mié., 30 de jun. de 2021 7:35 p.m., Eric Concha M. <
>> ericconchamu...@gmail.com> escribió:
>>
>> >
>> >  Y si lo haces con la libreria data.table ? suponiendo que bd es tu
>> >  base de datos:
>> >
>> >  bd1 <- bd[monodroga=="aciclovir" & UNIDADES==20,]
>> >  bd2 <- bd[monodroga=="paracetamol" & UNIDADES==10,]
>> >  bd3 <- bd[monodroga=="rosuvastatina" & UNIDADES==30,]
>> >
>> > y luego las unes:
>> >
>> >  bd.nueva <- rbind(bd1,bd2,bd3)
>> >
>> > Algo así podría ser ... hay muchas otras formas de hacerlo, pero me
>> > gusta data.table cuando son bbdd grandes xq es muy rápida, sobretodo si
>> > la usas con set.key() ... mira la ayuda de R para que veas los detalles
>> > de data.table.
>> >
>> > Ojo con los detalles, como que la columna monodroga sea tipo caracter o
>> > factor, q UNIDADES sea numérico, y así ...
>> >
>> > Suerte !!
>> >
>> > Eric.
>> >
>> >
>> >
>> >
>> > n Wed, 30 Jun 2021 19:15:21 -0300
>> > juan manuel dias  wrote:
>> >
>> > > Hola, como andan!
>> > >
>> > > Tengo una base de datos de medicamentos (monodrogas), con tres
>> > > variables, unidades, precio y precio unitario. Necesito llegar a un
>> > > data frame donde tenga solo las monodrogas que cumplen alguna
>> > > condición en la variable unidades, pero considerando varias
>> > > monodrogas.
>> > >
>> > > Esto es un recorte de la base:
>> > >
>> > > Monodroga UNIDADES Precio PrecioUnit
>> > > aciclovir 20 111272 55.636
>> > > aciclovir 20 97464 48.732
>> > > aciclovir 40 98322 432
>> > > aciclovir 40 98322 324
>> > > paracetamol 1 19291 192.91
>> > > paracetamol 1 24702 247.02
>> > > paracetamol 1 21120 211.2
>> > > paracetamol 10 9993 9.993
>> > > paracetamol 10 10443 10.443
>> > > rosuvastatina 14 141134 100.81
>> > > rosuvastatina 28 258262 92.2364286
>> > > rosuvastatina 28 201590 71.9964286
>> > > rosuvastatina 30 183717 61.239
>> > > rosuvastatina 30 231935 77.3116667
>> > >
>> > > Por ejemplo, para la monodroga "aciclovir" necesito solo las filas
>> > > donde Unidades==20,  en paracetamol==10 y en rosuvastatina==30.
>> > >
>> > > Estoy trabajando con tidyverse y he probado algunas cosas que no han
>> > > funcionado.
>> > >
>> > > prom_max_min_base_precios_May_2021_final<-base_precios_May_2021_final
>> > > %>% ##unite("concat1",CodDrog,CodForma,sep="",remove = FALSE) %>%
>> > >   ##unite("concat2",CodDrog,CodForma,Potencia,sep="",remove = FALSE)
>> > > %>% filter(!is.na(CodDrog)) %>%
>> > >   ##filter(monodroga=="aciclovir", Unidades %in% c(20)) %>%
>> > >   group_by(concat1,concat2,monodroga) %>%
>> > >   summarize(min_may_2021=min(precio_unitario),
>> > > max_may_2021=max(precio_unitario),
>> > > prom_may_2021=mean(precio_unitario)) %>%
>> > >   ungroup()
>> > >
>> > > Ajdunto la base en csv.
>> > >
>> > > Muchas gracias!
>> >
>> > ___
>> > R-help-es mailing list
>> > R-help-es@r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/r-help-es
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> R-help-es mailing list
>> R-help-es@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] concatenating columns in data.frame

2021-07-01 Thread Eric Berger
You can do the same steps but without so much intermediate saving to
shorten it

f <- function(x) {
  do.call(rbind,lapply(1:nrow(x),
   function(r) {paste(x[r,], collapse="_")}))
}

df_combo <- cbind(df,Combo=f(df[,c(4,2)]))

HTH,
Eric


On Thu, Jul 1, 2021 at 5:37 PM Micha Silver  wrote:

> I need to create a new data.frame column as a concatenation of existing
> character columns. But the number and name of the columns to concatenate
> needs to be passed in dynamically. The code below does what I want, but
> seems very clumsy. Any suggestions how to improve?
>
>
> df = data.frame("A"=sample(letters, 10), "B"=sample(letters, 10),
> "C"=sample(letters,10), "D"=sample(letters, 10))
>
> # Which columns to concat:
>
> use_columns = c("D", "B")
>
>
> UpdateCombo = function(df, use_columns) {
>  use_df = df[, use_columns]
>  combo_list = lapply(1:nrow(use_df), function(r) {
>  r_combo = paste(use_df[r,], collapse="_")
>  return(data.frame("Combo" = r_combo))
>  })
>  combo = do.call(rbind, combo_list)
>
>  names(combo) = "Combo"
>
>  return(combo)
>
> }
>
>
> combo_col = UpdateCombo(df, use_columns)
>
> df_combo = do.call(cbind, list(df, combo_col))
>
>
> Thanks
>
>
> --
> Micha Silver
> Ben Gurion Univ.
> Sde Boker, Remote Sensing Lab
> cell: +972-523-665918
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] concatenating columns in data.frame

2021-07-01 Thread Micha Silver
I need to create a new data.frame column as a concatenation of existing 
character columns. But the number and name of the columns to concatenate 
needs to be passed in dynamically. The code below does what I want, but 
seems very clumsy. Any suggestions how to improve?



df = data.frame("A"=sample(letters, 10), "B"=sample(letters, 10), 
"C"=sample(letters,10), "D"=sample(letters, 10))


# Which columns to concat:

use_columns = c("D", "B")


UpdateCombo = function(df, use_columns) {
    use_df = df[, use_columns]
    combo_list = lapply(1:nrow(use_df), function(r) {
    r_combo = paste(use_df[r,], collapse="_")
    return(data.frame("Combo" = r_combo))
    })
    combo = do.call(rbind, combo_list)

    names(combo) = "Combo"

    return(combo)

}


combo_col = UpdateCombo(df, use_columns)

df_combo = do.call(cbind, list(df, combo_col))


Thanks


--
Micha Silver
Ben Gurion Univ.
Sde Boker, Remote Sensing Lab
cell: +972-523-665918

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Consulta filtro múltiple.

2021-07-01 Thread Isidro Hidalgo Arellano
Carlos,
¿Te importaría poner el código?
Mil gracias...

Isidro Hidalgo Arellano
Observatorio del Mercado de Trabajo
Consejería de Economía, Empresas y Empleo
http://www.castillalamancha.es/

-Mensaje original-
De: R-help-es  En nombre de 
miguel.angel.rodriguez.mui...@sergas.es
Enviado el: jueves, 1 de julio de 2021 12:17
Para: c...@qualityexcellence.es
CC: r-help-es@r-project.org
Asunto: Re: [R-es] Consulta filtro múltiple.

Buen trabajo, Carlos.


Efectivamente, sqldf es muy poco eficiente (porque "recorre" varias veces la 
base adem�de que tiene que traducir las instrucciones).


Yo s�lo recomiendo para la gente que viene del mundo SQL (y del grandioso 
SELECT) y/o a los vagos que prefieran escribir poco c�o.

(yo cumplo las dos condiciones)


:-)


?

Un saludo,

Miguel.






De: Carlos Ortega 
Enviado: jueves, 1 de julio de 2021 11:57
Para: Rodr�ez Mu�s, Miguel �gel
Cc: juan manuel dias; Lista R
Asunto: Re: [R-es] Consulta filtro m�le.

 [ ... ]

Y el ganador en tiempos de ejecuci�s...

Unit: microseconds
  expr   min lq   meanmedian uq   max neval
 datatable99.376   198.8520   338.4575   232.609   300.4635  7845.888  1000
 dplyr  1417.242  1939.4520  2598.0892  2285.436  2884.1000 21591.185  1000
  base62.11999.7405   158.8749   119.255   156.1890 10826.685  1000
 sqldf 13058.622 16870.2300 21358.6144 19247.554 24269.0985 64807.865  1000

 [ ... ]



Saludos,
Carlos Ortega
http://secure-web.cisco.com/1sPMcNqTKgaZhCI3VHWDNPt8_Vz5bObsQ2pCpI6mALdCvwyT029ZW-ysfzukZ3rJw_JLfq3tp4HGeCTtarv64mANhWiqtSNw6eWZDkYL3pFw0Mf4Hx_YmTloiiwMhiH4at3c9HBAOyorwxRZbKClEi-JxpDosJjwQcIVT86Wygf27Pw4nI-yF7R0XhLNnfDbt8JrGM6GrqTB2Wtgqx4kVcJhNPmZ43oEzA-vHOyvU_IzV94U3bOinlD3q6bszB9KY-QH0OoIPQE69aJNCUipeKQ/http%3A%2F%2Fwww.qualityexcellence.es





Nota: A informaci�ontida nesta mensaxe e os seus posibles documentos adxuntos 
�rivada e confidencial e est�irixida �mente �u destinatario/a. Se vostede non 
�/a destinatario/a orixinal desta mensaxe, por favor elim�a. A distribuci�u 
copia desta mensaxe non est�utorizada.

Nota: La informaci�ontenida en este mensaje y sus posibles documentos adjuntos 
es privada y confidencial y est�irigida �mente a su destinatario/a. Si usted no 
es el/la destinatario/a original de este mensaje, por favor elim�lo. La 
distribuci� copia de este mensaje no est�utorizada.

See more languages: http://www.sergas.es/aviso-confidencialidad

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Unexpected date format coercion

2021-07-01 Thread Enrico Schumann
On Thu, 01 Jul 2021, Jeremie Juste writes:

> Hello 
>
> On Thursday,  1 Jul 2021 at 08:25, PIKAL Petr wrote:
>> Hm.
>>
>> Seems to me, that both your codes are wrong but printing in Linux is
>> different from Windows.
>>
>> With
>> as.Date("20-12-2020","%Y-%m-%d")
>> you say that 20 is year (actually year 20) and 2020 is day and only first
>> two values are taken (but with some valueas result is NA)
>>
>> I can confirm 4.0.3 in Windows behaves this way too.
>>> as.Date("20-12-2020","%Y-%m-%d")
>> [1] "0020-12-20"
>
> Many thanks for confirming this.
>
>
> On Thursday,  1 Jul 2021 at 18:22, Jim Lemon wrote:
>> Hi Jeremie,
>> Try:
>>
>> as.Date("20-12-2020","%y-%m-%d")
>> [1] "2020-12-20"
>
> Thanks for this info. I'm looking for something that produce NA if the
> date is not exactly in the specified format so that it can be
> corrected. I was relying on the format parameter of the date for that.
>
> The issue is that there can be so many variations in date format that for the 
> time
> being I still find it easier to delegate the correction to the user. A
> particular nasty case is when there are multiple date format in the same
> column.
>
>
> Best regards,
> Jeremie
>

You could explicitly test whether the specified format
is as expcected, perhaps with a regex such as

s <- c("2020-01-20", "20-12-2020")
grepl("^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]$", s)

and/or by checking the components of the dates:

valid_Date <- function(s) {
tmp <- strsplit(s, "[-]")

year <- as.numeric(sapply(tmp, `[[`, 1))
valid.year <- year < 2500 & year > 1800

month <- as.numeric(sapply(tmp, `[[`, 2))
valid.month <- month >= 0 & month <= 12

day <- as.numeric(sapply(tmp, `[[`, 3))
valid.day <- day >= 1 & day <= 31

ans <- as.Date(s)
ans[!(valid.year & valid.month & valid.day)] <- NA
ans
}



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected date format coercion

2021-07-01 Thread PIKAL Petr
Hi

Maybe you could inspect

parse_date_time {lubridate} R Documentation

from package lubridate.

Or see answers here

https://stackoverflow.com/questions/25463523/how-to-convert-variable-with-mixed-date-formats-to-one-format

Cheers
Petr

> -Original Message-
> From: Jeremie Juste 
> Sent: Thursday, July 1, 2021 11:00 AM
> To: PIKAL Petr 
> Cc: r-help 
> Subject: Re: [R] Unexpected date format coercion
>
> Hello
>
> On Thursday,  1 Jul 2021 at 08:25, PIKAL Petr wrote:
> > Hm.
> >
> > Seems to me, that both your codes are wrong but printing in Linux is
> > different from Windows.
> >
> > With
> > as.Date("20-12-2020","%Y-%m-%d")
> > you say that 20 is year (actually year 20) and 2020 is day and only
> > first two values are taken (but with some valueas result is NA)
> >
> > I can confirm 4.0.3 in Windows behaves this way too.
> >> as.Date("20-12-2020","%Y-%m-%d")
> > [1] "0020-12-20"
>
> Many thanks for confirming this.
>
>
> On Thursday,  1 Jul 2021 at 18:22, Jim Lemon wrote:
> > Hi Jeremie,
> > Try:
> >
> > as.Date("20-12-2020","%y-%m-%d")
> > [1] "2020-12-20"
>
> Thanks for this info. I'm looking for something that produce NA if the date 
> is
> not exactly in the specified format so that it can be corrected. I was 
> relying on
> the format parameter of the date for that.
>
> The issue is that there can be so many variations in date format that for 
> the
> time being I still find it easier to delegate the correction to the user. A
> particular nasty case is when there are multiple date format in the same
> column.
>
>
> Best regards,
> Jeremie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Consulta filtro múltiple.

2021-07-01 Thread miguel.angel.rodriguez.muinos
Buen trabajo, Carlos.


Efectivamente, sqldf es muy poco eficiente (porque "recorre" varias veces la 
base adem�s de que tiene que traducir las instrucciones).


Yo s�lo lo recomiendo para la gente que viene del mundo SQL (y del grandioso 
SELECT) y/o a los vagos que prefieran escribir poco c�digo.

(yo cumplo las dos condiciones)


:-)


?

Un saludo,

Miguel.






De: Carlos Ortega 
Enviado: jueves, 1 de julio de 2021 11:57
Para: Rodr�guez Mu��os, Miguel �ngel
Cc: juan manuel dias; Lista R
Asunto: Re: [R-es] Consulta filtro m�ltiple.

 [ ... ]

Y el ganador en tiempos de ejecuci�n es...

Unit: microseconds
  expr   min lq   meanmedian uq   max neval
 datatable99.376   198.8520   338.4575   232.609   300.4635  7845.888  1000
 dplyr  1417.242  1939.4520  2598.0892  2285.436  2884.1000 21591.185  1000
  base62.11999.7405   158.8749   119.255   156.1890 10826.685  1000
 sqldf 13058.622 16870.2300 21358.6144 19247.554 24269.0985 64807.865  1000

 [ ... ]



Saludos,
Carlos Ortega
www.qualityexcellence.es





Nota: A informaci�n contida nesta mensaxe e os seus posibles documentos 
adxuntos � privada e confidencial e est� dirixida �nicamente � seu 
destinatario/a. Se vostede non � o/a destinatario/a orixinal desta mensaxe, por 
favor elim�nea. A distribuci�n ou copia desta mensaxe non est� autorizada.

Nota: La informaci�n contenida en este mensaje y sus posibles documentos 
adjuntos es privada y confidencial y est� dirigida �nicamente a su 
destinatario/a. Si usted no es el/la destinatario/a original de este mensaje, 
por favor elim�nelo. La distribuci�n o copia de este mensaje no est� autorizada.

See more languages: http://www.sergas.es/aviso-confidencialidad

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Bibliometix Package Question

2021-07-01 Thread Jim Lemon
Hi Adrianna,
I can see why you may be confused about the input file to create the
dataframe for bibliometrix. Looking at the 'convert2db' function, it
seems to shuffle the various formats to a common order used by other
functions. The help page describes that order. I would look at the
sample file 'scientometrics' as it seems to be a plain text file. If
you set up your CSV file in that format, it should be read by the
convert2df function with the default 'plaintext' format argument.

Jim

On Thu, Jul 1, 2021 at 6:06 PM Perryman, Adrianna
 wrote:
>
> Hello,
>
> My name is Adrianna, I am a student researcher at the University Health 
> Network in Toronto. I am working on a bibliometric analysis and would like to 
> use R. My search involves five databases: MEDLINE (Ovid), AJOL, EMBASE, 
> African Medicus Index and Web of Science. We expect to gather thousands of 
> results based on our search and we have to undergo a cleaning processes to 
> remove duplicates and exclude any articles that do not fit our inclusion 
> criteria.  Given the codes for running the bibliometric analysis I am 
> inquiring whether it is possible to upload a csv or excel file after cleaning 
> the data to then run the analysis?
>
> If you have any other tips, please let me know the help is greatly 
> appreciated.
>
> I look forward to hearing back from you.
>
> Thank you again!
>
> Best,
>
> Adrianna Perryman
>
> This e-mail may contain confidential and/or privileged information for the 
> sole use of the intended recipient.
> Any review or distribution by anyone other than the person for whom it was 
> originally intended is strictly prohibited.
> If you have received this e-mail in error, please contact the sender and 
> delete all copies.
> Opinions, conclusions or other information contained in this e-mail may not 
> be that of the organization.
>
> If you feel you have received an email from UHN of a commercial nature and 
> would like to be removed from the sender's mailing list please do one of the 
> following:
> (1) Follow any unsubscribe process the sender has included in their email
> (2) Where no unsubscribe process has been included, reply to the sender and 
> type "unsubscribe" in the subject line. If you require additional information 
> please go to our UHN Newsletters and Mailing Lists page.
> Please note that we are unable to automatically unsubscribe individuals from 
> all UHN mailing lists.
>
>
> Patient Consent for Email:
>
> UHN patients may provide their consent to communicate with UHN about their 
> care using email. All electronic communication carries some risk. Please 
> visit our website 
> here
>  to learn about the risks of electronic communication and how to protect your 
> privacy. You may withdraw your consent to receive emails from UHN at any 
> time. Please contact your care provider, if you do not wish to receive emails 
> from UHN.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected date format coercion

2021-07-01 Thread Jeremie Juste
Hello 

On Thursday,  1 Jul 2021 at 08:25, PIKAL Petr wrote:
> Hm.
>
> Seems to me, that both your codes are wrong but printing in Linux is
> different from Windows.
>
> With
> as.Date("20-12-2020","%Y-%m-%d")
> you say that 20 is year (actually year 20) and 2020 is day and only first
> two values are taken (but with some valueas result is NA)
>
> I can confirm 4.0.3 in Windows behaves this way too.
>> as.Date("20-12-2020","%Y-%m-%d")
> [1] "0020-12-20"

Many thanks for confirming this.


On Thursday,  1 Jul 2021 at 18:22, Jim Lemon wrote:
> Hi Jeremie,
> Try:
>
> as.Date("20-12-2020","%y-%m-%d")
> [1] "2020-12-20"

Thanks for this info. I'm looking for something that produce NA if the
date is not exactly in the specified format so that it can be
corrected. I was relying on the format parameter of the date for that.

The issue is that there can be so many variations in date format that for the 
time
being I still find it easier to delegate the correction to the user. A
particular nasty case is when there are multiple date format in the same
column.


Best regards,
Jeremie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected date format coercion

2021-07-01 Thread PIKAL Petr
Hm.

Seems to me, that both your codes are wrong but printing in Linux is
different from Windows.

With
as.Date("20-12-2020","%Y-%m-%d")
you say that 20 is year (actually year 20) and 2020 is day and only first
two values are taken (but with some valueas result is NA)

I can confirm 4.0.3 in Windows behaves this way too.
> as.Date("20-12-2020","%Y-%m-%d")
[1] "0020-12-20"

Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of Jeremie Juste
> Sent: Thursday, July 1, 2021 10:06 AM
> To: r-help 
> Subject: [R] Unexpected date format coercion
> 
> Hello,
> 
> I have been surprised when converting a character string to a date with
the
> following format,
> 
> in R 4.1.0 (linux debian 10)
> 
> as.Date("20-12-2020","%Y-%m-%d")
> [1] "20-12-20"
> 
> in R 4.0.5 (window 10)
> 
> as.Date("20-12-2020","%Y-%m-%d")
> [1] "0020-12-20"
> 
> 
> Here I was expecting a blunt and sharp NA, am I missing something?
> 
> Best regards,
> Jeremie
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected date format coercion

2021-07-01 Thread Jim Lemon
Hi Jeremie,
Try:

as.Date("20-12-2020","%y-%m-%d")
[1] "2020-12-20"

Jim

On Thu, Jul 1, 2021 at 6:16 PM Jeremie Juste  wrote:
>
> Hello,
>
> I have been surprised when converting a character string to a date with the 
> following
> format,
>
> in R 4.1.0 (linux debian 10)
>
> as.Date("20-12-2020","%Y-%m-%d")
> [1] "20-12-20"
>
> in R 4.0.5 (window 10)
>
> as.Date("20-12-2020","%Y-%m-%d")
> [1] "0020-12-20"
>
>
> Here I was expecting a blunt and sharp NA, am I missing something?
>
> Best regards,
> Jeremie
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected date format coercion

2021-07-01 Thread Uwe Ligges




On 01.07.2021 10:06, Jeremie Juste wrote:

Hello,

I have been surprised when converting a character string to a date with the 
following
format,

in R 4.1.0 (linux debian 10)

as.Date("20-12-2020","%Y-%m-%d")
[1] "20-12-20"

in R 4.0.5 (window 10)

as.Date("20-12-2020","%Y-%m-%d")
[1] "0020-12-20"


Yes, it is rather strange to specify "2020" as the day and "20" as the 
4digits year, so different implementations may print the year in 2 or 4 
digits. What you want is actually


as.Date("20-12-2020","%d-%m-%Y")


Best,
Uwe Ligges









Here I was expecting a blunt and sharp NA, am I missing something?

Best regards,
Jeremie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unexpected date format coercion

2021-07-01 Thread Jeremie Juste
Hello,

I have been surprised when converting a character string to a date with the 
following
format,

in R 4.1.0 (linux debian 10)

as.Date("20-12-2020","%Y-%m-%d")
[1] "20-12-20"

in R 4.0.5 (window 10)

as.Date("20-12-2020","%Y-%m-%d")
[1] "0020-12-20"


Here I was expecting a blunt and sharp NA, am I missing something?

Best regards,
Jeremie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bibliometix Package Question

2021-07-01 Thread Perryman, Adrianna
Hello,

My name is Adrianna, I am a student researcher at the University Health Network 
in Toronto. I am working on a bibliometric analysis and would like to use R. My 
search involves five databases: MEDLINE (Ovid), AJOL, EMBASE, African Medicus 
Index and Web of Science. We expect to gather thousands of results based on our 
search and we have to undergo a cleaning processes to remove duplicates and 
exclude any articles that do not fit our inclusion criteria.  Given the codes 
for running the bibliometric analysis I am inquiring whether it is possible to 
upload a csv or excel file after cleaning the data to then run the analysis?

If you have any other tips, please let me know the help is greatly appreciated.

I look forward to hearing back from you.

Thank you again!

Best,

Adrianna Perryman

This e-mail may contain confidential and/or privileged information for the sole 
use of the intended recipient.
Any review or distribution by anyone other than the person for whom it was 
originally intended is strictly prohibited.
If you have received this e-mail in error, please contact the sender and delete 
all copies.
Opinions, conclusions or other information contained in this e-mail may not be 
that of the organization.

If you feel you have received an email from UHN of a commercial nature and 
would like to be removed from the sender's mailing list please do one of the 
following:
(1) Follow any unsubscribe process the sender has included in their email
(2) Where no unsubscribe process has been included, reply to the sender and 
type "unsubscribe" in the subject line. If you require additional information 
please go to our UHN Newsletters and Mailing Lists page.
Please note that we are unable to automatically unsubscribe individuals from 
all UHN mailing lists.


Patient Consent for Email:

UHN patients may provide their consent to communicate with UHN about their care 
using email. All electronic communication carries some risk. Please visit our 
website 
here
 to learn about the risks of electronic communication and how to protect your 
privacy. You may withdraw your consent to receive emails from UHN at any time. 
Please contact your care provider, if you do not wish to receive emails from 
UHN.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Consulta filtro múltiple.

2021-07-01 Thread miguel.angel.rodriguez.muinos
Hola Juan Manuel.


Otro enfoque (de la vieja escuela)


base <- read.csv2("base_monodrogas.csv")

library(sqldf)

?seleccion <- sqldf("select * from base where (monodroga='aciclovir' AND 
UNIDADES=20) OR (monodroga='paracetamol' AND UNIDADES=10) ")


M�s info del paquete sqldf aqu� -> https://rquer.netlify.app/sql/



Un saludo,

Miguel.







De: R-help-es  en nombre de juan manuel dias 

Enviado: jueves, 1 de julio de 2021 0:15
Para: Lista R
Asunto: [R-es] Consulta filtro m�ltiple.

Hola, como andan!

Tengo una base de datos de medicamentos (monodrogas), con tres variables, 
unidades, precio y precio unitario. Necesito llegar a un data frame donde tenga 
solo las monodrogas que cumplen alguna condici�n en la variable unidades, pero 
considerando varias monodrogas.

Esto es un recorte de la base:

Monodroga   UNIDADESPrecio  PrecioUnit
aciclovir   20  111272  55.636
aciclovir   20  97464   48.732
aciclovir   40  98322   432
aciclovir   40  98322   324
paracetamol 1   19291   192.91
paracetamol 1   24702   247.02
paracetamol 1   21120   211.2
paracetamol 10  99939.993
paracetamol 10  10443   10.443
rosuvastatina   14  141134  100.81
rosuvastatina   28  258262  92.2364286
rosuvastatina   28  201590  71.9964286
rosuvastatina   30  183717  61.239
rosuvastatina   30  231935  77.3116667

Por ejemplo, para la monodroga "aciclovir" necesito solo las filas donde 
Unidades==20,  en paracetamol==10 y en rosuvastatina==30.

Estoy trabajando con tidyverse y he probado algunas cosas que no han funcionado.

prom_max_min_base_precios_May_2021_final<-base_precios_May_2021_final %>%
  ##unite("concat1",CodDrog,CodForma,sep="",remove = FALSE) %>%
  ##unite("concat2",CodDrog,CodForma,Potencia,sep="",remove = FALSE) %>%
  filter(!is.na(CodDrog)) %>%
  ##filter(monodroga=="aciclovir", Unidades %in% c(20)) %>%
  group_by(concat1,concat2,monodroga) %>%
  summarize(min_may_2021=min(precio_unitario),
max_may_2021=max(precio_unitario),
prom_may_2021=mean(precio_unitario)) %>%
  ungroup()

Ajdunto la base en csv.

Muchas gracias!



Nota: A informaci�n contida nesta mensaxe e os seus posibles documentos 
adxuntos � privada e confidencial e est� dirixida �nicamente � seu 
destinatario/a. Se vostede non � o/a destinatario/a orixinal desta mensaxe, por 
favor elim�nea. A distribuci�n ou copia desta mensaxe non est� autorizada.

Nota: La informaci�n contenida en este mensaje y sus posibles documentos 
adjuntos es privada y confidencial y est� dirigida �nicamente a su 
destinatario/a. Si usted no es el/la destinatario/a original de este mensaje, 
por favor elim�nelo. La distribuci�n o copia de este mensaje no est� autorizada.

See more languages: http://www.sergas.es/aviso-confidencialidad

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es