Re: [R] Split

2020-09-22 Thread Bert Gunter
That was still slower and doesn't quite give what was requested:

> cbind(F1,utils::strcapture("([^_]*)_(.*)", F1$text,
proto=data.frame(Before_=character(), After_=character(
  ID1 ID2  text Before_ After_
1  A1  B1  NONE   
2  A1  B1 cf_12  cf 12
3  A1  B1  NONE   
4  A2  B2 X2_25  X2 25
5  A2  B3 fd_15  fd 15

> system.time({
+ cbind(F2,utils::strcapture("([^_]*)_(.*)", F2$text,
proto=data.frame(Before_=character(), After_=character(
+ }
+ )
   user  system elapsed
 32.712   0.736  33.587

Cheers,
Bert




On Tue, Sep 22, 2020 at 5:45 PM Bill Dunlap 
wrote:

> Another way to make columns out of the stuff before and after the
> underscore, with NAs if there is no underscore, is
>
> utils::strcapture("([^_]*)_(.*)", F1$text,
> proto=data.frame(Before_=character(), After_=character()))
>
> -Bill
>
> On Tue, Sep 22, 2020 at 4:25 PM Bert Gunter 
> wrote:
>
>> To be clear, I think Rui's solution is perfectly fine and probably better
>> than what I offer below. But just for fun, I wanted to do it without the
>> lapply().  Here is one way. I think my comments suffice to explain.
>>
>> > ## which are the  non "_" indices?
>> > wh <- grep("_",F1$text, fixed = TRUE, invert = TRUE)
>> > ## paste "_." to these
>> > F1[wh,"text"] <- paste(F1[wh,"text"],".",sep = "_")
>> > ## Now strsplit() and unlist() them to get a vector
>> > z <- unlist(strsplit(F1$text, "_"))
>> > ## now cbind() to the data frame
>> > F1 <- cbind(F1, matrix(z, ncol = 2, byrow = TRUE))
>> > F1
>>   ID1 ID2   text1  2
>> 1  A1  B1 NONE_. NONE  .
>> 2  A1  B1  cf_12   cf 12
>> 3  A1  B1 NONE_. NONE  .
>> 4  A2  B2  X2_25   X2 25
>> 5  A2  B3  fd_15   fd 15
>> >## You can change the names of the 2 columns yourself
>>
>> Cheers,
>> Bert
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along and
>> sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Tue, Sep 22, 2020 at 12:19 PM Rui Barradas 
>> wrote:
>>
>> > Hello,
>> >
>> > A base R solution with strsplit, like in your code.
>> >
>> > F1$Y1 <- +grepl("_", F1$text)
>> >
>> > tmp <- strsplit(as.character(F1$text), "_")
>> > tmp <- lapply(tmp, function(x) if(length(x) == 1) c(x, ".") else x)
>> > tmp <- do.call(rbind, tmp)
>> > colnames(tmp) <- c("X1", "X2")
>> > F1 <- cbind(F1[-3], tmp)# remove the original column
>> > rm(tmp)
>> >
>> > F1
>> > #  ID1 ID2 Y1   X1 X2
>> > #1  A1  B1  0 NONE  .
>> > #2  A1  B1  1   cf 12
>> > #3  A1  B1  0 NONE  .
>> > #4  A2  B2  1   X2 25
>> > #5  A2  B3  1   fd 15
>> >
>> >
>> > Note that cbind dispatches on F1, an object of class "data.frame".
>> > Therefore it's the method cbind.data.frame that is called and the result
>> > is also a df, though tmp is a "matrix".
>> >
>> >
>> > Hope this helps,
>> >
>> > Rui Barradas
>> >
>> >
>> > Às 20:07 de 22/09/20, Rui Barradas escreveu:
>> > > Hello,
>> > >
>> > > Something like this?
>> > >
>> > >
>> > > F1$Y1 <- +grepl("_", F1$text)
>> > > F1 <- F1[c(1, 2, 4, 3)]
>> > > F1 <- tidyr::separate(F1, text, into = c("X1", "X2"), sep = "_", fill
>> =
>> > > "right")
>> > > F1
>> > >
>> > >
>> > > Hope this helps,
>> > >
>> > > Rui Barradas
>> > >
>> > > Às 19:55 de 22/09/20, Val escreveu:
>> > >> HI All,
>> > >>
>> > >> I am trying to create   new columns based on another column string
>> > >> content. First I want to identify rows that contain a particular
>> > >> string.  If it contains, I want to split the string and create two
>> > >> variables.
>> > >>
>> > >> Here is my sample of data.
>> > >> F1<-read.table(text="ID1  ID2  text
>> > >> A1 B1   NONE
>> > >> A1 B1   cf_12
>> > >> A1 B1   NONE
>> > >> A2 B2   X2_25
>> > >> A2 B3   fd_15  ",header=TRUE,stringsAsFactors=F)
>> > >> If the variable "text" contains this "_" I want to create an
>> indicator
>> > >> variable as shown below
>> > >>
>> > >> F1$Y1 <- ifelse(grepl("_", F1$text),1,0)
>> > >>
>> > >>
>> > >> Then I want to split that string in to two, before "_" and after "_"
>> > >> and create two variables as shown below
>> > >> x1= strsplit(as.character(F1$text),'_',2)
>> > >>
>> > >> My problem is how to combine this with the original data frame. The
>> > >> desired  output is shown   below,
>> > >>
>> > >>
>> > >> ID1 ID2  Y1   X1X2
>> > >> A1  B10   NONE   .
>> > >> A1  B1   1cf12
>> > >> A1  B1   0  NONE   .
>> > >> A2  B2   1X225
>> > >> A2  B3   1fd15
>> > >>
>> > >> Any help?
>> > >> Thank you.
>> > >>
>> > >> __
>> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > >> https://stat.ethz.ch/mailman/listinfo/r-help
>> > >> PLEASE do read the posting guide
>> > >> http://www.R-project.org/posting-guide.html
>> > >> and provide commented, minimal, self-contained, reproducible code.
>> > >>
>> > >
>> > > __
>> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

[R-es] ORDEN GRÁFICO POR MESES

2020-09-22 Thread Jesus MARTIN F.
  Hola,

  Estoy haciendo un gráfico con:

#
## GRAFICO BARRAS : VALORES AL DEBE MENSUALIZADO
ggplot(Diario_S2, aes(x=mes_AAA, by = MES , y=ARS_DEB))+   # ASIGNAR
VARIABLES
geom_bar(stat="identity", width=0.7,  # ANCHO BARRAS
 colour="grey", fill="darkgreen", # ASPECTO (borde y
relleno)
 position = "dodge")+
scale_fill_brewer(palette = "paired")+  # PALETA DE COLORES
labs(x="MESES",  y="IMPORTES EN ARS",color="Tipo")+  # TITULOS EJES
ggtitle("VALORES AL DEBE POR MES")   # TITULO
GRAFICO
#

  El problema es que me està ordenando las barras por el mes alfabéticamente
,

 Los valores de X, son:

 [1] ENE FEB MAR ABR MAY JUN JUL AGO SEP OCT NOV DIC
Levels: ABR AGO DIC ENE FEB JUL JUN MAR MAY NOV OCT SEP

  El gráfico me está apareciendo ordenado alfabéticamente, según "Levels" y
necesito que quede por meses, respetando el orden de los meses y no
ordenándolos alfabéticamente,

  Gracias,

  Jesús










_

*Jesús MARTÍN FRADE *
Skype:jmfpas
Tel (celular):(011) 154-946-2131 (Argentina)
(+54) 911-4946-2131 (Internacional)
Facebook http://www.facebook.com/jesusmartinfrade

[image: Mailtrack]

Remitente
notificado con
Mailtrack

22/09/20
21:50:58

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Split

2020-09-22 Thread Bill Dunlap
Another way to make columns out of the stuff before and after the
underscore, with NAs if there is no underscore, is

utils::strcapture("([^_]*)_(.*)", F1$text,
proto=data.frame(Before_=character(), After_=character()))

-Bill

On Tue, Sep 22, 2020 at 4:25 PM Bert Gunter  wrote:

> To be clear, I think Rui's solution is perfectly fine and probably better
> than what I offer below. But just for fun, I wanted to do it without the
> lapply().  Here is one way. I think my comments suffice to explain.
>
> > ## which are the  non "_" indices?
> > wh <- grep("_",F1$text, fixed = TRUE, invert = TRUE)
> > ## paste "_." to these
> > F1[wh,"text"] <- paste(F1[wh,"text"],".",sep = "_")
> > ## Now strsplit() and unlist() them to get a vector
> > z <- unlist(strsplit(F1$text, "_"))
> > ## now cbind() to the data frame
> > F1 <- cbind(F1, matrix(z, ncol = 2, byrow = TRUE))
> > F1
>   ID1 ID2   text1  2
> 1  A1  B1 NONE_. NONE  .
> 2  A1  B1  cf_12   cf 12
> 3  A1  B1 NONE_. NONE  .
> 4  A2  B2  X2_25   X2 25
> 5  A2  B3  fd_15   fd 15
> >## You can change the names of the 2 columns yourself
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Sep 22, 2020 at 12:19 PM Rui Barradas 
> wrote:
>
> > Hello,
> >
> > A base R solution with strsplit, like in your code.
> >
> > F1$Y1 <- +grepl("_", F1$text)
> >
> > tmp <- strsplit(as.character(F1$text), "_")
> > tmp <- lapply(tmp, function(x) if(length(x) == 1) c(x, ".") else x)
> > tmp <- do.call(rbind, tmp)
> > colnames(tmp) <- c("X1", "X2")
> > F1 <- cbind(F1[-3], tmp)# remove the original column
> > rm(tmp)
> >
> > F1
> > #  ID1 ID2 Y1   X1 X2
> > #1  A1  B1  0 NONE  .
> > #2  A1  B1  1   cf 12
> > #3  A1  B1  0 NONE  .
> > #4  A2  B2  1   X2 25
> > #5  A2  B3  1   fd 15
> >
> >
> > Note that cbind dispatches on F1, an object of class "data.frame".
> > Therefore it's the method cbind.data.frame that is called and the result
> > is also a df, though tmp is a "matrix".
> >
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> >
> > Às 20:07 de 22/09/20, Rui Barradas escreveu:
> > > Hello,
> > >
> > > Something like this?
> > >
> > >
> > > F1$Y1 <- +grepl("_", F1$text)
> > > F1 <- F1[c(1, 2, 4, 3)]
> > > F1 <- tidyr::separate(F1, text, into = c("X1", "X2"), sep = "_", fill =
> > > "right")
> > > F1
> > >
> > >
> > > Hope this helps,
> > >
> > > Rui Barradas
> > >
> > > Às 19:55 de 22/09/20, Val escreveu:
> > >> HI All,
> > >>
> > >> I am trying to create   new columns based on another column string
> > >> content. First I want to identify rows that contain a particular
> > >> string.  If it contains, I want to split the string and create two
> > >> variables.
> > >>
> > >> Here is my sample of data.
> > >> F1<-read.table(text="ID1  ID2  text
> > >> A1 B1   NONE
> > >> A1 B1   cf_12
> > >> A1 B1   NONE
> > >> A2 B2   X2_25
> > >> A2 B3   fd_15  ",header=TRUE,stringsAsFactors=F)
> > >> If the variable "text" contains this "_" I want to create an indicator
> > >> variable as shown below
> > >>
> > >> F1$Y1 <- ifelse(grepl("_", F1$text),1,0)
> > >>
> > >>
> > >> Then I want to split that string in to two, before "_" and after "_"
> > >> and create two variables as shown below
> > >> x1= strsplit(as.character(F1$text),'_',2)
> > >>
> > >> My problem is how to combine this with the original data frame. The
> > >> desired  output is shown   below,
> > >>
> > >>
> > >> ID1 ID2  Y1   X1X2
> > >> A1  B10   NONE   .
> > >> A1  B1   1cf12
> > >> A1  B1   0  NONE   .
> > >> A2  B2   1X225
> > >> A2  B3   1fd15
> > >>
> > >> Any help?
> > >> Thank you.
> > >>
> > >> __
> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> 

Re: [R] Split

2020-09-22 Thread Bert Gunter
Oh, if efficiency is a consideration, then my code is about 15 times as
fast as Rui's:
> F2 <- F1[rep(1:5,1e6),]  ## 5 million rows
##Rui's
> system.time({
+ F2$Y1 <- +grepl("_", F2$text)
+ tmp <- strsplit(as.character(F2$text), "_")
+ tmp <- lapply(tmp, function(x) if(length(x) == 1) c(x, ".") else x)
+ tmp <- do.call(rbind, tmp)
+ colnames(tmp) <- c("X1", "X2")
+ F2 <- cbind(F2[-3], tmp)# remove the original column
+ })
   user  system elapsed
 20.072   0.625  20.786

## my version
> system.time({
+ wh <- grep("_",F2$text, fixed = TRUE, invert = TRUE)
+ F2[wh,"text"] <- paste(F2[wh,"text"],".",sep = "_")
+ z <- unlist(strsplit(F1$text,"_"))
+ F2 <- cbind(F2, matrix(z, ncol = 2, byrow = TRUE))
+ F2
+ })
   user  system elapsed
  1.256   0.019   1.281

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Sep 22, 2020 at 5:04 PM Val  wrote:

> Thank you all for the help!
>
> LMH, Yes I would like to see the alternative.  I am using this for a
> large data set and if the  alternative is more efficient than this
> then I would be happy.
>
> On Tue, Sep 22, 2020 at 6:25 PM Bert Gunter 
> wrote:
> >
> > To be clear, I think Rui's solution is perfectly fine and probably
> better than what I offer below. But just for fun, I wanted to do it without
> the lapply().  Here is one way. I think my comments suffice to explain.
> >
> > > ## which are the  non "_" indices?
> > > wh <- grep("_",F1$text, fixed = TRUE, invert = TRUE)
> > > ## paste "_." to these
> > > F1[wh,"text"] <- paste(F1[wh,"text"],".",sep = "_")
> > > ## Now strsplit() and unlist() them to get a vector
> > > z <- unlist(strsplit(F1$text, "_"))
> > > ## now cbind() to the data frame
> > > F1 <- cbind(F1, matrix(z, ncol = 2, byrow = TRUE))
> > > F1
> >   ID1 ID2   text1  2
> > 1  A1  B1 NONE_. NONE  .
> > 2  A1  B1  cf_12   cf 12
> > 3  A1  B1 NONE_. NONE  .
> > 4  A2  B2  X2_25   X2 25
> > 5  A2  B3  fd_15   fd 15
> > >## You can change the names of the 2 columns yourself
> >
> > Cheers,
> > Bert
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Tue, Sep 22, 2020 at 12:19 PM Rui Barradas 
> wrote:
> >>
> >> Hello,
> >>
> >> A base R solution with strsplit, like in your code.
> >>
> >> F1$Y1 <- +grepl("_", F1$text)
> >>
> >> tmp <- strsplit(as.character(F1$text), "_")
> >> tmp <- lapply(tmp, function(x) if(length(x) == 1) c(x, ".") else x)
> >> tmp <- do.call(rbind, tmp)
> >> colnames(tmp) <- c("X1", "X2")
> >> F1 <- cbind(F1[-3], tmp)# remove the original column
> >> rm(tmp)
> >>
> >> F1
> >> #  ID1 ID2 Y1   X1 X2
> >> #1  A1  B1  0 NONE  .
> >> #2  A1  B1  1   cf 12
> >> #3  A1  B1  0 NONE  .
> >> #4  A2  B2  1   X2 25
> >> #5  A2  B3  1   fd 15
> >>
> >>
> >> Note that cbind dispatches on F1, an object of class "data.frame".
> >> Therefore it's the method cbind.data.frame that is called and the result
> >> is also a df, though tmp is a "matrix".
> >>
> >>
> >> Hope this helps,
> >>
> >> Rui Barradas
> >>
> >>
> >> Às 20:07 de 22/09/20, Rui Barradas escreveu:
> >> > Hello,
> >> >
> >> > Something like this?
> >> >
> >> >
> >> > F1$Y1 <- +grepl("_", F1$text)
> >> > F1 <- F1[c(1, 2, 4, 3)]
> >> > F1 <- tidyr::separate(F1, text, into = c("X1", "X2"), sep = "_", fill
> =
> >> > "right")
> >> > F1
> >> >
> >> >
> >> > Hope this helps,
> >> >
> >> > Rui Barradas
> >> >
> >> > Às 19:55 de 22/09/20, Val escreveu:
> >> >> HI All,
> >> >>
> >> >> I am trying to create   new columns based on another column string
> >> >> content. First I want to identify rows that contain a particular
> >> >> string.  If it contains, I want to split the string and create two
> >> >> variables.
> >> >>
> >> >> Here is my sample of data.
> >> >> F1<-read.table(text="ID1  ID2  text
> >> >> A1 B1   NONE
> >> >> A1 B1   cf_12
> >> >> A1 B1   NONE
> >> >> A2 B2   X2_25
> >> >> A2 B3   fd_15  ",header=TRUE,stringsAsFactors=F)
> >> >> If the variable "text" contains this "_" I want to create an
> indicator
> >> >> variable as shown below
> >> >>
> >> >> F1$Y1 <- ifelse(grepl("_", F1$text),1,0)
> >> >>
> >> >>
> >> >> Then I want to split that string in to two, before "_" and after "_"
> >> >> and create two variables as shown below
> >> >> x1= strsplit(as.character(F1$text),'_',2)
> >> >>
> >> >> My problem is how to combine this with the original data frame. The
> >> >> desired  output is shown   below,
> >> >>
> >> >>
> >> >> ID1 ID2  Y1   X1X2
> >> >> A1  B10   NONE   .
> >> >> A1  B1   1cf12
> >> >> A1  B1   0  NONE   .
> >> >> A2  B2   1X225
> >> >> A2  B3   1fd15
> >> >>
> >> >> Any help?
> >> >> Thank you.
> >> >>
> >> >> __
> >> >> 

Re: [R] Split

2020-09-22 Thread Val
Thank you all for the help!

LMH, Yes I would like to see the alternative.  I am using this for a
large data set and if the  alternative is more efficient than this
then I would be happy.

On Tue, Sep 22, 2020 at 6:25 PM Bert Gunter  wrote:
>
> To be clear, I think Rui's solution is perfectly fine and probably better 
> than what I offer below. But just for fun, I wanted to do it without the 
> lapply().  Here is one way. I think my comments suffice to explain.
>
> > ## which are the  non "_" indices?
> > wh <- grep("_",F1$text, fixed = TRUE, invert = TRUE)
> > ## paste "_." to these
> > F1[wh,"text"] <- paste(F1[wh,"text"],".",sep = "_")
> > ## Now strsplit() and unlist() them to get a vector
> > z <- unlist(strsplit(F1$text, "_"))
> > ## now cbind() to the data frame
> > F1 <- cbind(F1, matrix(z, ncol = 2, byrow = TRUE))
> > F1
>   ID1 ID2   text1  2
> 1  A1  B1 NONE_. NONE  .
> 2  A1  B1  cf_12   cf 12
> 3  A1  B1 NONE_. NONE  .
> 4  A2  B2  X2_25   X2 25
> 5  A2  B3  fd_15   fd 15
> >## You can change the names of the 2 columns yourself
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and 
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Sep 22, 2020 at 12:19 PM Rui Barradas  wrote:
>>
>> Hello,
>>
>> A base R solution with strsplit, like in your code.
>>
>> F1$Y1 <- +grepl("_", F1$text)
>>
>> tmp <- strsplit(as.character(F1$text), "_")
>> tmp <- lapply(tmp, function(x) if(length(x) == 1) c(x, ".") else x)
>> tmp <- do.call(rbind, tmp)
>> colnames(tmp) <- c("X1", "X2")
>> F1 <- cbind(F1[-3], tmp)# remove the original column
>> rm(tmp)
>>
>> F1
>> #  ID1 ID2 Y1   X1 X2
>> #1  A1  B1  0 NONE  .
>> #2  A1  B1  1   cf 12
>> #3  A1  B1  0 NONE  .
>> #4  A2  B2  1   X2 25
>> #5  A2  B3  1   fd 15
>>
>>
>> Note that cbind dispatches on F1, an object of class "data.frame".
>> Therefore it's the method cbind.data.frame that is called and the result
>> is also a df, though tmp is a "matrix".
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
>> Às 20:07 de 22/09/20, Rui Barradas escreveu:
>> > Hello,
>> >
>> > Something like this?
>> >
>> >
>> > F1$Y1 <- +grepl("_", F1$text)
>> > F1 <- F1[c(1, 2, 4, 3)]
>> > F1 <- tidyr::separate(F1, text, into = c("X1", "X2"), sep = "_", fill =
>> > "right")
>> > F1
>> >
>> >
>> > Hope this helps,
>> >
>> > Rui Barradas
>> >
>> > Às 19:55 de 22/09/20, Val escreveu:
>> >> HI All,
>> >>
>> >> I am trying to create   new columns based on another column string
>> >> content. First I want to identify rows that contain a particular
>> >> string.  If it contains, I want to split the string and create two
>> >> variables.
>> >>
>> >> Here is my sample of data.
>> >> F1<-read.table(text="ID1  ID2  text
>> >> A1 B1   NONE
>> >> A1 B1   cf_12
>> >> A1 B1   NONE
>> >> A2 B2   X2_25
>> >> A2 B3   fd_15  ",header=TRUE,stringsAsFactors=F)
>> >> If the variable "text" contains this "_" I want to create an indicator
>> >> variable as shown below
>> >>
>> >> F1$Y1 <- ifelse(grepl("_", F1$text),1,0)
>> >>
>> >>
>> >> Then I want to split that string in to two, before "_" and after "_"
>> >> and create two variables as shown below
>> >> x1= strsplit(as.character(F1$text),'_',2)
>> >>
>> >> My problem is how to combine this with the original data frame. The
>> >> desired  output is shown   below,
>> >>
>> >>
>> >> ID1 ID2  Y1   X1X2
>> >> A1  B10   NONE   .
>> >> A1  B1   1cf12
>> >> A1  B1   0  NONE   .
>> >> A2  B2   1X225
>> >> A2  B3   1fd15
>> >>
>> >> Any help?
>> >> Thank you.
>> >>
>> >> __
>> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Split

2020-09-22 Thread Bert Gunter
To be clear, I think Rui's solution is perfectly fine and probably better
than what I offer below. But just for fun, I wanted to do it without the
lapply().  Here is one way. I think my comments suffice to explain.

> ## which are the  non "_" indices?
> wh <- grep("_",F1$text, fixed = TRUE, invert = TRUE)
> ## paste "_." to these
> F1[wh,"text"] <- paste(F1[wh,"text"],".",sep = "_")
> ## Now strsplit() and unlist() them to get a vector
> z <- unlist(strsplit(F1$text, "_"))
> ## now cbind() to the data frame
> F1 <- cbind(F1, matrix(z, ncol = 2, byrow = TRUE))
> F1
  ID1 ID2   text1  2
1  A1  B1 NONE_. NONE  .
2  A1  B1  cf_12   cf 12
3  A1  B1 NONE_. NONE  .
4  A2  B2  X2_25   X2 25
5  A2  B3  fd_15   fd 15
>## You can change the names of the 2 columns yourself

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Sep 22, 2020 at 12:19 PM Rui Barradas  wrote:

> Hello,
>
> A base R solution with strsplit, like in your code.
>
> F1$Y1 <- +grepl("_", F1$text)
>
> tmp <- strsplit(as.character(F1$text), "_")
> tmp <- lapply(tmp, function(x) if(length(x) == 1) c(x, ".") else x)
> tmp <- do.call(rbind, tmp)
> colnames(tmp) <- c("X1", "X2")
> F1 <- cbind(F1[-3], tmp)# remove the original column
> rm(tmp)
>
> F1
> #  ID1 ID2 Y1   X1 X2
> #1  A1  B1  0 NONE  .
> #2  A1  B1  1   cf 12
> #3  A1  B1  0 NONE  .
> #4  A2  B2  1   X2 25
> #5  A2  B3  1   fd 15
>
>
> Note that cbind dispatches on F1, an object of class "data.frame".
> Therefore it's the method cbind.data.frame that is called and the result
> is also a df, though tmp is a "matrix".
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> Às 20:07 de 22/09/20, Rui Barradas escreveu:
> > Hello,
> >
> > Something like this?
> >
> >
> > F1$Y1 <- +grepl("_", F1$text)
> > F1 <- F1[c(1, 2, 4, 3)]
> > F1 <- tidyr::separate(F1, text, into = c("X1", "X2"), sep = "_", fill =
> > "right")
> > F1
> >
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> > Às 19:55 de 22/09/20, Val escreveu:
> >> HI All,
> >>
> >> I am trying to create   new columns based on another column string
> >> content. First I want to identify rows that contain a particular
> >> string.  If it contains, I want to split the string and create two
> >> variables.
> >>
> >> Here is my sample of data.
> >> F1<-read.table(text="ID1  ID2  text
> >> A1 B1   NONE
> >> A1 B1   cf_12
> >> A1 B1   NONE
> >> A2 B2   X2_25
> >> A2 B3   fd_15  ",header=TRUE,stringsAsFactors=F)
> >> If the variable "text" contains this "_" I want to create an indicator
> >> variable as shown below
> >>
> >> F1$Y1 <- ifelse(grepl("_", F1$text),1,0)
> >>
> >>
> >> Then I want to split that string in to two, before "_" and after "_"
> >> and create two variables as shown below
> >> x1= strsplit(as.character(F1$text),'_',2)
> >>
> >> My problem is how to combine this with the original data frame. The
> >> desired  output is shown   below,
> >>
> >>
> >> ID1 ID2  Y1   X1X2
> >> A1  B10   NONE   .
> >> A1  B1   1cf12
> >> A1  B1   0  NONE   .
> >> A2  B2   1X225
> >> A2  B3   1fd15
> >>
> >> Any help?
> >> Thank you.
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] EXTRAER MES EN LETRAS Y CASTELLANO

2020-09-22 Thread Javier Marcuzzi
Estimado Jesus Martín

Comprendo lo que usted desea, pero puede entrar en problemas.

Los problemas son si necesita la fecha para algún cálculo, gráfico, etc.
Tocar fechas es complicado.

Mire lo siguiente https://cran.r-project.org/web/packages/date/date.pdf

Javier Rubén Marcuzzi

El mar., 22 sept. 2020 a las 18:58, Jesus MARTIN F. ()
escribió:

>   Buenas tardes,
>
>   Estoy precisando generar una nueva variable que contenga el mes en tres
> letras, por ejemplo: ENE , FEB, MAR , ABR y así sucesivamente a partir de
> los valores que ahora tengo en el Dataset, que son 1, 2, 3, 4 y así
> sucesivamente.
>
>   Entiendo que sería con mutate, pero consulto acerca del comando
> completo..
>
>   Gracias,
>
>   Jesús
>
>
>
> _
>
> *Jesús MARTÍN FRADE *
> Skype:jmfpas
> Tel (celular):(011) 154-946-2131 (Argentina)
> (+54) 911-4946-2131 (Internacional)
> Facebook http://www.facebook.com/jesusmartinfrade
>
> [image: Mailtrack]
> <
> https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;
> >
> Remitente
> notificado con
> Mailtrack
> <
> https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;
> >
> 22/09/20
> 18:54:37
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R-es] EXTRAER MES EN LETRAS Y CASTELLANO

2020-09-22 Thread Jesus MARTIN F.
  Buenas tardes,

  Estoy precisando generar una nueva variable que contenga el mes en tres
letras, por ejemplo: ENE , FEB, MAR , ABR y así sucesivamente a partir de
los valores que ahora tengo en el Dataset, que son 1, 2, 3, 4 y así
sucesivamente.

  Entiendo que sería con mutate, pero consulto acerca del comando completo..

  Gracias,

  Jesús



_

*Jesús MARTÍN FRADE *
Skype:jmfpas
Tel (celular):(011) 154-946-2131 (Argentina)
(+54) 911-4946-2131 (Internacional)
Facebook http://www.facebook.com/jesusmartinfrade

[image: Mailtrack]

Remitente
notificado con
Mailtrack

22/09/20
18:54:37

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Help with the Error Message in R "Error in 1:nchid : result would be too long a vector"

2020-09-22 Thread Rui Barradas

Hello,

Please keep this on the list so that others can give their contribution.

If you have reshaped your data can you post the code you ran to reshape 
it? Right now we only have the original attachment, in wide format, not 
the long format data.


Rui Barradas

Às 21:55 de 22/09/20, Rahul Chakraborty escreveu:

Hi,

Thank you so much for your reply.
Yes, thank you for pointing that out, I apologise for that error in
the variable name. However, my data is in long format.

See, my first column is IND which identifies my individuals,
second column is QES which identifies the question number each
individual faces, 3rd column is a stratification code that can be
ignored. Columns 6-13 are alternative specific variables and rest are
individual specific. So 1st 3 rows indicate 1st question faced by 1st
individual containing 3 alternatives, and so on. So, I have already
arranged the data in long format.

With that in mind if I use shape="long" it still gives me error.

Best  regards,

On Tue, Sep 22, 2020 at 11:00 PM Rui Barradas  wrote:


Hello,

I apologize if the rest of quotes prior to David's email are missing,
for some reason today my mail client is not including them.

As for the question, there are two other problems:

1) Alt_name is misspelled, it should be ALT_name;

2) the data is in wide, not long, format.

A 3rd, problem is that in ?dfidx it says

alt.var
the name of the variable that contains the alternative index (for a long
data.frame only) or the name under which the alternative index will be
stored (the default name is alt)


So if shape = "wide", alt.var is not needed.
But I am not a user of package mlogit, I'm just guessing.

The following seems to fix it (it doesn't throw errors).


mldata1 <- dfidx(mydata, shape = "wide",
   #alt.var = "ALT_name",
   choice = "Choice_binary",
   id.var = "IND")


Hope this helps,

Rui Barradas


Às 16:15 de 22/09/20, David Winsemius escreveu:

You were told two things about your code:


1) mlogit.data is deprecated by the package authors, so use dfidx.

2) dfidx does not allow duplicate ids in the first two columns.


Which one of those are you asserting is not accurate?








__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Split

2020-09-22 Thread LMH
Sometimes it just makes more sense to pre-process your data and get it into the 
format you need. It
just depends on whether you are more comfortable programing in R or in some 
other text manipulation
language like bash/sed/awk/grep etc.

If you know how to do this with other tools, you could write a script and 
probably call the script
from R. I could post a sample if you are interested.

LMH


Val wrote:
> HI All,
> 
> I am trying to create   new columns based on another column string
> content. First I want to identify rows that contain a particular
> string.  If it contains, I want to split the string and create two
> variables.
> 
> Here is my sample of data.
> F1<-read.table(text="ID1  ID2  text
> A1 B1   NONE
> A1 B1   cf_12
> A1 B1   NONE
> A2 B2   X2_25
> A2 B3   fd_15  ",header=TRUE,stringsAsFactors=F)
> If the variable "text" contains this "_" I want to create an indicator
> variable as shown below
> 
> F1$Y1 <- ifelse(grepl("_", F1$text),1,0)
> 
> 
> Then I want to split that string in to two, before "_" and after "_"
> and create two variables as shown below
> x1= strsplit(as.character(F1$text),'_',2)
> 
> My problem is how to combine this with the original data frame. The
> desired  output is shown   below,
> 
> 
> ID1 ID2  Y1   X1X2
> A1  B10   NONE   .
> A1  B1   1cf12
> A1  B1   0  NONE   .
> A2  B2   1X225
> A2  B3   1fd15
> 
> Any help?
> Thank you.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Encontrar un dato y añadirlo a otra columna

2020-09-22 Thread Juan Carlos Lopez Mesa
Hola,

prueba con esto

df %>% mutate(var = parse_number(nombre1))


Saludos

El mar., 22 sept. 2020 a las 15:46, Samura . ()
escribió:

> Buenas,
> A ver si alguien sabe como hacer lo siguiente:
>
> Tengo un df con letras y numeros, quiero que si me detecta un numero en
> concreto me añada dicho numero en otra columna.
>
> Algo asi
>
> df<-data.frame(c("AV 23","PEPE 34","QWE","AV 24","WERRR ER34","AV 25"))
> colnames(df)<-c("nombre1")
>
> df[grepl("AV 23",df$nombre1), "Nombre1_numero"]= "23"
> df[grepl("AV 24",df$nombre1), "Nombre1_numero"]= "24"
> df[grepl("AV 25",df$nombre1), "Nombre1_numero"]= "25"
> df
>
>
> nombre1  Nombre1_numero
> AV 23 23
> PEPE 34NA
> QWE  NA
> AV 24  24
> WERRR ER34   NA
> AV 25 25
>
> osea, busca AV 23, 24, 25 en la columna, si lo encuentras pon el numero en
> otra columna, el resto de datos NA
>
> como son muchos, para no repetir siempre lo mismo habia pensado en algo asi
>
>
> df[grepl("AV \\d{2}",df$nombre1), "Nombre1_numero"]= "\\d{2}"
>
> pero no se como poner ese "\\d{2}" ultimo para que me coloque el numero.
>
> ¿Alguna idea?
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>

-- 
“*Esta transmisión electrónica es propiedad de la Universidad Nacional de 
Colombia, su contenido es confidencial y únicamente lo puede recibir la 
persona o entidad a quien va dirigido. Se prohíbe: Usar esta información 
para propósitos ajenos a la Universidad, divulgar su contenido a personas 
externas, *_reproducir_ total y/o parcialmente la información contenida. No 
se asume responsabilidad sobre información, opiniones o criterios 
contenidos en este correo electrónico que no estén relacionados con la 
Universidad. Si usted no es el destinatario de este correo electrónico, se 
le notifica que el uso de esta información, así como su difusión, 
distribución o copia, está estrictamente prohibida, por favor notifique al 
remitente inmediatamente por este mismo medio y elimine lo antes posible 
este mensaje. La Universidad Nacional de Colombia, identificada con NIT 
899.999.063, con domicilio principal en la ciudad de Bogotá D.C. en la 
Carrera 45 # 26-85 Edif. Uriel Gutiérrez Bogotá D.C., Colombia y con 
teléfono (+57 1) 316 50 00, en cumplimiento de la Ley 1581 de 2012 y el 
artículo 15 del Decreto 1377 de 2013, como responsable del tratamiento de 
información de datos personales, desea informar a todas las personas cuyos 
datos personales se encuentran en nuestras bases de datos, que los mismos 
se encuentran bajo medidas que garantizan la seguridad, confidencialidad e 
integridad, y su tratamiento se realiza con base en nuestra Política de 
Tratamiento de Datos Personales, esta información se podrá consultar en la 
página web _unal.edu.co  o ser solicitada para su 
conocimiento en el correo electrónico protecdatos...@unal.edu.co 
. Canal por el que también puede ejercer 
sus derechos como titular dentro de los cuales se contempla conocer, 
actualizar, rectificar y revocar las autorizaciones dadas a las finalidades 
aplicables para el desarrollo de las relaciones laborales, académicas, 
contractuales y todas las relacionadas con el objeto social de la 
Universidad.”___

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R-es] Encontrar un dato y añadirlo a otra columna

2020-09-22 Thread Samura .
Buenas,
A ver si alguien sabe como hacer lo siguiente:

Tengo un df con letras y numeros, quiero que si me detecta un numero en 
concreto me a�ada dicho numero en otra columna.

Algo asi

df<-data.frame(c("AV 23","PEPE 34","QWE","AV 24","WERRR ER34","AV 25"))
colnames(df)<-c("nombre1")

df[grepl("AV 23",df$nombre1), "Nombre1_numero"]= "23"
df[grepl("AV 24",df$nombre1), "Nombre1_numero"]= "24"
df[grepl("AV 25",df$nombre1), "Nombre1_numero"]= "25"
df


nombre1  Nombre1_numero
AV 23 23
PEPE 34NA
QWE  NA
AV 24  24
WERRR ER34   NA
AV 25 25

osea, busca AV 23, 24, 25 en la columna, si lo encuentras pon el numero en otra 
columna, el resto de datos NA

como son muchos, para no repetir siempre lo mismo habia pensado en algo asi


df[grepl("AV \\d{2}",df$nombre1), "Nombre1_numero"]= "\\d{2}"

pero no se como poner ese "\\d{2}" ultimo para que me coloque el numero.

�Alguna idea?

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] expresiones regulares

2020-09-22 Thread Samura .
Muchas gracias

aunque no era lo que buscaba pq la estructura de los datos es cambiante y no se 
le puede aplicar una funcion
"fija".
Al final estoy usando gsub caso a caso.





De: Eric 
Enviado: domingo, 20 de septiembre de 2020 20:41
Para: Carlos Ortega ; Samura . 

Cc: r-help-es@r-project.org 
Asunto: Re: [R-es] expresiones regulares

Al parecer s�lo hay que eliminar los espacios, no ?



On 20-09-20 13:32, Carlos Ortega wrote:
> Hola,
>
> Extraer los tres primeros caracteres de cada cadena se puede hacer as�:
>
>> library(stringr)
>>
>> mis_str <-
> c('1.3ptd','1.3ptdm','4.4ptdm23j','7.716s','1.4hola','1.4hola.hola','5.5v6','5.5v6sdp','5.5v10sdp')
>> res_out <- vector()
>> for(i in 1:length(mis_str)) {
> +   wrd_tmp <- mis_str[i]
> +   pri_parte <- str_sub(wrd_tmp, 1, 3)
> +   sec_parte <- str_sub(wrd_tmp, 4, nchar(wrd_tmp))
> +   res_tmp <- c(pri_parte,sec_parte)
> +   res_out <- c(res_out, res_tmp)
> + }
>> paste0(res_out, collapse = " ")
> [1] "1.3 ptd 1.3 ptdm 4.4 ptdm23j 7.7 16s 1.4 hola 1.4 hola.hola 5.5 v6 5.5
> v6sdp 5.5 v10sdp"
> Pero es que este es el patr�n claro que veo de primeras. Hay alg�n otro
> patr�n m�s... sobre lo que se guarda en "sec_parte", pero siguiendo esta
> idea puedes tratarlo.
>
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
>
>
> El dom., 20 sept. 2020 a las 17:43, Samura . ()
> escribi�:
>
>> Hola a tod@s
>>
>> �alquien sabria como convertir estas frases con expresiones regulares?
>>
>> 1.3ptd  -> 1.3 ptd
>> 1.3ptdm -> 1.3 ptdm
>> 4.4ptdm23j -> 4.4 ptdm 23j
>> 7.716s -> 7.7 16s
>> 1.4hola -> 1.4 hola
>> 1.4hola.hola -> 1.4 hola.hola
>> 5.5v6  -> 5.5 v6
>> 5.5v6sdp  -> 5.5 v6 sdp
>> 5.5v10sdp  -> 5.5 v10 sdp
>>
>> de forma que esta frase
>>
>> "hola 1.3ptd 1.3ptdm 4.4ptdm23j 7.716s 1.4hola pepe 1.4hola.hola 5.5v6
>> 5.5v6sdp 5.5v10sdp"
>>
>>
>> quedara as�
>>
>> "hola 1.3 ptd 1.3 ptdm 4.4 ptdm 23j 7.7 16s 1.4 hola pepe 1.4 hola.hola
>> 5.5 v6 5.5 v6 sdp 5.5 v10 sdp"
>>
>> estoy probando con gsub y no doy con la tecla.
>>
>> Lo mismo hay una forma mas simple de cambiarlo y no usando las expresiones
>> regulares.
>>
>>
>>
>>  [[alternative HTML version deleted]]
>>
>> ___
>> R-help-es mailing list
>> R-help-es@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Split

2020-09-22 Thread Rui Barradas

Hello,

A base R solution with strsplit, like in your code.

F1$Y1 <- +grepl("_", F1$text)

tmp <- strsplit(as.character(F1$text), "_")
tmp <- lapply(tmp, function(x) if(length(x) == 1) c(x, ".") else x)
tmp <- do.call(rbind, tmp)
colnames(tmp) <- c("X1", "X2")
F1 <- cbind(F1[-3], tmp)# remove the original column
rm(tmp)

F1
#  ID1 ID2 Y1   X1 X2
#1  A1  B1  0 NONE  .
#2  A1  B1  1   cf 12
#3  A1  B1  0 NONE  .
#4  A2  B2  1   X2 25
#5  A2  B3  1   fd 15


Note that cbind dispatches on F1, an object of class "data.frame".
Therefore it's the method cbind.data.frame that is called and the result 
is also a df, though tmp is a "matrix".



Hope this helps,

Rui Barradas


Às 20:07 de 22/09/20, Rui Barradas escreveu:

Hello,

Something like this?


F1$Y1 <- +grepl("_", F1$text)
F1 <- F1[c(1, 2, 4, 3)]
F1 <- tidyr::separate(F1, text, into = c("X1", "X2"), sep = "_", fill = 
"right")

F1


Hope this helps,

Rui Barradas

Às 19:55 de 22/09/20, Val escreveu:

HI All,

I am trying to create   new columns based on another column string
content. First I want to identify rows that contain a particular
string.  If it contains, I want to split the string and create two
variables.

Here is my sample of data.
F1<-read.table(text="ID1  ID2  text
A1 B1   NONE
A1 B1   cf_12
A1 B1   NONE
A2 B2   X2_25
A2 B3   fd_15  ",header=TRUE,stringsAsFactors=F)
If the variable "text" contains this "_" I want to create an indicator
variable as shown below

F1$Y1 <- ifelse(grepl("_", F1$text),1,0)


Then I want to split that string in to two, before "_" and after "_"
and create two variables as shown below
x1= strsplit(as.character(F1$text),'_',2)

My problem is how to combine this with the original data frame. The
desired  output is shown   below,


ID1 ID2  Y1   X1    X2
A1  B1    0   NONE   .
A1  B1   1    cf    12
A1  B1   0  NONE   .
A2  B2   1    X2    25
A2  B3   1    fd    15

Any help?
Thank you.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Split

2020-09-22 Thread Rui Barradas

Hello,

Something like this?


F1$Y1 <- +grepl("_", F1$text)
F1 <- F1[c(1, 2, 4, 3)]
F1 <- tidyr::separate(F1, text, into = c("X1", "X2"), sep = "_", fill = 
"right")

F1


Hope this helps,

Rui Barradas

Às 19:55 de 22/09/20, Val escreveu:

HI All,

I am trying to create   new columns based on another column string
content. First I want to identify rows that contain a particular
string.  If it contains, I want to split the string and create two
variables.

Here is my sample of data.
F1<-read.table(text="ID1  ID2  text
A1 B1   NONE
A1 B1   cf_12
A1 B1   NONE
A2 B2   X2_25
A2 B3   fd_15  ",header=TRUE,stringsAsFactors=F)
If the variable "text" contains this "_" I want to create an indicator
variable as shown below

F1$Y1 <- ifelse(grepl("_", F1$text),1,0)


Then I want to split that string in to two, before "_" and after "_"
and create two variables as shown below
x1= strsplit(as.character(F1$text),'_',2)

My problem is how to combine this with the original data frame. The
desired  output is shown   below,


ID1 ID2  Y1   X1X2
A1  B10   NONE   .
A1  B1   1cf12
A1  B1   0  NONE   .
A2  B2   1X225
A2  B3   1fd15

Any help?
Thank you.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Split

2020-09-22 Thread Val
HI All,

I am trying to create   new columns based on another column string
content. First I want to identify rows that contain a particular
string.  If it contains, I want to split the string and create two
variables.

Here is my sample of data.
F1<-read.table(text="ID1  ID2  text
A1 B1   NONE
A1 B1   cf_12
A1 B1   NONE
A2 B2   X2_25
A2 B3   fd_15  ",header=TRUE,stringsAsFactors=F)
If the variable "text" contains this "_" I want to create an indicator
variable as shown below

F1$Y1 <- ifelse(grepl("_", F1$text),1,0)


Then I want to split that string in to two, before "_" and after "_"
and create two variables as shown below
x1= strsplit(as.character(F1$text),'_',2)

My problem is how to combine this with the original data frame. The
desired  output is shown   below,


ID1 ID2  Y1   X1X2
A1  B10   NONE   .
A1  B1   1cf12
A1  B1   0  NONE   .
A2  B2   1X225
A2  B3   1fd15

Any help?
Thank you.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with the Error Message in R "Error in 1:nchid : result would be too long a vector"

2020-09-22 Thread Rahul Chakraborty
David,

My apologies with the first one. I was checking different tutorials on
mlogit where they were using mlogit.data, so I ended up using it.

I am not getting what you are saying by the "duplicates in first two
columns". See, my first column is IND which identifies my individuals,
second column is QES which identifies the question number each
individual faces, 3rd column is a stratification code that can be
ignored. Columns 6-13 are alternative specific variables and rest are
individual specific. So 1st 3 rows indicate 1st question faced by 1st
individual containing 3 alternatives, and so on. So, I have already
arranged the data in long format. Here, I could not get what the
"duplicate in first two columns" mean.


And I am really sorry that there was an error in my code as Rui has
pointed out. The correct code is
mldata1 <- dfidx(mydata, shape = "long",
  alt.var = "ALT_name",
  choice = "Choice_binary",
  id.var = "IND")

It still shows the error-  "the two indexes don't define unique observations"
It would be really helpful if you kindly help.

Regards,


On Tue, Sep 22, 2020 at 8:46 PM David Winsemius  wrote:
>
> You were told two things about your code:
>
>
> 1) mlogit.data is deprecated by the package authors, so use dfidx.
>
> 2) dfidx does not allow duplicate ids in the first two columns.
>
>
> Which one of those are you asserting is not accurate?
>
>
> --
>
> David.
>
> On 9/21/20 10:20 PM, Rahul Chakraborty wrote:
> > Hello David and everyone,
> >
> > I am really sorry for not abiding by the specific guidelines in my
> > prior communications. I tried to convert the present email in plain
> > text format (at least it is showing me so in my gmail client). I have
> > also converted the xlsx file into a csv format with .txt extension.
> >
> > So, my problem is I need to run panel mixed logit regression for a
> > choice model. There are 3 alternatives, 9 questions for each
> > individual and 516 individuals in data. I have created a csv file in
> > long format from the survey questionnaire. Apart from the alternative
> > specific variables I have many individual specific variables and most
> > of these are dummies (dummy coded). I will use subsets of these in my
> > alternative model specifications. So, in my data I have 100 columns
> > with 13932 rows (3*9*516). After reading the csv file and creating a
> > dataframe 'mydata' I used the following command for mlogit.
> >
> > mldata1<- mlogit.data(mydata, shape = "long", alt.var = "Alt_name",
> > choice = "Choice_binary", id.var = "IND")
> >
> > It gives me the same error message- Error in 1:nchid : result would be
> > too long a vector.
> >
> > The attached file (csv file with .txt extension) is an example of 2
> > individuals each with 3 questions. I have also reduced the number of
> > columns to 57. Now, there are 18 rows. But still if I use the same
> > command on my new data I get the same error message. Can anyone please
> > help me out with this? Because of this error I am stuck at the
> > dataframe level.
> >
> >
> > Thanks in advance.
> >
> >
> > Regards,
> > Rahul Chakraborty
> >
> > On Tue, Sep 22, 2020 at 4:50 AM David Winsemius  
> > wrote:
> >> @Rahul;
> >>
> >>
> >> You need to learn to post in plain text and attachments may not be xls
> >> or xlsx. They need to be text files. And even if they are comma
> >> separated files and text, they still need to be named with a txt extension.
> >>
> >>
> >> I'm the only one who got the xlsx file. I got the error regardless of
> >> how many column I omitted, so my gues was possibly incorrect. But I did
> >> RTFM. See ?mlogit.datadfi The mlogit.data function is deprecated and you
> >> are told to use the dfidx function. Trying that you now get an error
> >> saying: " the two indexes don't define unique observations".
> >>
> >>
> >>   > sum(duplicated( dfrm[,1:2]))
> >> [1] 12
> >>   > length(dfrm[,1])
> >> [1] 18
> >>
> >> So of your 18 lines in the example file, most of them appear to be
> >> duplicated in their first two rows and apparently that is not allowed by
> >> dfidx.
> >>
> >>
> >> Caveat: I'm not a user of the mlogit package so I'm just reading the
> >> manual and possibly coming up with informed speculation.
> >>
> >> Please read the Posting Guide. You have been warned. Repeated violations
> >> of the policies laid down in that hallowed document will possibly result
> >> in postings being ignored.
> >>



-- 
Rahul Chakraborty
Research Fellow
National Institute of Public Finance and Policy
New Delhi- 110067

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with the Error Message in R "Error in 1:nchid : result would be too long a vector"

2020-09-22 Thread Rui Barradas

Hello,

I apologize if the rest of quotes prior to David's email are missing, 
for some reason today my mail client is not including them.


As for the question, there are two other problems:

1) Alt_name is misspelled, it should be ALT_name;

2) the data is in wide, not long, format.

A 3rd, problem is that in ?dfidx it says

alt.var 
the name of the variable that contains the alternative index (for a long 
data.frame only) or the name under which the alternative index will be 
stored (the default name is alt)



So if shape = "wide", alt.var is not needed.
But I am not a user of package mlogit, I'm just guessing.

The following seems to fix it (it doesn't throw errors).


mldata1 <- dfidx(mydata, shape = "wide",
 #alt.var = "ALT_name",
 choice = "Choice_binary",
 id.var = "IND")


Hope this helps,

Rui Barradas


Às 16:15 de 22/09/20, David Winsemius escreveu:

You were told two things about your code:


1) mlogit.data is deprecated by the package authors, so use dfidx.

2) dfidx does not allow duplicate ids in the first two columns.


Which one of those are you asserting is not accurate?




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] text on curve

2020-09-22 Thread Berry, Charles



> On Sep 22, 2020, at 1:10 AM, Jinsong Zhao  wrote:
> 
> Hi there,
> 
> I write a simple function that could place text along a curve. Since I am not 
> familiar with the operation of rotating graphical elements, e.g., text, 
> rectangle, etc., I hope you could give suggestions or hints on how to improve 
> it. Thanks in advance.
> 
> # Here is the code:
> 


[code deleted]

For this kind of operation you might want to use tikz. 

R has the ability to produce tikz directives and to insert raw tikz into a 
'tikzDevice'.

If you search rseek.org for 'tikz' you will get plenty of good hits. 

The tikz/pgf manual has examples of flowing text, IIRC.

HTH,

Chuck

p.s. this is a plain text list. Do not submit html.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with the Error Message in R "Error in 1:nchid : result would be too long a vector"

2020-09-22 Thread David Winsemius

You were told two things about your code:


1) mlogit.data is deprecated by the package authors, so use dfidx.

2) dfidx does not allow duplicate ids in the first two columns.


Which one of those are you asserting is not accurate?


--

David.

On 9/21/20 10:20 PM, Rahul Chakraborty wrote:

Hello David and everyone,

I am really sorry for not abiding by the specific guidelines in my
prior communications. I tried to convert the present email in plain
text format (at least it is showing me so in my gmail client). I have
also converted the xlsx file into a csv format with .txt extension.

So, my problem is I need to run panel mixed logit regression for a
choice model. There are 3 alternatives, 9 questions for each
individual and 516 individuals in data. I have created a csv file in
long format from the survey questionnaire. Apart from the alternative
specific variables I have many individual specific variables and most
of these are dummies (dummy coded). I will use subsets of these in my
alternative model specifications. So, in my data I have 100 columns
with 13932 rows (3*9*516). After reading the csv file and creating a
dataframe 'mydata' I used the following command for mlogit.

mldata1<- mlogit.data(mydata, shape = "long", alt.var = "Alt_name",
choice = "Choice_binary", id.var = "IND")

It gives me the same error message- Error in 1:nchid : result would be
too long a vector.

The attached file (csv file with .txt extension) is an example of 2
individuals each with 3 questions. I have also reduced the number of
columns to 57. Now, there are 18 rows. But still if I use the same
command on my new data I get the same error message. Can anyone please
help me out with this? Because of this error I am stuck at the
dataframe level.


Thanks in advance.


Regards,
Rahul Chakraborty

On Tue, Sep 22, 2020 at 4:50 AM David Winsemius  wrote:

@Rahul;


You need to learn to post in plain text and attachments may not be xls
or xlsx. They need to be text files. And even if they are comma
separated files and text, they still need to be named with a txt extension.


I'm the only one who got the xlsx file. I got the error regardless of
how many column I omitted, so my gues was possibly incorrect. But I did
RTFM. See ?mlogit.datadfi The mlogit.data function is deprecated and you
are told to use the dfidx function. Trying that you now get an error
saying: " the two indexes don't define unique observations".


  > sum(duplicated( dfrm[,1:2]))
[1] 12
  > length(dfrm[,1])
[1] 18

So of your 18 lines in the example file, most of them appear to be
duplicated in their first two rows and apparently that is not allowed by
dfidx.


Caveat: I'm not a user of the mlogit package so I'm just reading the
manual and possibly coming up with informed speculation.

Please read the Posting Guide. You have been warned. Repeated violations
of the policies laid down in that hallowed document will possibly result
in postings being ignored.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate semi-hourly data not 00-24 but 9-9

2020-09-22 Thread Stefano Sofia
Yes, thank you so much.

Stefano

 (oo)
--oOO--( )--OOo
Stefano Sofia PhD
Civil Protection - Marche Region
Meteo Section
Snow Section
Via del Colle Ameno 5
60126 Torrette di Ancona, Ancona
Uff: 071 806 7743
E-mail: stefano.so...@regione.marche.it
---Oo-oO


Da: Eric Berger [ericjber...@gmail.com]
Inviato: martedì 22 settembre 2020 11.00
A: Jeff Newmiller
Cc: Stefano Sofia; r-help mailing list
Oggetto: Re: [R] aggregate semi-hourly data not 00-24 but 9-9

Thanks Jeff.
Stefano, per Jeff's comment, you can replace the line

df1$data_POSIXminus9 <- df1$data_POSIX - lubridate::hours(9)

by

df1$data_POSIXminus9 <- df1$data_POSIX - as.difftime(9,units="hours")

On Mon, Sep 21, 2020 at 8:06 PM Jeff Newmiller  wrote:
>
> The base R as.difftime function is perfectly usable to create this offset 
> without pulling in lubridate.
>
> On September 21, 2020 8:06:51 AM PDT, Eric Berger  
> wrote:
> >Hi Stefano,
> >If you mean from 9am on one day to 9am on the following day, you can
> >do a trick. Simply subtract 9hrs from each timestamp and then you want
> >midnight to midnight for these adjusted times, which you can get using
> >the method you followed.
> >
> >I googled and found that lubridate::hours() can be used to add or
> >subtract hours from a POSIXct.
> >
> >library(lubridate)
> >
> >day_1 <- as.POSIXct("2020-02-19-00-00", format="%Y-%m-%d-%H-%M",
> >tz="Etc/GMT-1")
> >day_2 <- as.POSIXct("2020-02-24-12-00", format="%Y-%m-%d-%H-%M",
> >tz="Etc/GMT-1")
> >df1 <- data.frame(data_POSIX=seq(day_1, day_2, by="30 min"))
> >df1$hs <- rnorm(nrow(df1), 40, 10)
> >df1$diff[2:nrow(df1)] <- diff(df1$hs)
> >
> >df1$data_POSIXminus9 <- df1$data_POSIX - lubridate::hours(9)
> >df1$dayX <- format(df1$data_POSIXminus9,"%y-%m-%d")
> >df2X <- aggregate(diff ~ dayX, df1, sum)
> >df2X
> >
> >HTH,
> >Eric
> >
> >On Mon, Sep 21, 2020 at 5:30 PM Stefano Sofia
> > wrote:
> >>
> >> Dear R-list members,
> >> I have semi-hourly snowfall data.
> >> I should sum the semi-hourly increments (only the positive ones, but
> >this is not described in my example) day by day, not from 00 to 24 but
> >from 9 to 9.
> >>
> >> I am able to use the diff function, create a list of days and use the
> >function aggregate, but it works only from 0 to 24. Any suggestion for
> >an efficient way to do it?
> >> Here my code:
> >> day_1 <- as.POSIXct("2020-02-19-00-00", format="%Y-%m-%d-%H-%M",
> >tz="Etc/GMT-1")
> >> day_2 <- as.POSIXct("2020-02-24-12-00", format="%Y-%m-%d-%H-%M",
> >tz="Etc/GMT-1")
> >> df1 <- data.frame(data_POSIX=seq(day_1, day_2, by="30 min"))
> >> df1$hs <- rnorm(nrows(df1), 40, 10)
> >> df1$diff[2:nrow(df1)] <- diff(df1$hs)
> >> df1$day <- format(df$data_POSIX,"%y-%m-%d")
> >> df2 <- aggregate(diff ~ day, df, sum)
> >>
> >> Thank you for your help
> >> Stefano
> >>
> >>  (oo)
> >> --oOO--( )--OOo
> >> Stefano Sofia PhD
> >> Civil Protection - Marche Region
> >> Meteo Section
> >> Snow Section
> >> Via del Colle Ameno 5
> >> 60126 Torrette di Ancona, Ancona
> >> Uff: 071 806 7743
> >> E-mail: stefano.so...@regione.marche.it
> >> ---Oo-oO
> >>
> >> 
> >>
> >> AVVISO IMPORTANTE: Questo messaggio di posta elettronica può
> >contenere informazioni confidenziali, pertanto è destinato solo a
> >persone autorizzate alla ricezione. I messaggi di posta elettronica per
> >i client di Regione Marche possono contenere informazioni confidenziali
> >e con privilegi legali. Se non si è il destinatario specificato, non
> >leggere, copiare, inoltrare o archiviare questo messaggio. Se si è
> >ricevuto questo messaggio per errore, inoltrarlo al mittente ed
> >eliminarlo completamente dal sistema del proprio computer. Ai sensi
> >dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità
> >ed urgenza, la risposta al presente messaggio di posta elettronica può
> >essere visionata da persone estranee al destinatario.
> >> IMPORTANT NOTICE: This e-mail message is intended to be received only
> >by persons entitled to receive the confidential information it may
> >contain. E-mail messages to clients of Regione Marche may contain
> >information that is confidential and legally privileged. Please do not
> >read, copy, forward, or store this message unless you are an intended
> >recipient of it. If you have received this message in error, please
> >forward it to the sender and delete it completely from your computer
> >system.
> >>
> >> --
> >> Questo messaggio  stato analizzato da Libra ESVA ed  risultato non
> >infetto.
> >> This message was scanned by Libra ESVA and is believed to be clean.
> >>
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>  
> >> 

Re: [R] Quadratic programming

2020-09-22 Thread Maija Sirkjärvi
I really appreciate you helping me with this! I just don't seem to figure
it out.

(1) I don't know why you think bvec should be a matrix. The
documentation clearly says it should be a vector (implying not a
matrix).

- I've written it in a form of a matrix with one row and 2*J-3
columns. (0,2*J-3,1). I thought that it would pass as a vector. I made it a
vector now.

The thing is that I'm trying to replicate a C++ code with R. The C++ code
imposes shape restrictions on the function and works perfectly. The C++
code is attached and after that the same for R. You said you haven't used
the QP package for a decade. Is there a better/another package for these
types of problems?

Thanks again!

C++ code:

int main()
{
Print("Begin");

/* Bootstrap Parameters */
long RandomSeed = -98345; // Set random seed
const int S = 100; // Number of bootstrap replications

/* Housing Demand Parameters */
const double Beta =  1.1613;
const double v=  0.7837;
const double Eta  = -0.5140;

/* Quadratic Programming Tolerance Parameters */
const double Delta1 =  0.01;
const double Delta2 =  0.0001;

/* Read in Dataset */
DataFileDelim d("K:\\Data\\Local Jurisdictions\\AEJ Data.dat",'\t');
d.SortBy("AfterTaxPrice");
int J = d.NumObs();
Vector Price = d.Get("AfterTaxPrice");
Vector Educ  = d.Get("EducAll");
Vector Crime = d.Get("CrimeIndex");
Vector Dist  = d.Get("RushTrav");

/* Adjust Crime Data */
for(int j=0; j Rank1(J);
for(int j=0; j X(J);
Matrix Y(J,2);
Vector Z(J);
for(int j=0; j RhoGrid1 = Grid(-1.2,-0.1,RhoK1);
Matrix gTrans(J,RhoK1);
for(int rk=0; rk hSmooth(J);
for(int j=0; j Q(J);
for(int j=0; j H(J,Zero);
Vector c(J,Zero);
Matrix Aeq(0,J);
Vector beq(0);
Matrix Aneq(2*J-3,J,Zero);
Vector bneq(2*J-3);
Vector lb(J,-Inf);
Vector ub(J,Inf);
for(int j=0; j I was wondering if you're trying to fit a curve, subject to
> monotonicity/convexity constraints...
> If you are, this is a challenging topic, best of luck...
>
>
> On Tue, Sep 22, 2020 at 8:12 AM Abby Spurdle  wrote:
> >
> > Hi,
> >
> > Sorry, for my rushed responses, last night.
> > (Shouldn't post when I'm about to log out).
> >
> > I haven't used the quadprog package for nearly a decade.
> > And I was hoping that an expert using optimization in finance in
> > economics would reply.
> >
> > Some comments:
> > (1) I don't know why you think bvec should be a matrix. The
> > documentation clearly says it should be a vector (implying not a
> > matrix).
> > The only arguments that should be matrices are Dmat and Amat.
> > (2) I'm having some difficulty following your quadratic program, even
> > after rendering it.
> > Perhaps you could rewrite your expressions, in a form that is
> > consistent with the input to solve.QP. That's a math problem, not an R
> > programming problem, as such.
> > (3) If that fails, then you'll need to produce a minimal reproducible
> example.
> > I strongly recommend that the R code matches the quadratic program, as
> > closely as possible.
> >
> >
> > On Mon, Sep 21, 2020 at 9:28 PM Maija Sirkjärvi
> >  wrote:
> > >
> > > Hi!
> > >
> > > I was wondering if someone could help me out. I'm minimizing a
> following
> > > function:
> > >
> > > \begin{equation}
> > > $$\sum_{j=1}^{J}(m_{j} -\hat{m_{j}})^2,$$
> > > \text{subject to}
> > > $$m_{j-1}\leq m_{j}-\delta_{1}$$
> > > $$\frac{1}{Q_{j-1}-Q_{j-2}} (m_{j-2}-m_{j-1}) \leq
> \frac{1}{Q_{j}-Q_{j-1}}
> > > (m_{j-1}-m_{j})-\delta_{2} $$
> > > \end{equation}
> > >
> > > I have tried quadratic programming, but something is off. Does anyone
> have
> > > an idea how to approach this?
> > >
> > > Thanks in advance!
> > >
> > > Q <- rep(0,J)
> > > for(j in 1:(length(Price))){
> > >   Q[j] <- exp((-0.1) * (Beta *Price[j]^(Eta + 1) - 1) / (1 + Eta))
> > > }
> > >
> > > Dmat <- matrix(0,nrow= J, ncol=J)
> > > diag(Dmat) <- 1
> > > dvec <- -hs
> > > Aeq <- 0
> > > beq <- 0
> > > Amat <- matrix(0,J,2*J-3)
> > > bvec <- matrix(0,2*J-3,1)
> > >
> > > for(j in 2:nrow(Amat)){
> > >   Amat[j-1,j-1] = -1
> > >   Amat[j,j-1] = 1
> > > }
> > > for(j in 3:nrow(Amat)){
> > >   Amat[j,J+j-3] = -1/(Q[j]-Q[j-1])
> > >   Amat[j-1,J+j-3] = 1/(Q[j]-Q[j-1])
> > >   Amat[j-2,J+j-3] = -1/(Q[j-1]-Q[j-2])
> > > }
> > > for(j in 2:ncol(bvec)) {
> > >   bvec[j-1] = Delta1
> > > }
> > > for(j in 3:ncol(bvec)) {
> > >   bvec[J-1+j-2] = Delta2
> > > }
> > > solution <- solve.QP(Dmat,dvec,Amat,bvec=bvec)
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

Re: [R] aggregate semi-hourly data not 00-24 but 9-9

2020-09-22 Thread Eric Berger
Thanks Jeff.
Stefano, per Jeff's comment, you can replace the line

df1$data_POSIXminus9 <- df1$data_POSIX - lubridate::hours(9)

by

df1$data_POSIXminus9 <- df1$data_POSIX - as.difftime(9,units="hours")

On Mon, Sep 21, 2020 at 8:06 PM Jeff Newmiller  wrote:
>
> The base R as.difftime function is perfectly usable to create this offset 
> without pulling in lubridate.
>
> On September 21, 2020 8:06:51 AM PDT, Eric Berger  
> wrote:
> >Hi Stefano,
> >If you mean from 9am on one day to 9am on the following day, you can
> >do a trick. Simply subtract 9hrs from each timestamp and then you want
> >midnight to midnight for these adjusted times, which you can get using
> >the method you followed.
> >
> >I googled and found that lubridate::hours() can be used to add or
> >subtract hours from a POSIXct.
> >
> >library(lubridate)
> >
> >day_1 <- as.POSIXct("2020-02-19-00-00", format="%Y-%m-%d-%H-%M",
> >tz="Etc/GMT-1")
> >day_2 <- as.POSIXct("2020-02-24-12-00", format="%Y-%m-%d-%H-%M",
> >tz="Etc/GMT-1")
> >df1 <- data.frame(data_POSIX=seq(day_1, day_2, by="30 min"))
> >df1$hs <- rnorm(nrow(df1), 40, 10)
> >df1$diff[2:nrow(df1)] <- diff(df1$hs)
> >
> >df1$data_POSIXminus9 <- df1$data_POSIX - lubridate::hours(9)
> >df1$dayX <- format(df1$data_POSIXminus9,"%y-%m-%d")
> >df2X <- aggregate(diff ~ dayX, df1, sum)
> >df2X
> >
> >HTH,
> >Eric
> >
> >On Mon, Sep 21, 2020 at 5:30 PM Stefano Sofia
> > wrote:
> >>
> >> Dear R-list members,
> >> I have semi-hourly snowfall data.
> >> I should sum the semi-hourly increments (only the positive ones, but
> >this is not described in my example) day by day, not from 00 to 24 but
> >from 9 to 9.
> >>
> >> I am able to use the diff function, create a list of days and use the
> >function aggregate, but it works only from 0 to 24. Any suggestion for
> >an efficient way to do it?
> >> Here my code:
> >> day_1 <- as.POSIXct("2020-02-19-00-00", format="%Y-%m-%d-%H-%M",
> >tz="Etc/GMT-1")
> >> day_2 <- as.POSIXct("2020-02-24-12-00", format="%Y-%m-%d-%H-%M",
> >tz="Etc/GMT-1")
> >> df1 <- data.frame(data_POSIX=seq(day_1, day_2, by="30 min"))
> >> df1$hs <- rnorm(nrows(df1), 40, 10)
> >> df1$diff[2:nrow(df1)] <- diff(df1$hs)
> >> df1$day <- format(df$data_POSIX,"%y-%m-%d")
> >> df2 <- aggregate(diff ~ day, df, sum)
> >>
> >> Thank you for your help
> >> Stefano
> >>
> >>  (oo)
> >> --oOO--( )--OOo
> >> Stefano Sofia PhD
> >> Civil Protection - Marche Region
> >> Meteo Section
> >> Snow Section
> >> Via del Colle Ameno 5
> >> 60126 Torrette di Ancona, Ancona
> >> Uff: 071 806 7743
> >> E-mail: stefano.so...@regione.marche.it
> >> ---Oo-oO
> >>
> >> 
> >>
> >> AVVISO IMPORTANTE: Questo messaggio di posta elettronica può
> >contenere informazioni confidenziali, pertanto è destinato solo a
> >persone autorizzate alla ricezione. I messaggi di posta elettronica per
> >i client di Regione Marche possono contenere informazioni confidenziali
> >e con privilegi legali. Se non si è il destinatario specificato, non
> >leggere, copiare, inoltrare o archiviare questo messaggio. Se si è
> >ricevuto questo messaggio per errore, inoltrarlo al mittente ed
> >eliminarlo completamente dal sistema del proprio computer. Ai sensi
> >dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità
> >ed urgenza, la risposta al presente messaggio di posta elettronica può
> >essere visionata da persone estranee al destinatario.
> >> IMPORTANT NOTICE: This e-mail message is intended to be received only
> >by persons entitled to receive the confidential information it may
> >contain. E-mail messages to clients of Regione Marche may contain
> >information that is confidential and legally privileged. Please do not
> >read, copy, forward, or store this message unless you are an intended
> >recipient of it. If you have received this message in error, please
> >forward it to the sender and delete it completely from your computer
> >system.
> >>
> >> --
> >> Questo messaggio  stato analizzato da Libra ESVA ed  risultato non
> >infetto.
> >> This message was scanned by Libra ESVA and is believed to be clean.
> >>
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org 

Re: [R] text on curve

2020-09-22 Thread Jim Lemon
Hi Jinsong,
This is similar to the "arctext" function in plotrix. I don't want to
do all the trig right now, but I would suggest placing the characters
on the curve and then offsetting them a constant amount at right
angles to the slope of the curve at each letter. I would first try
having a "minspace" argument to deal with crowding at small radii and
you would probably have to start at the middle and work out to each
end. A tough problem and you have made a good start on it.  Check the
fragment below for a suggestion on how to avoid calling "substr"
repeatedly.

# get a vector of the characters in str
   # rather than call substr all the time
   strbits<-unlist(strsplit(str,""))

   for(i in 1:l) {
  w <- strwidth(strbits[i])
  h <- strheight(strbits[i])

Jim

On Tue, Sep 22, 2020 at 6:11 PM Jinsong Zhao  wrote:
>
> Hi there,
>
> I write a simple function that could place text along a curve. Since I am not 
> familiar with the operation of rotating graphical elements, e.g., text, 
> rectangle, etc., I hope you could give suggestions or hints on how to improve 
> it. Thanks in advance.
>
> # Here is the code:
>
> getCurrentAspect <- function() {
>uy <- diff(grconvertY(1:2,"user","inches"))
>ux <- diff(grconvertX(1:2,"user","inches"))
>uy/ux
> }
>
> r.xy <- function(o.x, o.y, theta) {
>r.x <- o.x * cos(theta) - o.y * sin(theta)
>r.y <- o.x * sin(theta) + o.y * cos(theta)
>c(r.x, r.y)
> }
>
> text.on.curve <- function(x, y, x.s, str, ...) {
>
>l <- nchar(str)
>
>fun <- approxfun(x, y, rule = 2)
>
>for(i in 1:l) {
>   w <- strwidth(substr(str, i, i))
>   h <- strheight(substr(str, i, i))
>
>   x.l <- x.s
>   x.r <- x.s + w
>   y.l <- fun(x.l)
>   y.r <- fun(x.r)
>   theta <- atan((y.r - y.l)/(x.r - x.l) * getCurrentAspect())
>
>   lb.xy <- c(x.s, fun(x.s))
>   rb.xy <- lb.xy + r.xy(w, 0, theta)
>   lt.xy <- lb.xy + r.xy(0, h, theta)
>   rt.xy <- lb.xy + r.xy(w, h, theta)
>   c.xy <- lb.xy + r.xy(w/2, h/2, theta)
>
>   while(i > 1 && lt.xy[1] < rt.xy.old[1]) {
>  x.s <- x.s + 0.05 * w
>  x.l <- x.s
>  x.r <- x.s + w
>  y.l <- fun(x.l)
>  y.r <- fun(x.r)
>  theta <- atan((y.r - y.l)/(x.r - x.l) * getCurrentAspect())
>
>  lb.xy <- c(x.s, fun(x.s))
>  rb.xy <- lb.xy + r.xy(w, 0, theta)
>  lt.xy <- lb.xy + r.xy(0, h, theta)
>  rt.xy <- lb.xy + r.xy(w, h, theta)
>  c.xy <- lb.xy + r.xy(w/2, h/2, theta)
>   }
>
>   x.s <- rb.xy[1]
>   rt.xy.old <- rt.xy
>
>   text(c.xy[1], c.xy[2], substr(str, i, i), srt = theta * 180 / pi, ...)
>}
> }
>
> # A simple demo:
>
> x <- seq(-5, 5, length.out = 100)
> y <- x^2
> plot(x,y, type = "l")
> text.on.curve(x, y, -2 ,"a demo of text on curve", col = "red")
>
> Best,
> Jinsong
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] text on curve

2020-09-22 Thread Jinsong Zhao
Hi there,

I write a simple function that could place text along a curve. Since I am not 
familiar with the operation of rotating graphical elements, e.g., text, 
rectangle, etc., I hope you could give suggestions or hints on how to improve 
it. Thanks in advance.

# Here is the code:

getCurrentAspect <- function() {
   uy <- diff(grconvertY(1:2,"user","inches"))
   ux <- diff(grconvertX(1:2,"user","inches"))
   uy/ux
}

r.xy <- function(o.x, o.y, theta) {
   r.x <- o.x * cos(theta) - o.y * sin(theta)
   r.y <- o.x * sin(theta) + o.y * cos(theta)
   c(r.x, r.y)
}

text.on.curve <- function(x, y, x.s, str, ...) {

   l <- nchar(str)

   fun <- approxfun(x, y, rule = 2)

   for(i in 1:l) {
  w <- strwidth(substr(str, i, i))
  h <- strheight(substr(str, i, i)) 

  x.l <- x.s
  x.r <- x.s + w
  y.l <- fun(x.l)
  y.r <- fun(x.r)
  theta <- atan((y.r - y.l)/(x.r - x.l) * getCurrentAspect())

  lb.xy <- c(x.s, fun(x.s))
  rb.xy <- lb.xy + r.xy(w, 0, theta)
  lt.xy <- lb.xy + r.xy(0, h, theta)
  rt.xy <- lb.xy + r.xy(w, h, theta)
  c.xy <- lb.xy + r.xy(w/2, h/2, theta)

  while(i > 1 && lt.xy[1] < rt.xy.old[1]) {
 x.s <- x.s + 0.05 * w
 x.l <- x.s
 x.r <- x.s + w
 y.l <- fun(x.l)
 y.r <- fun(x.r)
 theta <- atan((y.r - y.l)/(x.r - x.l) * getCurrentAspect())

 lb.xy <- c(x.s, fun(x.s))
 rb.xy <- lb.xy + r.xy(w, 0, theta)
 lt.xy <- lb.xy + r.xy(0, h, theta)
 rt.xy <- lb.xy + r.xy(w, h, theta)
 c.xy <- lb.xy + r.xy(w/2, h/2, theta)
  }

  x.s <- rb.xy[1]
  rt.xy.old <- rt.xy

  text(c.xy[1], c.xy[2], substr(str, i, i), srt = theta * 180 / pi, ...)
   }
}

# A simple demo:

x <- seq(-5, 5, length.out = 100)
y <- x^2
plot(x,y, type = "l")
text.on.curve(x, y, -2 ,"a demo of text on curve", col = "red")

Best,
Jinsong
 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.