date:20160420

[R] Mailing List

2016-04-20 Thread Ogbos Okike

Dear All,
I am using R to do my work and thank you very much for developing,
maintaining and making such excellent software available to anyone
that is interested enough to ask for it.

 I have registered at Nabble. I was wondering the right forum for me
to send my help request. I have tried sending to R-help@r-project.org.
However, I do receive a kind of warning email stating that my email
awaits approval from the moderator since I am a non-member posting to
membership email.

Can any one please direct me to the right forum for me. My problem
range from plotting graph using R, statistics in R, etc. You could
have seen some of my request this few days.

Thank you for your time.
Ogbos

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data reshaping with conditions

2016-04-20 Thread Jim Lemon

Hi sri,
I think that I see what you mean. Your statements:

x = Count_A_less_than_max of (Count type B)
y = Count_A_higher_than_max of (Count type B).

I took to mean that you wanted a logical value for x and y. Looking
more closely at your initial message, I see that you wanted _all_
values of A with respect to maxB in x and y. The error with maximum
values was due to a typo. Perhaps this will do what you want:

svdat<-read.table(text="Count id name type
117 335 sally A
19 335 sally A
167 335 sally B
18 340 susan A
56 340 susan A
22 340 susan B
53 340 susan B
135 351 lee A
114 351 lee A
84 351 lee A
80 351 lee A
19 351 lee A
8 351 lee A
21 351 lee A
88 351 lee B
111 351 lee B
46 351 lee B
108 351 lee B",header=TRUE)
# you can also do this with other reshape functions
library(prettyR)
svdatstr<-stretch_df(svdat,"id",c("Count","type"))
count_ind<-grep("Count",names(svdatstr))
type_ind<-grep("type",names(svdatstr))
svdatstr$maxA<-NA
svdatstr$maxB<-NA
svdatstr$x<-NA
svdatstr$y<-NA
for(row in 1:nrow(svdatstr)) {
 indicesA<-count_ind[as.logical(match(svdatstr[row,type_ind],"A",0))]
 svdatstr[row,"maxA"]<-max(svdatstr[row,indicesA])
 indicesB<-count_ind[as.logical(match(svdatstr[row,type_ind],"B",0))]
 svdatstr[row,"maxB"]<-max(svdatstr[row,indicesB])
 AltB<-svdatstr[row,indicesA][svdatstr[row,indicesA]=svdatstr[row,"maxB"]]
 svdatstr[row,"y"]<-paste(AgeB,collapse=",")
}
svdatstr[,c("id","name","maxB","x","y")]

Jim


On Thu, Apr 21, 2016 at 2:23 PM, sri vathsan  wrote:
> Hi Jim,
>
> Thanks for your time. But somehow this code did not help me to achieve my
> expected output.
> Problems: 1) x, y are coming as logical rather than values as I mentioned in
> my post
>2) The values that I get for Max A and Max B not correct
>3) It looks like a pretty big data, but I just need to
> concatenate the values with a comma, the final output will be a character
> variable.
>
> Regards,
> Sri
>
> On Thu, Apr 21, 2016 at 4:52 AM, Jim Lemon  wrote:
>>
>> Hi sri,
>> As your problem involves a few logical steps, I found it easier to
>> approach it in a stepwise way. Perhaps there are more elegant ways to
>> accomplish this.
>>
>> svdat<-read.table(text="Count id name type
>> 117 335 sally A
>> 19 335 sally A
>> 167 335 sally B
>> 18 340 susan A
>> 56 340 susan A
>> 22 340 susan B
>> 53 340 susan B
>> 135 351 lee A
>> 114 351 lee A
>> 84 351 lee A
>> 80 351 lee A
>> 19 351 lee A
>> 8 351 lee A
>> 21 351 lee A
>> 88 351 lee B
>> 111 351 lee B
>> 46 351 lee B
>> 108 351 lee B",header=TRUE)
>> # you can also do this with other reshape functions
>> library(prettyR)
>> svdatstr<-stretch_df(svdat,"id",c("Count","type"))
>> count_ind<-grep("Count",names(svdatstr))
>> type_ind<-grep("type",names(svdatstr))
>> svdatstr$maxA<-NA
>> svdatstr$maxB<-NA
>> svdatstr$x<-NA
>> svdatstr$y<-NA
>> for(row in 1:nrow(svdatstr)) {
>>  svdatstr[row,"maxA"]<-
>>
>> max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]])
>>  svdatstr[row,"maxB"]<-
>>
>> max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]])
>>  svdatstr[row,"x"]<-svdatstr[row,"maxA"] < svdatstr[row,"maxB"]
>>  svdatstr[row,"y"]<-!svdatstr[row,"x"]
>> }
>> svdatstr
>>
>> You can then just extract the columns that you need.
>>
>> Jim
>>
>>
>> On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan  wrote:
>> > Dear All,
>> >
>> > I am trying to reshape the data with some conditions. A small part of
>> > the
>> > data looks like below. Like this there will be more data with repeating
>> > ID.
>> >
>> > Count id name type
>> > 117 335 sally A
>> > 19 335 sally A
>> > 167 335 sally B
>> > 18 340 susan A
>> > 56 340 susan A
>> > 22 340 susan B
>> > 53 340 susan B
>> > 135 351 lee A
>> > 114 351 lee A
>> > 84 351 lee A
>> > 80 351 lee A
>> > 19 351 lee A
>> > 8 351 lee A
>> > 21 351 lee A
>> > 88 351 lee B
>> > 111 351 lee B
>> > 46 351 lee B
>> > 108 351 lee B
>> >
>> > >From the above data I am expecting an output like below.
>> >
>> > id name type count_of_B Max of count B x   y
>> > 335 sally B 167 167 117,19  NA
>> > 340 susan B 22,53 53 18  56
>> > 351 lee B 88,111,46,108  111 84,80,19,8,2   135,114
>> >
>> > Where, the column x and column y are:
>> >
>> > x = Count_A_less_than_max of (Count type B)
>> > y = Count_A_higher_than_max of (Count type B).
>> >
>> > *1)* I tried with dplyr with the following code for the initial step to
>> > get
>> > the values for each column.
>> > *2)*  I thought to transpose the columns which has the unique ID alone.
>> >
>> > I tried with the following code and I am struck with the intial step
>> > itself. The code is executed but higher and lower value of A is not
>> > coming.
>> >
>> > Expected_output= data %>%
>> >   group_by(id, Type) %>%
>> >   mutate(Count_of_B =

Re: [R] overlay two facet_grid

2016-04-20 Thread Ulrik Stervbo

It sounds like you want to use grid.arrange() from gridExtra:
https://cran.r-project.org/web/packages/gridExtra/vignettes/arrangeGrob.html

Hope this helps,
Ulrik

On Thu, 21 Apr 2016 at 00:52 Jeff Newmiller 
wrote:

>
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
>
> Overlaying aesthetics is possible. Overlaying graphs is not. Without
> sample data, concrete examples will be unlikely to  appear, so read the
> above link and pay attention to the dput function.
> --
> Sent from my phone. Please excuse my brevity.
>
> On April 20, 2016 3:01:43 PM PDT, "ch.elahe via R-help" <
> r-help@r-project.org> wrote:
> >Hi all,
> >Does anyone know how to overlay two facet_grids? I have two facet grids
> >as following:
> >
> >
>
> >ggplot(data=df,aes(x=TE,y=TR,color="orange"))+geom_point()+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)
>
> >ggplot(data=df,aes(x=TE,y=TR))+geom_point(aes(color=TST))+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)
> >
> >Thanks for any help!
> >Elahe
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data reshaping with conditions

2016-04-20 Thread Jim Lemon

Hi sri,
As your problem involves a few logical steps, I found it easier to
approach it in a stepwise way. Perhaps there are more elegant ways to
accomplish this.

svdat<-read.table(text="Count id name type
117 335 sally A
19 335 sally A
167 335 sally B
18 340 susan A
56 340 susan A
22 340 susan B
53 340 susan B
135 351 lee A
114 351 lee A
84 351 lee A
80 351 lee A
19 351 lee A
8 351 lee A
21 351 lee A
88 351 lee B
111 351 lee B
46 351 lee B
108 351 lee B",header=TRUE)
# you can also do this with other reshape functions
library(prettyR)
svdatstr<-stretch_df(svdat,"id",c("Count","type"))
count_ind<-grep("Count",names(svdatstr))
type_ind<-grep("type",names(svdatstr))
svdatstr$maxA<-NA
svdatstr$maxB<-NA
svdatstr$x<-NA
svdatstr$y<-NA
for(row in 1:nrow(svdatstr)) {
 svdatstr[row,"maxA"]<-
  max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]])
 svdatstr[row,"maxB"]<-
  max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]])
 svdatstr[row,"x"]<-svdatstr[row,"maxA"] < svdatstr[row,"maxB"]
 svdatstr[row,"y"]<-!svdatstr[row,"x"]
}
svdatstr

You can then just extract the columns that you need.

Jim


On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan  wrote:
> Dear All,
>
> I am trying to reshape the data with some conditions. A small part of the
> data looks like below. Like this there will be more data with repeating ID.
>
> Count id name type
> 117 335 sally A
> 19 335 sally A
> 167 335 sally B
> 18 340 susan A
> 56 340 susan A
> 22 340 susan B
> 53 340 susan B
> 135 351 lee A
> 114 351 lee A
> 84 351 lee A
> 80 351 lee A
> 19 351 lee A
> 8 351 lee A
> 21 351 lee A
> 88 351 lee B
> 111 351 lee B
> 46 351 lee B
> 108 351 lee B
>
> >From the above data I am expecting an output like below.
>
> id name type count_of_B Max of count B x   y
> 335 sally B 167 167 117,19  NA
> 340 susan B 22,53 53 18  56
> 351 lee B 88,111,46,108  111 84,80,19,8,2   135,114
>
> Where, the column x and column y are:
>
> x = Count_A_less_than_max of (Count type B)
> y = Count_A_higher_than_max of (Count type B).
>
> *1)* I tried with dplyr with the following code for the initial step to get
> the values for each column.
> *2)*  I thought to transpose the columns which has the unique ID alone.
>
> I tried with the following code and I am struck with the intial step
> itself. The code is executed but higher and lower value of A is not coming.
>
> Expected_output= data %>%
>   group_by(id, Type) %>%
>   mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse = ","))%>%
>   mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type ==
> "B"]),max(count[Type == "A"]))) %>%
>   mutate(count_type_A_lesser = ifelse
> (Type=="B",(paste(unlist(count[Type=="A"]) < Max_of_count_B[Type=="B"],
> collapse = ",")), "NA"))%>%
>   mutate(count_type_A_higher =
> ifelse(Type=="B",(paste(unlist(count[Type=="A"]) >
> Max_of_count_B[Type=="B"], collapse = ",")), "NA"))
>
> I hope I make my point clear. Please bare with the code, as I am new to
> this.
>
> Regards,
> sri
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] installation problem on Ubuntu

2016-04-20 Thread Jeff Newmiller

Have you read the CRAN  instructions for installing on Ubuntu?  Have you read 
the Posting Guide that mentions the R-sig-debian mailing list and that if you 
need help compiling R this is not the right list?
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 9:36:51 AM PDT, Paul Tremblay  wrote:
>I needed to update R so I could install ggplot. I am running Ubuntu
>12.04.
>I cannot upgrade Ubuntu because I am using a work computer.
>
>I tried upgrading the normal way:
>
>sudo apt-get update
> sudo apt-get install r-base r-base-dev
>
>But this only installed an earlier version. Finally I tried installing
>from
>source (./configure, Make install). This worked. However, when I try to
>install packages, I get this error:
>
>Error in download.file(url, destfile = f, quiet = TRUE) :
>  internet routines cannot be loaded
>In addition: Warning message:
>In download.file(url, destfile = f, quiet = TRUE) :
>  unable to load shared object '/usr/local/lib/R/modules//internet.so':
>/usr/local/lib/R/modules//internet.so: undefined symbol:
>curl_multi_wait
>
>
>>> ls /usr/local/lib/R/modules/
>>> R_X11.so  R_de.so  internet.so  lapack.so
>
>Thanks!
>
>P
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] overlay two facet_grid

2016-04-20 Thread Jeff Newmiller

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

Overlaying aesthetics is possible. Overlaying graphs is not. Without sample 
data, concrete examples will be unlikely to  appear, so read the above link and 
pay attention to the dput function. 
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 3:01:43 PM PDT, "ch.elahe via R-help"  
wrote:
>Hi all,
>Does anyone know how to overlay two facet_grids? I have two facet grids
>as following:
>
>
>ggplot(data=df,aes(x=TE,y=TR,color="orange"))+geom_point()+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)
>ggplot(data=df,aes(x=TE,y=TR))+geom_point(aes(color=TST))+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)
>
>Thanks for any help!
>Elahe
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] overlay two facet_grid

2016-04-20 Thread ch.elahe via R-help

Hi all,
Does anyone know how to overlay two facet_grids? I have two facet grids as 
following:


ggplot(data=df,aes(x=TE,y=TR,color="orange"))+geom_point()+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)
ggplot(data=df,aes(x=TE,y=TR))+geom_point(aes(color=TST))+facet_grid(FS+TRJ~OR+INV,labeller=label_both)+xlim(0,200)+ylim(0,1)

Thanks for any help!
Elahe

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Parsing and counting expressions in .txt-files

2016-04-20 Thread Bert Gunter

also check out this CRAN task view:

https://cran.r-project.org/web/views/NaturalLanguageProcessing.html

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Apr 20, 2016 at 9:07 AM, Alexander Nikles <24...@novasbe.pt> wrote:
> Dear Community,
>
>
>
> I hope that I have the right category selected because I am relatively new
> to the "R" world. I come with a relatively challenging problem in the
> luggage.  I would like to realize, that "R" reads text files (there are
> several hundred pieces in my folder) sequentially, and screens for specific
> terms. If the term is found, the program should write a 1, if not a 0.
> Another task is to scrape a ten-digit number from the file after a
> particular keyword, so that I can map the results. The Programm should
> create an .txt file ideally.
>
>
>
> A brief example:
>
>
>
> Keywords: "surpassed" "achieved", "very motivated"
>
> Text1:
>
> "Personnel number: 0123456789
>
>
>
> The employee has exceeded the set targets and was also otherwise always
> motivated (...) "
>
>
>
> So I want that my program for this case, ideally reflects the following (in
> lines and columns=
>
>
>
> Personell number;surpassed;achieved; very motivated (do not write)
> 0123456789;1;0;1
>
>
> For the following files, he shall all continue analogously in line 2, 3, 4
> and so on.
>
>
>
> Could you give a brief assessment, how to realize such a thing? How do I
> start best and whether you are possibly "stumbled" in advance about
> something similar in R? I am grateful for any suggestions/proposals.
>
>
>
> Thank you in advance,
>
>
>
> Alex
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Parsing and counting expressions in .txt-files

2016-04-20 Thread Bert Gunter

I suggest you go through some R tutorials to learn about R's
capabilities.  Some recommendations can be found here:
https://www.rstudio.com/online-learning/#R

To answer your specific query:

?scan  ## Because you do not specify file format.

?grep  ?regexp ## to use regular expressions to find text.

R may not be the best tool for this task, however. Or certain R
packages may be better than the basic R tools. Try searching on the
rseek.org site to see what might be available if you do not receive
suggestions here.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Apr 20, 2016 at 9:07 AM, Alexander Nikles <24...@novasbe.pt> wrote:
> Dear Community,
>
>
>
> I hope that I have the right category selected because I am relatively new
> to the "R" world. I come with a relatively challenging problem in the
> luggage.  I would like to realize, that "R" reads text files (there are
> several hundred pieces in my folder) sequentially, and screens for specific
> terms. If the term is found, the program should write a 1, if not a 0.
> Another task is to scrape a ten-digit number from the file after a
> particular keyword, so that I can map the results. The Programm should
> create an .txt file ideally.
>
>
>
> A brief example:
>
>
>
> Keywords: "surpassed" "achieved", "very motivated"
>
> Text1:
>
> "Personnel number: 0123456789
>
>
>
> The employee has exceeded the set targets and was also otherwise always
> motivated (...) "
>
>
>
> So I want that my program for this case, ideally reflects the following (in
> lines and columns=
>
>
>
> Personell number;surpassed;achieved; very motivated (do not write)
> 0123456789;1;0;1
>
>
> For the following files, he shall all continue analogously in line 2, 3, 4
> and so on.
>
>
>
> Could you give a brief assessment, how to realize such a thing? How do I
> start best and whether you are possibly "stumbled" in advance about
> something similar in R? I am grateful for any suggestions/proposals.
>
>
>
> Thank you in advance,
>
>
>
> Alex
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread Jeff Newmiller

The usual culprit in messy code is posting in HTML format. That usually leads 
to stripping of the formatting by the mailing list and a notice that that 
occurred, but I don't see that warning here. I still think posting plain text 
format would fix the problem. 
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 11:51:40 AM PDT, A A via R-help  wrote:
>Thanks for the help. Sorry, I am not sure why it looks like that in the
>mailing list - it looks much more neat on my end (see attached file). 
>
>On Wednesday, April 20, 2016 2:01 PM, Berend Hasselman 
>wrote:
> 
>
> 
>> On 20 Apr 2016, at 13:22, A A via R-help 
>wrote:
>> 
>> 
>> 
>> 
>> I have a situation in R where I would like to find any x (if one
>exists) that solves the linear system of equations Ax = b, where A is
>square, sparse, and singular, and b is a vector. Here is some code that
>mimics my issue with a relatively simple A and b, along with three
>other methods of solving this system that I found online, two of which
>give me an error and one of which succeeds on the simplified problem,
>but fails on my data set(attached). Is there a solver in R that I can
>use in order to get x without any errors given the structure of A?
>Thanks for your time.
>> #CODE STARTS HEREA =
>as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b
>= matrix(c(-30,40,-10),nrow=3,ncol=1)
>> #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A
>(or out of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
>> #one x that happens to solve Ax = bx =
>matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x
>> #Error in lsfit(A, b) : only 3 cases, but 4
>variableslsfit(A,b)#solves the system, but fails belowsolve(qr(A,
>LAPACK=TRUE),b)#Error in qr.solve(A, b) : singular matrix 'a' in
>solveqr.solve(A,b)
>> #matrices used in my actual problem (see attached files)A =
>readMM("A.txt")b = readMM("b.txt")
>> #Error in as(x, "matrix")[i, , drop = drop] : subscript out of
>boundssolve(qr(A, LAPACK=TRUE),b)
>
>Your code is a mess. 
>
>A singular square system of linear equations has an infinity of
>solutions if a solution exists at all.
>How that works you can find here:
>https://en.wikipedia.org/wiki/System_of_linear_equations
>in the section "Matrix solutions".
>
>For your simple example you can do it like this:
>
>library(MASS)
>Ag <- ginv(A)    # pseudoinverse
>
>xb <- Ag %*% b # minimum norm solution
>
>Aw <- diag(nrow=nrow(Ag)) - Ag %*% A  # see the Wikipedia page
>w <- runif(3)
>z <- xb + Aw %*% w
>A %*% z - b
>
>N <- Null(t(A))    # null space of A;  see the help for Null in package
>MASS
>A %*% N
>A %*% (xb + 2 * N) - b
>
>For sparse systems you will have to approach this differently; I have
>no experience with that.
>
>Berend
>
>
>  
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?

2016-04-20 Thread Henrik Bengtsson

On Wed, Apr 20, 2016 at 1:25 AM, Martin Maechler
 wrote:
>> Henrik Bengtsson 
>> on Tue, 19 Apr 2016 14:04:11 -0700 writes:
>
> > Using the Matrix package, how can I create a row-oriented sparse
> > Matrix from scratch populated with some data?  By default a
> > column-oriented one is created and I'm aware of the note that the
> > package is optimized for column-oriented ones, but I'm only interested
> > in using it for holding my sparse row-oriented data and doing basic
> > subsetting by rows (even using drop=FALSE).
>
> > Here is what I get when I set up a column-oriented sparse Matrix:
>
> >> Cc <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
> >> Cc[1:3,1] <- 1
>
> A general ("teaching") remark :
> The above use of Matrix() is seen in many places, and is fine
> for small matrices and the case where you only use the `[<-`
> method very few times (as above).
> Also using  Matrix()  is nice when being introduced to using the
> Matrix package.
>
> However, for efficience in non-small cases, do use
>
>sparseMatrix()
>
> directly to construct sparse matrices.
>
>
> >> Cc
> > 5 x 5 sparse Matrix of class "dgCMatrix"
>
> > [1,] 1 . . . .
> > [2,] 1 . . . .
> > [3,] 1 . . . .
> > [4,] . . . . .
> > [5,] . . . . .
> >> str(Cc)
> > Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
> > ..@ i   : int [1:3] 0 1 2
> > ..@ p   : int [1:6] 0 3 3 3 3 3
> > ..@ Dim : int [1:2] 5 5
> > ..@ Dimnames:List of 2
> > .. ..$ : NULL
> > .. ..$ : NULL
> > ..@ x   : num [1:3] 1 1 1
> > ..@ factors : list()
>
> > When I try to do the analogue for a row-oriented matrix, I get a
> > "dgTMatrix", whereas I would expect a "dgRMatrix":
>
> >> Cr <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
> >> Cr <- as(Cr, "dsRMatrix")
> >> Cr[1,1:3] <- 1
> >> Cr
> > 5 x 5 sparse Matrix of class "dgTMatrix"
>
> > [1,] 1 1 1 . .
> > [2,] . . . . .
> > [3,] . . . . .
> > [4,] . . . . .
> > [5,] . . . . .
>
> The reason for the above behavior has been
>
> a) efficiency.  All the subassignment ( `[<-` ) methods for
>"RsparseMatrix" objects (of which "dsRMatrix" is a special case)
>are implemented via  TsparseMatrix.
> b) because of the general attitude that Csparse (and Tsparse to
>some extent) are well supported in Matrix,
>and e.g. further operations on Rsparse matrices would *again*
>go via T* or C* sparse ones, I had decided to keep things Tsparse.

Thanks, understanding these design decisions is helpful.
Particularly, since I consider myself a rookie when it comes to the
Matrix package.

>
> [...]
>
> > Trying with explicit coercion does not work:
>
> >> as(Cc, "dgRMatrix")
> > Error in as(Cc, "dgRMatrix") :
> > no method or default for coercing "dgCMatrix" to "dgRMatrix"
>
> >> as(Cr, "dgRMatrix")
> > Error in as(Cr, "dgRMatrix") :
> > no method or default for coercing "dgTMatrix" to "dgRMatrix"
>
> The general philosophy in 'Matrix' with all the class
> hierarchies and the many specific classes has been to allow and
> foster coercing to abstract super classes,
> i.e, to  "dMatrix"  or "generalMatrix", "triangularMatrix", or
> then "denseMatrix", "sparseMatrix", "CsparseMatrix" or
> "RsparseMatrix", etc
>
> So in the above  as(*, "RsparseMatrix")   should work always.

Thanks for pointing this out (and confirming as I since discovered the
virtual RsparseMatrix class in the help).

>
>
> As a summary, in other words,  for what you want,
>
>as(sparseMatrix(.), "RsparseMatrix")
>
> should give you what you want reliably and efficiently.

Perfect.

>
>
> > Am I doing some wrong here?  Or is this what means that the package is
> > optimized for the column-oriented representation and I shouldn't
> > really work with row-oriented ones?  I'm really only interested in
> > access to efficient Cr[row,,drop=FALSE] subsetting (and a small memory
> > footprint).
>
> { though you could equivalently use   Cc[,row, drop=FALSE]
>   with a CsparseMatrix Cc := t(Cr),
>   couldn't you ?
> }

Yes, I actually went ahead did that, but since the code I'm writing
supports both plain matrix:es and sparse Matrix:es, and the underlying
model operates row-by-row, I figured the code would be more consistent
if I could use row-orientation everywhere.  Not a big deal.

Thanks Martin

Henrik

>
>
> Martin Maechler  (maintainer of 'Matrix')
> ETH Zurich
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting Numerical Vector Into Chunks

2016-04-20 Thread William Dunlap via R-help

> i <- seq_len(length(x)-1)
> split(x, cumsum(c(TRUE, (x[i]==0) != (x[i+1]==0
$`1`
[1] 0.144872972504 0.850797178400

$`2`
[1] 0 0 0

$`3`
[1] 0.199304859380 2.063609410700 0.939393760782 0.838781367540

$`4`
[1] 0 0 0 0 0

$`5`
[1] 0.374688091264 0.488423999452 0.783034615362 0.626990428900
0.138188255307 2.324635712186

$`6`
[1] 0 0 0 0 0 0 0


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Apr 20, 2016 at 12:49 PM, Ista Zahn  wrote:

> Perhaps
>
> x <- split(x, x == 0)
>
> Best,
> Ista
>
> On Wed, Apr 20, 2016 at 9:40 AM, Sidoti, Salvatore A.
>  wrote:
> > Greetings!
> >
> > I have several large data sets of animal movements. Their pauses (zero
> magnitude vectors) are of particular interest in addition to the speed
> distributions that precede the periods of rest. Here is an example of the
> kind of data I am interested in analyzing:
> >
> > x <-
> abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0)))
> > length(x)
> >
> > This example has 27 elements with strings of zeroes (pauses) situated
> among the speed values.
> > Is there a way to split the vector into zero and nonzero chunks and
> store them in a form where they can be analyzed? I have tried various forms
> of split() to no avail.
> >
> > Thank you!
> > Salvatore A. Sidoti
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting Numerical Vector Into Chunks

2016-04-20 Thread Ista Zahn

Perhaps

x <- split(x, x == 0)

Best,
Ista

On Wed, Apr 20, 2016 at 9:40 AM, Sidoti, Salvatore A.
 wrote:
> Greetings!
>
> I have several large data sets of animal movements. Their pauses (zero 
> magnitude vectors) are of particular interest in addition to the speed 
> distributions that precede the periods of rest. Here is an example of the 
> kind of data I am interested in analyzing:
>
> x <- 
> abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0)))
> length(x)
>
> This example has 27 elements with strings of zeroes (pauses) situated among 
> the speed values.
> Is there a way to split the vector into zero and nonzero chunks and store 
> them in a form where they can be analyzed? I have tried various forms of 
> split() to no avail.
>
> Thank you!
> Salvatore A. Sidoti
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merging Data Sets with Full Outer Join

2016-04-20 Thread Ista Zahn

Kunden <- Kunden_2011
Kunden <- merge(Kunden, Kunden_2012,
by = "Debitor", all = TRUE)

etc.

See ?merge for details.

Best,
Ista

On Wed, Apr 20, 2016 at 2:23 AM,   wrote:
> Hi All,
>
> I would like to match some datasets. Both deliver variables AND cases
> which might or might not be present in all datasets:
>
> This sequence
>
> Kunden <- Kunden_2011
> Kunden <- merge(Kunden, Kunden_2012,
> by.x = "Debitor", by.y = "Debitor")
>
> Kunden <- merge(Kunden, Kunden_2013,
> by.x = "Debitor", by.y = "Debitor")
>
> Kunden <- merge(Kunden, Kunden_2014,
> by.x = "Debitor", by.y = "Debitor")
>
> Kunden <- merge(Kunden, Kunden_2015,
> by.x = "Debitor", by.y = "Debitor")
>
> delivers too few cases. So I guess it does an equi-join.
>
> How can I join the datasets and keep the variables as well as the cases?
>
> I am looking forward to your reply.
>
> Kind regards
>
> Georg
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merging Data Sets with Full Outer Join

2016-04-20 Thread David Winsemius


> On Apr 19, 2016, at 11:23 PM, g.maub...@weinwolf.de wrote:
> 
> Hi All,
> 
> I would like to match some datasets. Both deliver variables AND cases 
> which might or might not be present in all datasets:
> 
> This sequence
> 
> Kunden <- Kunden_2011 
> Kunden <- merge(Kunden, Kunden_2012,
>by.x = "Debitor", by.y = "Debitor")
> 
> Kunden <- merge(Kunden, Kunden_2013,
>by.x = "Debitor", by.y = "Debitor")
> 
> Kunden <- merge(Kunden, Kunden_2014,
>by.x = "Debitor", by.y = "Debitor")
> 
> Kunden <- merge(Kunden, Kunden_2015,
>by.x = "Debitor", by.y = "Debitor")
> 
> delivers too few cases. So I guess it does an equi-join.

You should not be guessing. Read the help page. It calls the default setting a 
natural join.

> 
> How can I join the datasets and keep the variables as well as the cases?
> 

If you want a full outer join use all=TRUE. This, too, should have been in the 
?merge help page.


> I am looking forward to your reply.
> 
> Kind regards
> 
> Georg
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R-es] Script sin resultados

2016-04-20 Thread Manuel Máquez

Carlos:
Nuevamente te agradezco tu respuesta y más aún por la rapidez de la misma.
Voy a poner en práctica tu sugerencia, posteriormente te comentaré los
resultados.
Hasta luego y muchas gracias.
Atentamente,
*MANOLO MÁRQUEZ P.*

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread A A via R-help

Thanks for the help. Sorry, I am not sure why it looks like that in the mailing 
list - it looks much more neat on my end (see attached file). 

On Wednesday, April 20, 2016 2:01 PM, Berend Hasselman  
wrote:

> On 20 Apr 2016, at 13:22, A A via R-help  wrote:
> 
> 
> 
> 
> I have a situation in R where I would like to find any x (if one exists) that 
> solves the linear system of equations Ax = b, where A is square, sparse, and 
> singular, and b is a vector. Here is some code that mimics my issue with a 
> relatively simple A and b, along with three other methods of solving this 
> system that I found online, two of which give me an error and one of which 
> succeeds on the simplified problem, but fails on my data set(attached). Is 
> there a solver in R that I can use in order to get x without any errors given 
> the structure of A? Thanks for your time.
> #CODE STARTS HEREA = 
> as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = 
> matrix(c(-30,40,-10),nrow=3,ncol=1)
> #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out 
> of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
> #one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% 
> x
> #Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the 
> system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : 
> singular matrix 'a' in solveqr.solve(A,b)
> #matrices used in my actual problem (see attached files)A = readMM("A.txt")b 
> = readMM("b.txt")
> #Error in as(x, "matrix")[i, , drop = drop] : subscript out of 
> boundssolve(qr(A, LAPACK=TRUE),b)

Your code is a mess. 

A singular square system of linear equations has an infinity of solutions if a 
solution exists at all.
How that works you can find here: 
https://en.wikipedia.org/wiki/System_of_linear_equations
in the section "Matrix solutions".

For your simple example you can do it like this:

library(MASS)
Ag <- ginv(A)    # pseudoinverse

xb <- Ag %*% b # minimum norm solution

Aw <- diag(nrow=nrow(Ag)) - Ag %*% A  # see the Wikipedia page
w <- runif(3)
z <- xb + Aw %*% w
A %*% z - b

N <- Null(t(A))    # null space of A;  see the help for Null in package MASS
A %*% N
A %*% (xb + 2 * N) - b

For sparse systems you will have to approach this differently; I have no 
experience with that.

Berend

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread A A via R-help

Thanks for the response. Yes, in that situation a solution of x = 1 would be 
just as good as x = 1000 or any other value of x for me (but in my problem the 
matrix has nonzero rank, so I can't just randomly choose a vector and have it 
be a solution). If it helps, what I'm interested in is the R equivalent of 
x = A\b
in MATLAB, for these particular kinds of A matrices. I looked into irlba, and 
it seems to be able to calculate some of the singular values/vectors for the 
large dataset without taking too much time. I'll look more into seeing how I 
can solve the system with it. 

On Wednesday, April 20, 2016 11:01 AM, Jeff Newmiller 
 wrote:
 

 This is kind of like asking for a solution to x+1=x+1. Go back to linear 
algebra and look up Singular Value Decomposition, and decide if you really want 
to proceed. See also ?svd and package irlba.
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 4:22:34 AM PDT, A A via R-help  wrote:



 I have a situation in R where I would like to find any x (if one exists) that 
solves the linear system of equations Ax = b, where A is square, sparse, and 
singular, and b is a vector. Here is some code that mimics my issue with a 
relatively simple A and b, along with three other methods of solving this 
system that I found online, two of which give me an error and one of which 
succeeds on the simplified problem, but fails on my data set(attached). Is 
there a solver in R that I can use in order to get x without any errors given 
the structure of A? Thanks for your time.
#CODE STARTS HEREA = 
as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = 
matrix(c(-30,40,-10),nrow=3,ncol=1)
#solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of 
memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
#one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x
#Error in
lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the system, but 
fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : singular 
matrix 'a' in solveqr.solve(A,b)
#matrices used in my actual problem (see attached files)A = readMM("A.txt")b = 
readMM("b.txt")
#Error in as(x, "matrix")[i, , drop = drop] : subscript out of 
boundssolve(qr(A, LAPACK=TRUE),b)


R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Parsing and counting expressions in .txt-files

2016-04-20 Thread Alexander Nikles

Dear Community,



I hope that I have the right category selected because I am relatively new
to the "R" world. I come with a relatively challenging problem in the
luggage.  I would like to realize, that "R" reads text files (there are
several hundred pieces in my folder) sequentially, and screens for specific
terms. If the term is found, the program should write a 1, if not a 0.
Another task is to scrape a ten-digit number from the file after a
particular keyword, so that I can map the results. The Programm should
create an .txt file ideally.



A brief example:



Keywords: "surpassed" "achieved", "very motivated"

Text1:

"Personnel number: 0123456789



The employee has exceeded the set targets and was also otherwise always
motivated (...) "



So I want that my program for this case, ideally reflects the following (in
lines and columns=



Personell number;surpassed;achieved; very motivated (do not write)
0123456789;1;0;1


For the following files, he shall all continue analogously in line 2, 3, 4
and so on.



Could you give a brief assessment, how to realize such a thing? How do I
start best and whether you are possibly "stumbled" in advance about
something similar in R? I am grateful for any suggestions/proposals.



Thank you in advance,



Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Splitting Numerical Vector Into Chunks

2016-04-20 Thread Sidoti, Salvatore A.

Greetings!

I have several large data sets of animal movements. Their pauses (zero 
magnitude vectors) are of particular interest in addition to the speed 
distributions that precede the periods of rest. Here is an example of the kind 
of data I am interested in analyzing:

x <- 
abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0)))
length(x)

This example has 27 elements with strings of zeroes (pauses) situated among the 
speed values.
Is there a way to split the vector into zero and nonzero chunks and store them 
in a form where they can be analyzed? I have tried various forms of split() to 
no avail.

Thank you!
Salvatore A. Sidoti

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] project test data into principal components of training dataset

2016-04-20 Thread olsen

For the records, a slightly hacky answer, by modifying the ggbiplot
function, is provided now here:
http://stackoverflow.com/questions/36603268/how-to-plot-training-and-test-validation-data-in-r-using-ggbiplot

On 18/04/16 17:20, olsen wrote:
> Hi there,
> 
> I've a training dataset and a test dataset. My aim is to visually
> allocate the test data within the calibrated space reassembled by the
> PC's of the training data set, furthermore to keep the training data set
> coordinates fixed, so they can serve as ruler for measurement for
> additional test datasets coming up.
> 
> Please find a minimum working example using the wine dataset below.
> Ideally I would like to use ggbiplot as it comes with the elegant
> features but it only accepts objects of class prcomp, princomp, PCA, or
> lda, which is not fullfilled by the predicted test data.
> 
> I'm still slightly wet behind my R ears and the only solution I can
> think of is to plot the calibrated space in ggbiplot and the training
> data in ggplot and then join them, in the worst case by exporting them
> as svg and importing them in inkscape. Which is slightly complicated
> plus the scaling is different.
> 
> Any indication how this mission can be accomplished very welcome!
> 
> Thanks and greets
> Olsen
> 
> I started a threat on stackoverflow on that issue but know relevant
> indications so far.
> http://stackoverflow.com/questions/36603268/how-to-plot-training-and-test-validation-data-in-r-using-ggbiplot
> 
> ##MWE
> library(ggbiplot)
> data(wine)
> 
> ##pca on the wine dataset used as training data
> wine.pca <- prcomp(wine, center = TRUE, scale. = TRUE)
> 
> wine$class <- wine.class
> 
> ##simulate test data by generating three new wine classes
> wine.new.1 <- wine[c(sample(1:nrow(wine), 25)),]
> wine.new.2 <- wine[c(sample(1:nrow(wine), 43)),]
> wine.new.3 <- wine[c(sample(1:nrow(wine), 36)),]
> 
> ##Predict PCs for the new classes by transforming
> #them using the predict.prcomp function
> pred.new.1 <- predict(wine.pca, newdata = wine.new.1)
> pred.new.2 <- predict(wine.pca, newdata = wine.new.2)
> pred.new.3 <- predict(wine.pca, newdata = wine.new.3)
> 
> #simulate the classes for the new sorts
> wine.new.1$class <- rep("new.wine.1", nrow(wine.new.1))
> wine.new.2$class <- rep("new.wine.2", nrow(wine.new.2))
> wine.new.3$class <- rep("new.wine.3", nrow(wine.new.3))
> wine.new.bind <- rbind(wine.new.1, wine.new.2, wine.new.3)
> 
> ##compose the plot by joining the PCA ggbiplot training data with the
> testing data from ggplot
> #plot the calibrated space resulting from the test data
> g.train <- ggbiplot(wine.pca, obs.scale = 1, var.scale = 1, groups =
> wine$class, ellipse = TRUE, circle = TRUE)
> g.train
> #plot the test data resulting from the prediction
> df.pred = data.frame(PC1 = wine.new.bind[,1], PC2 = wine.new.bind[,2],
> PC3 = wine.new.bind[,3], PC4 = wine.new.bind[,4],
> classes = wine.new.bind$class)
> g.test <- ggplot(df.pred, aes(PC1, PC2, color = classes, shape =
> classes)) +  geom_point() +  stat_ellipse()
> g.test
> 
> 
> 
> 
> 

-- 
Our solar system is the cream of the crop
http://hasa-labs.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Merging Data Sets with Full Outer Join

2016-04-20 Thread G . Maubach

Hi All,

I would like to match some datasets. Both deliver variables AND cases 
which might or might not be present in all datasets:

This sequence

Kunden <- Kunden_2011 
Kunden <- merge(Kunden, Kunden_2012,
by.x = "Debitor", by.y = "Debitor")

Kunden <- merge(Kunden, Kunden_2013,
by.x = "Debitor", by.y = "Debitor")

Kunden <- merge(Kunden, Kunden_2014,
by.x = "Debitor", by.y = "Debitor")

Kunden <- merge(Kunden, Kunden_2015,
by.x = "Debitor", by.y = "Debitor")

delivers too few cases. So I guess it does an equi-join.

How can I join the datasets and keep the variables as well as the cases?

I am looking forward to your reply.

Kind regards

Georg

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] installation problem on Ubuntu

2016-04-20 Thread Paul Tremblay

I needed to update R so I could install ggplot. I am running Ubuntu 12.04.
I cannot upgrade Ubuntu because I am using a work computer.

I tried upgrading the normal way:

sudo apt-get update
 sudo apt-get install r-base r-base-dev

But this only installed an earlier version. Finally I tried installing from
source (./configure, Make install). This worked. However, when I try to
install packages, I get this error:

Error in download.file(url, destfile = f, quiet = TRUE) :
  internet routines cannot be loaded
In addition: Warning message:
In download.file(url, destfile = f, quiet = TRUE) :
  unable to load shared object '/usr/local/lib/R/modules//internet.so':
  /usr/local/lib/R/modules//internet.so: undefined symbol: curl_multi_wait


>> ls /usr/local/lib/R/modules/
>> R_X11.so  R_de.so  internet.so  lapack.so

Thanks!

P

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread A A via R-help

Thanks for the advice. I fixed the function and ran it on my systems just to 
see if it would work; for the first set of A and b, I got a valid solution, but 
for the second set, I got the error "Error in complete.cases(x, y, wt) : not 
all arguments have the same length".  

On Wednesday, April 20, 2016 10:59 AM, William Dunlap  
wrote:
 

 This is not a solution but your lsfit attempt   #Error in lsfit(A, b) : only 3 
cases, but 4 variables   lsfit(A,b)gave that error because lsfit adds a column 
of 1 toits first argument unless you use intercept=FALSE.Then it will give you 
an answer (but I think it convertsyour sparse matrix into a dense one before 
doingany linear algebra).


Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Apr 20, 2016 at 4:22 AM, A A via R-help  wrote:




 I have a situation in R where I would like to find any x (if one exists) that 
solves the linear system of equations Ax = b, where A is square, sparse, and 
singular, and b is a vector. Here is some code that mimics my issue with a 
relatively simple A and b, along with three other methods of solving this 
system that I found online, two of which give me an error and one of which 
succeeds on the simplified problem, but fails on my data set(attached). Is 
there a solver in R that I can use in order to get x without any errors given 
the structure of A? Thanks for your time.
#CODE STARTS HEREA = 
as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = 
matrix(c(-30,40,-10),nrow=3,ncol=1)
#solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of 
memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
#one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x
#Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the 
system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : 
singular matrix 'a' in solveqr.solve(A,b)
#matrices used in my actual problem (see attached files)A = readMM("A.txt")b = 
readMM("b.txt")
#Error in as(x, "matrix")[i, , drop = drop] : subscript out of 
boundssolve(qr(A, LAPACK=TRUE),b)

   
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] ¿Es "R" recomendable como lenguaje para alguien que quiere empezar a programar?....

2016-04-20 Thread rubenfcasal


Ola a todos,

Doy por hecho que a la persona no le interesa especialmente la 
estadística...


Mientras escribía el correo, ya se adelantaron Antonio y Carlos 
Gil, coincidiría con ellos...


Tengo visto herramientas gráficas y cursos para empezar a programar 
(en general) que utilizan python. Algunas incluso se utilizan para niños 
pequeños (p.e. en 'Quimo', un S.O. para niños, está una instalada; 
supongo que también en 'Picaros'...).


Un saludo, Rubén.


El 20/04/2016 a las 19:29, Javier Marcuzzi escribió:

Estimado Carlos

Yo creo que no. En mi caso de pequeño tenía una computadora y los juegos 
estaban en casette o comprábamos revistas con los códigos. Pero cuándo aprendí 
ya tenía algo más de edad, y es la base de datos, programación fue en la 
universidad con Fortran en genética, y yo a mi profesor le decía de R.

Con el tiempo aprendí que no importa el sistema operativo ni el lenguaje, lo 
importante es manejar una base de datos, no importa cuál, algo para guardar y 
recuperar la información. Y como lenguaje, hoy pienso que C#. Es relativamente 
fácil, aunque los bucles, listas, algo de matrices, es casi igual el todos, por 
la facilidad que tiene visual studio en asistencias al que escribe (podría ser 
F#).

Cuándo se aprende lo básico es relativamente fácil ir a lenguajes específicos, 
pero R no sería pensado para objetos, por lo cuál si se quiere esquivar esa 
parte podría serlo, lo importante de R es poder usarlo en estadística, quizás 
por necesidad el usuario mezcle R con lo aprendido con C#.

Lo que no me parece bueno es utilizar al inicio por ejemplo la documentación de 
F# y R, o sweave, pandoc, …, hay que aprenderlo puro y luego mezclarlo.

Javier Rubén Marcuzzi

De: Carlos Ortega
Enviado: miércoles, 20 de abril de 2016 13:21
Para: Lista R
Asunto: [R-es] ¿Es "R" recomendable como lenguaje para alguien que quiere 
empezar a programar?

Hola,

Quería preguntaros por vuestra opinión aunque esta discusión pueda ser algo
"offtopic".
Para alguien que quiere empezar a programar, ¿recomendaríais R?.

Visto que "R" aun siendo un lenguaje que nació con una orientación muy
específica, ya se usa como de propósito general, quizás podría ser
adecuado. Independientemente del uso, creo que R puede utilizarse para
aprender los fundamentos de programación y encima te llevas puesto toda la
parte de análisis y tratamiento de datos.

¿Cuál sería vuestra recomendación?.

Gracias,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread Berend Hasselman


> On 20 Apr 2016, at 13:22, A A via R-help  wrote:
> 
> 
> 
> 
> I have a situation in R where I would like to find any x (if one exists) that 
> solves the linear system of equations Ax = b, where A is square, sparse, and 
> singular, and b is a vector. Here is some code that mimics my issue with a 
> relatively simple A and b, along with three other methods of solving this 
> system that I found online, two of which give me an error and one of which 
> succeeds on the simplified problem, but fails on my data set(attached). Is 
> there a solver in R that I can use in order to get x without any errors given 
> the structure of A? Thanks for your time.
> #CODE STARTS HEREA = 
> as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b = 
> matrix(c(-30,40,-10),nrow=3,ncol=1)
> #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out 
> of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
> #one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A %*% 
> x
> #Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the 
> system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b) : 
> singular matrix 'a' in solveqr.solve(A,b)
> #matrices used in my actual problem (see attached files)A = readMM("A.txt")b 
> = readMM("b.txt")
> #Error in as(x, "matrix")[i, , drop = drop] : subscript out of 
> boundssolve(qr(A, LAPACK=TRUE),b)

Your code is a mess. 

A singular square system of linear equations has an infinity of solutions if a 
solution exists at all.
How that works you can find here: 
https://en.wikipedia.org/wiki/System_of_linear_equations
in the section "Matrix solutions".

For your simple example you can do it like this:

library(MASS)
Ag <- ginv(A)   # pseudoinverse

xb <- Ag %*% b # minimum norm solution

Aw <- diag(nrow=nrow(Ag)) - Ag %*% A  # see the Wikipedia page
w <- runif(3)
z <- xb + Aw %*% w
A %*% z - b

N <- Null(t(A))  # null space of A;  see the help for Null in package MASS
A %*% N
A %*% (xb + 2 * N) - b

For sparse systems you will have to approach this differently; I have no 
experience with that.

Berend

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] ¿Es "R" recomendable como lenguaje para alguien que quiere empezar a programar?....

2016-04-20 Thread Carlos J. Gil Bellosta

Hola, ¿qué tal?

Algo que desconcierta a quienes aprenden R es que no existe una manera
obvia (y preferiblemente única) de hacer las cosas. En Python, por ejemplo,
se trata de que sea así (véase https://www.python.org/dev/peps/pep-0020/).

R es caótico, desordenado y está plagado de sublenguajes de toda ralea. Se
puede vivir con ello, pero es mejor mantener a los novatos en la
programación al margen de ese tipo de problemas. Python es más adecuado.

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R-es] ¿Es "R" recomendable como lenguaje para alguien que quiere empezar a programar?....

2016-04-20 Thread Antonio Maurandi López


yo siempre que alguien me hace esa pregunta le hablo de Phyton.
me parece sencillo y claro, muy potente y hay muchos cursos 
iniciales..., es muy popular, y el paso a R es tb fácil.





El 20/04/16 a las 18:21, Carlos Ortega escribió:

Hola,

Quería preguntaros por vuestra opinión aunque esta discusión pueda ser algo
"offtopic".
Para alguien que quiere empezar a programar, ¿recomendaríais R?.

Visto que "R" aun siendo un lenguaje que nació con una orientación muy
específica, ya se usa como de propósito general, quizás podría ser
adecuado. Independientemente del uso, creo que R puede utilizarse para
aprender los fundamentos de programación y encima te llevas puesto toda la
parte de análisis y tratamiento de datos.

¿Cuál sería vuestra recomendación?.

Gracias,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


--
Antonio Maurandi López
Sec. Apoyo Estadístico (SAE)
Servicio de Apoyo a la Investigación (SAI)
Vicerrectorado de Investigación
Universidad de Murcia

Edif. SACE. Campus de Espinardo.
30100 Murcia
@. amaura...@um.es
T. 868 88 7315 F. 868 88 7302
www.um.es/sai www.um.es/ae
Blog: www.sae.saiblogs.inf.um.es
---
Audentes fortuna iuva

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R-es] ¿Es "R" recomendable como lenguaje para alguien que quiere empezar a programar?....

2016-04-20 Thread Javier Marcuzzi

Estimado Carlos

Yo creo que no. En mi caso de pequeño tenía una computadora y los juegos 
estaban en casette o comprábamos revistas con los códigos. Pero cuándo aprendí 
ya tenía algo más de edad, y es la base de datos, programación fue en la 
universidad con Fortran en genética, y yo a mi profesor le decía de R. 

Con el tiempo aprendí que no importa el sistema operativo ni el lenguaje, lo 
importante es manejar una base de datos, no importa cuál, algo para guardar y 
recuperar la información. Y como lenguaje, hoy pienso que C#. Es relativamente 
fácil, aunque los bucles, listas, algo de matrices, es casi igual el todos, por 
la facilidad que tiene visual studio en asistencias al que escribe (podría ser 
F#).

Cuándo se aprende lo básico es relativamente fácil ir a lenguajes específicos, 
pero R no sería pensado para objetos, por lo cuál si se quiere esquivar esa 
parte podría serlo, lo importante de R es poder usarlo en estadística, quizás 
por necesidad el usuario mezcle R con lo aprendido con C#.

Lo que no me parece bueno es utilizar al inicio por ejemplo la documentación de 
F# y R, o sweave, pandoc, …, hay que aprenderlo puro y luego mezclarlo.

Javier Rubén Marcuzzi

De: Carlos Ortega
Enviado: miércoles, 20 de abril de 2016 13:21
Para: Lista R
Asunto: [R-es] ¿Es "R" recomendable como lenguaje para alguien que quiere 
empezar a programar?

Hola,

Quería preguntaros por vuestra opinión aunque esta discusión pueda ser algo
"offtopic".
Para alguien que quiere empezar a programar, ¿recomendaríais R?.

Visto que "R" aun siendo un lenguaje que nació con una orientación muy
específica, ya se usa como de propósito general, quizás podría ser
adecuado. Independientemente del uso, creo que R puede utilizarse para
aprender los fundamentos de programación y encima te llevas puesto toda la
parte de análisis y tratamiento de datos.

¿Cuál sería vuestra recomendación?.

Gracias,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

[R-es] ¿Es "R" recomendable como lenguaje para alguien que quiere empezar a programar?....

2016-04-20 Thread Carlos Ortega

Hola,

Quería preguntaros por vuestra opinión aunque esta discusión pueda ser algo
"offtopic".
Para alguien que quiere empezar a programar, ¿recomendaríais R?.

Visto que "R" aun siendo un lenguaje que nació con una orientación muy
específica, ya se usa como de propósito general, quizás podría ser
adecuado. Independientemente del uso, creo que R puede utilizarse para
aprender los fundamentos de programación y encima te llevas puesto toda la
parte de análisis y tratamiento de datos.

¿Cuál sería vuestra recomendación?.

Gracias,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] installation of dplyr

2016-04-20 Thread Ben Tupper

Increasing memory resolved the issue for me.

Thanks again,
Ben

> On Apr 19, 2016, at 4:10 PM, Hadley Wickham  wrote:
> 
> You normally see these errors when compiling on a vm that has very
> little memory.
> Hadley
> 
> On Tue, Apr 19, 2016 at 2:47 PM, Ben Tupper  wrote:
>> Hello,
>> 
>> I am getting a fresh CentOS 6.7 machine set up with all of the goodies for R 
>> 3.2.3, including dplyr package. I am unable to successfully install it.  
>> Below I show the failed installation using utils::install.packages() and 
>> then again using devtools::install_github().  Each yields an error similar 
>> to the other but not quite exactly the same - the error messages sail right 
>> over my head.
>> 
>> I can contact the package author if that would be better, but thought it 
>> best to start here.
>> 
>> Thanks!
>> Ben
>> 
>> Ben Tupper
>> Bigelow Laboratory for Ocean Sciences
>> 60 Bigelow Drive, P.O. Box 380
>> East Boothbay, Maine 04544
>> http://www.bigelow.org
>> 
>>> sessionInfo()
>> R version 3.2.3 (2015-12-10)
>> Platform: x86_64-redhat-linux-gnu (64-bit)
>> Running under: CentOS release 6.7 (Final)
>> 
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>> [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>> [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>> [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>> [9] LC_ADDRESS=C   LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> 
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>> 
>> 
>> 
>> 
>> #   utils::install.packages()
>> 
>> 
>>> install.packages("dplyr", repo = "http://cran.r-project.org;)
>> Installing package into ‘/usr/lib64/R/library’
>> (as ‘lib’ is unspecified)
>> trying URL 'http://cran.r-project.org/src/contrib/dplyr_0.4.3.tar.gz'
>> Content type 'application/x-gzip' length 655997 bytes (640 KB)
>> ==
>> downloaded 640 KB
>> 
>> * installing *source* package ‘dplyr’ ...
>> ** package ‘dplyr’ successfully unpacked and MD5 sums checked
>> ** libs
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c RcppExports.cpp -o 
>> RcppExports.o
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c address.cpp -o address.o
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c api.cpp -o api.o
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c arrange.cpp -o arrange.o
>> In file included from ../inst/include/dplyr.h:131,
>> from arrange.cpp:1:
>> ../inst/include/dplyr/DataFrameSubsetVisitors.h: In constructor 
>> ‘dplyr::DataFrameSubsetVisitors::DataFrameSubsetVisitors(const 
>> Rcpp::DataFrame&, const Rcpp::CharacterVector&)’:
>> ../inst/include/dplyr/DataFrameSubsetVisitors.h:40: warning: ‘column’ may be 
>> used uninitialized in this function
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c between.cpp -o between.o
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c bind.cpp -o bind.o
>> g++ -m64 -I/usr/include/R -DNDEBUG -I../inst/include -DCOMPILING_DPLYR 
>> -I/usr/local/include -I"/usr/lib64/R/library/Rcpp/include" 
>> -I"/usr/lib64/R/library/BH/include"   -fpic  -O2 -g -pipe -Wall 
>> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
>> --param=ssp-buffer-size=4 -m64 -mtune=generic  -c combine_variables.cpp -o 
>> combine_variables.o
>> g++ -m64 -I/usr/include/R

Re: [R] Reading Multiple Output Variables

2016-04-20 Thread Jeff Newmiller

The word "analysis" is too vague. 

If you are referring to lm regression, you can specify Y as a matrix instead of 
a vector. 

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

Also, please disable HTML in your email when sending to this list, since it 
will usually come through to us in damaged form. 
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 8:19:48 AM PDT, "jody.kelly"  
wrote:
>
>Hi all,
>
>
>I am trying to read multiple out variables for a sensitivity analysis.
>
>
>Currently using one output value as follows:
>
>
>Y<-(E1)
>
>
>However I need to run analysis against 12 values of Y. So E1-E12.
>
>
>My matrix will be: Inputs are Column=4, Rows = 40 i.e. 40 rows of  4 
>input variables in different combinations. These will be analysed
>against 40 rows of output variables for 12 columns.
>
>
>e.g.
>
>
>  V1 V2 V3 V4E1 E2 E3 E4 ... E12
>
>1
>
>2
>
>...
>
>40
>
>
>Can someone provide guidance on How I can plot against all 12 months?
>
>
>Thanks
>
>
>Jody
>
>
>This message is intended solely for the addressee and may contain
>confidential and/or legally privileged information. Any use, disclosure
>or reproduction without the sender's explicit consent is unauthorised
>and may be unlawful. If you have received this message in error, please
>notify Northumbria University immediately and permanently delete it.
>Any views or opinions expressed in this message are solely those of the
>author and do not necessarily represent those of the University. The
>University cannot guarantee that this message or any attachment is
>virus free or has not been intercepted and/or amended.
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use multiple cores on Linux

2016-04-20 Thread Jeff Newmiller

The answer to your question is yes. You might consider using the parallel 
package., and I would suggest starting  with a simpler test case to learn how 
it works and incrementally adding complexity of packages and data handling. 
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 7:37:07 AM PDT, Miluji Sb  wrote:
>I am trying to run the following code in R on a Linux cluster. I would
>like
>to use the full processing power (specifying cores/nodes/memory). The
>code
>essentially runs predictions based on a GAM regression and saves the
>results as a CSV file for multiple sets of data (here I only show two).
>
>Is it possible to run this code using HPC packages such as
>Rmpi/snow/doParallel? Thank you!
>
>#
>library(data.table)
>library(mgcv)
>library(reshape2)
>library(dplyr)
>library(tidyr)
>library(lubridate)
>library(DataCombine)
>#
>gam_max_count_wk <- gam(count_pop ~ factor(citycode) + factor(year) +
>factor(week) + s(lnincome) + s(tmax) +
>s(hmax),data=cont,na.action="na.omit", method="ML")
>
>#
># Historic
>temp_hist <- read.csv("/work/sd00815/giss_historic/giss_temp_hist.csv")
>humid_hist <- read.csv("/work/sd00815/giss_historic/giss_hum_hist.csv")
>#
>temp_hist <- as.data.table(temp_hist)
>humid_hist <- as.data.table(humid_hist)
>#
># Merge
>mykey<- c("FIPS", "year","month", "week")
>setkeyv(temp_hist, mykey)
>setkeyv(humid_hist, mykey)
>#
>hist<- merge(temp_hist, humid_hist, by=mykey)
>#
>hist$X.x <- NULL
>hist$X.y <- NULL
>#
># Max
>hist_max <- hist
>hist_max$FIPS <- hist_max$year <- hist_max$month <- hist_max$tmin <-
>hist_max$tmean <- hist_max$hmin <- hist_max$hmean <- NULL
>#
># Adding Factors
>hist_max$citycode <- rep(101,nrow(hist_max))
>hist_max$year <- rep(2010,nrow(hist_max))
>hist_max$lnincome <- rep(10.262,nrow(hist_max))
>#
># Predictions
>pred_hist_max <- predict.gam(gam_max_count_wk,hist_max)
>#
>pred_hist_max <- as.data.table(pred_hist_max)
>pred_hist_max <- cbind(hist, pred_hist_max)
>pred_hist_max$tmax <- pred_hist_max$tmean <- pred_hist_max$tmin <-
>pred_hist_max$hmean <- pred_hist_max$hmax <- pred_hist_max$hmin <- NULL
>#
># Aggregate by FIPS
>max_hist <- pred_hist_max %>%
>  group_by(FIPS) %>%
>  summarise(pred_hist = mean(pred_hist_max))
>#
>### Future
>## 4.5
># 4.5_2021_2050
>temp_sim <-
>read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv")
>humid_sim <-
>read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv")
>#
># Max
>temp_sim <- as.data.table(temp_sim)
>setnames(temp_sim, "max", "tmax")
>setnames(temp_sim, "min", "tmin")
>setnames(temp_sim, "avg", "tmean")
>#
>humid_sim <- as.data.table(humid_sim)
>setnames(humid_sim, "max", "hmax")
>setnames(humid_sim, "min", "hmin")
>setnames(humid_sim, "avg", "hmean")
>#
>temp_sim$X <- NULL
>humid_sim$X <- NULL
>#
># Merge
>mykey<- c("FIPS", "year","month", "week")
>setkeyv(temp_sim, mykey)
>setkeyv(humid_sim, mykey)
>#
>sim <- merge(temp_sim, humid_sim, by=mykey)
>#
>sim_max <- sim
>#
>sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <-
>sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL
>#
># Adding Factors
>sim_max$citycode <- rep(101,nrow(sim_max))
>sim_max$year <- rep(2010,nrow(sim_max))
>sim_max$week <- rep(1,nrow(sim_max))
>sim_max$lnincome <- rep(10.262,nrow(sim_max))
>#
># Predictions
>pred_sim_max <- predict.gam(gam_max_count_wk,sim_max)
>#
>pred_sim_max <- as.data.table(pred_sim_max)
>pred_sim_max <- cbind(sim, pred_sim_max)
>pred_sim_max$tmax <- pred_sim_max$tmean <- pred_sim_max$tmin <-
>pred_sim_max$hmean <- pred_sim_max$hmax <- pred_sim_max$hmin <- NULL
>#
># Aggregate by FIPS
>max_sim <- pred_sim_max %>%
>  group_by(FIPS) %>%
>  summarise(pred_sim = mean(pred_sim_max))
>#
># Merge with Historical Data
>max_hist$FIPS <- as.factor(max_hist$FIPS)
>max_sim$FIPS <- as.factor(max_sim$FIPS)
>#
>mykey1<- c("FIPS")
>setkeyv(max_hist, mykey1)
>setkeyv(max_sim, mykey1)
>max_change <- merge(max_hist, max_sim, by=mykey1)
>max_change$change <-
>((max_change$pred_sim-max_change$pred_hist)/max_change$pred_hist)*100
>#
>write.csv(max_change, file =
>"/work/sd00815/projections_data/year_wk_fe/giss/max/giss_4.5_2021_2050.csv")
>
>
>
># 4.5_2081_2100
>temp_sim <-
>read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv")
>humid_sim <-
>read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv")
>#
># Max
>temp_sim <- as.data.table(temp_sim)
>setnames(temp_sim, "max", "tmax")
>setnames(temp_sim, "min", "tmin")
>setnames(temp_sim, "avg", "tmean")
>#
>humid_sim <- as.data.table(humid_sim)
>setnames(humid_sim, "max", "hmax")
>setnames(humid_sim, "min", "hmin")
>setnames(humid_sim, "avg", "hmean")
>#
>temp_sim$X <- NULL
>humid_sim$X <- NULL
>#
># Merge
>mykey<- c("FIPS", "year","month", "week")
>setkeyv(temp_sim, mykey)
>setkeyv(humid_sim, mykey)
>#
>sim <- merge(temp_sim, humid_sim, by=mykey)
>#
>sim_max <- sim
>#
>sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <-
>sim_max$tmean <- sim_max$hmin <- sim_max$hmean <-

[R] Reading Multiple Output Variables

2016-04-20 Thread jody.kelly


Hi all,


I am trying to read multiple out variables for a sensitivity analysis.


Currently using one output value as follows:


Y<-(E1)


However I need to run analysis against 12 values of Y. So E1-E12.


My matrix will be: Inputs are Column=4, Rows = 40 i.e. 40 rows of  4  input 
variables in different combinations. These will be analysed against 40 rows of 
output variables for 12 columns.


e.g.


  V1 V2 V3 V4E1 E2 E3 E4 ... E12

1

2

...

40


Can someone provide guidance on How I can plot against all 12 months?


Thanks


Jody


This message is intended solely for the addressee and may contain confidential 
and/or legally privileged information. Any use, disclosure or reproduction 
without the sender's explicit consent is unauthorised and may be unlawful. If 
you have received this message in error, please notify Northumbria University 
immediately and permanently delete it. Any views or opinions expressed in this 
message are solely those of the author and do not necessarily represent those 
of the University. The University cannot guarantee that this message or any 
attachment is virus free or has not been intercepted and/or amended.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread Jeff Newmiller

This is kind of like asking for a solution to x+1=x+1. Go back to linear 
algebra and look up Singular Value Decomposition, and decide if you really want 
to proceed.  See also ?svd and package irlba.
-- 
Sent from my phone. Please excuse my brevity.

On April 20, 2016 4:22:34 AM PDT, A A via R-help  wrote:
>
>
>
>I have a situation in R where I would like to find any x (if one
>exists) that solves the linear system of equations Ax = b, where A is
>square, sparse, and singular, and b is a vector. Here is some code that
>mimics my issue with a relatively simple A and b, along with three
>other methods of solving this system that I found online, two of which
>give me an error and one of which succeeds on the simplified problem,
>but fails on my data set(attached). Is there a solver in R that I can
>use in order to get x without any errors given the structure of A?
>Thanks for your time.
>#CODE STARTS HEREA =
>as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b
>= matrix(c(-30,40,-10),nrow=3,ncol=1)
>#solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or
>out of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
>#one x that happens to solve Ax = bx =
>matrix(c(-10,10,0),nrow=3,ncol=1)A %*% x
>#Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves
>the system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in
>qr.solve(A, b) : singular matrix 'a' in solveqr.solve(A,b)
>#matrices used in my actual problem (see attached files)A =
>readMM("A.txt")b = readMM("b.txt")
>#Error in as(x, "matrix")[i, , drop = drop] : subscript out of
>boundssolve(qr(A, LAPACK=TRUE),b)
>
>   
>
>
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving sparse, singular systems of equations

2016-04-20 Thread William Dunlap via R-help

This is not a solution but your lsfit attempt
   #Error in lsfit(A, b) : only 3 cases, but 4 variables
   lsfit(A,b)
gave that error because lsfit adds a column of 1 to
its first argument unless you use intercept=FALSE.
Then it will give you an answer (but I think it converts
your sparse matrix into a dense one before doing
any linear algebra).



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Apr 20, 2016 at 4:22 AM, A A via R-help 
wrote:

>
>
>
>  I have a situation in R where I would like to find any x (if one exists)
> that solves the linear system of equations Ax = b, where A is square,
> sparse, and singular, and b is a vector. Here is some code that mimics my
> issue with a relatively simple A and b, along with three other methods of
> solving this system that I found online, two of which give me an error and
> one of which succeeds on the simplified problem, but fails on my data
> set(attached). Is there a solver in R that I can use in order to get x
> without any errors given the structure of A? Thanks for your time.
> #CODE STARTS HEREA =
> as(matrix(c(1.5,-1.5,0,-1.5,2.5,-1,0,-1,1),nrow=3,ncol=3),"sparseMatrix")b
> = matrix(c(-30,40,-10),nrow=3,ncol=1)
> #solve for x, Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or
> out of memory)solve(A,b,sparse=TRUE,tol=.Machine$double.eps)
> #one x that happens to solve Ax = bx = matrix(c(-10,10,0),nrow=3,ncol=1)A
> %*% x
> #Error in lsfit(A, b) : only 3 cases, but 4 variableslsfit(A,b)#solves the
> system, but fails belowsolve(qr(A, LAPACK=TRUE),b)#Error in qr.solve(A, b)
> : singular matrix 'a' in solveqr.solve(A,b)
> #matrices used in my actual problem (see attached files)A =
> readMM("A.txt")b = readMM("b.txt")
> #Error in as(x, "matrix")[i, , drop = drop] : subscript out of
> boundssolve(qr(A, LAPACK=TRUE),b)
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Add a vertical arrow to a time series graph using ggplot and xts

2016-04-20 Thread Giorgio Garziano

Please see updates to df2 assignment as shown below.

library(xts)  # primary
#library(tseries)   # Unit root tests
library(ggplot2)
library(vars)
library(grid)
dt_xts<-xts(x = 1:10, order.by = seq(as.Date("2016-01-01"),
 as.Date("2016-01-10"), by = "1 day"))
colnames(dt_xts)<-"gdp"
xmin<-min(index(dt_xts))
xmax<-max(index(dt_xts))
df1<-data.frame(x = index(dt_xts), coredata(dt_xts))
p<-ggplot(data = df1, mapping= aes(x=x, y=gdp))+geom_line()
rg<-ggplot_build(p)$panel$ranges[[1]]$y.range
y1<-rg[1]
y2<-rg[2]


# x = as.Date(..) in place of x = "2016-01-05"
df2<-data.frame(x = as.Date("2016-01-05"), y1=y1, y2=y2 )

p1<-p+geom_segment(mapping=aes(x=x, y=y1, xend=x, yend=y2), data=df2,
   arrow=arrow())
--

Best,

GG




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Use multiple cores on Linux

2016-04-20 Thread Miluji Sb

I am trying to run the following code in R on a Linux cluster. I would like
to use the full processing power (specifying cores/nodes/memory). The code
essentially runs predictions based on a GAM regression and saves the
results as a CSV file for multiple sets of data (here I only show two).

Is it possible to run this code using HPC packages such as
Rmpi/snow/doParallel? Thank you!

#
library(data.table)
library(mgcv)
library(reshape2)
library(dplyr)
library(tidyr)
library(lubridate)
library(DataCombine)
#
gam_max_count_wk <- gam(count_pop ~ factor(citycode) + factor(year) +
factor(week) + s(lnincome) + s(tmax) +
s(hmax),data=cont,na.action="na.omit", method="ML")

#
# Historic
temp_hist <- read.csv("/work/sd00815/giss_historic/giss_temp_hist.csv")
humid_hist <- read.csv("/work/sd00815/giss_historic/giss_hum_hist.csv")
#
temp_hist <- as.data.table(temp_hist)
humid_hist <- as.data.table(humid_hist)
#
# Merge
mykey<- c("FIPS", "year","month", "week")
setkeyv(temp_hist, mykey)
setkeyv(humid_hist, mykey)
#
hist<- merge(temp_hist, humid_hist, by=mykey)
#
hist$X.x <- NULL
hist$X.y <- NULL
#
# Max
hist_max <- hist
hist_max$FIPS <- hist_max$year <- hist_max$month <- hist_max$tmin <-
hist_max$tmean <- hist_max$hmin <- hist_max$hmean <- NULL
#
# Adding Factors
hist_max$citycode <- rep(101,nrow(hist_max))
hist_max$year <- rep(2010,nrow(hist_max))
hist_max$lnincome <- rep(10.262,nrow(hist_max))
#
# Predictions
pred_hist_max <- predict.gam(gam_max_count_wk,hist_max)
#
pred_hist_max <- as.data.table(pred_hist_max)
pred_hist_max <- cbind(hist, pred_hist_max)
pred_hist_max$tmax <- pred_hist_max$tmean <- pred_hist_max$tmin <-
pred_hist_max$hmean <- pred_hist_max$hmax <- pred_hist_max$hmin <- NULL
#
# Aggregate by FIPS
max_hist <- pred_hist_max %>%
  group_by(FIPS) %>%
  summarise(pred_hist = mean(pred_hist_max))
#
### Future
## 4.5
# 4.5_2021_2050
temp_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv")
humid_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv")
#
# Max
temp_sim <- as.data.table(temp_sim)
setnames(temp_sim, "max", "tmax")
setnames(temp_sim, "min", "tmin")
setnames(temp_sim, "avg", "tmean")
#
humid_sim <- as.data.table(humid_sim)
setnames(humid_sim, "max", "hmax")
setnames(humid_sim, "min", "hmin")
setnames(humid_sim, "avg", "hmean")
#
temp_sim$X <- NULL
humid_sim$X <- NULL
#
# Merge
mykey<- c("FIPS", "year","month", "week")
setkeyv(temp_sim, mykey)
setkeyv(humid_sim, mykey)
#
sim <- merge(temp_sim, humid_sim, by=mykey)
#
sim_max <- sim
#
sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <-
sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL
#
# Adding Factors
sim_max$citycode <- rep(101,nrow(sim_max))
sim_max$year <- rep(2010,nrow(sim_max))
sim_max$week <- rep(1,nrow(sim_max))
sim_max$lnincome <- rep(10.262,nrow(sim_max))
#
# Predictions
pred_sim_max <- predict.gam(gam_max_count_wk,sim_max)
#
pred_sim_max <- as.data.table(pred_sim_max)
pred_sim_max <- cbind(sim, pred_sim_max)
pred_sim_max$tmax <- pred_sim_max$tmean <- pred_sim_max$tmin <-
pred_sim_max$hmean <- pred_sim_max$hmax <- pred_sim_max$hmin <- NULL
#
# Aggregate by FIPS
max_sim <- pred_sim_max %>%
  group_by(FIPS) %>%
  summarise(pred_sim = mean(pred_sim_max))
#
# Merge with Historical Data
max_hist$FIPS <- as.factor(max_hist$FIPS)
max_sim$FIPS <- as.factor(max_sim$FIPS)
#
mykey1<- c("FIPS")
setkeyv(max_hist, mykey1)
setkeyv(max_sim, mykey1)
max_change <- merge(max_hist, max_sim, by=mykey1)
max_change$change <-
((max_change$pred_sim-max_change$pred_hist)/max_change$pred_hist)*100
#
write.csv(max_change, file =
"/work/sd00815/projections_data/year_wk_fe/giss/max/giss_4.5_2021_2050.csv")



# 4.5_2081_2100
temp_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv")
humid_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv")
#
# Max
temp_sim <- as.data.table(temp_sim)
setnames(temp_sim, "max", "tmax")
setnames(temp_sim, "min", "tmin")
setnames(temp_sim, "avg", "tmean")
#
humid_sim <- as.data.table(humid_sim)
setnames(humid_sim, "max", "hmax")
setnames(humid_sim, "min", "hmin")
setnames(humid_sim, "avg", "hmean")
#
temp_sim$X <- NULL
humid_sim$X <- NULL
#
# Merge
mykey<- c("FIPS", "year","month", "week")
setkeyv(temp_sim, mykey)
setkeyv(humid_sim, mykey)
#
sim <- merge(temp_sim, humid_sim, by=mykey)
#
sim_max <- sim
#
sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <-
sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL
#
# Adding Factors
sim_max$citycode <- rep(101,nrow(sim_max))
sim_max$year <- rep(2010,nrow(sim_max))
sim_max$week <- rep(1,nrow(sim_max))
sim_max$lnincome <- rep(10.262,nrow(sim_max))
#
# Predictions
pred_sim_max <- predict.gam(gam_max_count_wk,sim_max)
#
pred_sim_max <- as.data.table(pred_sim_max)
pred_sim_max <- cbind(sim, pred_sim_max)
pred_sim_max$tmax <- pred_sim_max$tmean <- pred_sim_max$tmin <-
pred_sim_max$hmean <- pred_sim_max$hmax <- pred_sim_max$hmin <- NULL
#
# Aggregate by FIPS
max_sim <-

[R] Data reshaping with conditions

2016-04-20 Thread sri vathsan

Dear All,

I am trying to reshape the data with some conditions. A small part of the
data looks like below. Like this there will be more data with repeating ID.

Count id name type
117 335 sally A
19 335 sally A
167 335 sally B
18 340 susan A
56 340 susan A
22 340 susan B
53 340 susan B
135 351 lee A
114 351 lee A
84 351 lee A
80 351 lee A
19 351 lee A
8 351 lee A
21 351 lee A
88 351 lee B
111 351 lee B
46 351 lee B
108 351 lee B

>From the above data I am expecting an output like below.

id name type count_of_B Max of count B x   y
335 sally B 167 167 117,19  NA
340 susan B 22,53 53 18  56
351 lee B 88,111,46,108  111 84,80,19,8,2   135,114

Where, the column x and column y are:

x = Count_A_less_than_max of (Count type B)
y = Count_A_higher_than_max of (Count type B).

*1)* I tried with dplyr with the following code for the initial step to get
the values for each column.
*2)*  I thought to transpose the columns which has the unique ID alone.

I tried with the following code and I am struck with the intial step
itself. The code is executed but higher and lower value of A is not coming.

Expected_output= data %>%
  group_by(id, Type) %>%
  mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse = ","))%>%
  mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type ==
"B"]),max(count[Type == "A"]))) %>%
  mutate(count_type_A_lesser = ifelse
(Type=="B",(paste(unlist(count[Type=="A"]) < Max_of_count_B[Type=="B"],
collapse = ",")), "NA"))%>%
  mutate(count_type_A_higher =
ifelse(Type=="B",(paste(unlist(count[Type=="A"]) >
Max_of_count_B[Type=="B"], collapse = ",")), "NA"))

I hope I make my point clear. Please bare with the code, as I am new to
this.

Regards,
sri

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merge sort

2016-04-20 Thread Duncan Murdoch


On 20/04/2016 7:38 AM, Gaston wrote:

I indeed used is.na() to check length, as I was not sure weather
lenght() was a simple query or would go through the whole vector to
count the elements.


length() is a simple query, and is very fast.  The other problem in your 
approach (which may not be a problem with your current data) is that NA 
is commonly used as an element of a vector to represent a missing value.




So to sum up, function calls are expensive, therefore recursion should
be avoided, and growing the size of a vector (which is probably
reassigning and copying?) is also expensive.


"Avoided" may be too strong:  speed isn't always a concern, sometimes 
clarity is more important.  Growing vectors is definitely expensive.


Duncan Murdoch



Thank you for your help!



On 04/19/2016 11:51 PM, Duncan Murdoch wrote:

On 19/04/2016 3:39 PM, Gaston wrote:

Hello everyone,

I am learning R since recently, and as a small exercise I wanted to
write a recursive mergesort. I was extremely surprised to discover that
my sorting, although operational, is deeply inefficient in time. Here is
my code :


merge <- function(x,y){
if (is.na(x[1])) return(y)
else if (is.na(y[1])) return(x)
else if (x[1] 0 && length(x2) > 0) {
 # compare the first values
 if (x1[1] < x2[1]) {
   result[i + 1] <- x1[1]
   x1 <- x1[-1]
 } else {
   result[i + 1] <- x2[1]
   x2 <- x2[-1]
 }
 i <- i + 1
   }

   # put the smaller one into the result
   # delete it from whichever vector it came from
   # repeat until one of x1 or x2 is empty
   # copy both vectors (one is empty!) onto the end of the results
   result <- c(result, x1, x2)
   result
}

If I were going for speed, I wouldn't modify the x1 and x2 vectors,
and I'd pre-allocate result to the appropriate length, rather than
growing it in the while loop.  But that was a different class!

Duncan Murdoch




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merge sort

2016-04-20 Thread Gaston

I indeed used is.na() to check length, as I was not sure weather 
lenght() was a simple query or would go through the whole vector to 
count the elements.


So to sum up, function calls are expensive, therefore recursion should 
be avoided, and growing the size of a vector (which is probably 
reassigning and copying?) is also expensive.


Thank you for your help!



On 04/19/2016 11:51 PM, Duncan Murdoch wrote:

On 19/04/2016 3:39 PM, Gaston wrote:

Hello everyone,

I am learning R since recently, and as a small exercise I wanted to
write a recursive mergesort. I was extremely surprised to discover that
my sorting, although operational, is deeply inefficient in time. Here is
my code :


merge <- function(x,y){
   if (is.na(x[1])) return(y)
   else if (is.na(y[1])) return(x)
   else if (x[1] 0 && length(x2) > 0) {
# compare the first values
if (x1[1] < x2[1]) {
  result[i + 1] <- x1[1]
  x1 <- x1[-1]
} else {
  result[i + 1] <- x2[1]
  x2 <- x2[-1]
}
i <- i + 1
  }

  # put the smaller one into the result
  # delete it from whichever vector it came from
  # repeat until one of x1 or x2 is empty
  # copy both vectors (one is empty!) onto the end of the results
  result <- c(result, x1, x2)
  result
}

If I were going for speed, I wouldn't modify the x1 and x2 vectors, 
and I'd pre-allocate result to the appropriate length, rather than 
growing it in the while loop.  But that was a different class!


Duncan Murdoch


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] Script sin resultados

2016-04-20 Thread Carlos Ortega

Hola,

Si he entendido lo que quieres hacer, el conseguir cuántos 0, 1, 2.. hay
por columna, se me ocurre esta alternativa.

#---
col_ref <- 0:max(bas)
res_df <- 0
for(i in 1:ncol(bas)) {
  print(i)
  tmp_val <- as.data.frame(table(bas[,i]))
  tmp_df <- merge(col_ref, tmp_val, by.x = 1, by.y=1, all = TRUE)
  res_df <- cbind.data.frame(res_df, tmp_df$Freq)
}
res_df <- res_df[, 2:ncol(res_df)]
names(res_df) <- paste("V",1:ncol(res_df), sep="")
res_end <- cbind.data.frame(col_ref, res_df)
res_end[is.na(res_end)] <- 0
#---

Este script incluye por columna cuantos, ceros, unos... hasta el valor
máximo que hay en "bas".
La primera columna "col_ref" es la de referencia, va desde 0 hasta el valor
máximo de bas.

Gracias,
Carlos.

El 20 de abril de 2016, 2:13, Manuel Máquez  escribió:

> Carlos:
>
> Te agradezco mucho tu respuesta tan rápida.
>
> Se trata de obtener las incidencias en cada uno de los 39 grupos, aquellos
> que lo hacen con mayor frecuencia. Es decir del grupo 'n' cuántas veces
> sucede 1, 2, ... hasta 10. Con este resumen que va ir cambiando, aplicarle
> loess.
>
> Por lo que se refiere a los valores que indico, los obtuve contando
> físicamente de los datos de TAB.
>
> Nuevamente te repito las gracias más cumplidas.
>
> *MANOLO MÁRQUEZ P.*
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>



-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?

2016-04-20 Thread Martin Maechler

> Henrik Bengtsson 
> on Tue, 19 Apr 2016 14:04:11 -0700 writes:

> Using the Matrix package, how can I create a row-oriented sparse
> Matrix from scratch populated with some data?  By default a
> column-oriented one is created and I'm aware of the note that the
> package is optimized for column-oriented ones, but I'm only interested
> in using it for holding my sparse row-oriented data and doing basic
> subsetting by rows (even using drop=FALSE).

> Here is what I get when I set up a column-oriented sparse Matrix:

>> Cc <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
>> Cc[1:3,1] <- 1

A general ("teaching") remark :
The above use of Matrix() is seen in many places, and is fine
for small matrices and the case where you only use the `[<-`
method very few times (as above).
Also using  Matrix()  is nice when being introduced to using the
Matrix package.

However, for efficience in non-small cases, do use

   sparseMatrix()

directly to construct sparse matrices.

>> Cc
> 5 x 5 sparse Matrix of class "dgCMatrix"

> [1,] 1 . . . .
> [2,] 1 . . . .
> [3,] 1 . . . .
> [4,] . . . . .
> [5,] . . . . .
>> str(Cc)
> Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
> ..@ i   : int [1:3] 0 1 2
> ..@ p   : int [1:6] 0 3 3 3 3 3
> ..@ Dim : int [1:2] 5 5
> ..@ Dimnames:List of 2
> .. ..$ : NULL
> .. ..$ : NULL
> ..@ x   : num [1:3] 1 1 1
> ..@ factors : list()

> When I try to do the analogue for a row-oriented matrix, I get a
> "dgTMatrix", whereas I would expect a "dgRMatrix":

>> Cr <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
>> Cr <- as(Cr, "dsRMatrix")
>> Cr[1,1:3] <- 1
>> Cr
> 5 x 5 sparse Matrix of class "dgTMatrix"

> [1,] 1 1 1 . .
> [2,] . . . . .
> [3,] . . . . .
> [4,] . . . . .
> [5,] . . . . .

The reason for the above behavior has been

a) efficiency.  All the subassignment ( `[<-` ) methods for
   "RsparseMatrix" objects (of which "dsRMatrix" is a special case)
   are implemented via  TsparseMatrix.
b) because of the general attitude that Csparse (and Tsparse to
   some extent) are well supported in Matrix,
   and e.g. further operations on Rsparse matrices would *again*
   go via T* or C* sparse ones, I had decided to keep things Tsparse.

[...]

> Trying with explicit coercion does not work:

>> as(Cc, "dgRMatrix")
> Error in as(Cc, "dgRMatrix") :
> no method or default for coercing "dgCMatrix" to "dgRMatrix"

>> as(Cr, "dgRMatrix")
> Error in as(Cr, "dgRMatrix") :
> no method or default for coercing "dgTMatrix" to "dgRMatrix"

The general philosophy in 'Matrix' with all the class
hierarchies and the many specific classes has been to allow and
foster coercing to abstract super classes,
i.e, to  "dMatrix"  or "generalMatrix", "triangularMatrix", or
then "denseMatrix", "sparseMatrix", "CsparseMatrix" or
"RsparseMatrix", etc

So in the above  as(*, "RsparseMatrix")   should work always.

As a summary, in other words,  for what you want,

   as(sparseMatrix(.), "RsparseMatrix")

should give you what you want reliably and efficiently.

> Am I doing some wrong here?  Or is this what means that the package is
> optimized for the column-oriented representation and I shouldn't
> really work with row-oriented ones?  I'm really only interested in
> access to efficient Cr[row,,drop=FALSE] subsetting (and a small memory
> footprint).

{ though you could equivalently use   Cc[,row, drop=FALSE]
  with a CsparseMatrix Cc := t(Cr),
  couldn't you ?
}

Martin Maechler  (maintainer of 'Matrix')
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with X11

2016-04-20 Thread Lorenzo Isella


Hello!
Today on debian testing R 3.2.5 was delivered among the updates.
The X11 problem is no longer there.
Cheers

Lorenzo

On Tue, Apr 19, 2016 at 02:28:44PM -0400, Tom Wright wrote:

I don't have my debian box available so can't confirm. But I would try
$apt-get install libpng

On Tue, Apr 19, 2016 at 11:23 AM, Lorenzo Isella 
wrote:


Dear All,
I have never had this problem before. I run debian testing on my box
and I have recently update my R environment.
Now, see what happens when I try the most trivial of all plots

plot(seq(22))



Error in (function (display = "", width, height, pointsize, gamma, bg,
:
 X11 module cannot be loaded
 In addition: Warning message:
 In (function (display = "", width, height, pointsize, gamma, bg,  :
   unable to load shared object '/usr/lib/R/modules//R_X11.so':
 /usr/lib/x86_64-linux-gnu/libpng12.so.0: version `PNG12_0' not
 found (required by /usr/lib/R/modules//R_X11.so)

and this is my sessionInfo()

sessionInfo()



R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux stretch/sid

locale:
[1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C
 [3] LC_TIME=en_GB.utf8LC_COLLATE=en_GB.utf8
  [5] LC_MONETARY=en_GB.utf8LC_MESSAGES=en_GB.utf8
   [7] LC_PAPER=en_GB.utf8   LC_NAME=C
[9] LC_ADDRESS=C  LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


Anybody understands what is going on here?
Regards

Lorenzo

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

44 matches

Mail list logo