Re: [R] table and unique seems to behave differently
Thanks a lot, it answers my question. Alain De : Jeff Newmiller Envoy� : mardi 10 d�cembre 2019 16:31 � : r-help@r-project.org ; Duncan Murdoch ; Alain Guillet ; r-help@r-project.org Objet : Re: [R] table and unique seems to behave differently I think the question was about table vs unique. Table groups by character representation, unique groups by the underlying representation. On December 10, 2019 7:03:34 AM PST, Duncan Murdoch wrote: >On 10/12/2019 3:53 a.m., Alain Guillet wrote: >> Hi, >> >> I have a vector (see below the dput) and I use unique on it to get >unique values. If I then sort the result of the vector obtained by >unique, I see some elements that look like identical. I suspect it >could be a matter of rounded values but table gives a different result: >unlike unique output which contains "3.4 3.4", table has only one cell >for 3.4. >> >> Can anybody know why I get results that look like incoherent between >the two functions? > >dput() does some rounding, so it doesn't necessarily reproduce values >exactly. For example, > >x <- c(3.4, 3.4 + 1e-15) >unique(x) >#> [1] 3.4 3.4 >dput(x) >#> c(3.4, 3.4) >identical(x, c(3.4, 3.4)) >#> [1] FALSE > >If you really want to see exact values, you can use the "hexNumeric" >option to dput(): > >dput(x, control = "hexNumeric") >#> c(0x1.bp+1, 0x1.b3335p+1) >identical(x, c(0x1.bp+1, 0x1.b3335p+1)) >#> [1] TRUE > >Duncan Murdoch > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=02%7C01%7C%7C74bb1eeaeb444a6499e508d77d85fba3%7C7ab090d4fa2e4ecfbc7c4127b4d582ec%7C0%7C0%7C637115886725989409sdata=mU3K2kH%2FAwxdEQ%2BOWVYBhqNLbWGkWzmtfgx92D1DNF8%3Dreserved=0 >PLEASE do read the posting guide >https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.R-project.org%2Fposting-guide.htmldata=02%7C01%7C%7C74bb1eeaeb444a6499e508d77d85fba3%7C7ab090d4fa2e4ecfbc7c4127b4d582ec%7C0%7C0%7C637115886725989409sdata=hTxOssdYb%2FcvvSFQyQZ5GBWkpHzsIrbtqJbgCgW2LPw%3Dreserved=0 >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table and unique seems to behave differently
Another finding for me today: dput doesn't write exactly the vector that creates the problem. I could use an RData file but I think it is forbidden in this mailing list... Alain De : Chris Evans Envoy� : mardi 10 d�cembre 2019 15:41 � : Alain Guillet Cc : r-help@r-project.org Objet : Re: [R] table and unique seems to behave differently This doesn't answer your question but I get exactly the same vector of length 210 with unique(toto) and names(table(toto)) using the same version of R that you are and I can't see any obvious reason why you wouldn't but when I hit things like that it tends to be that one version is string with initial or trailing spaces or a character set issue. I can't see that those apply here but it's all I could imagine without racking my poor old brains much more. Good luck finding the answer! Chris - Original Message - > From: "Alain Guillet" > To: r-help@r-project.org > Sent: Tuesday, 10 December, 2019 09:53:29 > Subject: [R] table and unique seems to behave differently > Hi, > > I have a vector (see below the dput) and I use unique on it to get unique > values. If I then sort the result of the vector obtained by unique, I see some > elements that look like identical. I suspect it could be a matter of rounded > values but table gives a different result: unlike unique output which contains > "3.4 3.4", table has only one cell for 3.4. > > Can anybody know why I get results that look like incoherent between the two > functions? > > > Best regards, > Alain Guillet > > -- > platform x86_64-pc-linux-gnu > arch x86_64 > os linux-gnu > system x86_64, linux-gnu > status > major 3 > minor 6.1 > year 2019 > month 07 > day05 > svn rev76782 > language R > version.string R version 3.6.1 (2019-07-05) > nickname Action of the Toes > -- >> dput(toto) > c(2.5, 2.6, 2.6, 2.6, 2.6, 2.7, 2.7, 2.7, 2.7, 2.7, 2.7, 2.8, > 2.8, 2.8, 2.8, 2.8, 2.8, 2.8, 2.8, 2.8, 2.9, 2.9, 2.9, 2.9, 2.9, > 2.9, 2.9, 2.9, 3, 3, 3, 3, 3, 3, 3, 3, 3.1, 3.1, 3.1, 3.1, 3.1, > 3.1, 3.1, 3.1, 3.1, 3.1, 3.1, 3.1, 3.2, 3.2, 3.2, 3.2, 3.2, 3.2, > 3.2, 3.2, 3.2, 3.2, 3.2, 3.2, 3.2, 3.3, 3.3, 3.3, 3.3, 3.3, 3.3, > 3.3, 3.3, 3.3, 3.3, 3.3, 3.3, 3.3, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, > 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.5, 3.5, 3.5, 3.5, 3.5, > 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.6, 3.6, 3.6, > 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, > 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, > 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, > 3.7, 3.7, 3.7, 3.7, 3.7, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, > 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, > 3.8, 3.8, 3.8, 3.8, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, > 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, > 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 4, 4, 4, 4, 4, 4, 4, 4, 4, > 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, > 4, 4, 4, 4, 4, 4, 4, 4, 4, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, > 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, > 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, > 4.1, 4.1, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, > 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, > 4.2, 4.2, 4.2, 4.2, 4.2, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, > 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, > 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.4, 4.4, 4.4, 4.4, > 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, > 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, > 4.4, 4.4, 4.4, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, > 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, > 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, > 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.6, > 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, > 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, > 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, > 4.6, 4.6, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, > 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, > 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, > 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.8, 4.8, 4.8, 4.8, 4.8, > 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, > 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, > 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.
[R] table and unique seems to behave differently
Hi, I have a vector (see below the dput) and I use unique on it to get unique values. If I then sort the result of the vector obtained by unique, I see some elements that look like identical. I suspect it could be a matter of rounded values but table gives a different result: unlike unique output which contains "3.4 3.4", table has only one cell for 3.4. Can anybody know why I get results that look like incoherent between the two functions? Best regards, Alain Guillet -- platform x86_64-pc-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 3 minor 6.1 year 2019 month 07 day05 svn rev76782 language R version.string R version 3.6.1 (2019-07-05) nickname Action of the Toes -- > dput(toto) c(2.5, 2.6, 2.6, 2.6, 2.6, 2.7, 2.7, 2.7, 2.7, 2.7, 2.7, 2.8, 2.8, 2.8, 2.8, 2.8, 2.8, 2.8, 2.8, 2.8, 2.9, 2.9, 2.9, 2.9, 2.9, 2.9, 2.9, 2.9, 3, 3, 3, 3, 3, 3, 3, 3, 3.1, 3.1, 3.1, 3.1, 3.1, 3.1, 3.1, 3.1, 3.1, 3.1, 3.1, 3.1, 3.2, 3.2, 3.2, 3.2, 3.2, 3.2, 3.2, 3.2, 3.2, 3.2, 3.2, 3.2, 3.2, 3.3, 3.3, 3.3, 3.3, 3.3, 3.3, 3.3, 3.3, 3.3, 3.3, 3.3, 3.3, 3.3, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.4, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.7, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.8, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 3.9, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.1, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.2, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.3, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.6, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.7, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.8, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.1, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.2, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.3, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.6, 5.6, 5.6, 5.6, 5.6, 5.6,
Re: [R] Aggregate behaviour inconsistent (?) when FUN=table
Thank you for your response. Note that with R 3.4.3, I get the same result with simplify=TRUE or simplify=FALSE. My problem was the behaviour was different if I define my columns as character or as numeric but for now some minutes I discovered there also is a stringsAsFactors option in the function data.frame. So yes, it was a stupid question and I apologize for it. On 06/02/2018 18:07, William Dunlap wrote: Don't use aggregate's simplify=TRUE when FUN() produces return values of various dimensions. In your case, the shape of table(subset)'s return value depends on the number of levels in the factor 'subset'. If you make B a factor before splitting it by C, each split will have the same number of levels (2). If you split it and then let table convert each split to a factor, one split will have 1 level and the other 2. To see the details of the output , use str() instead of print(). Bill Dunlap TIBCO Software wdunlap tibco.com <http://tibco.com> On Tue, Feb 6, 2018 at 12:20 AM, Alain Guillet <alain.guil...@uclouvain.be <mailto:alain.guil...@uclouvain.be>> wrote: Dear R users, When I use aggregate with table as FUN, I get what I would call a strange behaviour if it involves numerical vectors and one "level" of it is not present for every "levels" of the "by" variable: --- > df <- data.frame(A=c(1,1,1,1,0,0,0,0),B=c(1,0,1,0,0,0,1,0),C=c(1,0,1,0,0,1,1,1)) > aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE) Group.1 A.0 A.1 B 1 0 1 2 3 2 1 3 2 2, 3 > table(df$C,df$B) 0 1 0 3 0 1 2 3 --- As you can see, a comma appears in the column with the variable B in the aggregate whereas when I call table I obtain the same result as if B was defined as a factor (I suppose it comes from the fact "non-factor arguments a are coerced via factor" according to the details of the table help). I find it completely normal if I remember that aggregate first splits the data into subsets and then compute the table. But then I don't understand why it works differently with character vectors. Indeed if I use character vectors, I get the same result as with factors: > df <- data.frame(A=factor(c("1","1","1","1","0","0","0","0")),B=factor(c("1","0","1","0","0","0","1","0")),C=factor(c("1","0","1","0","0","1","1","1"))) > aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE) Group.1 A.0 A.1 B.0 B.1 1 0 1 2 3 0 2 1 3 2 2 3 > df <- data.frame(A=factor(c(1,1,1,1,0,0,0,0)),B=factor(c(1,0,1,0,0,0,1,0)),C=factor(c(1,0,1,0,0,1,1,1))) > aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE) Group.1 A.0 A.1 B.0 B.1 1 0 1 2 3 0 2 1 3 2 2 3 - Is it possible to precise anything about this behaviour in the aggregate help since the result is not completely compatible with the expectation of result we can have according to the table help? Or would it be possible to have the same results independently of the vector type? This post was rejected on the R-devel mailing list so I ask my question here as suggested. Best regards, Alain Guillet -- __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Aggregate behaviour inconsistent (?) when FUN=table
Dear R users, When I use aggregate with table as FUN, I get what I would call a strange behaviour if it involves numerical vectors and one "level" of it is not present for every "levels" of the "by" variable: --- > df <- data.frame(A=c(1,1,1,1,0,0,0,0),B=c(1,0,1,0,0,0,1,0),C=c(1,0,1,0,0,1,1,1)) > aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE) Group.1 A.0 A.1 B 1 0 1 2 3 2 1 3 2 2, 3 > table(df$C,df$B) 0 1 0 3 0 1 2 3 --- As you can see, a comma appears in the column with the variable B in the aggregate whereas when I call table I obtain the same result as if B was defined as a factor (I suppose it comes from the fact "non-factor arguments a are coerced via factor" according to the details of the table help). I find it completely normal if I remember that aggregate first splits the data into subsets and then compute the table. But then I don't understand why it works differently with character vectors. Indeed if I use character vectors, I get the same result as with factors: > df <- data.frame(A=factor(c("1","1","1","1","0","0","0","0")),B=factor(c("1","0","1","0","0","0","1","0")),C=factor(c("1","0","1","0","0","1","1","1"))) > aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE) Group.1 A.0 A.1 B.0 B.1 1 0 1 2 3 0 2 1 3 2 2 3 > df <- data.frame(A=factor(c(1,1,1,1,0,0,0,0)),B=factor(c(1,0,1,0,0,0,1,0)),C=factor(c(1,0,1,0,0,1,1,1))) > aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE) Group.1 A.0 A.1 B.0 B.1 1 0 1 2 3 0 2 1 3 2 2 3 - Is it possible to precise anything about this behaviour in the aggregate help since the result is not completely compatible with the expectation of result we can have according to the table help? Or would it be possible to have the same results independently of the vector type? This post was rejected on the R-devel mailing list so I ask my question here as suggested. Best regards, Alain Guillet -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain http://www.uclouvain.be/smcs Bureau c.316 Voie du Roman Pays, 20 (bte L1.04.01) B-1348 Louvain-la-Neuve Belgium Tel: +32 10 47 30 50 Accès: http://www.uclouvain.be/323631.html __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using read.csv2()
Hello, The defaults in read.csv2 are ";" as the separator and "," as the decimal symbol. It seems that the file you import is not a true csv since it mixes up two norms. You can solve your problem in defining the dec option equals to ".": read.csv2("test.csv",dec=".")->don Alain On 29/09/16 10:59, Voirin Pascale wrote: Hello, I have a problem with the variable type defined by reading a csv file with read.csv2. Here is a test file saved as < test.csv > : var1;var2;var3 TI;1995;4.5 VD;1990;4.8 FR;1994;3.9 VS;1993;5.1 FR;1995;4.7 FR;1992;5.8 That I read in R with : read.csv2("test.csv")->don;don don$var3 ## [1] 4.5 4.8 3.9 5.1 4.7 5.8 ## Levels: 3.9 4.5 4.7 4.8 5.1 5.8 as.double(don$var3) ## [1] 2 4 1 5 3 6 Why is it by default a type ? And how can I get the decimal value for var3 Thanks a lot for your answer. With my best regards, Pascale Voirin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. . -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain http://www.uclouvain.be/smcs Bureau c.316 Voie du Roman Pays, 20 (bte L1.04.01) B-1348 Louvain-la-Neuve Belgium Tel: +32 10 47 30 50 Accès: http://www.uclouvain.be/323631.html __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Upgrade R 3.2 to 3.3 using tar.gz file on Ubuntu 16.04
Dear Luigi, You have to modify the /etc/apt/source.list file in order to add a new depot to get a new R version. Everything is explained on the page https://cran.r-project.org/bin/linux/ubuntu/ . Alain On 13/09/16 15:00, Luigi Marongiu wrote: Dear all, I am working on Linux Ubuntu 16.04 and I have installed R 3.2. I need to upgrade to R 3.3 and I tried several options available online with no success. I downloaded the tar.gz file for R 3.3 and I would like to ask how can I use this file in order to accomplish the upgrade. Many thanks, Luigi __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. . -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain http://www.uclouvain.be/smcs Bureau c.316 Voie du Roman Pays, 20 (bte L1.04.01) B-1348 Louvain-la-Neuve Belgium Tel: +32 10 47 30 50 Accès: http://www.uclouvain.be/323631.html __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot2 - unexpected beahviour with facet_grid
Hello, I use ggplot2 in order to represent the same data during 3 periods so I call facet_grid to get one subgraph by period. But when I do so, I get different results between the call on the whole data and the one on only one period (I expect to get one of the subgraphs to be identical to the graph obtained when using only one period). I added the code and my session info hereunder. Could you explain me what I do worng or if there is a bug? Thank you. Kind regards, Alain -- library(ggplot2) # data tmp <- data.frame(x=rnorm(9000),y=rnorm(9000),color=factor(rep(1:3,each=3000)),period=factor(rep(1:3,3000)),ligne=factor(rep(1:2,4500))) # plot with the three periods ggplot(tmp,aes(x=x,y=y,col=color,linetype=ligne))+geom_smooth()+scale_colour_manual(values=c("black","blue","yellow"))+guides(linetype=FALSE,col=FALSE)+facet_grid(period~.) #plot with only the first period ggplot(tmp[tmp$period=="1",],aes(x=x,y=y,col=color,linetype=ligne))+geom_smooth()+scale_colour_manual(values=c("black","blue","yellow"))+guides(linetype=FALSE,col=FALSE)+facet_grid(period~.) -- R version 3.3.1 (2016-06-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux 8 (jessie) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 [4] LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] doBy_4.5-15 ggplot2_2.1.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.5 lattice_0.20-33 digest_0.6.9 MASS_7.3-45 grid_3.3.1 [6] plyr_1.8.4 nlme_3.1-128 gtable_0.2.0 magrittr_1.5 scales_0.4.0 [11] stringi_1.1.1reshape2_1.4.1 Matrix_1.2-6 labeling_0.3 tools_3.3.1 [16] stringr_1.0.0munsell_0.4.3colorspace_1.2-6 mgcv_1.8-12 -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain http://www.uclouvain.be/smcs Bureau c.316 Voie du Roman Pays, 20 (bte L1.04.01) B-1348 Louvain-la-Neuve Belgium Tel: +32 10 47 30 50 Accès: http://www.uclouvain.be/323631.html __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rename columns with pattern
Dear Raj, names(dff)[1:6] - paste(bp,1:6,sep=_) Alain On 2015-01-12 15:17, Kuma Raj wrote: I want to rename columns 1 to 6 in the sample data set as bp_1 to bp_6. How could I do that in R? Thanks dput(dff) structure(list(one = c(1.00027378507871, 0.982313483915127, 1.1531279945243, 1.07400410677618, 1.22710472279261, 1.19762271047046, 1.10904859685147, 1.32060232717317), two = c(1.04707392197125, 1.00998288843258, 1.17598904859685, 1.09595482546201, 1.28599589322382, 1.26632675564591, 1.12986995208761, 1.30704654346338), three = c(1.06301619895049, 1.02743782797171, 1.1977093315081, 1.11466803559206, 1.2949441022131, 1.28365657768591, 1.1305452886151, 1.32089436459046), four = c(1.06994010951403, 1.03489904175222, 1.19799452429843, 1.1172022587269, 1.28742984257358, 1.27650013346977, 1.12265058179329, 1.30723134839151), five = c(1.07019712525667, 1.03722792607803, 1.19174811772758, 1.11514168377823, 1.26594387405886, 1.25720010677582, 1.11339630390144, 1.29178507871321), six = c(1.1909650924, 1.08407027150354, 1.24785877253023, 1.16373032169747, 1.31150581793292, 1.31042514031455, 1.16205338809035, 1.37122975131189), idd = 1:8), .Names = c(one, two, three, four, five, six, idd), row.names = c(NA, -8L), class = c(tbl_df, data.frame)) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. . -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain http://www.uclouvain.be/smcs Bureau c.316 Voie du Roman Pays, 20 (bte L1.04.01) B-1348 Louvain-la-Neuve Belgium Tel: +32 10 47 30 50 Accès: http://www.uclouvain.be/323631.html __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove levels
Hi, Without more information I guess your problem is that the level name still exists in the factor whereas it doesn't appear anymore in the factor. If so, try droplevels. Alain Guillet On 13/06/13 14:02, Shane Carey wrote: I have a dataframe consisting of factors in one column. Im trying to remove certain levels using the following code: toBeRemoved1-which(DATA$UnitName_1==lake) DATA-DATA[-toBeRemoved1,] However it will not remove the level lake In the past this worked for me, but its not working now. Any help appreciated. Thanks -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain http://www.uclouvain.be/smcs Bureau c.316 Voie du Roman Pays, 20 (bte L1.04.01) B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 Accès: http://www.uclouvain.be/323631.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Introduction to R. Any such documentation in Vietnamese?
There is a contributed section on the http://cran.r-project.org/. Go to it, there is a vietnamese document to introduce R. Alain Guillet On 20/03/2013 02:06, Peter Alspach wrote: Dear fellow users Are there any Vietnamese language resources for beginners of R? If so, I would be interested in hearing from people who have had experience with them and which are better (if there is more than one). I am involved with an aid project in Vietnam, and would like to move the scientists involved from using Excel for 'analysis' to R. Thanks Peter Alspach The contents of this e-mail are confidential and may be ...{{dropped:17}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain http://www.uclouvain.be/smcs Bureau c.316 Voie du Roman Pays, 20 (bte L1.04.01) B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 Accès: http://www.uclouvain.be/323631.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding an ellipse to a PCA plot
Hi, I think the easiest way is to use the function plotellipses of the FactoMineR package (but you have to do your PCA with the PCA function included in this package). Alain On 06-Jun-11 16:32, Lukas Baitsch wrote: Hi, I created a principal component plot using the first two principal components. I used the function princomp() to calculate the scores. now, I would like to superimpose an ellipse representing the center and the 95% confidence interval of a series of points in my plot (as to illustrate the grouping of my samples). I looked at the ellipse() function in the ellipse package but can't get it to work. the princomp()-function gives me the scores of each point, so I can calculate the mean and the 95%-CI, but I can't integrate this into the ellipse()-function). Is there a better way of doing this or can someone help me figure out this function? best regards, Lukas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain http://www.uclouvain.be/smcs Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 Accès: http://www.uclouvain.be/323631.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to arrange the data
Hi, You can use the reshape package and the melt function : melt(data, id=date) Alain On 17-Dec-10 10:40, Amy Milano wrote: Dear R helpers I have one data as given below. date value1 value2 value3 30-Nov-2010 100 40 61 25-Nov-2010 108 31 88 14-Sep-201011 180 56 I want the following output date name amount 30-Nov-2010 value1100 30-Nov-2010 value2 40 30-Nov-2010 value3 61 25-Nov-2010 value1 108 25-Nov-2010 value2 31 25-Nov-2010 value3 88 14-Sep-2010 value111 14-Sep-2010 value2 180 14-Sep-2010 value3 56 I have presented here a small part of large data. I tried to convert the data into matrix, then transpose etc. but things are not working for me. Please guide Thanking in advance Amy Milano [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain http://www.uclouvain.be/smcs Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Increase R precision
Hi, It is not a problem of precision but a problem of display. options(digits=15) (18-46)/(45-93) [1] 0.583 Alain On 27-Oct-10 13:49, Alaios wrote: Hello everyone. When I execute the following in R (18-46)/(45-93) [1] 0.583 I get small precision for what I am trying to deal with . Is it possible to increase the precision for this and for other operations? For example openoffice calc for this operation returns 0.5830 I I would like to thank you for your help [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Increase R precision
As everybody told you in using options with digits... It exactly is what I made in the sent code. Alain On 27-Oct-10 14:21, Alaios wrote: So? Do you imply that I do not need to change the precision.. and if yes how to change the default display settings? Best regards Alex *From:* Alain Guillet alain.guil...@uclouvain.be *To:* Alaios ala...@yahoo.com *Cc:* Rhelp r-help@r-project.org *Sent:* Wed, October 27, 2010 1:58:46 PM *Subject:* Re: [R] Increase R precision Hi, It is not a problem of precision but a problem of display. options(digits=15) (18-46)/(45-93) [1] 0.583 Alain On 27-Oct-10 13:49, Alaios wrote: Hello everyone. When I execute the following in R (18-46)/(45-93) [1] 0.583 I get small precision for what I am trying to deal with . Is it possible to increase the precision for this and for other operations? For example openoffice calc for this operation returns 0.5830 I I would like to thank you for your help [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MATLAB vrs. R
Hi, The first argument of myquadrature in result shouldn't be val but f I guess. At least it works for me result=myquadrature(f,0,2000) print(result) [1] 3 Regards, Alain On 11-Oct-10 09:37, Craig O'Connell wrote: Thank you Peter. That is very much helpful. If you don't mind, I continued running the code to attempt to get my answer and I continue to get inf inf inf... (printed around 100 times). Any assistance with this issue. Here is my code (including your corrections): myquadrature-function(f,a,b){ npts=length(f) nint=npts-1 if(npts=1) error('need at least two points to integrate') end; if(b=a) error('something wrong with the interval, b should be greater than a') else dx=b/real(nint) end; npts=length(f) int=0 int- sum(f[-npts]+f[-1])/2*dx } #Call my quadrature x=seq(0,2000,10) h = 10.*(cos(((2*pi)/2000)*(x-mean(x)))+1) u = 1.*(cos(((2*pi)/2000)*(x-mean(x)))+1) a = x[1] b = x[length(x)] plot(x,-h) a = x[1]; b = x[length(x)]; #call your quadrature function. Hint, the answer should be 3. f=u*h; val = myquadrature(f,a,b); ? ___This is where issue arises. result=myquadrature(val,0,2000) ? print(result) ? Thanks again, Phil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kolmogorov Smirnov p-values
Hi, Are you sure you don't want to do ks.test(y, punif, min=0, max=1, alternative=greater) instead of what you tried? Alain On 02-Sep-10 15:52, Samsiddhi Bhattacharjee wrote: ks.test(y, runif, min=0, max=1, alternative=greater) -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determining the length of unique items in a vector
Hi, You can try sapply(levels(as.factor(dat1)),nchar) Alain On 20-Aug-10 12:01, Ron Michael wrote: Dear all, let suppose I have following vector: dat1- c(rep(asd, 5), rep(xyz, 12), rep(erd, 17)) dat1- dat1[sample(1:length(dat1), length(dat1), replace=F)] dat1 [1] erd xyz erd asd asd erd xyz asd erd erd asd xyz erd asd xyz xyz erd xyz erd [20] erd erd xyz xyz erd erd erd erd xyz xyz xyz erd xyz erd erd Here I want to know the length of replications for each unique items viz asd, xyz, and erd. Is there any R function available to directly implement that? Thanks, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determining the length of unique items in a vector
Hi Ivan, Now I read the other answers I also think I misunderstood the question... The good thing is that one of use certainly gave the right answer to the question ;-) Alain On 20-Aug-10 12:37, Ivan Calandra wrote: Hi, I thought you were looking for table(), but the other answers gave you something really different; I might have wrongly understood your question. HTH, Ivan Le 8/20/2010 12:32, Alain Guillet a écrit : Hi, You can try sapply(levels(as.factor(dat1)),nchar) Alain On 20-Aug-10 12:01, Ron Michael wrote: Dear all, let suppose I have following vector: dat1- c(rep(asd, 5), rep(xyz, 12), rep(erd, 17)) dat1- dat1[sample(1:length(dat1), length(dat1), replace=F)] dat1 [1] erd xyz erd asd asd erd xyz asd erd erd asd xyz erd asd xyz xyz erd xyz erd [20] erd erd xyz xyz erd erd erd erd xyz xyz xyz erd xyz erd erd Here I want to know the length of replications for each unique items viz asd, xyz, and erd. Is there any R function available to directly implement that? Thanks, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where the data file is stored?
Hi, You can find your current working directory with the getwd() function. Alain On 12-Aug-10 11:22, Stephen Liu wrote: - Original Message From: Philipp Pagelp.pa...@wzw.tum.de To: r-help@r-project.org Sent: Thu, August 12, 2010 3:54:53 PM Subject: Re: [R] Where the data file is stored? You dont't tell us what you did to create a datafile - to me it sounds like you created an object (probably a data frame) in your R workspace. If that's the case it is stored in a file called .RData in your current work directory (together with other variables in your workspace). If that is not what you did please give us mre information. Hi Philipp, Yes, it is data frame. I have run the step write.csv ... Other advice noted. Thanks B.R. Stephen L __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Difference Between R: wilcox.test and STATA: signrank
Hi, Look at the output of the test made in R and you can see it is a Wilcoxon rank sum test and not a Wilcoxon signed rank test. If there are ties, I know I prefer wilcox.exact from the exactRankTests. Alain On 09-Aug-10 12:43, Capasia wrote: This is my first post to the mailing list and I guess it's a pretty stupid question but I can't figure it out. I hope this is the right forum for these kind of questions. Before I started using R I was using STATA to run a Wilcoxon signed-rank test on two variables. See data below: https://spreadsheets.google.com/pub?key=0ApodAA2GAEP_dDZkdzZHSFBqX1JHOWJBX1dMQUZCVkEhl=enoutput=html%20%20https://spreadsheets.google.com/pub?key=0ApodAA2GAEP_dDZkdzZHSFBqX1JHOWJBX1dMQUZCVkEhl=enoutput=html STATA Output: . signrank x=y Wilcoxon signed-rank test sign | obs sum ranksexpected -+- positive | 413101 2330.5 negative | 181560 2330.5 zero | 4912251225 -+- all | 10858865886 unadjusted variance 106438.50 adjustment for ties -282.38 adjustment for zeros -10106.25 -- adjusted variance 96049.88 Ho: transfer_2_a = transfer_2_b z = 2.486 Prob |z| = *0.0129* When running a Wilcoxon signed-rank test wilcox.test(datablatt$x, datablatt$y) Wilcoxon rank sum test with continuity correction data: datablatt$x and datablatt$y W = 7059.5, p-value = *0.09197* alternative hypothesis: true location shift is not equal to 0 As you can see the p Values are different (one with H0 rejection and the other one not). I tested whether it could be that the STATA one isn't paired but this doesn't seem to be the problem. I'm dumbfound what could lead to such a difference. I couldn't find any seetings I have missed but I somehow I guess I'm using the function in the wrong way... Any ideas? Thanks a lot in advance! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Substring of a character column
Hi, a - c(ID=NM_182905.1;Name=NM_182905;Alias=FLJ00038;Note=hypothetical protein + LOC375690 + ,ID=NM_001005484;Alias=OR4F5;Note=olfactory receptor%2C family 4%2C + subfamily F + ,ID=NM_001005224.1;Name=NM_001005224;Alias=OR4F3;Note=olfactory + receptor%2C family 4%2C subfamily F + ) fonction - function(data,string) { liste - strsplit(data,;) return(lapply(liste,function(x) grep(string,x))) } fonction(a,ID=) fonction(a,Alias=) HTH, Alain On 04-Aug-10 12:00, LogLord wrote: Hi, I have a dataframe with a rather complicated descriptive column (V9): test3[(1:3), V1 V4 V5 10 1 4559 7173 17 1 58954 59871 19 1 357522 358458 V9 10 ID=NM_182905.1;Name=NM_182905;Alias=FLJ00038;Note=hypothetical protein LOC375690 17 ID=NM_001005484;Alias=OR4F5;Note=olfactory receptor%2C family 4%2C subfamily F 19 ID=NM_001005224.1;Name=NM_001005224;Alias=OR4F3;Note=olfactory receptor%2C family 4%2C subfamily F I have problems to extract two strings from this column (V9). First I need the ID=... and second I need the Alias=... both in seperate columns. I tried it with substr() but due to the different length and no wildcard allowance it did not work. Would be glad for any help! Thanks in advance. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save plot
?jpeg On 04-Aug-10 14:28, linda.s wrote: Can I make a group of jpeg instead of pdfs? Thanks. Linda On Wed, Aug 4, 2010 at 6:47 AM, John Kanejrkrid...@yahoo.ca wrote: Yes, ?jpeg --- On Tue, 8/3/10, linda.ssamrobertsm...@gmail.com wrote: From: linda.ssamrobertsm...@gmail.com Subject: Re: [R] save plot To: gavin.simp...@ucl.ac.uk Cc: r-help@r-project.org Received: Tuesday, August 3, 2010, 5:36 PM [I presume you addressed this to Duncan Murdoch for a good reason???] Open a new device before plotting, do your plotting, close the device. For example, using a PDF device via pdf(): pdf(my_plots.pdf, height = 8, width = 8, pointsize = 10, version = 1.4, onefile = TRUE) for(i in 1:10) { y- rnorm(100) x- rnorm(100) plot(y ~ x) } dev.off() Can I make a group of jpg instead of pdfs? Thanks. Linda __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot of a subset of a data.frame()
Hello, It is completely normal. I advise you to read the manual An introduction to R on the CRAN website. For example you can find (part 12.1.1) : 12.1.1 The |plot()| function One of the most frequently used plotting functions in R is the |plot()| function. This is a /generic/ function: the type of plot produced is dependent on the type or /class/ of the first argument. |plot(|x|, |y|)| |plot(|xy|)| If x and y are vectors, |plot(|x|, |y|)| produces a scatterplot of y against x. The same effect can be produced by supplying one argument (second form) as either a list containing two elements x and y or a two-column matrix. |plot(|x|)| If x is a time series, this produces a time-series plot. If x is a numeric vector, it produces a plot of the values in the vector against their index in the vector. If x is a complex vector, it produces a plot of imaginary versus real parts of the vector elements. |plot(|f|)| |plot(|f|, |y|)| f is a factor object, y is a numeric vector. The first form generates a bar plot of f; the second form produces boxplots of y for each level of f. |plot(|df|)| |plot(~ |expr|)| |plot(|y| ~ |expr|)| df is a data frame, y is any object, expr is a list of object names separated by `|+|' (e.g., |a + b + c|). The first two forms produce distributional plots of the variables in a data frame (first form) or of a number of named objects (second form). The third form plots y against every object named in expr. Alain On 26-Jul-10 13:38, Steffen Uhlig wrote: Hello, my data.frame is sort of a collection of process values, i.e. huge run-chart. It consists of a time-stamp in the first column (date as string), factors in the following columns (used for subset-filtering), and some process-data columns. Hereafter, two examples are listed, showing the problems that occour during print: At first the example, that works fine: ~~ a = c(1:10) # create a vector of integers b = rep(c(a,b),5)# create a vector of chars, used # as factor-levels d = rnorm(10)# some random numbers e = data.frame(a,b,d)# connect to a data.frame e.1 = subset(e, b==a)# create two subsets e.2 = subset(e, b==b) plot(d~a, e.1, pch=3, col=2) # plot first data-subset points(d~a, e.2, pch=4, col=3) # plot the 2nd one ~~ all looks fine in theses plots. However, changing the content of vector a to a set of strings the following happens: ~~ a = c(a,b,c,d,e,f,g,h,i,j) e = data.frame(a,b,d) # re-build data.frame e.1 = subset(e, b==a) # create two subsets e.2 = subset(e, b==b) plot(d~a, e.1, pch=3, col=2) points(d~a, e.2, pch=4, col=3) ~~ The plot-command produces horizontal lines instead of dots. This seems to happen when the x-axis contains strings rather than numbers. is there a way out? Best regards, /Steffen -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Concatenate a mix of numbers and letters to create a vector name
Hi, assign(paste(c(tmax., 1950, 12), collapse=) ,1:10) does what you want. Alain On 26-Jul-10 16:23, Panos Hadjinicolaou wrote: Thanks for the reply. Indeed the paste function results in concatenation: paste(c(tmax., 1950, 12), collapse=) [1] tmax.195012 but I am looking for a way to subsequently get rid of the - - in order to use tmax.195012 as an object (e.g. to define a vector with that name). Any ideas? Thanks, Panos _ From: Dimitris Rizopoulos [mailto:d.rizopou...@erasmusmc.nl] To: Panos Hadjinicolaou [mailto:p.hadjinicol...@cyi.ac.cy] Cc: r-help@r-project.org Sent: Mon, 26 Jul 2010 16:48:31 +0300 Subject: Re: [R] Concatenate a mix of numbers and letters to create a vector name have a look at function paste(), i.e., ?paste I hope it helps. Best, Dimitris On 7/26/2010 3:44 PM, Panos Hadjinicolaou wrote: Dear all, I am trying to create a vector name, for example tmax.195012 from tmax., 1950 and 12. Obviously I don't wish to simply type it because the 3 name components are changing in each iteration within a loop. Is there any way of concatenating those 3 components (which are a mixture of numbers and letters)? Thanks for reading, Panos - Dr Panos Hadjinicolaou Energy Environment Water Research Center (EEWRC) The Cyprus Institute -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to write legend of a plot
Hi, There is no way to obtain an automatical legend with plot. The choice you have is to draw the legend by yourself or to use another graphical package but it is not necessary easier... For example, the code for the package ggplot2: donnees - data.fram(x=x,y=y,y1=y1) melt(donnees,id=x) - donnees.m qplot(x,value,data=donnees.m,colour=variable,geom=c(point,line)) Alain On 22-Jul-10 12:17, Yogesh Tiwari wrote: Dear R Users, If we issue simple plot command in R we don't get legend of the plot automatically. For example, following lines plots two curves, but to write a legend of these two curves there is no simple command. I checked with ?legend but it seems bit complicated for me. Does anyone know how to get a legend in a simple way for following R plot. Thanks, Yogesh plot (x,y, type='n', ann=FALSE) lines(x,y,col=1,lty=solid) points(x,y,pch=16) points (x,y1, type='n', ann=FALSE) lines(x,y1,col=1,lty=solid) points(x,y1,pch=21) - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] par(uin) ?
My question is probably stupid but why don't you use the text() function? plot(1:10,type=n) text(4,4,{) text(6,6,{,cex=3) # if you want it bigger Alain On 19-Jul-10 17:20, Michael Friendly wrote: I inherited a function written either for an older version of R or SPlus to draw a brace, {, in a graph. It uses par(uin) to determine the scaling of the quarter circles that make up segments of the brace, but that setting doesn't exist in current R. I'm guessing that, in the function below, ux, uy can be defined from par(usr) and par(pin), but maybe someone remembers what par(uin) was supposed to refer to. brace - function (x1 = 0, y1 = 0, x2 = 0, y2 = 1, right = TRUE, rad = 0.2) { uin - par(uin) ux - uin[1] uy - uin[2] dx - x2 - x1 dy - y2 - y1 alpha - atan(ux * dx, uy * dy) scale - sqrt((ux * dx)^2 + (uy * dy)^2) if (scale 5 * rad) rad - rad/scale qcirc - cbind(cos((0:10) * pi/20), sin((0:10) * pi/20)) qcircr - cbind(cos((10:0) * pi/20), sin((10:0) * pi/20)) rot - function(theta) t(cbind(c(cos(theta), sin(theta)), c(-sin(theta), cos(theta seg1 - t(t(rad * qcirc %*% rot(-pi/2)) + c(0, rad)) seg4 - t(t(rad * qcirc) + c(0, 1 - rad)) seg3 - t(t((rad * qcircr) %*% rot(pi)) + c(2 * rad, 0.5 + rad)) seg2 - t(t((rad * qcircr) %*% rot(pi/2)) + c(2 * rad, 0.5 - rad)) bra - rbind(seg1, seg2, seg3, seg4) if (!right) bra - bra %*% diag(c(-1, 1)) bra - scale * bra %*% rot(-alpha) bra - bra %*% diag(c(1/ux, 1/uy)) bra - t(t(bra) + c(x1, y1)) bra } -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in continuation with the earlier R puzzle
I don't know what is wrong with your code but I believe you should use ifelse instead of a for loop: s - ifelse(news1o s2o, 1 , -1 ) Alain On 12-Jul-10 16:09, Raghu wrote: When I just run a for loop it works. But if I am going to run a for loop every time for large vectors I might as well use C or any other language. The reason R is powerful is becasue it can handle large vectors without each element being manipulated? Please let me know where I am wrong. for(i in 1:length(news1o)){ + if(news1o[i]s2o[i]) + s[i]-1 + else + s[i]--1 + } -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a small puzzle?
In an if statement, you can use only elements. In your example, news1o and s2o are vectors so there is a warning saying the two vectors have a bigger length than one. If you don't send two messages about the same problem in two minutes, you can see what people answer you... For example, I advised you to use ifelse which works on vectors. Alain On 12-Jul-10 16:02, Raghu wrote: I know the following may sound too basic but I thought the mailing list is for the benefit of all levels of people. I ran a simple if statement on two numeric vectors (news1o and s2o) which are of equal length. I have done an str on both of them for your kind perusal below. I am trying to compare the numbers in both and initiate a new vector s as 1 or 0 depending on if the elements in the arrays are greater or lesser than each other. When I do a simple s=(news1os2o) I get the values of S as a string of TRUEs and FALSEs but when I try to override using the if statements this cribs. I get only one element in s and that is a puzzle. Any ideas on this please? Many thanks. if(news1os2o)(s-1) else + (s--1) [1] -1 Warning message: In if (news1o s2o) (s- 1) else (s- -1) : the condition has length 1 and only the first element will be used s [1] -1 length(s) [1] 1 str(news1o) num [1:3588] 891 890 890 888 886 ... str(s2o) num [1:3588] 895 892 890 888 885 ... -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with the recode function
Dear John, Thanks a lot for the time you spent on my problem. I don't believe you can do something to avoid this kind of problem. I don't know if it is technically possible but I wonder if when we load Rcmdr plug-ins from the Rcmdr menu, it wouldn't be possible during the restart of Rmcdr to detach the package used by Rcmdr in order to load them after the packages used by the plug-ins in order to at least avoid to break Rmcdr (in breaking I mean to prevent from using Rcmdr functions like recode (from car) in my example). Regards, Alain On 15-Jun-10 21:37, John Fox wrote: Dear Alain, -Original Message- From: Alain Guillet [mailto:alain.guil...@uclouvain.be] Sent: June-15-10 12:25 PM To: John Fox Cc: r-help@r-project.org Subject: Re: [R] Problem with the recode function I found out what the problem is: when I start R Commander, some plug-ins are automatically loaded and it seems that the problem comes from the RcmdrPlugin.Export, more precisely from the Hmisc package (the plug-in depends on it) which contains a recode() function too with the following documentation : That makes sense of the problem, but I'm not sure what I can about it -- that is, there's always the possibility that someone will load a package that shadows a function in another package. I'll think some more about the problem. Best, John Hmisc-internal package:Hmisc R Documentation Internal Hmisc functions Description: Internal Hmisc functions. Details: These are not to be called by the user or are undocumented. Alain On 15-Jun-10 17:53, John Fox wrote: Dear Alain, I'm afraid that I can't duplicate your problem. First, there is no recode function in the Rcmdr package; it uses recode from car. Here's a record of my Rcmdr session, using the recode dialog to generate the recode command: test$variable- recode(test$x, '1:5=0; else=1; ', as.factor.result=TRUE) test # entered in script window x variable 1 10 2 20 3 30 4 40 5 50 6 61 7 71 8 81 9 91 10 101 I noticed that you set as.factor.result=TRUE for one command and FALSE for the other, but both work for me. It occurred to me that you may have entered the recode command in the script window and executed it from there, but that works for me too. Best, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Alain Guillet Sent: June-15-10 10:58 AM To: r-help@r-project.org Subject: [R] Problem with the recode function Hello, I am using the recode() function in Rcmdr and the result is not what I expect so I am almost sure I did something wrong but what... test- data.frame(x=1:10) library(car) recode(test$x,'1:5=0 ; else=1', as.factor.result=TRUE) [1] 0 0 0 0 0 1 1 1 1 1 Levels: 0 1 BUT library(Rcmdr) # recode from the car package is now masked Now I recode test$x through the Rmcdr interface and I get the following code : test$variable- recode(test$x, '1:5 = 0; else = 1; ', as.factor.result=FALSE) And a vector of NA as result. test$variable [1] NA NA NA NA NA NA NA NA NA NA I am using R 2.11.1 with Rcmdr 1.5-5 on Windows Vista. Regards, Alain -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with the recode function
Hello, I am using the recode() function in Rcmdr and the result is not what I expect so I am almost sure I did something wrong but what... test - data.frame(x=1:10) library(car) recode(test$x,'1:5=0 ; else=1', as.factor.result=TRUE) [1] 0 0 0 0 0 1 1 1 1 1 Levels: 0 1 BUT library(Rcmdr) # recode from the car package is now masked Now I recode test$x through the Rmcdr interface and I get the following code : test$variable - recode(test$x, '1:5 = 0; else = 1; ', as.factor.result=FALSE) And a vector of NA as result. test$variable [1] NA NA NA NA NA NA NA NA NA NA I am using R 2.11.1 with Rcmdr 1.5-5 on Windows Vista. Regards, Alain -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with the recode function
I found out what the problem is: when I start R Commander, some plug-ins are automatically loaded and it seems that the problem comes from the RcmdrPlugin.Export, more precisely from the Hmisc package (the plug-in depends on it) which contains a recode() function too with the following documentation : Hmisc-internal package:Hmisc R Documentation Internal Hmisc functions Description: Internal Hmisc functions. Details: These are not to be called by the user or are undocumented. Alain On 15-Jun-10 17:53, John Fox wrote: Dear Alain, I'm afraid that I can't duplicate your problem. First, there is no recode function in the Rcmdr package; it uses recode from car. Here's a record of my Rcmdr session, using the recode dialog to generate the recode command: test$variable- recode(test$x, '1:5=0; else=1; ', as.factor.result=TRUE) test # entered in script window x variable 1 10 2 20 3 30 4 40 5 50 6 61 7 71 8 81 9 91 10 101 I noticed that you set as.factor.result=TRUE for one command and FALSE for the other, but both work for me. It occurred to me that you may have entered the recode command in the script window and executed it from there, but that works for me too. Best, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Alain Guillet Sent: June-15-10 10:58 AM To: r-help@r-project.org Subject: [R] Problem with the recode function Hello, I am using the recode() function in Rcmdr and the result is not what I expect so I am almost sure I did something wrong but what... test- data.frame(x=1:10) library(car) recode(test$x,'1:5=0 ; else=1', as.factor.result=TRUE) [1] 0 0 0 0 0 1 1 1 1 1 Levels: 0 1 BUT library(Rcmdr) # recode from the car package is now masked Now I recode test$x through the Rmcdr interface and I get the following code : test$variable- recode(test$x, '1:5 = 0; else = 1; ', as.factor.result=FALSE) And a vector of NA as result. test$variable [1] NA NA NA NA NA NA NA NA NA NA I am using R 2.11.1 with Rcmdr 1.5-5 on Windows Vista. Regards, Alain -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NAs are not allowed in subscripted assignments
Maybe you can withdraw the [i] in your code... for (i in 1:6) + {new[new[i]5.5]-NA} new [1] 5 5 5 5 NA Alain On 09-Apr-10 11:23, Paul Chatfield wrote: I'm trying to assign NAs to values that satisfy certain conditions (more complex than shown below) and it gives the right result, but breaks the loop having done the first one viz: new-c(rep(5,4),6) for (i in 1:6) {new[new[i]5.5][i]-NA} gives the correct result, though an error message appears which causes a break if it's in a loop. If I can get rid of the error message and get the loop to continue, this should work fine. I'm sure I'm missing a simple solution, but can't seem to see it, Any help, as always, greatly appreciated, Paul -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] terminating function
Hi, Look at the function stop which does what you want. ?stop Alain On 09-Apr-10 11:27, Covelli Paolo wrote: Hi everyone, I 'm building a function, in the middle it controls the sign of a variable x. If x 0 the function write a warning (Error: negative value!). At this point I want the function stops without execute the remaining code. How can I do to terminate the function before your ending? Thanks in advance. Paolo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NAs are not allowed in subscripted assignments
Sorry I forgot to add that you don't need the for loop: new[new5.5] - NA new [1] 5 5 5 5 NA Alain On 09-Apr-10 11:23, Paul Chatfield wrote: new-c(rep(5,4),6) for (i in 1:6) {new[new[i]5.5][i]-NA} -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summing with NA
Hi, Do help(sum) to find more information about the option na.rm=T sum(c(z,e,k),na.rm=T) [1] -23 Alain On 24-Mar-10 17:21, Muhammad Rahiz wrote: Slightly longer method, but works as well. z - c(-12,-9) e - c(-2,0) k - c(NA,NA) x - c(z,e,k) x1 - which(x!=NA,arr.ind=TRUE) # get elements which are not NA x2 - x[x1] sum(x2) [1] -23 -- Muhammad tj wrote: Hi all, May I request for your help if you have time and if you have an idea on how to do this. I want to add three vectors... And my goal is to obtain the sum of the vectors, ignoring the vector of na... Here is what i did in R.. I'm adding the three vectors, e,z,k, and my objective is to get an answer = -23. I tried putting the na.omit but it did not work. Thanks. z [1] -12 -9 e [1] -2 0 k [1] NA NA sum(z+e+k) [1] NA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to sum a list of matrices ?
Hi, Look at the R News 8/1 in the R Help Desk. Alain On 10-Mar-10 16:34, Carlos Petti wrote: Dear list, I have a list of three matrices : i = list(matrix(1:4,2,2), matrix(3:6,2,2), matrix(9:12,2,2)) I would like to sum the matrices, as follows : [,1] [,2] [1,] 13 19 [2,] 16 22 I used this code : k- i[[1]] for (j in (2:length(i))) { k- k + i[[j]]} But, is it possible to sum without a loop ? Thanks in advance, Carlos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
partly from teaching) The fact that this xapply-stuff was not idempotent (worse: not always) and that you need a monster like do.call() to straighten this out. Nowadays, plyr comes close. The concept of environment. With S it was worse, though. That you cannot change values passed by reference. I noted that the latter is no problem for students who have not worked with c(++/#) before. That there is only one return-result in functions. [ and the likes as an operator. 10 years ago, when I started, the message was: S4 is the future, S3 is legacy. So I learned S4. Only to never use is in self-written code later. Might be different for BioConductor people. That sometimes you can use vectors not in data= (lattice), and sometimes not (ggplot2). Still a VERY confusing inconsistency. The why-does-this-not-print FAQ. Why does par(oma..) not work with lattice? Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to extract one of four plots in a linear regression model
Hi, You can extract a plot in using the option which in specifying the number of the plot (from 1 to 6). For example: plot(lm.D9, which=1) Regards, Alain Guillet On 25-Feb-10 16:50, FMH wrote: Dear All, A linear regression model could be fitted by using lm function and the plot function can be used to check the assumption of the model. The help menu shows few instances on suitable coding for fitting such a linear model. In addition, four different plots could be extracted simultaneously with only a single plot function as followed: require(graphics) ## Annette Dobson (1990) An Introduction to Generalized Linear Models. ## Page 9: Plant Weight Data. ctl- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group- gl(2,10,20, labels=c(Ctl,Trt)) weight- c(ctl, trt) anova(lm.D9- lm(weight ~ group)) opar- par(mfrow = c(2,2), oma = c(0, 0, 1.1, 0)) plot(lm.D9, las = 1) The plot function gives four different plots simulaneously but i just need only part of them, for instance the normality plot. Could someone give some ideas the way to extract this single plot as i need to copy only this plot and paste it into Word document. Thanks Fir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to label individuals with FactoMiner ?
Hi, The label you want to see on the factorial map should be the row names so change first the row names and then do your analysis using FactoMineR. Then apply the code hereunder replacing res by your PCA object. plot.PCA(res, axes=c(1, 2), choix=ind, habillage=ind, col.ind=black, col.ind.sup=blue, col.quali=magenta, label=c(ind, ind.sup, quali), title=) Alain On 24-Feb-10 12:33, Robert U wrote: Dear all, i'm trying to label specific individuals (supplementary ones) after a PCA with the FactoMiner package. There is not much details (possibilities?) in the R-help of the plot.pca function. There is indeed a label parameter but i could only manage to label the supplementary individuals with there row.names (i.e. label=indiv.sup) and not with the specific names i would like them to display (gathered in a data-frame column for example, characters or numeric...). I saw that i might resolve my problem with the ade4 package (s.plot function or something like that) but i would like to stick to factominer, if there is a way to manage this label issue... Did anyone deal with that before ? Thanks for your help, with regards. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Function doesn't work?
Hi, If you execute the following code it works but I wouldn't use grid if I were you as a vector as this name is already used by R (check help(grid)) and it explains why you have to define it in the function. ReturnsGrid = function(x,y,m){ grid - numeric(m) for (i in 1:m){ grid[i] - x + (i-1)*(y-x)/m } grid } xx=ReturnsGrid(0,9,3) Regards, Alain Anastasia wrote: Hello, I am new to R program, therefore, I am sorry if this is a really stupid question. I wrote a simple function and for some reason it doesn't work ReturnsGrid = function(x,y,m){ for (i in 1:m){ grid[i] - x + (i-1)*(y-x)/m } grid } xx=ReturnsGrid(0,9,3) Thanks a lot! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Switch Help
I believe that is what you want: aar -function(command) { for(i in command){ cat(i,:,switch(EXPR=i, scrn = Screening, dx = Diagnosis, df = Don't Forget), \n) } } aar(c(dx,df)) dx : Diagnosis df : Don't Forget Alain oscar linares wrote: Dear Rexperts, Given, aar -function(command) { switch(command, {scrn = cat(scrn :Screening,\n)} {dx = cat(dx:Diagnosis,\n)} {df = cat(df:Don't Forget,\n)} ) } I want to be able to do: aar(dx) # function does cat(dx:Diagnosis,\n) aar(c(dx,df)) # function does cat(dx:Diagnosis,\n) # function does df = cat(df:Don't Forget,\n) BUT IT IS NOT WORKING FOR ME. Please help:-) -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: ^ operator
Hi, You forgot to put the parenthesis in the way Petr told you : (-6.108576e-05)^(1/3) and the result is NaN. What do you want to preserve? Alain carol white wrote: but with complex, I get complex numbers for the first and last elements: (as.complex(tmp))^(1/3) [1] 0.01969170+0.03410703i 0.03478442+0.i 0.03285672+0.i [4] 0.08950802+0.i 0.05848363+0.10129661i whereas for the first element, we get the followings. Moreover, -6.108576e-05^(1/3) [1] -0.03938341 and -(6.108576e-05^(1/3)) [1] -0.03938341 and -((6.108576e-05)^(1/3)) [1] -0.03938341 give the same results. so using () doesn't preserve any thing --- On Mon, 11/16/09, Petr PIKAL petr.pi...@precheza.cz wrote: From: Petr PIKAL petr.pi...@precheza.cz Subject: Odp: [R] ^ operator To: carol white wht_...@yahoo.com Cc: r-h...@stat.math.ethz.ch Date: Monday, November 16, 2009, 3:40 AM Hi AFAIK, this is issue of the preference of operators. r-help-boun...@r-project.org napsal dne 16.11.2009 11:24:59: Hi, I want to apply ^ operator to a vector but it is applied to some of the elements correctly and to some others, it generates NaN. Why is it not able to calculate -6.108576e-05^(1/3) even though it exists? tmp [1] -6.108576e-05 4.208762e-05 3.547092e-05 7.171101e-04 -1.600269e-03 tmp^(1/3) [1]NaN 0.03478442 0.03285672 0.08950802NaN This computes (-a)^(1/3) which is not possible in real numbers. You have to use as.complex(tmp)^(1/3) to get a result. -6.108576e-05^(1/3) [1] -0.03938341 this is actually -(6.108576e-05^(1/3)) Regards Petr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discriminant plot
Hello Alejo, Please, keep sending your post to the R-help mailing list in order other people can also answer. The type of lda_analysis is lda and that is normal and it also is perfectly normal to find a different type for predict(lda_analysis)$x. Moreover the example of the lda() function about iris gives me the exact same types for the object z (of the example) and for predict(z). When you plot lda_analysis, you use the function plot.lda whereas you use the function plot for the predict object. As I told you in my previous e-mail the predicted class are not the class of X$G3 so it is normal if the two plots are not exactly the same. which(predict(lda_analysis)$class != X$G3) gives you all the observations that are predicted in a different category from X$G3. Look at this points and you can see they are the only different points from the two plots (the coordinates are the same). Alain Alejo C.S. wrote: Hi Alain, I thought (in the worng way I see) that the predict function applied to an object of class lda returned the coordinates of the discriminant axes. When doing the same to iris data, the original classes are the same than those returned by predict. Is not the case with my data, if you compare the original classes with those returned by predict(), the are different. I'm really confused now... Regards, Alejo 2009/10/15, Alain Guillet alain.guil...@uclouvain.be mailto:alain.guil...@uclouvain.be: Hi Alejo, According to my knowledge the two plots are different because in the first one a point belongs to a group depending on its group in the data whereas in the second plot a point belongs to the group predicted by the linear discriminant analysis. I hope somebody will correct me if I am wrong. Alain Alejo C.S. wrote: Hi Alain, this is the code: library(MASS) library(mda) #data attached, first column G3 group membership X - read.table(data, header=T) lda_analysis - lda(formula(X), data=X) plot(lda_analysis, col=palette()[X$G3]) #the above plot is completely different to: plot(predict(lda_analysis)$x, type=n) text(predict(lda_analysis)$x, labels=predict(lda_analysis)$class, col=palette()[predict(lda_analysis)$class]) The above code only reproduce the first plot using predict to obtain coordinates and classes for the first tow discriminant axis. Thanks , Alejo -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discriminant plot
Hi Alejo, According to my knowledge the two plots are different because in the first one a point belongs to a group depending on its group in the data whereas in the second plot a point belongs to the group predicted by the linear discriminant analysis. I hope somebody will correct me if I am wrong. Alain Alejo C.S. wrote: Hi Alain, this is the code: library(MASS) library(mda) #data attached, first column G3 group membership X - read.table(data, header=T) lda_analysis - lda(formula(X), data=X) plot(lda_analysis, col=palette()[X$G3]) #the above plot is completely different to: plot(predict(lda_analysis)$x, type=n) text(predict(lda_analysis)$x, labels=predict(lda_analysis)$class, col=palette()[predict(lda_analysis)$class]) The above code only reproduce the first plot using predict to obtain coordinates and classes for the first tow discriminant axis. Thanks , Alejo -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot discriminant analysis
Hi, I did it with Iris - data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]), Sp = rep(c(s,c,v), rep(50,3))) train - sample(1:150, 75) table(Iris$Sp[train]) z - lda(Sp ~ ., Iris, prior = c(1,1,1)/3, subset = train) Then I did plot(z,xlim=c(-10,10),ylim=c(-10,10)) before drawing points(predict(z)$x, col=palette()[predict(z)$class],xlim=c(-10,10),ylim=c(-10,10)) and all the points are superimposed. The only difference I found was the different x- and y-axis when I drew them separately, i.e. plot(z) plot(predict(z)$x, col=palette()[predict(z)$class]) Alain Alejo C.S. wrote: I'm confused on how is the right way to plot a discriminant analysis made by lda function (MASS package). (I had attached my data fro reproduction). When I plot a lda object : X - read.table(data, header=T) lda_analysis - lda(formula(X), data=X) plot(lda_analysis) #the above plot is completely different to: plot(predict(lda_analysis)$x, col=palette()[predict(lda_analysis)$class]) that should be the same graph than the first? In the second case, I use predict function to obtain the LD1 and LD2 coordinates of lda_analysis (predict(lda_analysis)$x) and it's respective class (predict(lda_analysis)$class), but it seems that the classes are different: table(X$G3, predict(lda_analysis)$class) BG M B 2903 G0 26 2 M 40 46 any clues? Regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau c.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] principal component analysis for class variables
The PCA doesn't work with class variables so the error is normal. You should try to work with a discriminant factorial analysis (see discrimin.coa in ade4). Alain andreiabb wrote: Dear Forum, I have a class variable 1 (populations A-E), and two other class variables, variable 2 and variable 3. What I want is to see if the combination of var 2 and var 3, will give me a pattern that allows to distinguish populations. I found several packages like ade4, with pcaiv function and factoMineR. but there are not working. Using the ade4 package, when I try to build the pca: pca1 - dudi.pca(D, scan = FALSE, nf = 2) Error in v * row.w : non-numeric argument to binary operator Does someone has suggestions? Thanks, Andy -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.316 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Batch problem
Hi, I want to make my R program run in batch under Windows XP. To do so, I create a bat file with the command RCMD BATCH --vanilla program.R program.out and I use the bat file with the scheduled task of Windows XP. Then I log off. It works up to the log off of another user on the same computer with R-2.9.1 but this problem doesn't appear with R-1.9.1 on the same machine. Is anything wrong in the syntax of my bat file? Thanks. Regards, Alain -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tapply changing order of factor levels?
Hi, I don't believe the problem is related to tapply. I would say it is because of the factor. In fact, the order of a factor is given by the alphanumerical order of his levels. You can see it with levels(myfactor). I you want to change the order, redefine the levels of myfactor with the expected order or use the function ordered. Alain Chirantan Kundu wrote: Hi, Does tapply change the order when applied on a factor? Below is the code I tried. mylevels-c(IN0020020155,IN0019800021,IN0020020064) mydata-c(IN0020020155,IN0019800021,IN0020020064,IN0020020155,IN0019800021,IN0019800021,IN0020020064,IN0020020064,IN0019800021) myfactor-factor(mydata,levels=mylevels) myfactor [1] IN0020020155 IN0019800021 IN0020020064 IN0020020155 IN0019800021 IN0019800021 IN0020020064 IN0020020064 IN0019800021 Levels: IN0020020155 IN0019800021 IN0020020064 summary(myfactor) IN0020020155 IN0019800021 IN0020020064 243 # Everything fine upto this point. The order of levels is maintained as it is. mysummary-tapply(myfactor,mydata,length) mysummary IN0019800021 IN0020020064 IN0020020155 432 # Now the order has changed. Is this the expected behavior? Any idea on how to avoid the change in order? Regards, Chirantan Visit us at http://www.2pirad.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tapply changing order of factor levels?
Hi, I meant that your problem occured because the levels of mylevels are not ordered whereas tapply uses the ordered levels for printing. If you order them (look under), you can see the results of the tapply has the same order as the levels of myfactor mydata-c(IN0020020155,IN0019800021,IN0020020064,IN0020020155,IN0019800021,IN0019800021,IN0020020064,IN0020020064,IN0019800021) mylevels-c(IN0020020155,IN0019800021,IN0020020064) myfactor-factor(mydata,levels=mylevels) myfactor [1] IN0020020155 IN0019800021 IN0020020064 IN0020020155 IN0019800021 [6] IN0019800021 IN0020020064 IN0020020064 IN0019800021 Levels: IN0020020155 IN0019800021 IN0020020064 levels(myfactor) - sort(mylevels) myfactor [1] IN0019800021 IN0020020064 IN0020020155 IN0019800021 IN0020020064 [6] IN0020020064 IN0020020155 IN0020020155 IN0020020064 Levels: IN0019800021 IN0020020064 IN0020020155 tapply(myfactor,mydata,length) IN0019800021 IN0020020064 IN0020020155 432 Chirantan Kundu wrote: Hi Alain, I tried levels(myfactor) as you suggested. levels(myfactor) [1] IN0020020155 IN0019800021 IN0020020064 The order is preserved, no alphanumerical sorting done here. Regards. On Wed, May 6, 2009 at 7:35 PM, Alain Guillet alain.guil...@uclouvain.be mailto:alain.guil...@uclouvain.be wrote: Hi, I don't believe the problem is related to tapply. I would say it is because of the factor. In fact, the order of a factor is given by the alphanumerical order of his levels. You can see it with levels(myfactor). I you want to change the order, redefine the levels of myfactor with the expected order or use the function ordered. Alain Chirantan Kundu wrote: Hi, Does tapply change the order when applied on a factor? Below is the code I tried. mylevels-c(IN0020020155,IN0019800021,IN0020020064) mydata-c(IN0020020155,IN0019800021,IN0020020064,IN0020020155,IN0019800021,IN0019800021,IN0020020064,IN0020020064,IN0019800021) myfactor-factor(mydata,levels=mylevels) myfactor [1] IN0020020155 IN0019800021 IN0020020064 IN0020020155 IN0019800021 IN0019800021 IN0020020064 IN0020020064 IN0019800021 Levels: IN0020020155 IN0019800021 IN0020020064 summary(myfactor) IN0020020155 IN0019800021 IN0020020064 243 # Everything fine upto this point. The order of levels is maintained as it is. mysummary-tapply(myfactor,mydata,length) mysummary IN0019800021 IN0020020064 IN0020020155 432 # Now the order has changed. Is this the expected behavior? Any idea on how to avoid the change in order? Regards, Chirantan Visit us at http://www.2pirad.com [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 Visit us at http://www.2pirad.com -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a vector of sums
Look at cumsum() Alain Rachel Taylor wrote: Hi, I am trying to create a function for a goodness-of-fit test for the Pareto Distribution for some loss data that I have. So far I have the following: function(X=OTOL) { n - length(X)-1 #calculated the number of values (extra as 0 included) i - 2:640 #values of i j - 1:639 #values of i-1 Y - (n-j+1)*((X[i])-(X[j])) #First part of GoF model Y } Where OTOL is the ordered loss data (decreasing), and Y is a vector of length 639 What I need to do next is create another vector TY (of the same length) that is the the sum of part of the Y vector. So TY[1]=Y[1] TY[2]=Y[1]+Y[2] TY[3]=Y[1]+Y[2]+Y[3] and so on. I have tried to do a sum(Y[j]) but it just comes out with a single value. Any help is greatly appreciated, thank you. Rachel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparing two regression line slopes
Hello benedikt, You say the slopes differ significantly if the p-value is less than a given threshold, most of the time 0.05. Please, note that fitting a linear regression through three points is senseless... Regards, Alain Benedikt Niesterok wrote: Hello R users, I've used the following help: Comparing two regression line slopes I knew the method based on the following statement : t = (b1 - b2) / sb1,b2 where b1 and b2 are the two slope coefficients and sb1,b2 the pooled standard error of the slope (b) which can be calculated in R this way: df1 - data.frame(x=1:3, y=1:3+rnorm(3)) df2 - data.frame(x=1:3, y=1:3+rnorm(3)) fit1 - lm(y~x, df1) s1 - summary(fit1)$coefficients fit2 - lm(y~x, df2) s2 - summary(fit2)$coefficients db - (s2[2,1]-s1[2,1]) sd - sqrt(s2[2,2]^2+s1[2,2]^2) df - (fit1$df.residual+fit2$df.residual) td - db/sd 2*pt(-abs(td), df) Using my data I finally get the value of the test, which is: 2.245e-7. Do my slopes differ significantly now? Thanks for help, Benedikt -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning list into vector/dataframe
Hi Melissa, L - list(min=rnorm(5),mean=rnorm(5),max=rnorm(5)) matrix(unlist(L),ncol=3) gives what you want Alain Melissa2k9 wrote: Hi, I have used this command : resamples-lapply(1:1000,function(i) sample(lambs,replace=F)) resamples2-lapply(resamples,Cusum) to get a list of 1000 samples of my data. The function Cumsum is defined as follows: Cusum-function(x){ SUM-cumsum(x)-(1:length(x))*mean(x) min-min(cumsum(x)-(1:length(x))*mean(x)) max-max(cumsum(x)-(1:length(x))*mean(x)) diff-max-min ans-c(min,max,diff) ans } where lambs is a vector of temperatures. An example of part of my list is: [[998]] [1] -5.233176 6.903034 12.136210 [[999]] [1] -9.296690 1.516233 10.812922 [[1000]] [1] -1.502066e+01 -4.547474e-13 1.502066e+01 Now I want to convert this list into a dataframe so for example 1000 rows with col names Min, Max and Diff. My supervisor said I first had to turn this into a vector but I don't seem to be able to do that! Any ideas on how to turn this list into a dataframe would be really appreciated :) Thanks in advance Melissa -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is a point into an ellipse
Hi, I drew an ellipse with the package ellipse. Now I would like to know if a point is inside the ellipse. Is any R functions to do it without computing the equation of the ellipse manually? Thanks. For example, if I do plot(ellipse(0.8), type = 'l'), I would like to know if (0,1) belongs to the drawn ellipse. Regards, Alain -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and SPSS
Hi, There exists a R plug-in for SPSS. You can find it on the SPSS website. Hope it helps. Alain Liviu Andronic wrote: Hello, On Wed, Nov 26, 2008 at 9:25 PM, Applejus [EMAIL PROTECTED] wrote: I have a code in R. Could anyone give me the best possible way (or just ways!) to integrate it in SPSS? I would doubt you could do this, but for the least provide commented, minimal, self-contained, reproducible code. It would help if you were more specific. Liviu -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic question on concatenating factors
Hi, I have a solution to concatenate two factors in one but I don't believe it is the best one: factor(c(as.character(f1),as.character(f2))) [1] a a b b b a Levels: a b You can always add a level by assigning a new vector at the level vector: levels(f1) - c(a,b,c) f1 [1] a a b Levels: a b c udi cohen wrote: Hi all, I hope it's not too trivial for the list - I'm trying to concatenate two factor arrays, and obtain the following: f1-factor(c(a,a,b)) f1 [1] a a b Levels: a b f2-factor(c(b,b,a)) f2 [1] b b a Levels: a b c(f1,f2) [1] 1 1 2 2 2 1 Instead of getting: [1] a a b b b a Levels: a b a related question is: how do I add a level which does not exists yet in a factored vector, so I'll be able to add later these values, without getting: In `[-.factor`(`*tmp*`, 2, value = c) : invalid factor level, NAs generated Thanks, EC __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Incorrect order
Hi, I believe Bart answered to your question. What is the solution you are expecting? If you don't give us more explanations we cannot understand what is wrong for you. help(sort) |order| returns a permutation which rearranges its first argument into ascending or descending order, breaking ties by further arguments. |sort.list| is the same, using only one argument. See the examples for how to use these functions to sort data frames, etc. In the section see also of the help about sort there are the two functions sort and rank! a-c(20,30,15,40) sort(a) [1] 15 20 30 40 order(a) [1] 3 1 2 4 rank(a) [1] 2 3 1 4 Alain lll73 wrote: I am using the order function and the result seems to be incorrect: a-c(20,30,15,40) order(a) [1] 3 1 2 4 Any suggestions? Thanks, Laura -- Alain Guillet Statistician and Computer Scientist Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] add labelled contour lines to filled.contour plot
Look at contourplot in the lattice library. There is an example doing what you want. Alain Guillet Mark wrote: Is it possible to add labelled contour lines to filled.contour plot ? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to set rownames / colnames for matrices in a list
Hi, If all your matrices have the same size, you should work with an array and not with a list. Then you can use dimnames to set the names of the rows, columns, and so on.. Alain Antje wrote: Hello, I have another stupid question. I hope you can give me a hint how to solve this: I have a list and one element is again a list containing matrices, all of the same dimensions. Now, I'd like to set the dimnames for all matrices: example code: m1 - matrix(1:25, nrow=5) m2 - matrix(26:50, nrow=5) # ... there can be much more than two matrices l - list() l[[1]] - list(m1,m2) r_names - LETTERS[1:5] c_names - LETTERS[6:10] ? how can I apply these names to any number of matrices within this list-list ? Ciao, Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Alain Guillet Statistician and Computer Scientist Institut de statistique - Université catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.