Re: [R] matrix manipulation question
On 27 Mar 2015, at 09:58 , Stéphane Adamowicz stephane.adamow...@avignon.inra.fr wrote: data_no_NA - data[, complete.cases(t(data))==T] Ouch! logical == TRUE is bad, logical == T is worse: data[, complete.cases(t(data))] -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix manipulation question
Very, very, very bad solution. as.matrix can change silently your data to unwanted format, complete.cases()==T is silly as Peter already pointed out. I use head(airquality[ ,colSums(is.na(airquality))==0]) Wind Temp Month Day 1 7.4 67 5 1 2 8.0 72 5 2 3 12.6 74 5 3 4 11.5 62 5 4 5 14.3 56 5 5 6 14.9 66 5 6 if I want to get rid of columns with NA. Cheers Petr From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Stéphane Adamowicz Sent: Friday, March 27, 2015 11:42 AM To: peter dalgaard Cc: r-help@r-project.org Subject: Re: [R] matrix manipulation question Well, it seems to work with me. Y - as.matrix(airquality) head(Y, n=8) Ozone Solar.R Wind Temp Month Day [1,]41 190 7.4 67 5 1 [2,]36 118 8.0 72 5 2 [3,]12 149 12.6 74 5 3 [4,]18 313 11.5 62 5 4 [5,]NA NA 14.3 56 5 5 [6,]28 NA 14.9 66 5 6 [7,]23 299 8.6 65 5 7 [8,]19 99 13.8 59 5 8 Z - Y[,complete.cases(t(Y))==T] head(Z, n=8) Wind Temp Month Day [1,] 7.4 67 5 1 [2,] 8.0 72 5 2 [3,] 12.6 74 5 3 [4,] 11.5 62 5 4 [5,] 14.3 56 5 5 [6,] 14.9 66 5 6 [7,] 8.6 65 5 7 [8,] 13.8 59 5 8 The columns that contained NA were deleted. Le 27 mars 2015 � 10:38, peter dalgaard pda...@gmail.commailto:pda...@gmail.com a �crit : On 27 Mar 2015, at 09:58 , St�phane Adamowicz stephane.adamow...@avignon.inra.frmailto:stephane.adamow...@avignon.inra.fr wrote: data_no_NA - data[, complete.cases(t(data))==T] Ouch! logical == TRUE is bad, logical == T is worse: data[, complete.cases(t(data))] -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dkmailto:pd@cbs.dk Priv: pda...@gmail.commailto:pda...@gmail.com _ St�phane Adamowicz Inra, centre de recherche Paca, unit� PSH 228, route de l'a�rodrome CS 40509 domaine St Paul, site Agroparc 84914 Avignon, cedex 9 France stephane.adamow...@avignon.inra.frmailto:stephane.adamow...@avignon.inra.fr tel. +33 (0)4 32 72 24 35 fax. +33 (0)4 32 72 24 32 do not dial 0 when out of France web PSH : https://www6.paca.inra.fr/psh web Inra : http://www.inra.fr/ _ [[alternative HTML version deleted]] __ R-help@r-project.orgmailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům. Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému. Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu. V případě, že je tento e-mail součástí obchodního jednání: - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou. - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech. - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without
Re: [R] matrix manipulation question
Well, it seems to work with me. Y - as.matrix(airquality) head(Y, n=8) Ozone Solar.R Wind Temp Month Day [1,]41 190 7.4 67 5 1 [2,]36 118 8.0 72 5 2 [3,]12 149 12.6 74 5 3 [4,]18 313 11.5 62 5 4 [5,]NA NA 14.3 56 5 5 [6,]28 NA 14.9 66 5 6 [7,]23 299 8.6 65 5 7 [8,]19 99 13.8 59 5 8 Z - Y[,complete.cases(t(Y))==T] head(Z, n=8) Wind Temp Month Day [1,] 7.4 67 5 1 [2,] 8.0 72 5 2 [3,] 12.6 74 5 3 [4,] 11.5 62 5 4 [5,] 14.3 56 5 5 [6,] 14.9 66 5 6 [7,] 8.6 65 5 7 [8,] 13.8 59 5 8 The columns that contained NA were deleted. Le 27 mars 2015 � 10:38, peter dalgaard pda...@gmail.com a �crit : On 27 Mar 2015, at 09:58 , St�phane Adamowicz stephane.adamow...@avignon.inra.fr wrote: data_no_NA - data[, complete.cases(t(data))==T] Ouch! logical == TRUE is bad, logical == T is worse: data[, complete.cases(t(data))] -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com _ St�phane Adamowicz Inra, centre de recherche Paca, unit� PSH 228, route de l'a�rodrome CS 40509 domaine St Paul, site Agroparc 84914 Avignon, cedex 9 France stephane.adamow...@avignon.inra.fr tel. +33 (0)4 32 72 24 35 fax. +33 (0)4 32 72 24 32 do not dial 0 when out of France web PSH : https://www6.paca.inra.fr/psh web Inra : http://www.inra.fr/ _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix manipulation question
Why not use complete.cases() ? data_no_NA - data[, complete.cases(t(data))==T] Le 27 mars 2015 à 06:13, Jatin Kala jatin.kala...@gmail.com a écrit : Hi, I've got a rather large matrix of about 800 rows and 60 columns. Each column is a time-series 800 long. Out of these 60 time series, some have missing values (NA). I want to strip out all columns that have one or more NA values, i.e., only want full time series. This should do the trick: data_no_NA - data[,!apply(is.na(data), 2, any)] I now use data_no_NA as input to a function, which returns output as a matrix of the same size as data_no_NA The trick is that i now need to put these columns back into a new 800 by 60 empty matrix, at their original locations. Any suggestions on how to do that? hopefully without having to use loops. I'm using R/3.0.3 Cheers, Jatin. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why can't I access this type?
On Thu, 26-Mar-2015 at 04:58PM -0400, yoursurrogate...@gmail.com wrote: [...] | I agree with you on the indexing approach. But even after using | within, I still get the same error. You leave us to guess just what you tried, but if you did this: all.states - within(as.data.frame(state.x77), state - rownames(state.x77)) and then again did this: cold.states - all.states[all.states$Frost 150, c(Name, Frost)] of course it will give the same error, because as you haven't addressed the problem as you've been told On Sun, 22-Mar-2015 at 08:06AM -0800, John Kane wrote: | Well, first off, you have no variable called Name. You have lost | the state names as they are rownames in the matrix state.x77 and | not a variable. If you did this: all.states - within(as.data.frame(state.x77), Name - rownames(state.x77)) instead of all.states - within(as.data.frame(state.x77), state - rownames(state.x77)) then this would worka; cold.states - all.states[all.states$Frost 150, c(Name, Frost)] Modify the above to match where my guess at what you tried is in error. HTH -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vennerable Plots for Publications
HI Dario, Have you tried creating a larger PNG image and then shrinking the result with an image manipulation program (e.g. GIMP)? Jim On Fri, Mar 27, 2015 at 4:00 PM, Dario Strbenac dstr7...@uni.sydney.edu.au wrote: Does anyone make Venn diagrams for publication using Vennerable ? I found that the font size is too big when the plot is created at 300 DPI, and there's no option to change it, even when the point size argument to the device is changed. aVenn - Venn(Sets = list(A = 1:5, B = 3:6)) png(forPublication.png, units = in, h = 2.55, w = 2.4, res = 300) # Changing pointsize to a smaller number has no effect on size of the text. plot(aVenn) dev.off() -- Dario Strbenac PhD Student University of Sydney Camperdown NSW 2050 Australia __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix manipulation question
Le 27 mars 2015 à 12:34, PIKAL Petr petr.pi...@precheza.cz a écrit : Very, very, very bad solution. as.matrix can change silently your data to unwanted format, complete.cases()==T is silly as Peter already pointed out. Perhaps, but it happens that in the original message, the question dealt with a matrix not a dataframe, and thus I needed a matrix example. Furthermore in my example no unwanted format occurred. You can check easily that the final matrix contains only « numeric » data as in the original data frame. Stéphane _ St�phane Adamowicz Inra, centre de recherche Paca, unit� PSH 228, route de l'a�rodrome CS 40509 domaine St Paul, site Agroparc 84914 Avignon, cedex 9 France stephane.adamow...@avignon.inra.fr tel. +33 (0)4 32 72 24 35 fax. +33 (0)4 32 72 24 32 do not dial 0 when out of France web PSH : https://www6.paca.inra.fr/psh web Inra : http://www.inra.fr/ _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům. Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému. Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu. V případě, že je tento e-mail součástí obchodního jednání: - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou. - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech. - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix manipulation question
Hi -Original Message- From: Stéphane Adamowicz [mailto:stephane.adamow...@avignon.inra.fr] Sent: Friday, March 27, 2015 1:26 PM To: PIKAL Petr Cc: peter dalgaard; r-help@r-project.org Subject: Re: [R] matrix manipulation question Le 27 mars 2015 à 12:34, PIKAL Petr petr.pi...@precheza.cz a écrit : Very, very, very bad solution. as.matrix can change silently your data to unwanted format, complete.cases()==T is silly as Peter already pointed out. Perhaps, but it happens that in the original message, the question I do not have original message. dealt with a matrix not a dataframe, and thus I needed a matrix But you made matrix from data frame. If one column was not numeric all resulting matrix would chnge to nonnumeric format. example. Furthermore in my example no unwanted format occurred. You can Yes because data.frame was (luckily) numeric. check easily that the final matrix contains only « numeric » data as in the original data frame. You want matrix? Here it is. head(as.matrix(airquality)[ ,colSums(is.na(airquality))==0]) Wind Temp Month Day [1,] 7.4 67 5 1 [2,] 8.0 72 5 2 [3,] 12.6 74 5 3 [4,] 11.5 62 5 4 [5,] 14.3 56 5 5 [6,] 14.9 66 5 6 Works same with matrix as with data frame without need to transform it. Cheers Petr Stéphane _ St phane Adamowicz Inra, centre de recherche Paca, unit PSH 228, route de l'a rodrome CS 40509 domaine St Paul, site Agroparc 84914 Avignon, cedex 9 France stephane.adamow...@avignon.inra.fr tel. +33 (0)4 32 72 24 35 fax. +33 (0)4 32 72 24 32 do not dial 0 when out of France web PSH : https://www6.paca.inra.fr/psh web Inra : http://www.inra.fr/ _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům. Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému. Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu. V případě, že je tento e-mail součástí obchodního jednání: - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou. - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech. - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the
Re: [R] matrix manipulation question
example. Furthermore in my example no unwanted format occurred. You can Yes because data.frame was (luckily) numeric. Luck has nothing to do with this. I Chose this example on purpose … Stéphane __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculating area of polygons created from a spatial intersect
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello all, I am attempting to automate an analysis that I developed with ArcInfo using R and the gdal and geos packages (or any other) if possible. Here is the basic process I have a shape file (lines) that defines the limits of all of the projects with each project having a unique identifier. I have another shape file (polys) that contains total population and low income population and represent Census block groups. This shape file has an area field which has the acreage of the total block group. Process Step 1. I then buffer these project lines to create a second shape file that represents the 'footprint' of the project. (Creates polys). Step 2. In ArcInfo, I perform an intersection of the two shape files (footprint and census blocks) and this creates a third shape file which has a unique polygon for every project/census block intersection. Step 3. I then perform an area calculation (acres) on this new poly shape file and use this calculated area divided by the original area of the associated census block group to apportion the two population datum to this new polygon. Step 4. Finally, I sum the two population datums for each of the projects from the attribute table of this final shape file. When I try to replicate the above procedure I run into a problem with Step 2 when I use what I think is the appropriate command: gIntersects(buffered_projects, census_blocks, byID=TRUE) This command is producing a matrix of each project/census block combination and only providing me a true/false indication. Is there any way to replicate the process from ArcInfo that I outlined above within R? Walter Anderson -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQIcBAEBAgAGBQJVFWYWAAoJEHfnxjvhypCiMc8P/2Dsja+h4RKuR7ygHx+oI+4/ oEIxl/NtnHwPh6szyL6CBndSYI6hvdWWwBUm86IsJLmLSSFivB1Ru54nkFq+kfKL tWpxyOAXNZoa2xn1ADaG1ChFiY/hF937zlTTv8D3a5pYAnYtTeyg6UJ3AuHsfjqG PbFAg6T+QD3AlJvV73JGmEchgYoj7NlxiEmdcfB3X9cgLMMOCsfLgm4d5g5J/mhh LKZm3Xg9+eXEjPJazHYB9xc0+AF8Jp6SH9XnnZ/DMFN3DuyR3KuTJr6YnHUKvtUs o/Uog3zAGuVUDqNwF1H9+WNuz4Fm7XXiHl4xX0n9faE3niTe2b63bVn/Ueiyofb9 ky3wIpAr412/Ne3dtMtSDPkE3w2TsdIUKki2VP9duXB/4vEtHHXvQxNtfKdKmlYX cnyyK/1ZwULiwWhyxZKJNUd6N2GyLYJ8MmJ7AXnT7EboJjNkhNta1BhWBE9Kzx8p fUN1UwS8P96iFXztgg2jw3aYTPdPIp9rFYFJax5nKCl6n+YbjUw11GuO6F4lqNDv PoLllcKkmsGWFo04P0TbS+x1zhc0wmyMn2EV8FcIXJ/80pqT/dWwksbjTfrQGoWx Xo1m1vTR2LVVrdf0vSkWnxHA3xVQPv7YH5erVNBGWvuhgbLRx8j7MPUp7lFHOJvQ bq1VJbpnZFRvJyZfII2p =cZWI -END PGP SIGNATURE- __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with download.file ?
# download.file() Seems to put the xlsx file onto hard drive. download.file(http://www.udel.edu/johnmack/data_library/zipcode_centroids.xlsx;, zipcode_centroids.xlsx) trying URL 'http://www.udel.edu/johnmack/data_library/zipcode_centroids.xlsx' Content type 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' length 2785832 bytes (2.7 Mb) opened URL downloaded 2.7 Mb # Trouble reading file with xlsx. library(xlsx) Loading required package: rJava Loading required package: xlsxjars Warning messages: 1: package ‘xlsx’ was built under R version 3.1.3 2: package ‘rJava’ was built under R version 3.1.3 df - read.xlsx2(zipcode_centroids.xlsx, sheetIndex=1) Error in .jcall(RJavaTools, Ljava/lang/Object;, invokeMethod, cl, : java.util.zip.ZipException: invalid entry size (expected 1168 but got 1173 bytes) # I downloaded the file manually (same name) from the web page and tried again. # Then I read the file into R with xlsx successfully. df - read.xlsx2(/zipdist/zipcode_centroids.xlsx, sheetIndex=1) str(df) 'data.frame': 42961 obs. of 8 variables: $ ZIPCODE : Factor w/ 42961 levels 01001,01002,..: 1 2 3 4 5 6 7 8 9 10 ... $ TOWN.: Factor w/ 18955 levels Aaronsburg,Abbeville,..: 85 333 333 333 898 1089 1459 1620 1899 2929 ... $ STATE: Factor w/ 52 levels AK,AL,AR,..: 21 21 21 21 21 21 21 21 21 21 ... $ LATITUDE : Factor w/ 37352 levels -7.209975,19.101978,..: 28020 28948 28916 28971 29047 28624 28326 28418 28197 28603 ... $ LONGITUDE: Factor w/ 37241 levels -100.00991,-100.02632,..: 8799 8706 8811 8715 8470 8639 9019 8608 8531 9065 ... $ STFIPS : Factor w/ 51 levels 01,02,04,..: 22 22 22 22 22 22 22 22 22 22 ... $ CD : Factor w/ 55 levels 00,01,02,..: 3 2 2 2 2 2 2 3 3 2 ... $ CONG_DIST: Factor w/ 436 levels 01_01,01_02,..: 191 190 190 190 190 190 190 191 191 190 ... # Is there a problem with download.file() when file is an Excel file or this particular Excel file? -- Giles L Crane, MPH, ASA, NJPHA Statistical Consultant and R Instructor 621 Lake Drive Princeton, NJ 08540 Phone: 609 924-0971 Email: gilescr...@verizon.net [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with download.file ?
Add the argument mode=wb to your call to download.file(). On Windows this means to use 'binary' format - do not change line endings. Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Mar 27, 2015 at 7:25 AM, Giles Crane gilescr...@verizon.net wrote: # download.file() Seems to put the xlsx file onto hard drive. download.file( http://www.udel.edu/johnmack/data_library/zipcode_centroids.xlsx;, zipcode_centroids.xlsx) trying URL ' http://www.udel.edu/johnmack/data_library/zipcode_centroids.xlsx' Content type 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' length 2785832 bytes (2.7 Mb) opened URL downloaded 2.7 Mb # Trouble reading file with xlsx. library(xlsx) Loading required package: rJava Loading required package: xlsxjars Warning messages: 1: package ‘xlsx’ was built under R version 3.1.3 2: package ‘rJava’ was built under R version 3.1.3 df - read.xlsx2(zipcode_centroids.xlsx, sheetIndex=1) Error in .jcall(RJavaTools, Ljava/lang/Object;, invokeMethod, cl, : java.util.zip.ZipException: invalid entry size (expected 1168 but got 1173 bytes) # I downloaded the file manually (same name) from the web page and tried again. # Then I read the file into R with xlsx successfully. df - read.xlsx2(/zipdist/zipcode_centroids.xlsx, sheetIndex=1) str(df) 'data.frame': 42961 obs. of 8 variables: $ ZIPCODE : Factor w/ 42961 levels 01001,01002,..: 1 2 3 4 5 6 7 8 9 10 ... $ TOWN.: Factor w/ 18955 levels Aaronsburg,Abbeville,..: 85 333 333 333 898 1089 1459 1620 1899 2929 ... $ STATE: Factor w/ 52 levels AK,AL,AR,..: 21 21 21 21 21 21 21 21 21 21 ... $ LATITUDE : Factor w/ 37352 levels -7.209975,19.101978,..: 28020 28948 28916 28971 29047 28624 28326 28418 28197 28603 ... $ LONGITUDE: Factor w/ 37241 levels -100.00991,-100.02632,..: 8799 8706 8811 8715 8470 8639 9019 8608 8531 9065 ... $ STFIPS : Factor w/ 51 levels 01,02,04,..: 22 22 22 22 22 22 22 22 22 22 ... $ CD : Factor w/ 55 levels 00,01,02,..: 3 2 2 2 2 2 2 3 3 2 ... $ CONG_DIST: Factor w/ 436 levels 01_01,01_02,..: 191 190 190 190 190 190 190 191 191 190 ... # Is there a problem with download.file() when file is an Excel file or this particular Excel file? -- Giles L Crane, MPH, ASA, NJPHA Statistical Consultant and R Instructor 621 Lake Drive Princeton, NJ 08540 Phone: 609 924-0971 Email: gilescr...@verizon.net [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix manipulation question
On Mar 27, 2015, at 3:41 AM, Stéphane Adamowicz wrote: Well, it seems to work with me. No one is doubting that it worked for you in this instance. What Peter D. was criticizing was the construction : complete.cases(t(Y))==T ... and it was on two bases that it is wrong. The first is that `T` is not guaranteed to be TRUE. The second is that the test ==T (or similarly ==TRUE) is completely unnecessary because `complete.cases` returns a logical vector and so that expression is a waste of time. (The issue of matrix versus dataframe was raised by someone else.) -- David. Y - as.matrix(airquality) head(Y, n=8) Ozone Solar.R Wind Temp Month Day [1,]41 190 7.4 67 5 1 [2,]36 118 8.0 72 5 2 [3,]12 149 12.6 74 5 3 [4,]18 313 11.5 62 5 4 [5,]NA NA 14.3 56 5 5 [6,]28 NA 14.9 66 5 6 [7,]23 299 8.6 65 5 7 [8,]19 99 13.8 59 5 8 Z - Y[,complete.cases(t(Y))==T] head(Z, n=8) Wind Temp Month Day [1,] 7.4 67 5 1 [2,] 8.0 72 5 2 [3,] 12.6 74 5 3 [4,] 11.5 62 5 4 [5,] 14.3 56 5 5 [6,] 14.9 66 5 6 [7,] 8.6 65 5 7 [8,] 13.8 59 5 8 The columns that contained NA were deleted. Le 27 mars 2015 à 10:38, peter dalgaard pda...@gmail.com a écrit : On 27 Mar 2015, at 09:58 , Stéphane Adamowicz stephane.adamow...@avignon.inra.fr wrote: data_no_NA - data[, complete.cases(t(data))==T] Ouch! logical == TRUE is bad, logical == T is worse: data[, complete.cases(t(data))] -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com _ Stéphane Adamowicz Inra, centre de recherche Paca, unité PSH 228, route de l'aérodrome CS 40509 domaine St Paul, site Agroparc 84914 Avignon, cedex 9 France stephane.adamow...@avignon.inra.fr tel. +33 (0)4 32 72 24 35 fax. +33 (0)4 32 72 24 32 do not dial 0 when out of France web PSH : https://www6.paca.inra.fr/psh web Inra : http://www.inra.fr/ _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating area of polygons created from a spatial intersect
Suggest (strongly) that you move this question to r-sig-geo. Much more appropriate there, and more people there are more familiar with this kind of work. But ... I suspect you want gIntersection(), not gIntersects(). -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 3/27/15, 7:16 AM, Walter Anderson wandrso...@gmail.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello all, I am attempting to automate an analysis that I developed with ArcInfo using R and the gdal and geos packages (or any other) if possible. Here is the basic process I have a shape file (lines) that defines the limits of all of the projects with each project having a unique identifier. I have another shape file (polys) that contains total population and low income population and represent Census block groups. This shape file has an area field which has the acreage of the total block group. Process Step 1. I then buffer these project lines to create a second shape file that represents the 'footprint' of the project. (Creates polys). Step 2. In ArcInfo, I perform an intersection of the two shape files (footprint and census blocks) and this creates a third shape file which has a unique polygon for every project/census block intersection. Step 3. I then perform an area calculation (acres) on this new poly shape file and use this calculated area divided by the original area of the associated census block group to apportion the two population datum to this new polygon. Step 4. Finally, I sum the two population datums for each of the projects from the attribute table of this final shape file. When I try to replicate the above procedure I run into a problem with Step 2 when I use what I think is the appropriate command: gIntersects(buffered_projects, census_blocks, byID=TRUE) This command is producing a matrix of each project/census block combination and only providing me a true/false indication. Is there any way to replicate the process from ArcInfo that I outlined above within R? Walter Anderson -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQIcBAEBAgAGBQJVFWYWAAoJEHfnxjvhypCiMc8P/2Dsja+h4RKuR7ygHx+oI+4/ oEIxl/NtnHwPh6szyL6CBndSYI6hvdWWwBUm86IsJLmLSSFivB1Ru54nkFq+kfKL tWpxyOAXNZoa2xn1ADaG1ChFiY/hF937zlTTv8D3a5pYAnYtTeyg6UJ3AuHsfjqG PbFAg6T+QD3AlJvV73JGmEchgYoj7NlxiEmdcfB3X9cgLMMOCsfLgm4d5g5J/mhh LKZm3Xg9+eXEjPJazHYB9xc0+AF8Jp6SH9XnnZ/DMFN3DuyR3KuTJr6YnHUKvtUs o/Uog3zAGuVUDqNwF1H9+WNuz4Fm7XXiHl4xX0n9faE3niTe2b63bVn/Ueiyofb9 ky3wIpAr412/Ne3dtMtSDPkE3w2TsdIUKki2VP9duXB/4vEtHHXvQxNtfKdKmlYX cnyyK/1ZwULiwWhyxZKJNUd6N2GyLYJ8MmJ7AXnT7EboJjNkhNta1BhWBE9Kzx8p fUN1UwS8P96iFXztgg2jw3aYTPdPIp9rFYFJax5nKCl6n+YbjUw11GuO6F4lqNDv PoLllcKkmsGWFo04P0TbS+x1zhc0wmyMn2EV8FcIXJ/80pqT/dWwksbjTfrQGoWx Xo1m1vTR2LVVrdf0vSkWnxHA3xVQPv7YH5erVNBGWvuhgbLRx8j7MPUp7lFHOJvQ bq1VJbpnZFRvJyZfII2p =cZWI -END PGP SIGNATURE- __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Having trouble with gdata read in
Anthony, XLSX won’t read an XLS file. Additionally, the legacy Java that is required for the xlsx package really effs up my computer. Have to reinstall my OS to fix it. — Sent from Mailbox On Wed, Mar 25, 2015 at 3:51 PM, Anthony Damico ajdam...@gmail.com wrote: maybe library(xlsx) tf - tempfile() ami - http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls download.file( ami , tf , mode = 'wb' ) ami.data2008 - read.xlsx( tf , sheetIndex = 1 ) On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker bba...@reed.edu wrote: Trying to read and clean up the FERC data on Advanced Metering infrastructure. Of course it is in XLS for the first two survey years and then converts to XLSX for the final two. Bad enough that it is all in excel, they had to change the survey design and data format as well. Still, I’m sorting through it. However, when I try and read in the 2008 data, I’m getting this error: ### Wide character in print at /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/ xls2csv.pl line 270. Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : EOF within quoted string ### Here is the code I’m running to get the data: ### install.packages(gdata) library(gdata) fileUrl - http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls download.file(fileUrl, destfile=./ami.data/ami-data2008.xls) list.files(ami.data) dateDown.2008 - date() ami.data2008 - read.xls(./ami.data/ami-data2008.xls, sheet=1, header=TRUE) ### Reviewed the data in the XLS file, and both “” and # are present within it. Don’t know how to get the read.xls to ignore them so I can read all the data into my data frame. Tried : ### ami.data2008 - read.xls(./ami.data/ami-data2008.xls, sheet=1, quote=, header=TRUE) ### And it spits out “More columns than column names” output. Been searching this, and I can find some “solutions” for read.table, but nothing specific to read.xls Many thanks, Benjamin Baker — Sent from Mailbox [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Categorizing by month
Hi everyone, I'm trying to categorize by month in R. How can I do this if my dates are in date/month/year form? Thanks, Lois -- View this message in context: http://r.789695.n4.nabble.com/Categorizing-by-month-tp4705173.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vif in package car: there are aliased coefficients in the model
Hello. I'm trying to use the function vif from package car in a lm. However it returns the following error: Error in vif.default(lm(MDescores.sitescores ~ hidroperiodo + localizacao + : there are aliased coefficients in the model When I exclude any predictor from the model, it returns this warning message: Warning message: In cov2cor(v) : diag(.) had 0 or NA entries; non-finite result is doubtful When I exclude any other predictor from the model vif finally works. I can't figure it out whats the problem. This are the results that R returns me: vif(lm(MDescores.sitescores ~ hidroperiodo + localizacao + area + profundidade + NTVM + NTVI + PCs...c.1.., data = MDVIF)) Error in vif.default(lm(MDescores.sitescores ~ hidroperiodo + localizacao + : there are aliased coefficients in the model vif(lm(MDescores.sitescores ~ localizacao + area + profundidade + NTVM + NTVI + PCs...c.1.., data = MDVIF)) GVIF Df GVIF^(1/(2*Df)) localizacao NaN 2 NaN area NaN 1 NaN profundidade NaN 1 NaN NTVM NaN 1 NaN NTVI NaN 1 NaN PCs...c.1.. NaN 1 NaN Warning message: In cov2cor(v) : diag(.) had 0 or NA entries; non-finite result is doubtful Thanks. -- Rodolfo Mei Pelinson. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix manipulation question
On 2015-03-27 11:41, Stéphane Adamowicz wrote: Well, it seems to work with me. Y - as.matrix(airquality) head(Y, n=8) Ozone Solar.R Wind Temp Month Day [1,]41 190 7.4 67 5 1 [2,]36 118 8.0 72 5 2 [3,]12 149 12.6 74 5 3 [4,]18 313 11.5 62 5 4 [5,]NA NA 14.3 56 5 5 [6,]28 NA 14.9 66 5 6 [7,]23 299 8.6 65 5 7 [8,]19 99 13.8 59 5 8 Z - Y[,complete.cases(t(Y))==T] Peter's point, I guess, is that 1. complete.cases(t(Y)) is already a vector of logicals 2. T (and F) can be redefined, so what if T - FALSE? Henric Winell head(Z, n=8) Wind Temp Month Day [1,] 7.4 67 5 1 [2,] 8.0 72 5 2 [3,] 12.6 74 5 3 [4,] 11.5 62 5 4 [5,] 14.3 56 5 5 [6,] 14.9 66 5 6 [7,] 8.6 65 5 7 [8,] 13.8 59 5 8 The columns that contained NA were deleted. Le 27 mars 2015 � 10:38, peter dalgaard pda...@gmail.com a �crit : On 27 Mar 2015, at 09:58 , St�phane Adamowicz stephane.adamow...@avignon.inra.fr wrote: data_no_NA - data[, complete.cases(t(data))==T] Ouch! logical == TRUE is bad, logical == T is worse: data[, complete.cases(t(data))] -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com _ St�phane Adamowicz Inra, centre de recherche Paca, unit� PSH 228, route de l'a�rodrome CS 40509 domaine St Paul, site Agroparc 84914 Avignon, cedex 9 France stephane.adamow...@avignon.inra.fr tel. +33 (0)4 32 72 24 35 fax. +33 (0)4 32 72 24 32 do not dial 0 when out of France web PSH : https://www6.paca.inra.fr/psh web Inra : http://www.inra.fr/ _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Having trouble with gdata read in
Jim, Thanks, XLConnect with proper syntax works great for both types of files. — Sent from Mailbox On Thu, Mar 26, 2015 at 5:15 AM, jim holtman jholt...@gmail.com wrote: My suggestion is to use XLConnect to read the file: x - C:\\Users\\jh52822\\AppData\\Local\\Temp\\Rtmp6nVgFC\\file385c632aba3.xls require(XLConnect) Loading required package: XLConnect Loading required package: XLConnectJars XLConnect 0.2-10 by Mirai Solutions GmbH [aut], Martin Studer [cre], The Apache Software Foundation [ctb, cph] (Apache POI, Apache Commons Codec), Stephen Colebourne [ctb, cph] (Joda-Time Java library) http://www.mirai-solutions.com , http://miraisolutions.wordpress.com input - f.readXLSheet(x, 1) str(input) 'data.frame': 2266 obs. of 51 variables: $ EIA : num 34 59 87 97 108 118 123 149 150 157 ... $ Entity.Name : chr City of Abbeville City of Abbeville City of Ada Adams Electric Cooperative ... $ State: chr SC LA MN IL ... $ NERC.Region : chr SERC SPP MRO SERC ... $ Filing.Order : num 12 11 1237 392 252 ... $ Q5.MultRegion: chr ... $ Q6.OwnMeters.: chr Yes Yes Yes Yes ... $ Q7.ResMeters : num 3051 4253 857 8154 33670 ... $ Q7.ComMeters : num 531 972 132 155 1719 ... $ Q7.IntMeters : num 0 19 32 NA 626 NA 29 0 2 NA ... $ Q7.TransMeters : num 0 NA NA NA NA NA NA 0 0 NA ... $ Q7.OtherMeters : num 0 NA NA 57 NA NA NA 0 0 NA ... $ Q7...total.meters: num 3582 5244 1021 8366 36015 ... $ Q8.15Min.ResAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.ComAMI : num 0 NA NA 155 NA NA NA NA NA NA ... $ Q8.15Min.IndAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.TransAMI: num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.OtherAMI: num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.TotalAMI: num 0 0 0 155 0 0 0 0 0 0 ... $ Q8.Hourly.ResAMI : num 0 NA NA NA 16100 NA NA NA NA NA ... $ Q8.Hourly.ComAMI : num 0 NA NA NA 1600 NA NA NA NA NA ... Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker bba...@reed.edu wrote: Trying to read and clean up the FERC data on Advanced Metering infrastructure. Of course it is in XLS for the first two survey years and then converts to XLSX for the final two. Bad enough that it is all in excel, they had to change the survey design and data format as well. Still, I’m sorting through it. However, when I try and read in the 2008 data, I’m getting this error: ### Wide character in print at /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/ xls2csv.pl line 270. Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : EOF within quoted string ### Here is the code I’m running to get the data: ### install.packages(gdata) library(gdata) fileUrl - http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls download.file(fileUrl, destfile=./ami.data/ami-data2008.xls) list.files(ami.data) dateDown.2008 - date() ami.data2008 - read.xls(./ami.data/ami-data2008.xls, sheet=1, header=TRUE) ### Reviewed the data in the XLS file, and both “” and # are present within it. Don’t know how to get the read.xls to ignore them so I can read all the data into my data frame. Tried : ### ami.data2008 - read.xls(./ami.data/ami-data2008.xls, sheet=1, quote=, header=TRUE) ### And it spits out “More columns than column names” output. Been searching this, and I can find some “solutions” for read.table, but nothing specific to read.xls Many thanks, Benjamin Baker — Sent from Mailbox [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix manipulation question
Thanks Richard, This works, rather obvious now that i think of it! =) On 27/03/2015 4:30 pm, Richard M. Heiberger wrote: just reverse what you did before. newdata - data newdata[] - NA newdata[,!apply(is.na(data), 2, any)] - myfunction(data_no_NA) On Fri, Mar 27, 2015 at 1:13 AM, Jatin Kala jatin.kala...@gmail.com wrote: Hi, I've got a rather large matrix of about 800 rows and 60 columns. Each column is a time-series 800 long. Out of these 60 time series, some have missing values (NA). I want to strip out all columns that have one or more NA values, i.e., only want full time series. This should do the trick: data_no_NA - data[,!apply(is.na(data), 2, any)] I now use data_no_NA as input to a function, which returns output as a matrix of the same size as data_no_NA The trick is that i now need to put these columns back into a new 800 by 60 empty matrix, at their original locations. Any suggestions on how to do that? hopefully without having to use loops. I'm using R/3.0.3 Cheers, Jatin. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why can't I access this type?
On 2015-03-27 09:19, Patrick Connolly wrote: [...] On Sun, 22-Mar-2015 at 08:06AM -0800, John Kane wrote: | Well, first off, you have no variable called Name. You have lost | the state names as they are rownames in the matrix state.x77 and | not a variable. If you did this: all.states - within(as.data.frame(state.x77), Name - rownames(state.x77)) instead of all.states - within(as.data.frame(state.x77), state - rownames(state.x77)) Alternatively, since 'data.frame()' coerces internally, one could do all.states - data.frame(state.x77, Name = rownames(state.x77)) Henric Winell then this would worka; cold.states - all.states[all.states$Frost 150, c(Name, Frost)] Modify the above to match where my guess at what you tried is in error. HTH __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using and abusing %% (was Re: Why can't I access this type?)
On 2015-03-26 07:48, Patrick Connolly wrote: On Wed, 25-Mar-2015 at 03:14PM +0100, Henric Winell wrote: ... | Well... Opinions may perhaps differ, but apart from '%%' being | butt-ugly it's also fairly slow: Beauty, it is said, is in the eye of the beholder. I'm impressed by the way using %% reduces or eliminates complicated nested brackets. I didn't dispute whether '%%' may be useful -- I just pointed out that it is slow. However, it is only part of the problem: 'filter()' and 'select()', although aesthetically pleasing, also seem to be slow: all.states - data.frame(state.x77, Name = rownames(state.x77)) f1 - function() + all.states[all.states$Frost 150, c(Name, Frost)] f2 - function() + subset(all.states, Frost 150, select = c(Name, Frost)) f3 - function() { + filt - subset(all.states, Frost 150) + subset(filt, select = c(Name, Frost)) + } f4 - function() + all.states %% subset(Frost 150) %% + subset(select = c(Name, Frost)) f5 - function() + select(filter(all.states, Frost 150), Name, Frost) f6 - function() + all.states %% filter(Frost 150) %% select(Name, Frost) mb - microbenchmark( + f1(), f2(), f3(), f4(), f5(), f6(), + times = 1000L + ) print(mb, signif = 3L) Unit: microseconds expr min lq mean median uq max neval cld f1() 115 124 134.8812129 134 1500 1000 a f2() 128 141 147.4694145 151 1520 1000 a f3() 303 328 344.3175338 348 1740 1000 b f4() 458 494 518.0830510 523 1890 1000 c f5() 806 848 887.7270875 894 3510 1000d f6() 971 1010 1056.5659 1040 1060 3110 1000 e So, using '%%', but leaving 'filter()' and 'select()' out of the equation, as in 'f4()' is only half as bad as the full 'dplyr' idiom in 'f6()'. In this case, since we're talking microseconds, the speed-up is negligible but that *is* beside the point. In this tiny example it's not obvious but it's very clear if the objective is to sort the dataframe by three or four columns and various lots of aggregation then returning a largish number of consecutive columns, omitting the rest. It's very easy to see what's going on without the need for intermediate objects. Why are you opposed to using intermediate objects? In this case, as can be seen from 'f3()', it will also have the benefit of being faster than either '%%' or the full 'dplyr' idiom. | [...] It's no surprise that instructing a computer in something closer to human language is an order of magnitude slower. Certainly not true, at least for compiled languages. In any case, judging from off-list correspondence, it definitely came as a surprise to some R users... Given that '%%' is so heavily marketed through 'dplyr', where the latter is said to provide blazing fast performance for in-memory data by writing key pieces in C++ and a fast, consistent tool for working with data frame like objects, both in memory and out of memory, I don't think it's far-fetched to expect that it should be more performant than base R. I'm sure you'd get something even quicker using machine code. Don't be ridiculous. We're mainly discussing all.states[all.states$Frost 150, c(state, Frost)] vs. all.states %% filter(Frost 150) %% select(state, Frost) i.e., pure R code. I spend 3 or 4 orders of magnitude more time writing code than running it. You and me both. But that doesn't mean speed is of no or little importance. It's much more important to me to be able to read and modify than it is to have it run at optimum speed. Good for you. But surely, if this is your goal, nothing beats intermediate objects. And like I said, it may still be faster than the 'dplyr' idiom. | Of course, this doesn't matter for interactive one-off use. But | lately I've seen examples of the '%%' operator creeping into | functions in packages. That could indicate that %% is seductively easy to use. It's probably true that there are places where it should be done the hard way. We all know how easy it is to write ugly and sluggish code in R. But 'foo[i,j]' is neither ugly nor sluggish and certainly not the hard way. | However, it would be nice to see a fast pipe operator as part of | base R. Heck, it doesn't even have to be fast as long as it's a bit more elegant than '%%'. Henric Winell | | | Henric Winell | __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Having trouble with gdata read in
Jim, I’m not seeing the command f.readXLSheet in the documentation, nor is it executing in my code. — Sent from Mailbox On Thursday, Mar 26, 2015 at 5:15 AM, jim holtman jholt...@gmail.com, wrote: My suggestion is to use XLConnect to read the file: x - C:\\Users\\jh52822\\AppData\\Local\\Temp\\Rtmp6nVgFC\\file385c632aba3.xls require(XLConnect) Loading required package: XLConnect Loading required package: XLConnectJars XLConnect 0.2-10 by Mirai Solutions GmbH [aut], Martin Studer [cre], The Apache Software Foundation [ctb, cph] (Apache POI, Apache Commons Codec), Stephen Colebourne [ctb, cph] (Joda-Time Java library) http://www.mirai-solutions.com , http://miraisolutions.wordpress.com input - f.readXLSheet(x, 1) str(input) 'data.frame': 2266 obs. of 51 variables: $ EIA : num 34 59 87 97 108 118 123 149 150 157 ... $ Entity.Name : chr City of Abbeville City of Abbeville City of Ada Adams Electric Cooperative ... $ State : chr SC LA MN IL ... $ NERC.Region : chr SERC SPP MRO SERC ... $ Filing.Order : num 12 11 1237 392 252 ... $ Q5.MultRegion : chr ... $ Q6.OwnMeters. : chr Yes Yes Yes Yes ... $ Q7.ResMeters : num 3051 4253 857 8154 33670 ... $ Q7.ComMeters : num 531 972 132 155 1719 ... $ Q7.IntMeters : num 0 19 32 NA 626 NA 29 0 2 NA ... $ Q7.TransMeters : num 0 NA NA NA NA NA NA 0 0 NA ... $ Q7.OtherMeters : num 0 NA NA 57 NA NA NA 0 0 NA ... $ Q7...total.meters : num 3582 5244 1021 8366 36015 ... $ Q8.15Min.ResAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.ComAMI : num 0 NA NA 155 NA NA NA NA NA NA ... $ Q8.15Min.IndAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.TransAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.OtherAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.TotalAMI : num 0 0 0 155 0 0 0 0 0 0 ... $ Q8.Hourly.ResAMI : num 0 NA NA NA 16100 NA NA NA NA NA ... $ Q8.Hourly.ComAMI : num 0 NA NA NA 1600 NA NA NA NA NA ... Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker bba...@reed.edu wrote: Trying to read and clean up the FERC data on Advanced Metering infrastructure. Of course it is in XLS for the first two survey years and then converts to XLSX for the final two. Bad enough that it is all in excel, they had to change the survey design and data format as well. Still, I’m sorting through it. However, when I try and read in the 2008 data, I’m getting this error: ### Wide character in print at /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/xls2csv.pl line 270. Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : EOF within quoted string ### Here is the code I’m running to get the data: ### install.packages(gdata) library(gdata) fileUrl - http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls; download.file(fileUrl, destfile=./ami.data/ami-data2008.xls) list.files(ami.data) dateDown.2008 - date() ami.data2008 - read.xls(./ami.data/ami-data2008.xls, sheet=1, header=TRUE) ### Reviewed the data in the XLS file, and both “” and # are present within it. Don’t know how to get the read.xls to ignore them so I can read all the data into my data frame. Tried : ### ami.data2008 - read.xls(./ami.data/ami-data2008.xls, sheet=1, quote=, header=TRUE) ### And it spits out “More columns than column names” output. Been searching this, and I can find some “solutions” for read.table, but nothing specific to read.xls Many thanks, Benjamin Baker — Sent from Mailbox [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] hash - extract key values
Hi, I was trying to use hash, but can't seem to get the keys from the hash. According to the hash documentation ('hash' package pdf, the following should work: hx - hash( c('a','b','c'), 1:3 ) class(hx) [1] hash attr(,package) [1] hash hx$a [1] 1 keys(hx) Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘keys’ for signature ‘hash’ How can I get the keys for my hash? thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Categorizing by month
Hi, On Mar 27, 2015, at 12:06 PM, lychang lych...@emory.edu wrote: Hi everyone, I'm trying to categorize by month in R. How can I do this if my dates are in date/month/year form? I'm not sure about the date form you describe, but if you have the dates as POSIXct you can extract the month as character and categorize with that. x - seq(from = as.POSIXct(2000/1/1, format=%Y/%m/%d), to = as.POSIXct(2009/12/1, format=%Y/%m/%d), by = 'month') mon - format(x, '%m') xx - split(x, mon) str(xx) List of 12 $ 01: POSIXct[1:10], format: 2000-01-01 2001-01-01 2002-01-01 2003-01-01 ... $ 02: POSIXct[1:10], format: 2000-02-01 2001-02-01 2002-02-01 2003-02-01 ... $ 03: POSIXct[1:10], format: 2000-03-01 2001-03-01 2002-03-01 2003-03-01 ... $ 04: POSIXct[1:10], format: 2000-04-01 2001-04-01 2002-04-01 2003-04-01 ... $ 05: POSIXct[1:10], format: 2000-05-01 2001-05-01 2002-05-01 2003-05-01 ... $ 06: POSIXct[1:10], format: 2000-06-01 2001-06-01 2002-06-01 2003-06-01 ... $ 07: POSIXct[1:10], format: 2000-07-01 2001-07-01 2002-07-01 2003-07-01 ... $ 08: POSIXct[1:10], format: 2000-08-01 2001-08-01 2002-08-01 2003-08-01 ... $ 09: POSIXct[1:10], format: 2000-09-01 2001-09-01 2002-09-01 2003-09-01 ... $ 10: POSIXct[1:10], format: 2000-10-01 2001-10-01 2002-10-01 2003-10-01 ... $ 11: POSIXct[1:10], format: 2000-11-01 2001-11-01 2002-11-01 2003-11-01 ... $ 12: POSIXct[1:10], format: 2000-12-01 2001-12-01 2002-12-01 2003-12-01 ... Does that help? Ben Thanks, Lois -- View this message in context: http://r.789695.n4.nabble.com/Categorizing-by-month-tp4705173.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Ben Tupper Bigelow Laboratory for Ocean Sciences 60 Bigelow Drive, P.O. Box 380 East Boothbay, Maine 04544 http://www.bigelow.org __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Having trouble with gdata read in
pardon me it was my function which is just a call to readWorksheetFromFile Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Fri, Mar 27, 2015 at 3:52 PM, Benjamin Baker bba...@reed.edu wrote: Jim, I’m not seeing the command f.readXLSheet in the documentation, nor is it executing in my code. — Sent from Mailbox https://www.dropbox.com/mailbox On Thursday, Mar 26, 2015 at 5:15 AM, jim holtman jholt...@gmail.com, wrote: My suggestion is to use XLConnect to read the file: x - C:\\Users\\jh52822\\AppData\\Local\\Temp\\Rtmp6nVgFC\\file385c632aba3.xls require(XLConnect) Loading required package: XLConnect Loading required package: XLConnectJars XLConnect 0.2-10 by Mirai Solutions GmbH [aut], Martin Studer [cre], The Apache Software Foundation [ctb, cph] (Apache POI, Apache Commons Codec), Stephen Colebourne [ctb, cph] (Joda-Time Java library) http://www.mirai-solutions.com , http://miraisolutions.wordpress.com input - f.readXLSheet(x, 1) str(input) 'data.frame': 2266 obs. of 51 variables: $ EIA : num 34 59 87 97 108 118 123 149 150 157 ... $ Entity.Name : chr City of Abbeville City of Abbeville City of Ada Adams Electric Cooperative ... $ State: chr SC LA MN IL ... $ NERC.Region : chr SERC SPP MRO SERC ... $ Filing.Order : num 12 11 1237 392 252 ... $ Q5.MultRegion: chr ... $ Q6.OwnMeters.: chr Yes Yes Yes Yes ... $ Q7.ResMeters : num 3051 4253 857 8154 33670 ... $ Q7.ComMeters : num 531 972 132 155 1719 ... $ Q7.IntMeters : num 0 19 32 NA 626 NA 29 0 2 NA ... $ Q7.TransMeters : num 0 NA NA NA NA NA NA 0 0 NA ... $ Q7.OtherMeters : num 0 NA NA 57 NA NA NA 0 0 NA ... $ Q7...total.meters: num 3582 5244 1021 8366 36015 ... $ Q8.15Min.ResAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.ComAMI : num 0 NA NA 155 NA NA NA NA NA NA ... $ Q8.15Min.IndAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.TransAMI: num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.OtherAMI: num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.TotalAMI: num 0 0 0 155 0 0 0 0 0 0 ... $ Q8.Hourly.ResAMI : num 0 NA NA NA 16100 NA NA NA NA NA ... $ Q8.Hourly.ComAMI : num 0 NA NA NA 1600 NA NA NA NA NA ... Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker bba...@reed.edu wrote: Trying to read and clean up the FERC data on Advanced Metering infrastructure. Of course it is in XLS for the first two survey years and then converts to XLSX for the final two. Bad enough that it is all in excel, they had to change the survey design and data format as well. Still, I’m sorting through it. However, when I try and read in the 2008 data, I’m getting this error: ### Wide character in print at /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/ xls2csv.pl line 270. Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : EOF within quoted string ### Here is the code I’m running to get the data: ### install.packages(gdata) library(gdata) fileUrl - http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls download.file(fileUrl, destfile=./ami.data/ami-data2008.xls) list.files(ami.data) dateDown.2008 - date() ami.data2008 - read.xls(./ami.data/ami-data2008.xls, sheet=1, header=TRUE) ### Reviewed the data in the XLS file, and both “” and # are present within it. Don’t know how to get the read.xls to ignore them so I can read all the data into my data frame. Tried : ### ami.data2008 - read.xls(./ami.data/ami-data2008.xls, sheet=1, quote=, header=TRUE) ### And it spits out “More columns than column names” output. Been searching this, and I can find some “solutions” for read.table, but nothing specific to read.xls Many thanks, Benjamin Baker — Sent from Mailbox [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Re: [R] hash - extract key values
Works for me : library(hash) hash-2.2.6 provided by Decision Patterns hx - hash( c('a','b','c'), 1:3 ) class(hx) [1] hash attr(,package) [1] hash hx$a [1] 1 keys(hx) [1] a b c Maybe restart your session? Clear your workspace? Upgrade? B. On Mar 27, 2015, at 7:39 PM, Brian Smith bsmith030...@gmail.com wrote: Hi, I was trying to use hash, but can't seem to get the keys from the hash. According to the hash documentation ('hash' package pdf, the following should work: hx - hash( c('a','b','c'), 1:3 ) class(hx) [1] hash attr(,package) [1] hash hx$a [1] 1 keys(hx) Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘keys’ for signature ‘hash’ How can I get the keys for my hash? thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vif in package car: there are aliased coefficients in the model
Dear Rodolfo, It's apparently the case that at least one of the columns of the model matrix for your model is perfectly collinear with others. There's not nearly enough information here to figure out exactly what the problem is, and the information that you provided certainly falls short of allowing me or anyone else to reproduce your problem and diagnose it properly. It's not even clear from your message exactly what the structure of the model is, although localizacao is apparently a factor with 3 levels. If you look at the summary() output for your model or just print it, you should at least see which coefficients are aliased, and that might help you understand what went wrong. I hope this helps, John --- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rodolfo Pelinson Sent: March-27-15 3:07 PM To: r-help@r-project.org Subject: [R] vif in package car: there are aliased coefficients in the model Hello. I'm trying to use the function vif from package car in a lm. However it returns the following error: Error in vif.default(lm(MDescores.sitescores ~ hidroperiodo + localizacao + : there are aliased coefficients in the model When I exclude any predictor from the model, it returns this warning message: Warning message: In cov2cor(v) : diag(.) had 0 or NA entries; non-finite result is doubtful When I exclude any other predictor from the model vif finally works. I can't figure it out whats the problem. This are the results that R returns me: vif(lm(MDescores.sitescores ~ hidroperiodo + localizacao + area + profundidade + NTVM + NTVI + PCs...c.1.., data = MDVIF)) Error in vif.default(lm(MDescores.sitescores ~ hidroperiodo + localizacao + : there are aliased coefficients in the model vif(lm(MDescores.sitescores ~ localizacao + area + profundidade + NTVM + NTVI + PCs...c.1.., data = MDVIF)) GVIF Df GVIF^(1/(2*Df)) localizacao NaN 2 NaN area NaN 1 NaN profundidade NaN 1 NaN NTVM NaN 1 NaN NTVI NaN 1 NaN PCs...c.1.. NaN 1 NaN Warning message: In cov2cor(v) : diag(.) had 0 or NA entries; non-finite result is doubtful Thanks. -- Rodolfo Mei Pelinson. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using and abusing %% (was Re: Why can't I access this type?)
I didn't dispute whether '%%' may be useful -- I just pointed out that it is slow. However, it is only part of the problem: 'filter()' and 'select()', although aesthetically pleasing, also seem to be slow: all.states - data.frame(state.x77, Name = rownames(state.x77)) f1 - function() + all.states[all.states$Frost 150, c(Name, Frost)] f2 - function() + subset(all.states, Frost 150, select = c(Name, Frost)) f3 - function() { + filt - subset(all.states, Frost 150) + subset(filt, select = c(Name, Frost)) + } f4 - function() + all.states %% subset(Frost 150) %% + subset(select = c(Name, Frost)) f5 - function() + select(filter(all.states, Frost 150), Name, Frost) f6 - function() + all.states %% filter(Frost 150) %% select(Name, Frost) mb - microbenchmark( + f1(), f2(), f3(), f4(), f5(), f6(), + times = 1000L + ) print(mb, signif = 3L) Unit: microseconds expr min lq mean median uq max neval cld f1() 115 124 134.8812129 134 1500 1000 a f2() 128 141 147.4694145 151 1520 1000 a f3() 303 328 344.3175338 348 1740 1000 b f4() 458 494 518.0830510 523 1890 1000 c f5() 806 848 887.7270875 894 3510 1000d f6() 971 1010 1056.5659 1040 1060 3110 1000 e So, using '%%', but leaving 'filter()' and 'select()' out of the equation, as in 'f4()' is only half as bad as the full 'dplyr' idiom in 'f6()'. In this case, since we're talking microseconds, the speed-up is negligible but that *is* beside the point. When benchmarking it's important to consider both the relative and absolute difference and to think about how the cost scales as the data grows - the cost of using using %% is fixed, and 500 µs doesn't seem like a huge performance penalty to me. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.