[R] tidyquant error downloading symbols for Index
Hi R Helpers, I recently tried to take advantage of the ability to download all the tickers in the S 500 using the functionality of tidyquant, but it threw an error. For summary, the set of commands that I ran was library(tidyquant) tq_index_options() tq_index("SP500") sessionInfo() R feedback including error message and sessionInfo are provided below. Guidance would be appreciated. --John J. Sparks, Ph.D. > library(tidyquant) Loading required package: lubridate Attaching package: lubridate The following object is masked from package:base: date Loading required package: PerformanceAnalytics Loading required package: xts Loading required package: zoo Attaching package: zoo The following objects are masked from package:base: as.Date, as.Date.numeric Package PerformanceAnalytics (1.4.3541) loaded. Copyright (c) 2004-2014 Peter Carl and Brian G. Peterson, GPL-2 | GPL-3 http://r-forge.r-project.org/projects/returnanalytics/ Attaching package: PerformanceAnalytics The following object is masked from package:graphics: legend Loading required package: quantmod Loading required package: TTR Version 0.4-0 included new data defaults. See ?getSymbols. Learn from a quantmod author: https://www.datacamp.com/courses/importing-and-managing-financial-data-in-r Loading required package: tidyverse Loading tidyverse: ggplot2 Loading tidyverse: tibble Loading tidyverse: tidyr Loading tidyverse: readr Loading tidyverse: purrr Loading tidyverse: dplyr Conflicts with tidy packages - as.difftime(): lubridate, base date():lubridate, base filter(): dplyr, stats first(): dplyr, xts intersect(): lubridate, base lag(): dplyr, stats last():dplyr, xts setdiff(): lubridate, base union(): lubridate, base Attaching package: tidyquant The following object is masked from package:dplyr: as_tibble The following object is masked from package:tibble: as_tibble There were 14 warnings (use warnings() to see them) > tq_index_options() [1] "RUSSELL1000" "RUSSELL2000" "RUSSELL3000" "DOW" "DOWGLOBAL" [6] "SP400" "SP500" "SP600" "SP1000" > tq_index("SP500") Getting holdings for SP500 # A tibble: 0 x 0 Warning message: In tq_index("SP500") : Error at SP500 during download. Error: .onLoad failed in loadNamespace() for 'rJava', details: call: fun(libname, pkgname) error: No CurrentVersion entry in Software/JavaSoft registry! Try re-installing Java and make sure R and Java have matching architectures. > sessionInfo() R version 3.3.2 (2016-10-31) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] tidyquant_0.5.3 dplyr_0.7.2 [3] purrr_0.2.3 readr_1.1.1 [5] tidyr_0.6.3 tibble_1.3.3 [7] ggplot2_2.2.1 tidyverse_1.1.1 [9] quantmod_0.4-10 TTR_0.23-2 [11] PerformanceAnalytics_1.4.3541 xts_0.10-0 [13] zoo_1.8-0 lubridate_1.6.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.12 cellranger_1.1.0 plyr_1.8.4 bindr_0.1 [5] forcats_0.2.0tools_3.3.2 jsonlite_1.5 nlme_3.1-131 [9] gtable_0.2.0 lattice_0.20-35 pkgconfig_2.0.1 rlang_0.1.1 [13] psych_1.7.5 curl_2.8.1 parallel_3.3.2 haven_1.1.0 [17] bindrcpp_0.2 xml2_1.1.1 httr_1.2.1 stringr_1.2.0 [21] hms_0.3 grid_3.3.2 glue_1.1.1 R6_2.2.2 [25] Quandl_2.8.0 readxl_1.0.0 foreign_0.8-69 modelr_0.1.1 [29] reshape2_1.4.2 magrittr_1.5 scales_0.4.1 rvest_0.3.2 [33] assertthat_0.2.0 mnormt_1.5-5 colorspace_1.3-2 stringi_1.1.5 [37] lazyeval_0.2.0 munsell_0.4.3broom_0.4.2 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looping Through QuantMod Objects
Dear R Helpers, I have run into a problem trying to perform a number of actions on a set of quantmod data objects through a loop and I am hoping that this is an easy problem for someone else as opposed to very difficult for me. The example task is to get the first three objects of the quarterly balance sheet for a number of companies from the getFinancials object and put them together into a single file. I can do this one by one, but if I try to build a loop and use the get function then the results are not anticipated and leave me baffled. If I do it one at a time all is good. require(quantmod) getFinancials("AAPL") getFinancials("IBM") getFinancials("MSFT") items=c("Cash & Equivalents","Short Term Investments","Cash and Short Term Investments") HoldQuart<-AAPL.f$BS$Q CashHold<-subset(HoldQuart,rownames(HoldQuart) %in% items) CashT<-t(CashHold) Cashdf<-data.frame(CashT) Cashdf$tic<-"AAPL" AAPL.c<-Cashdf HoldQuart<-IBM.f$BS$Q CashHold<-subset(HoldQuart,rownames(HoldQuart) %in% items) CashT<-t(CashHold) Cashdf<-data.frame(CashT) Cashdf$tic<-"IBM" IBM.c<-Cashdf HoldQuart<-MSFT.f$BS$Q CashHold<-subset(HoldQuart,rownames(HoldQuart) %in% items) CashT<-t(CashHold) Cashdf<-data.frame(CashT) Cashdf$tic<-"MSFT" MSFT.c<-Cashdf BigCash<-rbind(AAPL.c, IBM.c, MSFT.c) #setwd<-("C:/Users/HP USER/Documents") #write.csv(BigCash,file="CashList.csv") When I try to process through this using a loop, however, things go south pretty quickly. tickerlist<-ls(pattern="^[A-Z]+\\.f") for( i in 1:1) { test<-get(paste0(tickerlist[i],"$BS$Q")) } Error in get(paste0(tickerlist[i], "$BS$Q")) : object 'AAPL.f$BS$Q' not found So I tried to break it up into smaller steps, but the resulting matrix seems to have lost its structure (see below). If someone could help me out, I sure would appreciate. Thanks. --John Sparks tickerlist<-ls(pattern="^[A-Z]+\\.f") for( i in 1:1) { HoldFin<-get(tickerlist[i]) BSQ<-as.matrix(paste0(HoldFin,"$BS$Q")) } BSQ [1,] "list(Q = c(52896, NA, 52896, 32305, 20591, 3718, 2776, NA, NA, NA, NA, 38799, 14097, NA, NA, -165, 14684, 11029, NA, NA, 11029, NA, NA, NA, 11029, NA, 11029, 11029, NA, NA, NA, NA, 5261.69, 2.1, NA, 0.57, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2.1, 78351, NA, 78351, 48175, 30176, 3946, 2871, NA, NA, NA, NA, 54992, 23359, NA, NA, 122, 24180, 17891, NA, NA, 17891, NA, NA, NA, 17891, NA, 17891, 17891, NA, NA, NA, NA, 5327.99, 3.36, NA, 0.57, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3.36, \n46852, NA, 46852, 29039, 17813, 3482, 2570, NA, NA, NA, NA, 35091, 11761, NA, NA, -159, 12188, 9014, NA, NA, 9014, NA, NA, NA, 9014, NA, 9014, 9014, NA, NA, NA, NA, 5393.33, 1.67, NA, 0.57, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1.67, 42358, NA, 42358, 26252, 16106, 3441, 2560, NA, NA, NA, NA, 32253, 10105, NA, NA, -263, 10469, 7796, NA, NA, 7796, NA, NA, NA, 7796, NA, 7796, 7796, NA, NA, NA, NA, 5472.78, 1.42, NA, 0.57, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1.42, 50557, NA, 50557, \n30636, 19921, 3423, 2511, NA, NA, NA, NA, 36570, 13987, NA, NA, -510, 14142, 10516, NA, NA, 10516, NA, NA, NA, 10516, NA, 10516, 10516, NA, NA, NA, NA, 5540.89, 1.9, NA, 0.52, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1.9), A = c(215639, NA, 215639, 131376, 84263, 14194, 10045, NA, NA, NA, NA, 155615, 60024, NA, NA, -1195, 61372, 45687, NA, NA, 45687, NA, NA, NA, 45687, NA, 45687, 45687, NA, NA, NA, NA, 5500.28, 8.31, NA, 2.18, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 8.31, 233715, NA, \n233715, 140089, 93626, 14329, 8067, NA, NA, NA, NA, 162485, 71230, NA, NA, -903, 72515, 53394, NA, NA, 53394, NA, NA, NA, 53394, NA, 53394, 53394, NA, NA, NA, NA, 5793.07, 9.22, NA, 1.98, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 9.22, 182795, NA, 182795, 112258, 70537, 11993, 6041, NA, NA, NA, NA, 130292, 52503, NA, NA, -311, 53483, 39510, NA, NA, 39510, NA, NA, NA, 39510, NA, 39510, 39510, NA, NA, NA, 0, 6122.66, 6.45, NA, 1.81, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 6.45, 170910, \nNA, 170910, 106606, 64304, 10830, 4475, NA, NA, NA, NA, 121911, 48999, NA, NA, -24, 50155, 37037, NA, NA, 37037, NA, NA, NA, 37037, NA, 37037, 37037, NA, NA, NA, 0, 6521.5, 5.68, NA, 1.63, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 5.68))$BS$Q" [2,] "list(Q = c(NA, 59501, 67101, 11579, NA, 20612, 2910, NA, 11367, 101990, 65124, -37961, 5473, 2617, 189740, 7549, 334532, 28573, 21665, 9992, 3999, 9113, 73342, 84531, NA, 84531, 98522, 28226, NA, 14351, 200450, NA, NA, 33579, NA, 100925, NA, -902, 134082, 334532, NA, 5205.81, NA, 51093, 60452, 14057, NA, 27977, 2712, NA, 12191, 103332, 62759, -36249, 5423, 2848, 185638, 7390, 331141, 38510, 21895, 10493, 3499, 9733, 84130, 73557, NA, 73557, 87549, 26948, NA, 14116, 198751, NA, NA, 32144, NA, 11, \nNA, -1567, 132390, 331141, NA, 5255.42, NA, 58554, 67155, 15754, NA, 29299, 2132, NA, 8283, 106869, 61245, -34235, 5414, 3206, 170430, 8757, 321686, 37294, 20951, 8105, 3500, 9156, 79006, 75427, NA, 75427, 87032, 26019, NA, 12985,
Re: [R] Setting .Rprofile for RStudio on a Windows 7 x64bit
Bruce, Do you think that you could post the final solution to the problem? That way it would be stored with this thread and the next person who has the same problem would be able to locate the FINAL solution. --JJS On Mon, April 17, 2017 12:47 pm, BR_email wrote: > TO _ALL_: > THANK YOU. THANK YOU. THANK YOU. > After hours, and hours, and hours, and ... , and hours: Success. > To all who helped, thanks. > My quest was minor, but major for me, as I learn from the path of one, > whether big or small begets another. > > I never look down at anyone, except to help him/her up. > > With gratitude, > Bruce > > Bruce Ratner, Ph.D. > The Significant Statistician⢠> (516) 791-3544 > Statistical Predictive Analtyics -- www.DMSTAT1.com > Machine-Learning Data Mining and Modeling -- www.GenIQ.net > > > Peter Dalgaard wrote: >>> On 17 Apr 2017, at 19:01 , BR_emailwrote: >>> >>> Berend: Something looks good, but RStudio still Rprofile still doees >>> not affect the launch. >>> source(echo=TRUE, "C:/Users/BruceRatner/Documents/.Rprofile.site") options(prompt="R> ") set.seed(12345) rm(list=ls()) >>> R> >>> >>> >>> Bruce Ratner, Ph.D. >>> The Significant Statistician⢠>>> (516) 791-3544 >>> Statistical Predictive Analtyics -- www.DMSTAT1.com >>> Machine-Learning Data Mining and Modeling -- www.GenIQ.net >>> >>> Berend Hasselman wrote: source(echo=TRUE, ""C:/Users/BruceRatner/Documents/.Rprofile.site") >> According to the gospel of St.Henrik, that filename is wrong, and >> possibly the directory too. >> >> So try his suggestions. What is the output (show us!) of >> >> normalizePath("./.Rprofile") >> normalizePath("~/.Rprofile") >> >> Assuming that the former is >> >> "C:/Users/BruceRatner/Documents/.Rprofile" >> >> you could try renaming the .Rprofile.site file to that. If need be, use >> file.rename, as in >> >> file.rename(from="C:/Users/BruceRatner/Documents/.Rprofile.site", >> to="C:/Users/BruceRatner/Documents/.Rprofile") >> >> (and restart, obviously). >> >> [I wouldn't set the seed in a .Rprofile file, nor would I use rm() >> there, but that is a different kettle of fish.] >> > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cforest Single Tree Output for Categorical Variable
Hello R Helpers, I am building a random forest using the cforest method in the party package. I then want to have a look at the characteristics of a few of the trees. I get the output for one of the trees by executing pt <- party:::prettytree(cforest@ensemble[[3]], names(cforest@data@get("input"))) pt The first splitting variable is a categorical variable (here named cat, which contains value 0 through 9 and is a factor), but the output does not specify which values went into which part of the tree: 1) cat == {}; criterion = 1, statistic = 32.792 Can anyone help me to get the detail on this splitting variable to appear in the output? I regret that I cannot send a reproducible example because the data is proprietary. I will try to work up an example with a public data set that has the same problem. Any help would be much appreciated. Best wishes, --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Pull Stock Symbol Out of String
Dear R Helpers, My regex skills are beginner to intermediate and banging around the web has not resulted in a solution to the problem below so I hope that one of you who has mad skills can help me out. I want to extract the stock ticker--AMT-- out of the string American Tower Corporation (REIT) (AMT) The presence of the other parenthetical text (REIT) makes this difficult. Please note that the string may or may not have a interfering set of characters such as the (REIT) so the solution needs to be generalizable to the last set of characters that are contained in parentheses in the larger string. So an example of a string without the interfering (REIT) would be Aetna Inc. (AET) Your assistance would be very much appreciated. --John Sparks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grap Element from Web Page
Thanks, the second approach worked fine on Windows. --JJS On Thu, August 15, 2013 8:38 am, Jeffrey Dick wrote: Sorry, I can't generate an error when running those commands in R on Linux 64-bit. But if I move to Windows (R version 3.0.1, XML_3.98-1.1), I get a different error ... require(XML) Loading required package: XML doc - htmlTreeParse( http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany ) node - getNodeSet(doc[[1]], //link[@rel='alternate'] ) Input is not proper UTF-8, indicate encoding ! Bytes: 0xC2 0x0A 0x20 0x20 Error: 1: Input is not proper UTF-8, indicate encoding ! Bytes: 0xC2 0x0A 0x20 0x20 node - getNodeSet(doc, //link[@rel='alternate'] ) Error in UseMethod(xpathApply) : no applicable method for 'xpathApply' applied to an object of class XMLDocumentContent ... note that I've tried both doc[[1]] and doc in the function call. Also, only the XML library is required. I'm not sure what's going on with the character encoding error, might be my system settings. Reading the help page (?htmlTreeParse) provides a clue to use the htmlParse function instead, equivalent to setting the useInternalNodes parameter to TRUE ... These can then be searched using XPath expressions via 'xpathApply' and 'getNodeSet'. That seems to be relevant to this case. doc - htmlParse( http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany ) node - xpathSApply(doc, //link[@rel='alternate'], xmlAttrs) node [,1] rel alternate type application/atom+xml title ATOM href /cgi-bin/browse-edgar?action=getcompanyCIK=789019type=dateb=owner=excludecount=40output=atom strsplit(strsplit(node[[4]], CIK=)[[1]][2], type)[[1]][1] [1] 789019 Perhaps that approach is less prone to error. On Thu, Aug 15, 2013 at 12:48 PM, Sparks, John James jspa...@uic.eduwrote: Thanks so much for looking into this for me. Unfortunately, I get an error when I execute your code. Is there a library that you loaded that I haven't? require(scrapeR) require(XML) require(RCurl) doc-htmlTreeParse( http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany ) node - getNodeSet(doc[[1]], //link[@rel='alternate'] ) Error in UseMethod(xpathApply) : no applicable method for 'xpathApply' applied to an object of class character Guidance would be much appreciated. --JJS On Wed, August 14, 2013 4:19 am, Jeffrey Dick wrote: Hi, There are many occurrences of the CIK number in the page source. This pulls out the first node containing it: node - getNodeSet(doc[[1]], //link[@rel='alternate'] ) From there you can extract the number. Here's one way to do it. strsplit(strsplit(unlist(node)[[5]], CIK=)[[1]][2], type)[[1]][1] Jeff On Wed, Aug 14, 2013 at 1:34 PM, Sparks, John James jspa...@uic.edu wrote: Dear R Helpers, I would like to pull the CIK number from the web page http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany If you put this web page into your browser you will see the CIK number in red on the left side of the page near the top. When I try the basic require(scrapeR) require(XML) require(RCurl) doc -htmlTreeParse( http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany ) str(doc) I get a large number of items in the data frame that I don't know how to interpret. Both tables - readHTMLTable(doc) and list-xmlToList(doc) result in errors. Any (positive) guidance would be much appreciated. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grap Element from Web Page
Thanks so much for looking into this for me. Unfortunately, I get an error when I execute your code. Is there a library that you loaded that I haven't? require(scrapeR) require(XML) require(RCurl) doc-htmlTreeParse(http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany;) node - getNodeSet(doc[[1]], //link[@rel='alternate'] ) Error in UseMethod(xpathApply) : no applicable method for 'xpathApply' applied to an object of class character Guidance would be much appreciated. --JJS On Wed, August 14, 2013 4:19 am, Jeffrey Dick wrote: Hi, There are many occurrences of the CIK number in the page source. This pulls out the first node containing it: node - getNodeSet(doc[[1]], //link[@rel='alternate'] ) From there you can extract the number. Here's one way to do it. strsplit(strsplit(unlist(node)[[5]], CIK=)[[1]][2], type)[[1]][1] Jeff On Wed, Aug 14, 2013 at 1:34 PM, Sparks, John James jspa...@uic.edu wrote: Dear R Helpers, I would like to pull the CIK number from the web page http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany If you put this web page into your browser you will see the CIK number in red on the left side of the page near the top. When I try the basic require(scrapeR) require(XML) require(RCurl) doc -htmlTreeParse( http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany ) str(doc) I get a large number of items in the data frame that I don't know how to interpret. Both tables - readHTMLTable(doc) and list-xmlToList(doc) result in errors. Any (positive) guidance would be much appreciated. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Grap Element from Web Page
Dear R Helpers, I would like to pull the CIK number from the web page http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany If you put this web page into your browser you will see the CIK number in red on the left side of the page near the top. When I try the basic require(scrapeR) require(XML) require(RCurl) doc -htmlTreeParse(http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFTFind=Searchowner=excludeaction=getcompany;) str(doc) I get a large number of items in the data frame that I don't know how to interpret. Both tables - readHTMLTable(doc) and list-xmlToList(doc) result in errors. Any (positive) guidance would be much appreciated. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] T test for Single Mean
Dear R Helpers, I am stuck on some syntax and I thought that I was following one of the examples that I found out there quite faithfully. I just want to know how to do a t test on a single mean for whether or not it is greater than a specific value. So I am using the data set sleep and I want to know if the mean of extra is greater then zero. I was under the impression that the syntax is t.test(sleep$extra,mu=0,greater) but I get the error message Error in t.test.default(sleep$extra, mu = 0, greater) : not enough 'y' observations I have tried this on a few other data sets that have more then 20 observations and I get the same error. I looked in the documentation but the examples are for the comparison of two groups, not a single group mean. Any help would be most appreciated. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Refer to Data Frame Name Inside a List
Dear R Helpers, I have a fairly complicated list of data frames. To give you an idea of the structure, the top of the str output is shown below. How do I refer to the data.frame name for each data.frame in the list? That is, how can I pull the terms Advertising2007, AirFreightDelivery2007, Apparel2007 etc. out of the list? I need them to keep track of correlations that I am doing inside each data frame of the list. Apologies for not sending a reproducible example. I am hoping that someone knows this off the top of their head. --John Sparks str(ResList) List of 60 $ Advertising2007 :'data.frame': 21 obs. of 10 variables: ..$ RFPred : num [1:21] -0.01749 -0.00801 -0.01155 -0.01494 -0.03715 ... ..$ marsPred : num [1:21] 0.0901 0.0127 0.0616 0.0618 -0.0559 ... ..$ GainRepAft3 : num [1:21] -0.0673 -0.0183 -0.2353 0.0294 -0.059 ... ..$ Industry : chr [1:21] Advertising2007 Advertising2007 Advertising2007 Advertising2007 ... ..$ dateavail: Factor w/ 346 levels 2008-02-01,2008-02-13,..: 18 4 14 12 13 19 1 15 17 8 ... ..$ FinYearEnd : Factor w/ 12 levels 2007-12-01,2007-03-01,..: 1 1 1 1 1 1 1 1 1 1 ... ..$ GainAft1Aft30: num [1:21] -0.2376 -0.1384 -0.1176 0.0145 0.0527 ... ..$ GainAft1Aft60: num [1:21] -0.36212 -0.17801 -0.23529 -0.00501 -0.27414 ... ..$ GainAft1Aft90: num [1:21] -0.516 -0.203 -0.176 0.024 -0.241 ... ..$ groups : Factor w/ 40 levels -0.04013239,..: 4 11 8 6 1 1 10 13 2 5 ... $ AirFreightDelivery2007 :'data.frame': 20 obs. of 10 variables: ..$ RFPred : num [1:20] 0.00322 -0.00351 0.034 0.01095 0.02237 ... ..$ marsPred : num [1:20] -0.013 -0.109 0.0662 0.0353 0.0662 ... ..$ GainRepAft3 : num [1:20] 0.0344 -0.0659 0.054 0.045 0.0266 ... ..$ Industry : chr [1:20] AirFreightDelivery2007 AirFreightDelivery2007 AirFreightDelivery2007 AirFreightDelivery2007 ... ..$ dateavail: Factor w/ 346 levels 2008-02-01,2008-02-13,..: 22 10 26 33 35 32 25 23 31 10 ... ..$ FinYearEnd : Factor w/ 12 levels 2007-12-01,2007-03-01,..: 2 1 1 1 1 1 1 3 1 1 ... ..$ GainAft1Aft30: num [1:20] -0.0656 -0.1539 -0.1002 -0.0694 -0.4101 ... ..$ GainAft1Aft60: num [1:20] -0.133 -0.141 -0.242 -0.691 -0.212 ... ..$ GainAft1Aft90: num [1:20] -0.0523 -0.0673 -0.1793 -0.6875 -0.187 ... ..$ groups : Factor w/ 40 levels -0.04013239,..: 24 16 39 32 37 21 17 30 35 37 ... $ Apparel2007 :'data.frame': 28 obs. of 10 variables: ..$ RFPred : num [1:28] 0.011439 0.021311 0.014564 0.018168 -0.000892 ... ..$ marsPred : num [1:28] -0.001463 0.0345 0.027227 -0.000129 -0.006483 ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selecting A List of Columns
Dear R Helpers, I need help with a slightly unusual situation in which I am trying to select some columns from a data frame. I know how to use the subset statement with column names as in: x=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) all.cols-colnames(x) to.keep-all.cols[1:2] Kept-subset(x,select=to.keep) Kept However, if I want to select some columns based on a selection of the most important variables from a random forest then I find myself stuck. The example below demonstrates the problem. library(randomForest) data(mtcars) mtcars.rf - randomForest(mpg ~ ., data=mtcars,importance=TRUE) Importance-data.frame(mtcars.rf$importance) Importance MSEImportance-head(Importance[order(Importance$X.IncMSE, decreasing=TRUE),],3) MSEVars-row.names(MSEImportance) MSEVars-data.frame(MSEVars,stringsAsFactors = FALSE) colnames(MSEVars)-Vars NodeImportance-head(Importance[order(Importance$IncNodePurity,decreasing=TRUE),], 3) NodeVars-row.names(NodeImportance) NodeVars-data.frame(NodeVars,stringsAsFactors = FALSE) colnames(NodeVars)-Vars ImportantVars-rbind(MSEVars,NodeVars) ImportantVars-unique(ImportantVars) nrow(ImportantVars) ImportantVars-as.character(ImportantVars) ImportantVars CarsVarsKept-subset(mtcars,select=ImportantVars) Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected Any help on how to select these columns from the data frame would be most appreciated. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] To List or Not To List
Dear R Helpers, A few weeks ago I asked for some help on how to accomplish modifications to data in a set of data frames. As part of that request I mentioned that I realized that one way to accomplish my goal was to put the data frames together in a list but that I was looking for a way to do it with data frames and a loop because I believe the better thing is to work df by df for my particular situation. A couple of posters asked me to provide more detail as to what is it about my situation that made data frame alterations in a loop more appropriate vs. a list. Life and the scoring of many exams intervened in the last several days, but with grades filed I am now able to return to this issue. First, let me provide some particulars regarding my situation. I am working with 5,863 data frames, each with 7 columns and between 5,686 and 21 rows of data. Each data frame contains the daily stock price history for an equity traded on one of the U.S. markets. I wanted to get an historical price change for each of the days on the file. If one were working with a single data from for IBM then the command is if(nrow(IBM)129){IBM$Mo129-ROC(IBM[,Close],n=129)} to get the Rate Of Change of the stock price relative to 129 trading days ago. This function is in the TTR library which is called by quantmod. So it strikes me that in one sense this is a simple fixed costs vs. variable costs question: Is it worth it to assemble the data frames into a list and then process them, putatively more quickly than going data frame by data frame, which does not require the up-front assembly. A look at the empirical results shows executing this set of functions df by df consumes 44.15 of elapsed time. ptm - proc.time() ROCFunc-function(DF){ + if(nrow(DF)129){DF$Mo129-ROC(DF[,Close],n=129)} + if(nrow(DF) 65){DF$Mo65 -ROC(DF[,Close],n= 65)} + if(nrow(DF) 21){DF$Mo21 -ROC(DF[,Close],n= 21)} + if(nrow(DF) 10){DF$Mo10 -ROC(DF[,Close],n= 10)} + if(nrow(DF) 5){DF$Mo5 -ROC(DF[,Close],n= 5)} + return(DF) + } for(i in symbols) assign( i, ROCFunc(get(i))) time-proc.time() - ptm time user system elapsed 43.520.58 44.15 Using a list approach, the assembly of the list requires 8.44 and then the processing requires 39.20 totaling 47.64. So a slight win for the data frame approach. [Continued] ptm - proc.time() list.object - quote(list()) list.object[ symbols ] - lapply( symbols, as.name ) biglist-eval(list.object) for (i in seq_along(biglist)) + { +biglist[[i]]-subset(biglist[[i]],select=-c(Open,High,Low)) +#biglist[[i]]-biglist[[i]][as.character(biglist[[i]]$Index) 2007-01-01, ] +#biglist[[i]]$Index- as.Date(biglist[[i]]$Index,format=%Y-%m-%d) +#biglist[[i]]-xts(biglist[[i]][,-1],biglist[[i]][,1]) +#biglist[[i]]-biglist[[i]]['2005-01-01/'] +} proc.time() - ptm user system elapsed 8.030.408.44 ptm - proc.time() rm(list=ls(pattern=^[A-Z])) for (i in seq_along(biglist)) + { +if(nrow(biglist[[i]])180) + { + biglist[[i]][[Mo180]]-ROC(biglist[[i]][[Close]],n=129) + } + if(nrow(biglist[[i]])90) + { + biglist[[i]][[Mo90]] -ROC(biglist[[i]][[Close]],n=65) + } + if(nrow(biglist[[i]])30) + { + biglist[[i]][[Mo30]] -ROC(biglist[[i]][[Close]],n=21) + } + if(nrow(biglist[[i]])10) + { + biglist[[i]][[Mo10]] -ROC(biglist[[i]][[Close]],n=10) + } + if(nrow(biglist[[i]])5) + { + biglist[[i]][[Mo5]] -ROC(biglist[[i]][[Close]],n=5) + } + } proc.time() - ptm user system elapsed 39.190.00 39.20 The larger issue for me, however, is recovering to the set of data frames with the new calculations completed inside each one. For this I used the following syntax that I gleaned from the web: data.frame(lapply(data.frame(t(sapply(biglist, `[`))), unlist)) But this results in Error in FUN(X[[2003L]], ...) : promise already under evaluation: recursive default argument reference or earlier problems? Calls: data.frame - lapply - FUN Execution halted In previous executions I have seen the all to familiar error message 'unable to allocate a vector of size...' indicating to me that I have run out of usable RAM at this last step. I have 8G on my machine, so RAM constraints are rarely a problem. This is the main reason that I said that I believed that a list approach was not the best for my situation: going that route will not result in a finished job. I hope that this demonstration answers the questions of the posters who posed the question and can potentially serve to provide an example to those who, like me recently, are beginning to explore how to execute on multiple data frames. I hope that this outweighs the fact that I have not asked a specific question nor provided re-producible
[R] Adding Column to Data Frames Using a Loop
Dear R Helpers, I am trying to do calculations on multiple data frames and do not want to create a list of them to go through each one. I know that lists have many wonderful advantages, but I believe the better thing is to work df by df for my particular situation. For background, I have already received some wonderful help on how to handle some situations, such as removing columns: x=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) y=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) z=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) for(i in letters[24:26] ) assign( i, subset(get(i), select=-c(V1)) ) x y z And I figured how to do further processing using functions: myfunc-function(DF){ DF$V4-DF$V2+DF$V3 return(DF) } for(i in letters[24:26] ) assign( i, myfunc(get(i))) But if I want to do a rather simple calculation and store it as a new column in each data frame such as x$V4-x$V2+x$V3 y$V4-y$V2+y$V3 z$V4-z$V2+z$V3 is there a simpler way to do this than building a function as shown above? I tried a few variations of i-24 assign(paste(i,$V4,sep=),paste(get(i),$V2+,get(i),$V3,sep=)) but keep getting syntax errors. If anyone could help with the syntax as to how to accomplish the calculation above without building a function, I would really appreciate it. --John Sparks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Function for Data Frame
Dear R Helpers, I have about 20 data frames that I need to do a series of data scrubbing steps to. I have the list of data frames in a list so that I can use lapply. I am trying to build a function that will do the data scrubbing that I need. However, I am new to functions and there is something fundamental that I am not understanding. I use the return function at the end of the function and this completes the data processing specified in the function, but leaves the data frame that I want changed unaffected. How do I get my function to apply its results to the data frame in question instead of simply displaying the results to the screen? Any helpful guidance would be most appreciated. --John Sparks x=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) myfunc-function(DF){ DF-subset(DF,select=-c(V1)) return(DF) } myfunc(x) #How to get this change to data frame x? #And preferrably not send the results to the screen? x __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Remove Rows Based on Factor
Dear R Helpers, I did a search for deleting rows based on conditions but wasn't able to find an example that addressed the error that I am getting. I am hoping that this is a simple syntax phenomenon that somebody else knows off the top of their head. My apologies for not providing a reproducible example but I think that the information given will allow someone to give me a hint. I want to delete the rows of the data frame ZZ where Index is earlier that Jan 1 of 2007. That Index column is a factor. When I tired a couple of different methods, I got the error shown below. Can anybody tell me what I am doing wrong? I would really appreciate it. --John Sparks str(ZZ) 'data.frame': 1584 obs. of 7 variables: $ Index : Factor w/ 1583 levels 2006-04-07,2006-04-10,..: 1 2 3 4 5 6 7 8 9 10 ... $ Open: num 17.5 17.6 16.8 17.2 17 ... $ High: num 18.2 17.6 17.2 17.2 17.1 ... $ Low : num 17.3 16.8 16.8 16.8 16.6 ... $ Close : num 17.5 16.8 17.1 16.8 16.7 ... $ Volume : num 23834500 2916000 1453700 991400 967400 ... $ Adjusted: num 16.8 16.2 16.4 16.2 16 ... test-ZZ[ZZ$Index2007-01-01,] Warning message: In Ops.factor(ZZ$Index, 2007-01-01) : not meaningful for factors test-subset(ZZ,Index2007-01-01) Warning message: In Ops.factor(Index, 2007 - 1 - 1) : not meaningful for factors __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create New Column Inside Data Frame for Many Data Frames
Dear R Helpers, I have a large number of data frames and I need to create a new column inside each data frame. Because there is a large number, I need to loop through this, but I don't know the syntax of assigning a new column name dynamically. Below is a simple example of what I need to do. Assume that I have to do this for all 26 letters and you should see the form of the problem. Any help would be much appreciated. If more information is needed, please let me know. Many thanks. --John Sparks library(quantmod) A - data.frame(population=c(100, 300, 5000, 2000, 900, 2500)) A$Rate-ROC(A[population]) B - data.frame(population=c(200, 300, 4000, 3000, 2000, 500)) B$Rate-ROC(B[population]) letters-c(A,B) length(letters) #for (i in letters){ # HELP! #} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Try Giving Invalid Argument Type Error
Dear R Helpers, I am getting an error message from the try function that I don't understand so I am hoping that someone can help. I am scraping from web pages, but sometimes they disappear. When that happens I need to control for it with some sort of function. This web page is parsed without a problem. exh-NASDAQ tic-EGHT URL-paste(http://www.advfn.com/p.php?pid=financialsbtn=istart_datemode=quarterly_reportssymbol=;, exh,%3A,tic,istart_date=0, sep = ) doc - htmlParse(URL) However, when I change the value of tic it will not. tic-AACOU URL-paste(http://www.advfn.com/p.php?pid=financialsbtn=istart_datemode=quarterly_reportssymbol=;, exh,%3A,tic,istart_date=0, sep = ) doc - htmlParse(URL) Error in htmlParse(URL) : error in creating parser for http://www.advfn.com/p.php?pid=financialsbtn=istart_datemode=quarterly_reportssymbol=NASDAQ0X1.CP-1072AACOUistart_date=0 I tried to account for this using the try function but I get the error below that I don't understand. options(error = expression(NULL)) URL-paste(http://www.advfn.com/p.php?pid=financialsbtn=istart_datemode=quarterly_reportssymbol=;, exh,%3A,tic,istart_date=0, sep = ) if( !is( try( doc - htmlParse(URL) ,try-error) ) ) { qtrstop - xpathApply(doc, count(//select/option))-5 } Error in !silent : invalid argument type Any help would be most appreciated. --John Sparks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quantmod getOptionChain Not Work
Michael, I have not had time to look at this for a while but still wanted to say thanks for looking into it and sending this solution. By the way, Jeff mentioned that the version of quantmod on the SVN (0.3.18) works for this. I tried to figure out how to download that version, but found the documentation on SVN's quite confusing. Is there anyway that you could make that version available? Much appreciated. --John Sparks On Fri, March 23, 2012 5:55 pm, R. Michael Weylandt wrote: Sorry about that: two small mistakes and I imagine there are a few more I've missed. This should actually work: ### library(XML) readYahooOptions - function(Symbols, Exp, ...){ parse.expiry - function(x) { if(is.null(x)) return(NULL) if(inherits(x, Date) || inherits(x, POSIXt)) return(format(x, %Y-%m)) if (nchar(x) == 5L) { x - sprintf(substring(x, 4, 5), match(substring(x, 1, 3), month.abb), fmt = 20%s-%02i) } else if (nchar(x) == 6L) { x - paste(substring(x, 1, 4), substring(x, 5, 6), sep = -) } return(x) } clean.opt.table - function(tableIn){ tableOut - sapply(tableIn[,-2], function(x) as.numeric(gsub(,,,x))) rownames(tableOut) - tableIn[,2] tableOut } if(missing(Exp)) optURL - paste(paste(http://finance.yahoo.com/q/op?s,Symbols,sep==;),Options,sep=+) else optURL - paste(paste(http://finance.yahoo.com/q/op?s=,Symbols,m=,parse.expiry(Exp),sep=),Options,sep=+) if(!missing(Exp) is.null(Exp)) { optPage - readLines(optURL) optPage - optPage[grep(View By Expiration, optPage)] allExp - gregexpr(m=, optPage)[[1]][-1] + 2 allExp - substring(optPage, allExp, allExp + 6) allExp - allExp[seq_len(length(allExp)-1)] # Last one seems useless ? return(structure(lapply(allExp, readYahooOptions, Symbols=Symbols), .Names=format(as.yearmon(allExp } stopifnot(require(XML)) optURL - readHTMLTable(optURL) # Not smart to hard code these but it's a 'good-enough' hack for now # Also, what is table 9 on this page? list(calls = clean.opt.table(optURL[[10]]), puts = clean.opt.table(optURL[[14]]), symbol = Symbols) } On Fri, Mar 23, 2012 at 6:44 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: I just got around to taking a look at this, but below is a fix. It seems like yahoo finance redesigned the page and rather than reparsing all their HTML, I'll use Duncan TL's XML package to make life happier. (I loathe HTML parsing) This isn't thoroughly tested and it'll break if yahoo redesigns things again (I hardcode the table numbers for now) but it seems to work well enough. Let me know if you have any errors with it. If Jeff likes it, it should be a drop-in replacement for the getOptionChain.yahoo for quantmod with a name change. Feedback welcome, Michael # library(XML) readYahooOptions - function(Symbols, Exp, ...){ parse.expiry - function(x) { if(is.null(x)) return(NULL) if(inherits(x, Date) || inherits(x, POSIXt)) return(format(x, %Y-%m)) if (nchar(x) == 5L) { x - sprintf(substring(x, 4, 5), match(substring(x, 1, 3), month.abb), fmt = 20%s-%02i) } else if (nchar(x) == 6L) { x - paste(substring(x, 1, 4), substring(x, 5, 6), sep = -) } return(x) } clean.opt.table - function(tableIn){ tableOut - lapply(tableIn[,-2], function(x) as.numeric(gsub(,,,x))) rownames(tableOut) - tableIn[,2] } if(missing(Exp)) optURL - paste(paste(http://finance.yahoo.com/q/op?s,Symbols,sep==;),Options,sep=+) else optURL - paste(paste(http://finance.yahoo.com/q/op?s=,Symbols,m=,parse.expiry(Exp),sep=),Options,sep=+) if(!missing(Exp) is.null(Exp)) { optPage - readLines(optURL) optPage - optPage[grep(View By Expiration, optPage)] allExp - gregexpr(m=, optPage)[[1]][-1] + 2 allExp - substring(optPage, allExp, allExp + 6) allExp - allExp[seq_len(length(allExp)-1)] # Last one seems useless ? Always true? return(structure(lapply(allExp, readYahooOptions, Symbols=Symbols), .Names=format(as.yearmon(allExp } stopifnot(require(XML)) optURL - readHTMLTable(optURL) # Not smart to hard code these but it's a 'good-enough' hack for now # Also, what is table 9 on this page? CALLS - optURL[[10]] PUTS - optURL[[14]] list(calls = CALLS, puts = PUTS, symbol = Symbols) } ### On Sun, Mar 4, 2012 at 2:18 PM, Sparks, John James jspa...@uic.edu wrote: Dear R Helpers, I am still having trouble with the getOptionChain command in quantmod. I have the latest version of quantmod, etc. so I was under the impression that the problem
[R] quantmod getOptionChain Not Work
Dear R Helpers, I am still having trouble with the getOptionChain command in quantmod. I have the latest version of quantmod, etc. so I was under the impression that the problem was solved with updates to the package. If someone could let me know what I need to install in order to make this work, I would really appreciate it. My error message as session info are shown below. Thanks a bunch. --John Sparks R version 2.14.2 (2012-02-29) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] pomp_0.40-2 deSolve_1.10-3 subplex_1.1-3mvtnorm_0.9-9992 quantmod_0.3-17 TTR_0.21-0 xts_0.8-2zoo_1.7-7 Defaults_1.1-1 loaded via a namespace (and not attached): [1] grid_2.14.2lattice_0.20-0 tools_2.14.2 AAPL.OPT-getOptionChain(AAPL) Error in puts[, 2] : incorrect number of dimensions AAPL.OPT-getOptionChain(AAPL,NULL) Error in puts[, 2] : incorrect number of dimensions __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble with Paste and Quotes and List Objects
Dear R Helpers, I have a list that contains a number of objects, each of them financial statement data from quantmod (although I don't think that knowledge of quantmod is necessary to help with this problem). str(listfinobj) chr [1:4815] A.f AA.f AACC.f AAME.f AAN.f AAON.f AAP.f AAPL.f AAT.f AATI.f AAU.f ... I can easily pick out the 3rd object in this list. listfinobj[[3]] [1] AACC.f Each of the .f objects has a mildly complicated structure (partial results shown below). str(AACC.f) List of 3 $ IS:List of 2 ..$ Q: num [1:49, 1:5] 50.4 NA 50.4 NA 50.4 ... .. ..- attr(*, dimnames)=List of 2 .. .. ..$ : chr [1:49] Revenue Other Revenue, Total Total Revenue Cost of Revenue, Total ... .. .. ..$ : chr [1:5] 2011-03-31 2010-12-31 2010-09-30 2010-06-30 ... .. ..- attr(*, col_desc)= chr [1:5] 3 months ending 2011-03-31 3 months ending 2010-12-31 3 months ending 2010-09-30 3 months ending 2010-06-30 ... ..$ A: num [1:49, 1:4] 198 NA 198 NA 198 ... .. ..- attr(*, dimnames)=List of 2 .. .. ..$ : chr [1:49] Revenue Other Revenue, Total Total Revenue Cost of Revenue, Total ... .. .. ..$ : chr [1:4] 2010-12-31 2009-12-31 2008-12-31 2007-12-31 .. ..- attr(*, col_desc)= chr [1:4] 12 months ending 2010-12-31 12 months ending 2009-12-31 12 months ending 2008-12-31 12 months ending 2007-12-31 $ BS:List of 2 ..$ Q: num [1:42, 1:5] NA NA 6.53 326.25 NA ... I can get the column names for one of the sub-objects of this object. colnames(AACC.f$IS$A) [1] 2010-12-31 2009-12-31 2008-12-31 2007-12-31 Thanks for your patience so far; here's the question. I want to get the column names from all the sub objects in each of the .f objects, so I want to build a loop, but I need to be able to refer to the column names of the sub object dynamically. My many attempts with paste and get have not worked, I believe because of the quotes and the $'s. For example temp-colnames(paste(listfinobj[[3]],$BS$A)[1],sep=,) Error: unexpected '$' in temp-colnames(paste(listfinobj[[3]],$ as.name(paste(as.name(listfinobj[[3]]),as.name($BS$A),sep=)) `AACC.f$BS$A` colnames(as.name(paste(as.name(listfinobj[[3]]),as.name($BS$A),sep=))) NULL as.factor(paste(as.name(listfinobj[[3]]),as.name($BS$A),sep=)) [1] AACC.f$BS$A Levels: AACC.f$BS$A colnames(as.factor(paste(as.name(listfinobj[[3]]),as.name($BS$A),sep=))) NULL Please help me to understand how to refer to the column names in the sub-objects of the objects in the list dynamically so that I can build a loop to get at each of them. Your help would be much appreciated. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble Combining With Paste
Dear R Helpers, I am having trouble combining some pieces of programming that work fine individually, but fall down when I try to get them to work together. The end goal is to take a data frame, and if any of the variables has more than 10 values, then use cut2 to reduce the number of (effective) values to 10. I want to do this in automated fashion, which is where the combining comes in. For example all of these pieces work as I would expect: tables-lapply(infert,table) lengths-lapply(tables,length) toolong-which(lengths10) require(Hmisc) foo-as.numeric(cut2(infert$age,g=10,levels.mean=TRUE)) str(foo) #num [1:248] 2 10 9 7 7 8 1 6 1 3 ... bar-paste(inftert$,attr(toolong[1],names),sep=) bar #[1] inftert$age But the following gives an error: foobar-as.numeric(cut2(paste(inftert$,attr(toolong[1],names),sep=),g=10,levels.mean=TRUE)) Error in min(diff(x.unique))/2 : non-numeric argument to binary operator In addition: Warning message: In min(diff(x.unique)) : no non-missing arguments, returning NA Your guidance would be much appreciated. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Thiel's Uncertainty Coefficient
Dear R Helpers, I was looking at the email help threads in trying to find a calculation in R of Thiel's uncertainty coefficient. One of the writers offered to send the function in custom code to the inquirer. Can I get a copy of that code, or does anyone know if the calculation is now available in an R package? Please advise. Many thanks. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Apply or Tapply to Build Set of Tables
Dear R Helpers, First, I apologize for asking for help on the first of my topics. I have been looking at the posts and pages for apply, tapply etc, and I know that the solution to this must be ridiculously easy, but I just can't seem to get my brain around it. If I want to produce a set of tables for all the variables in my data, how can I do that without having to type them into the table command one by one. So, I would like to use (t? s? r?)apply to use one command instead of the following set of table commands: data(infert, package = datasets) attach(infert) table.education-table(education) table.age-table(age) table.parity-table(parity) etc. To make matters worse, what I subsequently need is the chi-square for each and all of the pairs of variables. Such as: chi.education.age-chisq.test(table(education,age)) chi.education.parity-chisq.test(table(education,parity)) chi.age.parity-chisq.test(table(age,parity)) etc. Your guidance would be much appreciated. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Apply or Tapply to Build Set of Tables
Dear R Helpers, First, I apologize for asking for help on the first of my topics. I have been looking at the posts and pages for apply, tapply etc, and I know that the solution to this must be ridiculously easy, but I just can't seem to get my brain around it. If I want to produce a set of tables for all the variables in my data, how can I do that without having to type them into the table command one by one. So, I would like to use (t? s? r?)apply to use one command instead of the following set of table commands: data(infert, package = datasets) attach(infert) table.education-table(education) table.age-table(age) table.parity-table(parity) etc. To make matters worse, what I subsequently need is the chi-square for each and all of the pairs of variables. Such as: chi.education.age-chisq.test(table(education,age)) chi.education.parity-chisq.test(table(education,parity)) chi.age.parity-chisq.test(table(age,parity)) etc. Your guidance would be much appreciated. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Find String Between Characters
Hi Jim, Thanks for your note. Unfortunately, when I attempt your solution in my exact setting, I get a weird and slightly different answer. First, let me be more clear. What I am attempting to do is pull the CIK number out of the information from the web page itself after it has loaded to R (this may not be optimal, but I am new at this), not from the web page reference (as you have done). So, when I execute the following as per your suggestion: require(scrapeR) mmm-scrape(url=http://www.sec.gov/cgi-bin/browse-edgar?action=getcompanyCIK=320193owner=excludecount=40;) num - sub(^.*CIK=([0-9]+).*, \\1, mmm) I get [1] pointer: 0x001265c0 Is this just a hex representation of the same number, or is something else going on here? Comments from any and all would be much appreciated. --John J. Sparks, Ph.D. On Sat, May 14, 2011 7:57 pm, jim holtman wrote: Is this what you want: mmm-http://www.sec.gov/cgi-bin/browse-edgar?action=getcompanyCIK=320193owner=excludecount=40; num - sub(^.*CIK=([0-9]+).*, \\1, mmm) num [1] 320193 On Sat, May 14, 2011 at 8:20 PM, Sparks, John James jspa...@uic.edu wrote: Dear R Helpers, I am trying to isolate a set of characters between two other characters in a long string file. I tried some of the examples on the R help pages and elsewhere, but I am not able to get it. Your help would be much appreciated. require(scrapeR) mmm-scrape(url=http://www.sec.gov/cgi-bin/browse-edgar?action=getcompanyCIK=320193owner=excludecount=40;) str(mmm) I want to get the number 320193 that is between the CIK= and the . I have tried g - grep( CIK=|, mmm ) and temp-grep(mmm,\CIK=\) and variations on these themes, but all won't run or come bask as an empty object. How can I grab this number? Best wishes, --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changing Attribute With Paste
Dear R Helpers, I am trying to adjust the attribute of an R object pulled from quantmod. Since I want to do this for many such objects, I was trying to make the adjustment programmatic. Unfortunately, I am having a huge amount of trouble using attr in combination with paste (and perhaps get, and perhaps assign, none of which seem to help). When I hard-code the change it works fine. Your help would be much appreciated. require(quantmod) getFin(NYSE:A) attr(NYSE.A.f,symbol)-A #works fine ticker-A attr(paste(NYSE.,ticker,.f,sep=),symbol)-A #doesn't work attr(get(paste(NYSE.,ticker,.f,sep=)),symbol)-A#nor does this, nor the hundred other combinations I have tried Best wishes, --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Identify Objects that end with .f (and all caps)
Dear R Helpers, I am trying to find a way to identify all the objects in my environment that are all caps and then end with .f. I can do the all caps part pretty easily, but I have tried a number of variations on the \ and can't get a recognition of that operator. As a simple example A.f-foo1 AA.f-foo2 aa.f-foo3 A.a-foo4 ls() [1] A.a A.f aa.f AA.f temp1-ls(pattern=[A-Z]) temp1 [1] A.a A.f AA.f temp2-ls(pattern=\f) Error: unexpected input in temp2-ls(pattern=\ The end goal is to isolate A.f and AA.f and not the others. In terms of just getting the 'ending with .f' portion, I have tried a number of variations in the pattern=\f, but can't get R to recognize what I want. Your guidance would be much appreciated. --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Find String Between Characters
Dear R Helpers, I am trying to isolate a set of characters between two other characters in a long string file. I tried some of the examples on the R help pages and elsewhere, but I am not able to get it. Your help would be much appreciated. require(scrapeR) mmm-scrape(url=http://www.sec.gov/cgi-bin/browse-edgar?action=getcompanyCIK=320193owner=excludecount=40;) str(mmm) I want to get the number 320193 that is between the CIK= and the . I have tried g - grep( CIK=|, mmm ) and temp-grep(mmm,\CIK=\) and variations on these themes, but all won't run or come bask as an empty object. How can I grab this number? Best wishes, --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] If Then Trouble
Dear R Helpers, I have another one of those problems involving a very simple step, but due to my inexperience I can't find a way to solve it. I had a look at a number of on-line references, but they don't speak to this problem. I have a variable with 20 values table (testY2$redgroups) 1 2 3 4 5 6 7 8 9101112 1314151617181920 69 734 6079 18578 13693 6412 3548 1646 659 323 12988 904057333617 613 Values 18,19 and 20 have small counts. So, I want to set the value of redgroups for these rows to 17 in order to combine groups. I would think that it would be as easy as if(testY2$redgroups17) testY2$redgroups-17 following the syntax that I have seen in the manuals. However, I get the error message Warning message: In if (testY2$redgroups 17) testY2$redgroups - 17 : the condition has length 1 and only the first element will be used Can someone please tell me the correct syntax for this? I would really appreciate it. Appreciatively yours, --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assign Character Value to Data Frame
Dear R Helpers, I am trying to write a character value to the row of a data frame and am running into a problem that I don't have when I do this for numeric arguments. For example, the following works just fine: test-data.frame(number=numeric(1)) test[1,]-.5 test number 10.5 But the following bombs out: hold-data.frame(symbol=character(1)) hold[1,]-NYSE:MMM Warning message: In `[-.factor`(`*tmp*`, iseq, value = NYSE:MMM) : invalid factor level, NAs generated Could someone please guide me as to what adjustment I need to make to assign this character value to this row of the data frame? Your help would be very much appreciated. --John Sparks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assign with Paste Problem
Dear R Helpers, I am trying to change the name of an object using the assign function. When I use paste on the new object but not the old, everything is fine: The new object is a direct copy of the old object. When I use a paste for both the new and the old object, however, the new object is simply the character representation of the old object name, not the old object itself. The example below uses quantmod, but you don't need that to see the nature of the problem. How can I get the new object to be a complete copy of the old object when I use paste in both sides of the assign? Your help would be most appreciated. --John Sparks #Careful! Removes everything in working directory! rm(list = ls()) ticker-F require(quantmod) getFin(paste(NYSE:,ticker,sep=)) [1] NYSE.F.f NYSE.F.f Financial Statement for NYSE:F Retrieved from google at 2011-04-12 20:03:20 Use viewFinancials or viewFin to view F.f-NYSE.F.f F.f Financial Statement for NYSE:F Retrieved from google at 2011-04-12 20:03:20 Use viewFinancials or viewFin to view rm(F.f) assign(paste(ticker,.f,sep=),NYSE.F.f) F.f Financial Statement for NYSE:F Retrieved from google at 2011-04-12 20:03:20 Use viewFinancials or viewFin to view rm(F.f) assign(paste(ticker,.f,sep=),paste(NYSE.,ticker,.f,sep=)) F.f [1] NYSE.F.f __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replacing Period in String
Dear R Users, I am working with gsub for the first time. I am trying to remove some characters from a string. I have hit the problem where the period is the shorthand for 'everything' in the R language when what I want to remove is the actual periods. In the example below, I simply want to remove the periods as I have removed the comma, but instead the complete string is wiped out. I would appreciate it if someone could let me know how I communicate that I want to remove the period verbatim to R. Many thanks. --John Sparks txt=This is a test. However, it is only a test. txt2-gsub(,,,txt) txt2 [1] This is a test. However it is only a test. txt3-gsub(.,,txt) txt3 [1] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] quantmod Some Single Letter Tickers Not getFin
Hi, I have been learning the quantmod package over the last several days. I went to check some of my data pulls against other sources and was surprised to find that a few tickers that have single characters do not successfully scrape from Google Finance using getFin(). Particularly require(quantmod) getFin(A) getFin(E) getFin(F) getFin(G) getFin(M) all result in a file not found error. I show the last one below. getFin(M) Error in download.file(paste(google.fin, Symbol, sep = ), quiet = TRUE, : cannot open URL 'http://finance.google.com/finance?fstype=iiq=M' In addition: Warning message: In download.file(paste(google.fin, Symbol, sep = ), quiet = TRUE, : cannot open: HTTP status was '400 Bad Request' I checked out the financial statement pages for all of these and they exist and are as expected: 5 quarters worth of quarterly figures (except for cash-flow which has 4 quarters) and 4 years of annual figures. All the rows are also present by comparing a scrape to excel with the figures for Y, which does getFin(Y) without a problem. I was hoping that someone who knows a lot more about scraping then I do could look into this. Best wishes to all, --John Sparks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Set a Numeric Field To Blank
Hi, I have one of those questions that I suspect is very simple, but hard to classify, so I have been searching for quite some time and am not able to find it. If I have a data frame and I want to change all the values of one of the columns to blanks, what is the syntax? I tried a few different spellings of Null, etc., but can't get it. Can someone please send me what I suspect is a one line solution to this? Many thanks, --John J. Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Picking Part of Large R Object
Dear All, I have imported an HTML document to R (called tables) and wish to select certain pieces of it for processing. The first few lines of the object appear as follows: tables [[1]] table id=fs-table class=gf-table rgt thead trth class=lm lft nwp In Millions of USD (except for per share items) /th th class=rgt 3 months ending 2010-06-30 /th th class=rgt 3 months ending 2010-03-31 /th th class=rgt 3 months ending 2009-12-31 /th th class=rgt 3 months ending 2009-09-30 /th th class=rgt rm 3 months ending 2009-06-30 /th /tr /thead tbody !-- 1 row for one coaitem -- trtd class=lft lmRevenue /td td class=r16,039.00/td td class=r14,503.00/td td class=r19,022.00/td td class=r12,920.00/td td class=r rm13,099.00/td /tr The next major partition of the object is: [[2]] table id=fs-table class=gf-table rgt thead trth class=lm lft nwp In Millions of USD (except for per share items) /th th class=rgt 12 months ending 2010-06-30 /th th class=rgt 12 months ending 2009-06-30 /th th class=rgt 12 months ending 2008-06-30 /th th class=rgt rm 12 months ending 2007-06-30 /th /tr /thead tbody !-- 1 row for one coaitem -- trtd class=lft lmRevenue /td td class=r62,484.00/td td class=r58,437.00/td td class=r60,420.00/td td class=r rm51,122.00/td /tr trtd class=lft lmOther Revenue, Total /td td class=r-/td td class=r-/td td class=r-/td td class=r rm-/td /tr tr class=hilitetd class=lft lm bldTotal Revenue /td td class=r bld62,484.00/td td class=r bld58,437.00/td td class=r bld60,420.00/td td class=r bld rm51,122.00/td /tr trtd class=lft lmCost of Revenue, Total /td td class=r12,395.00/td td class=r12,155.00/td td class=r11,598.00/td td class=r rm10,693.00/td How can I specify the part of the R object denoted by [[1]] and put it into a new object for processing. As in table1-... I have tried many variations of [[1]], c[1], etc. but haven't had any luck. Guidance would be much appreciated. --John Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ScrapeR Unanticipated XML objects
Dear All, I have come across a very surprising result as I have started to learn how to use R to pull data from the web for analysis. I am trying to isolate that table headers for the quarterly income statement (qtrinc) that I pulled from Google finance. I executed the following commands after installing the scrapeR package. require(scrapeR) htmlfile-scrape(url=http://www.google.com/finance?q=NASDAQ:MSFTfstype=ii,headers=TRUE,parse=TRUE) tables-xpathSApply(htmlfile[[1]],//table) qtrinc-tables[[1]] xpathSApply(qtrinc,//thead,xmlValue) I receive the result: [1] \nIn Millions of USD (except for per share items)\n\n\n3 months ending 2010-06-30\n\n\n3 months ending 2010-03-31\n\n\n3 months ending 2009-12-31\n\n\n3 months ending 2009-09-30\n\n\n3 months ending 2009-06-30\n\n [2] \nIn Millions of USD (except for per share items)\n\n\n12 months ending 2010-06-30\n\n\n12 months ending 2009-06-30\n\n\n12 months ending 2008-06-30\n\n\n12 months ending 2007-06-30\n\n [3] \nIn Millions of USD (except for per share items)\n\n\nAs of 2010-06-30\n\n\nAs of 2010-03-31\n\n\nAs of 2009-12-31\n\n\nAs of 2009-09-30\n\n\nAs of 2009-06-30\n\n [4] \nIn Millions of USD (except for per share items)\n\n\nAs of 2010-06-30\n\n\nAs of 2009-06-30\n\n\nAs of 2008-06-30\n\n\nAs of 2007-06-30\n\n [5] \nIn Millions of USD (except for per share items)\n\n\n12 months ending 2010-06-30\n\n\n9 months ending 2010-03-31\n\n\n6 months ending 2009-12-31\n\n\n3 months ending 2009-09-30\n\n [6] \nIn Millions of USD (except for per share items)\n\n\n12 months ending 2010-06-30\n\n\n12 months ending 2009-06-30\n\n\n12 months ending 2008-06-30\n\n\n12 months ending 2007-06-30\n\n Interestingly, only the first of these table headers exists in the list qtrinc (if you list(qtrinc) you will see what I mean). These are actually the table headers for all the tables in the object htmlfile. Can someone please help me isolate the table headers for only the object qtrinc? As long as I am at it, I also don't know how to remove the \n characters when calling the data. Help would be much appreciated. --John Sparks, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.