Dear R-helpers, Considering that a substantial part of analysis is related data manipulation, I'm just wondering if I should do the basic data part in a database server (currently I have the data in .txt file). For this purpose, I am planning to use MySQL. Is MySQL a good way to go about? Are there any anticipated problems that I need to be aware of?
Considering, that many users here use large datasets. Do you typical store the data in databases and query relevant portions for your analysis? Does it speed up the entire process? Is it neater to do things in a database? (for e.g. errors could corrected at data import stage itself by conditions in defining the data itself in the database as opposed to discovering things when you do the analysis in R and realize something is wrong in the output?) This is vis-à-vis using the built in SQLLite, indexing, etc capabilities in R itself? Does performance work better with a database backend (especially for simple but large datasets)? The financial applications that I am thinking of are not exactly realtime but quick response and fast performance would definitely help. Aside info, I want to take things to a cloud environment at some point of time just because it will be easier and cheaper to deliver. Kind of an open question, but any inputs will help. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.