Re: [R] grofit package problem inputting dataset
Your csv output doesn't have any commas in it. Your email is in HTML format so we cannot trust it to show what is really there (read the Posting Guide). The sink function forwards stuff that would have been printed to a file, but that isn't a particularly good way to exchange data with other software. I use write.table(foo,data.csv,sep=,,row.names=FALSE) to export csv data. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On August 8, 2014 10:19:48 PM PDT, Fethe, Michael mfet...@vols.utk.edu wrote: I've recently wanted to analyze some data sets of growth curves; so, I decided to try out the grofit package the dataset inputting gave me some issues. I've been trying to replicate the example from the grofit package with R foo - ran.data(100, 25) time - foo$time data - foo$data write.csv(file=data.csv, foo) sink(file=sink.txt) foo sink() also to see if there is another problem I'm missing i've used sink() to see the actual output of this data. The sink call provides a dataframe object, however I'm trying to use my own data in this. Is there a way to create the dataframe object in excel (I've tried following the example from the write.csv() output). I know there is a problem with the input method by following the .csv output and i could use follow the sink() output to create my data frame object; however, I'm not sure about how i would do this with a large dataset with lets say 1000 data points. Has anyone ran into this issue and is there a quick work around? the sink() output $data X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 1 Test I A 0.06195419 0.314959772 0.398263408 0.091132317 1.012083039 0.35122189 1.78719671 2.47453614 3.25305005 3.9829927 5.2863775 5.8576975 7.2154323 8.501599 8.405278 2 Test I B 0.09543701 0.441797809 -0.345308045 0.462532336 1.324153423 1.20722268 1.71422886 2.40135394 2.79558398 3.8981917 5.3614344 5.8423570 6.8511538 7.192188 8.344946 3 Test I C 0.16386938 -0.178975999 0.464790443 0.325264753 0.033088580 0.79395919 0.63525411 1.71685521 2.62738577 2.5209004 4.4535178 4.9102966 6.8867905 6.996433 7.553863 4 Test I D 0.14198175 0.100778235 -0.164231759 -0.322266709 0.571067561 1.24234632 1.54056165 1.96933568 2.97097469 3.7711348 3.8414402 5.2768758 5.6688960 7.041779 7.651093 5 Test I E 0.25390711 0.039312093 -0.463351713 0.628339527 0.418403984 0.56460811 1.26348242 1.56823878 1.93588943 3.3141874 3.0360158 3.7567956 5.6896075 5.873556 6.524754 6 Test I F 0.22319304 -0.076464713 0.074501305 -0.160924707 0.384392150 0.76412340 1.36118116 1.50356468 2.53322106 3.4924520 4.5054475 4.6222326 5.5347148 6.349000 7.482548 7 Test I G 0.28095487 -0.728248588 0.479450323 0.542078371 0.757460716 0.10292177 1.01113655 1.14448036 2.19976257 3.3030023 3.0848547 4.1877661 5.2832997 5.485825 6.375234 8 Test I A 0.32135093 -0.036980401 -0.068313544 -0.059620107 0.440995143 0.48424749 0.65644521 1.38482000 2.05964880 2.2548116 2.6813025 3.5085200 4.7988415 5.017753 5.405950 9 Test I B 0.31930456 0.169104577 0.100637447 0.070003632 0.304209263 1.72301034 ... $time [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9][,10][,11][,12][,13][,14][,15][,16] [,17][,18][,19][,20] [1,] 1.105562 2.429351 3.303547 4.941900 5.370762 6.939251 7.546835 8.400435 9.602104 10.85058 11.27732 12.12658 13.80886 14.03953 15.88684 16.63256 17.85349 18.29452 19.94482 20.74473 [2,] 1.842357 2.183849 3.528340 4.761010 5.610834 6.473241 7.771216 8.575116 9.544966 10.79126 11.61007 12.91479 13.21070 14.27869 15.30520 16.32062 17.71822 18.94936 19.58821 20.46276 [3,] 1.787510 2.734994 3.063325 4.788528 5.676988 6.096337 7.373838 8.047332 9.042756 10.35715 11.47604 12.84095 13.14183 14.77351 15.81132 16.63433 17.37387 18.51372 19.83332 20.74899 [4,] 1.074662 2.386418 3.734951 4.889867 5.994899 6.483399 7.603956 8.826142 9.536391 10.01653 11.03079 12.60530 13.00965 14.70169 15.17157 16.75678 17.85311 18.22975 19.28407 20.73184 [5,] 1.748113 2.185892 3.236870 4.294671 5.093055 6.910312 7.881226 8.067719 9.632505 10.26807 11.03523 12.75277 13.66110 14.30814 15.61313 16.62628 17.98222 18.95378 19.46946 20.17275 the write.csv() output data.X1 data.X2 data.X3 data.X4 data.X5 data.X6 data.X7
Re: [R] Logical operators and named arguments
On 09/08/2014 01:10, Joshua Wiley wrote: On Sat, Aug 9, 2014 at 9:56 AM, Patrick Burns pbu...@pburns.seanet.com wrote: On 07/08/2014 07:21, Joshua Wiley wrote: Hi Ryan, It does work, but the *apply family of functions always pass to the first argument, so you can specify e2 = , but not e1 =. For example: sapply(1:3, ``, e2 = 2) [1] FALSE FALSE TRUE That is not true: But it is passed as the first argument, not by name, but positionally. The reason it works with your gt() is because R with regular functions is flexible: f - function(x, y) x y f(1:3, x = 2) [1] TRUE FALSE FALSE but primitives ARE positionally matched That's not true either. Almost all primitives intended to be called as functions do have standard argument-matching semantics. (Once upon a time they did not, but I added the requisite code years ago.) There are six exceptions plus binary operators and other language elements. See http://cran.r-project.org/doc/manuals/r-release/R-ints.html#g_t_002eInternal-vs-_002ePrimitive and the comments about primitive functions in ?lapply. ``(1:3, 2) [1] FALSE FALSE TRUE ``(1:3, e1 = 2) [1] FALSE FALSE TRUE gt - function(x, y) x y sapply(1:3, gt, y=2) [1] FALSE FALSE TRUE sapply(1:3, gt, x=2) [1] TRUE FALSE FALSE Specifying the first argument(s) in an apply call is a standard way of getting flexibility. I'd hazard to guess that the reason the original version doesn't work is because `` is Primitive. There's speed at the expense of not behaving quite the same as typical functions. Pat From ?sapply 'lapply' returns a list of the same length as 'X', each element of which is the result of applying 'FUN' to the corresponding element of 'X'. so `` is applied to each element of 1:3 ``(1, ...) ``(2, ...) ``(3, ...) and if e2 is specified than that is passed ``(1, 2) ``(2, 2) ``(3, 2) Further, see ?Ops If the members of this group are called as functions, any argument names are removed to ensure that positional matching is always used. and you can see this at work: ``(e1 = 1, e2 = 2) [1] FALSE ``(e2 = 1, e1 = 2) [1] FALSE If you want to the flexibility to specify which argument the elements of X should be *applied to, use a wrapper: sapply(1:3, function(x) ``(x, 2)) [1] FALSE FALSE TRUE sapply(1:3, function(x) ``(2, x)) [1] TRUE FALSE FALSE HTH, Josh On Thu, Aug 7, 2014 at 2:20 PM, Ryan rec...@bwh.harvard.edu wrote: Hi, I'm wondering why calling with named arguments doesn't work as expected: args() function (e1, e2) NULL sapply(c(1,2,3), ``, e2=0) [1] TRUE TRUE TRUE sapply(c(1,2,3), ``, e1=0) [1] TRUE TRUE TRUE Shouldn't the latter be FALSE? Thanks for any help, Ryan The information in this e-mail is intended only for th...{{dropped:23}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com twitter: @burnsstat @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of: 'Impatient R' 'The R Inferno' 'Tao Te Programming') -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Possible pair of 2 binary vectors
Hi, Let say I have 2 binary vectors of length 'd', therefore both these vectors can take only 0-1 values. Now I want to simulate all possible pairs of them. Theoretically there will be 4^d possible pairs. Is there any R function to directly simulate them? Thanks for your help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Package for Text Manipulation
Hi all, I want to know, where i can find a package to simulate the functions Search and Replace and Find Words that contain - replace them with..., that we can use in EXCEL. I've look in other places and they say: Reshape2 by Hadley Wickham. How ever, i've investigated it and its not exactly what i'm looking (it's main functions are cast and melt, sure you know them). May you help me please? I want to download data from Google Analytics and clean it, what is the best approach? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Possible pair of 2 binary vectors
Dear Ron, What about this? set.seed(123) d - 4 x1 - sample(0:1, d, TRUE) x2 - sample(0:1, d, TRUE) x1 x2 expand.grid(x1 = x1, x2 = x2) See ?expand.grid for more information. Best, Jorge.- On Sat, Aug 9, 2014 at 7:46 PM, Ron Michael ron_michae...@yahoo.com wrote: Hi, Let say I have 2 binary vectors of length 'd', therefore both these vectors can take only 0-1 values. Now I want to simulate all possible pairs of them. Theoretically there will be 4^d possible pairs. Is there any R function to directly simulate them? Thanks for your help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Package for Text Manipulation
On Sat, Aug 9, 2014 at 8:15 AM, Omar André Gonzáles Díaz oma.gonza...@gmail.com wrote: Hi all, I want to know, where i can find a package to simulate the functions Search and Replace and Find Words that contain - replace them with..., that we can use in EXCEL. I've look in other places and they say: Reshape2 by Hadley Wickham. How ever, i've investigated it and its not exactly what i'm looking (it's main functions are cast and melt, sure you know them). May you help me please? I want to download data from Google Analytics and clean it, what is the best approach? [[alternative HTML version deleted]] 1. The gsubfn function in the gsubfn package can do that. These commands extract the words and then apply the function represented in formula notation in the second argument to them: library(gsubfn) # home page at http://gsubfn.googlecode.com s - The quick brown fox # test data # replace the word quick with QUICK gsubfn(\\S+, ~ if (x == quick) QUICK else x, s) ## [1] The QUICK brown fox # replace words containing o with ? gsubfn(\\S+, ~ if (grepl(o, x)) ? else x, s) ## [1] The quick ? ? 2. It can also be done without packages: # replace quick with QUICK gsub(\\bquick\\b, QUICK, s) ## [1] The QUICK brown fox # or the following which first split s into a vector of words and # operate on that pasting it back into a single string at the end words - strsplit(s, \\s+)[[1]] paste(replace(words, words == quick, QUICK), collapse = ) ## [1] The QUICK brown fox # replace words containing o with ?. Use `words` from above. paste(replace(words, grepl(o, words), ?), collapse = ) ## [1] The quick ? ? -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing manual package problem
If you just want to install the package from github, the easy way is to first install the devtools package and use the install_github function. Best, Ista On Aug 8, 2014 4:21 PM, James Holland holland.ag...@gmail.com wrote: Running R 3.03 on Windows 7 I am trying to install a package from a github repository. https://github.com/google/glassbox I downloaded the repository as a zip file, extracted it to get the glassbox folder and re-zipped it with 7-zip. I then ran #-Start code---# install.packages(C:/Users/jholland/Downloads/glassbox.zip, repos=NULL, type=source) #-# The output message said Installing package into âC:/Users/jholland/Documents/R/win-library/3.0â (as âlibâ is unspecified) library(glassbox) Error in library(glassbox) : âglassboxâ is not a valid installed package I'm not sure what I'm doing wrong. When I look in the R library folder (...R/win-library/3.0) I see the glassbox folder there. I'm new to using packages not from the CRAN list so I'm trying to learn fast. I tried some searching and this seems to be what I'm suppossed to do, but perhaps I need to use dev mode ? Thank you for the help. ~James [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing manual package problem
Thank you all, I didn't know about the install_github function. Sorry, forgot to switch to plain text On Sat, Aug 9, 2014 at 10:11 AM, Ista Zahn istaz...@gmail.com wrote: If you just want to install the package from github, the easy way is to first install the devtools package and use the install_github function. Best, Ista On Aug 8, 2014 4:21 PM, James Holland holland.ag...@gmail.com wrote: Running R 3.03 on Windows 7 I am trying to install a package from a github repository. https://github.com/google/glassbox I downloaded the repository as a zip file, extracted it to get the glassbox folder and re-zipped it with 7-zip. I then ran #-Start code---# install.packages(C:/Users/jholland/Downloads/glassbox.zip, repos=NULL, type=source) #-# The output message said Installing package into ‘C:/Users/jholland/Documents/R/win-library/3.0’ (as ‘lib’ is unspecified) library(glassbox) Error in library(glassbox) : ‘glassbox’ is not a valid installed package I'm not sure what I'm doing wrong. When I look in the R library folder (...R/win-library/3.0) I see the glassbox folder there. I'm new to using packages not from the CRAN list so I'm trying to learn fast. I tried some searching and this seems to be what I'm suppossed to do, but perhaps I need to use dev mode ? Thank you for the help. ~James [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing manual package problem
On 09.08.2014 17:40, James Holland wrote: Thank you all, I didn't know about the install_github function. Sorry, forgot to switch to plain text On Sat, Aug 9, 2014 at 10:11 AM, Ista Zahn istaz...@gmail.com wrote: If you just want to install the package from github, the easy way is to first install the devtools package and use the install_github function. Reason why your former approach did not work: This is a source package, you need to install source packages via install.packages(..., type=source) or from the command line via R CMD INSTALL package_version.tar.gz See the R Installation and Administration manual for details. To build a proper .tar.gz file, do use R CMD build directory_name from the command line. Best, Uwe Ligges Best, Uwe Ligges Best, Ista On Aug 8, 2014 4:21 PM, James Holland holland.ag...@gmail.com wrote: Running R 3.03 on Windows 7 I am trying to install a package from a github repository. https://github.com/google/glassbox I downloaded the repository as a zip file, extracted it to get the glassbox folder and re-zipped it with 7-zip. I then ran #-Start code---# install.packages(C:/Users/jholland/Downloads/glassbox.zip, repos=NULL, type=source) #-# The output message said Installing package into ‘C:/Users/jholland/Documents/R/win-library/3.0’ (as ‘lib’ is unspecified) library(glassbox) Error in library(glassbox) : ‘glassbox’ is not a valid installed package I'm not sure what I'm doing wrong. When I look in the R library folder (...R/win-library/3.0) I see the glassbox folder there. I'm new to using packages not from the CRAN list so I'm trying to learn fast. I tried some searching and this seems to be what I'm suppossed to do, but perhaps I need to use dev mode ? Thank you for the help. ~James [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Package for Text Manipulation
On Aug 9, 2014, at 5:15 AM, Omar André Gonzáles Díaz wrote: Hi all, I want to know, where i can find a package to simulate the functions Search and Replace and Find Words that contain - replace them with..., that we can use in EXCEL. I've look in other places and they say: Reshape2 by Hadley Wickham. How ever, i've investigated it and its not exactly what i'm looking (it's main functions are cast and melt, sure you know them). May you help me please? I want to download data from Google Analytics and clean it, what is the best approach? That request is on the vague side. You are advised in the Posting Guide to include code that begins an analysis and then requests assistance with specific difficulties. (You are also asked to do this in a plain text message since HTML tends to scramble messages.) The base package offers the `grep`, `sub`, and `gsub` functions which bring the power of regular expression to the R user. There are much more flexible that anything that Excel offers. Please look at: ?grep ?regex [[alternative HTML version deleted]] And do : PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] loops with assign() and get()
Dear all, I was able to create 102 distinct dataframes (DFs1, DFs2, DFs3, etc) using the assign() in a loop. Now, I would like to perform the following transformation for each one of these dataframes: df1=DFs1[1,] df1=df1[,1:3] names(df1)=names(DFs1[c(1,4,5)]) df1=rbind(df1,DFs1[c(1,4,5)]) names(df1)=c(UID,Date,Location) something like this: for (i in 1 : nrow(unique)){ dfi=DFsi[1,] dfi=dfi[,1:3] names(dfi)=names(DFsi[c(1,4,5)]) dfi=rbind(dfi,DFsi[c(1,4,5)]) names(dfi)=c(UID,Date,Location) } I thought it could be straightforward but has proven the opposite Many thanks Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading chunks of data from a file more efficiently
Hi, I have some very large (~1.1 GB) output files from a groundwater model called STOMP that I want to read as efficiently as possible. For each variable there are over 1 million values to read. Variables are not organized in columns; instead they are written out in sections in the file, like this: X-Direction Node Positions, m 5.93145E+05 5.93155E+05 5.93165E+05 5.93175E+05 5.93245E+05 5.93255E+05 5.93265E+05 5.93275E+05 . . . 5.94695E+05 5.94705E+05 5.94715E+05 5.94725E+05 5.94795E+05 5.94805E+05 5.94815E+05 5.94825E+05 Y-Direction Node Positions, m 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05 . . . 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05 Z-Direction Node Positions, m 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01 . . . I want to read and use only a subset of the variables. I wrote the function below to find the line where each target variable begins and then scan the values, but it still seems rather slow, perhaps because I am opening and closing the file for each variable. Can anyone suggest a faster way? # Reads original STOMP plot file (plot.*) directly. Should be useful when the plot files are # very large with lots of variables, and you just want to retrieve a few of them. # Arguments: 1) plot filename, 2) number of nodes, # 3) character vector of names of target variables you want to return. # Returns a list with the selected plot output. READ.PLOT.OUTPUT6 - function(plt.file, num.nodes, var.names) { lines - readLines(plt.file) num.vars - length(var.names) tmp - list() for(i in 1:num.vars) { ind - grep(var.names[i], lines, fixed=T, useBytes=T) if(length(ind) != 1) stop(Not one line in the plot file with matching variable name.\n) tmp[[i]] - scan(plt.file, skip=ind, nmax=num.nodes, quiet=T) } return(tmp) } # end READ.PLOT.OUTPUT6() Regards, Scott Waichler Pacific Northwest National Laboratory Richland, WA, USA scott.waich...@pnnl.gov __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loops with assign() and get()
I was able to create 102 distinct dataframes (DFs1, DFs2, DFs3, etc) using the assign() in a loop. The first step to making things easier to do is to put those data.frames into a list. I'll call it DFS and your data.frames will now be DFs[[1]], DFs[[2]], ..., DFs[[length(DFs)]]. DFs - lapply(paste0(DFs, 1:102), get) In the future, I think it would be easier if you skipped the 'assign()' and just put the data into a list from the start. Now use lapply to process that list, creating a new list called 'df', where df[[i]] is the result of processing DFs[[i]]: df - lapply(DFs, FUN=function(DFsi) { # your code from the for loop you supplied dfi=DFsi[1,] dfi=dfi[,1:3] names(dfi)=names(DFsi[c(1,4,5)]) dfi=rbind(dfi,DFsi[c(1,4,5)]) names(dfi)=c(UID,Date,Location) dfi # return this to put in list that lapply is making }) (You didn't supply sample data so I did not run this - there may be typos.) Bill Dunlap TIBCO Software wdunlap tibco.com On Sat, Aug 9, 2014 at 1:39 PM, Laura Villegas Ortiz lvil...@ncsu.edu wrote: Dear all, I was able to create 102 distinct dataframes (DFs1, DFs2, DFs3, etc) using the assign() in a loop. Now, I would like to perform the following transformation for each one of these dataframes: df1=DFs1[1,] df1=df1[,1:3] names(df1)=names(DFs1[c(1,4,5)]) df1=rbind(df1,DFs1[c(1,4,5)]) names(df1)=c(UID,Date,Location) something like this: for (i in 1 : nrow(unique)){ dfi=DFsi[1,] dfi=dfi[,1:3] names(dfi)=names(DFsi[c(1,4,5)]) dfi=rbind(dfi,DFsi[c(1,4,5)]) names(dfi)=c(UID,Date,Location) } I thought it could be straightforward but has proven the opposite Many thanks Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading chunks of data from a file more efficiently
Informally abbreviating data is not recommended... I faked some, but would appreciate if you would make your example reproducible next time. All I really did for performance was use the data you read in rather than re-scanning the file. # generated by using dput() lines - c(X-Direction Node Positions, m, 5.93145E+05 5.93155E+05 5.93165E+05 5.93175E+05, 5.93245E+05 5.93255E+05 5.93265E+05 5.93275E+05, 5.94695E+05 5.94705E+05 5.94715E+05 5.94725E+05, 5.94795E+05 5.94805E+05 5.94815E+05 5.94825E+05, , Y-Direction Node Positions, m, 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05, 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05, 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05, 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05, , Z-Direction Node Positions, m, 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01, 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01, 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01, 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01, , X-Direction Node Positions, n, 5.93145E+05 5.93155E+05 5.93165E+05 5.93175E+05, 5.93245E+05 5.93255E+05 5.93265E+05 5.93275E+05, 5.94695E+05 5.94705E+05 5.94715E+05 5.94725E+05, 5.94795E+05 5.94805E+05 5.94815E+05 5.94825E+05, , Y-Direction Node Positions, n, 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05, 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05, 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05, 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05, , Z-Direction Node Positions, n, 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01, 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01, 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01, 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01, , ) getDimVar - function( lines, Dim, specifiedvar, starts ) { vstart - grep( paste0( ^, Dim, -Direction Node Positions, , specifiedvar, $ ), lines ) startv - match( vstart, starts ) if ( 0 == length( startv ) ) { stop( Variable , specifiedvar, not found ) } if ( length( starts ) == startv ) { vend - length( lines ) } else { vend - starts[ startv + 1 ] - 1 } tcon - textConnection( lines[ seq( vstart + 1, vend ) ] ) result - scan( tcon ) close( tcon ) result } starts - grep( ^[XYZ]-Direction Node Positions, , lines ) specifiedvar - n n - data.frame( X=getDimVar( lines, X, specifiedvar, starts ) , Y=getDimVar( lines, Y, specifiedvar, starts ) , Z=getDimVar( lines, Z, specifiedvar, starts ) ) # test a variable that doesn't exist specifiedvar - o o - data.frame( X=getDimVar( lines, X, specifiedvar, starts ) , Y=getDimVar( lines, Y, specifiedvar, starts ) , Z=getDimVar( lines, Z, specifiedvar, starts ) ) On Sat, 9 Aug 2014, Waichler, Scott R wrote: Hi, I have some very large (~1.1 GB) output files from a groundwater model called STOMP that I want to read as efficiently as possible. For each variable there are over 1 million values to read. Variables are not organized in columns; instead they are written out in sections in the file, like this: X-Direction Node Positions, m 5.93145E+05 5.93155E+05 5.93165E+05 5.93175E+05 5.93245E+05 5.93255E+05 5.93265E+05 5.93275E+05 . . . 5.94695E+05 5.94705E+05 5.94715E+05 5.94725E+05 5.94795E+05 5.94805E+05 5.94815E+05 5.94825E+05 Y-Direction Node Positions, m 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05 1.14805E+05 . . . 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05 1.17195E+05 Z-Direction Node Positions, m 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01 9.55000E+01 . . . I want to read and use only a subset of the variables. I wrote the function below to find the line where each target variable begins and then scan the values, but it still seems rather slow, perhaps because I am opening and closing the file for each variable. Can anyone suggest a faster way? # Reads original STOMP plot file (plot.*) directly. Should be useful when the plot files are # very large with lots of variables, and you just want to retrieve a few of them. # Arguments: 1) plot filename, 2) number of nodes, # 3) character vector of names of target variables you want to return. # Returns a list with the selected plot output. READ.PLOT.OUTPUT6 - function(plt.file,
[R] Time series analysis for a large number of series
I have over 8000 time series that I need to analyze and forecast. Running 1500 takes over 2 hours using just ETS, let alone Holt-Winters and ARIMA. So I am looking at ways in shrinking the time to generate a 2 year forecast. The code I am using successfully to run through the time series sequentially is below. The essence of the code being reading data from multiple CSV files, 1 per data set, that contain up to 5 years of historical sales by item. I parse each file out by item, generate a time-series for each item, fit the ETS model by item, generate a 24 months forecast by item, add the item number to the forecast, and write the forecast to an Excel file. I'm looking for guidance in two areas: * Reading the raw data in from Excel which is in the form: d1d2d3 d4... series 1 v11 v12 v13 v14 series 2 v21 v22 v23 v24 . . * Using parallel processing to analyze the data more quickly using several cores. I have tried to use doParallel at the item level, but without success. I have annotated the code to show where I tried to insert the %dopar% aspects. # store the current directory initial.dir-getwd() # change to the new directory setwd(~/R) # load the necessary libraries require(TTR) require(forecast) require(xlsx) #require(doParallel) #cl - makeCluster(3) #registerDoSNOW(cl) #chunks - getDoParWorkers() # output plots to a file pdf(R Plots.pdf) # set the output file sink(file = R Output.out, type = c(output)) # load the dataset files - c(3MH, 6MH, 12MH) for (j in 1:3) { title - paste(\n\n\n Evaluation of, files[j], - Started at, date(), \n\n\n) cat(title) History - read.csv(paste(files[j],csv, sep=.)) # output forecast to XLSX outwb - createWorkbook() sheet - createSheet(outwb, sheetName = paste(files[j], - ETS)) Item - unique(unlist(History$Item)) for (i in 1:length(Item)) # I tried using r - foreach(i=1:length(Item) , .combine='rbind') %dopar% at this level { title - paste(Evaluation of item , Item[i], -, i, of, length(Item),\n) cat(title) data - subset(History, Item == Item[i]) dates - unique(unlist(data$Date)) d - as.Date(dates, format(%d/%m/%Y)) data.ts - ts(data$Volume, frequency=12, start=c(as.numeric(format(d[1],%Y)), as.numeric(format(d[1],%m #try(plot(decompose(data.ts))) #acf(data.ts) try(data.ets - ets(data.ts)) try(forecast.ets - forecast.ets(data.ets, h=24)) IL - c(Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i]) ets.df - data.frame(forecast.ets) ets.df$Item - IL r - 24*(i-1)+2 addDataFrame(ets.df, sheet, col.names=FALSE, startRow=r) } title - paste(\n\n\n Evaluation of, files[j], - Completed at, date(), \n\n\n) cat(title) saveWorkbook(outwb, paste(files[j],xlsx,sep='.')) } # close the output file sink() dev.off() #stopCluster(cl) # change back to the original directory setwd(initial.dir) Trevor Miles Vice President, Thought Leadership [http://www.kinaxis.com/email-signature/images/logo-kinaxis.png]http://www.kinaxis.com O: +1.613.907.7611 | M: +1.647.248.6269 | T: @MilesAheadhttps://twitter.com/milesahead | L: ca.linkedin.com/in/trevormileshttp://ca.linkedin.com/in/trevormiles [Kinexions '14]http://kinexions.kinaxis.com [http://www2.kinaxis.com/email-signature/images/social-icon-twitter.png]http://twitter.com/kinaxis [http://www2.kinaxis.com/email-signature/images/social-icon-facebook.png] http://www.facebook.com/Kinaxis [http://www2.kinaxis.com/email-signature/images/social-icon-linkedin.png] http://www.linkedin.com/company/kinaxis [http://www2.kinaxis.com/email-signature/images/social-icon-community.png] https://community.kinaxis.com Confidential. This email and any attachments hereto may contain private, confidential, and privileged material for the sole use of the addressee. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please return this email to the sender immediately and permanently delete the original and any copies of this email and any of its attachments. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.