Re: [R] grofit package problem inputting dataset

2014-08-09 Thread Jeff Newmiller
Your csv output doesn't have any commas in it. Your email is in HTML format 
so we cannot trust it to show what is really there (read the Posting Guide). 
The sink function forwards stuff that would have been printed to a file, but 
that isn't a particularly good way to exchange data with other software.

I use

write.table(foo,data.csv,sep=,,row.names=FALSE)

to export csv data.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On August 8, 2014 10:19:48 PM PDT, Fethe, Michael mfet...@vols.utk.edu 
wrote:
I've recently wanted to analyze some data sets of growth curves; so, I
decided to try out the grofit package the dataset inputting gave me
some issues.


I've been trying to replicate the example from the grofit package with


R


 foo - ran.data(100, 25)

 time - foo$time
 data - foo$data


 write.csv(file=data.csv, foo)

 sink(file=sink.txt)

 foo

 sink()


also to see if there is another problem I'm missing i've used sink() to
see the actual output of this data.


The sink call provides a dataframe object, however I'm trying to use my
own data in this. Is there a way to create the dataframe object in
excel (I've tried following the example from the write.csv() output).


I know there is a problem with the input method by following the .csv
output and i could use follow the sink() output to create my data frame
object; however, I'm not sure about how i would do this with a large
dataset with lets say 1000 data points. Has anyone ran into this issue
and is there a quick work around?



the sink() output


$data
X1 X2 X3   X4   X5   X6   X7   
X8  X9 X10 X11   X12   X13   X14   
   X15  X16  X17
1 Test I  A 0.06195419  0.314959772  0.398263408  0.091132317 
1.012083039  0.35122189  1.78719671  2.47453614  3.25305005 3.9829927
5.2863775 5.8576975 7.2154323 8.501599 8.405278
2 Test I  B 0.09543701  0.441797809 -0.345308045  0.462532336 
1.324153423  1.20722268  1.71422886  2.40135394  2.79558398 3.8981917
5.3614344 5.8423570 6.8511538 7.192188 8.344946
3 Test I  C 0.16386938 -0.178975999  0.464790443  0.325264753 
0.033088580  0.79395919  0.63525411  1.71685521  2.62738577 2.5209004
4.4535178 4.9102966 6.8867905 6.996433 7.553863
4 Test I  D 0.14198175  0.100778235 -0.164231759 -0.322266709 
0.571067561  1.24234632  1.54056165  1.96933568  2.97097469 3.7711348
3.8414402 5.2768758 5.6688960 7.041779 7.651093
5 Test I  E 0.25390711  0.039312093 -0.463351713  0.628339527 
0.418403984  0.56460811  1.26348242  1.56823878  1.93588943 3.3141874
3.0360158 3.7567956 5.6896075 5.873556 6.524754
6 Test I  F 0.22319304 -0.076464713  0.074501305 -0.160924707 
0.384392150  0.76412340  1.36118116  1.50356468  2.53322106 3.4924520
4.5054475 4.6222326 5.5347148 6.349000 7.482548
7 Test I  G 0.28095487 -0.728248588  0.479450323  0.542078371 
0.757460716  0.10292177  1.01113655  1.14448036  2.19976257 3.3030023
3.0848547 4.1877661 5.2832997 5.485825 6.375234
8 Test I  A 0.32135093 -0.036980401 -0.068313544 -0.059620107 
0.440995143  0.48424749  0.65644521  1.38482000  2.05964880 2.2548116
2.6813025 3.5085200 4.7988415 5.017753 5.405950
9 Test I  B 0.31930456  0.169104577  0.100637447  0.070003632 
0.304209263  1.72301034


...


$time
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[,9][,10][,11][,12][,13][,14][,15][,16]   
[,17][,18][,19][,20]
[1,] 1.105562 2.429351 3.303547 4.941900 5.370762 6.939251 7.546835
8.400435 9.602104 10.85058 11.27732 12.12658 13.80886 14.03953 15.88684
16.63256 17.85349 18.29452 19.94482 20.74473
[2,] 1.842357 2.183849 3.528340 4.761010 5.610834 6.473241 7.771216
8.575116 9.544966 10.79126 11.61007 12.91479 13.21070 14.27869 15.30520
16.32062 17.71822 18.94936 19.58821 20.46276
[3,] 1.787510 2.734994 3.063325 4.788528 5.676988 6.096337 7.373838
8.047332 9.042756 10.35715 11.47604 12.84095 13.14183 14.77351 15.81132
16.63433 17.37387 18.51372 19.83332 20.74899
[4,] 1.074662 2.386418 3.734951 4.889867 5.994899 6.483399 7.603956
8.826142 9.536391 10.01653 11.03079 12.60530 13.00965 14.70169 15.17157
16.75678 17.85311 18.22975 19.28407 20.73184
[5,] 1.748113 2.185892 3.236870 4.294671 5.093055 6.910312 7.881226
8.067719 9.632505 10.26807 11.03523 12.75277 13.66110 14.30814 15.61313
16.62628 17.98222 18.95378 19.46946 20.17275



the write.csv() output

data.X1 data.X2 data.X3 data.X4 data.X5 data.X6 data.X7 

Re: [R] Logical operators and named arguments

2014-08-09 Thread Prof Brian Ripley

On 09/08/2014 01:10, Joshua Wiley wrote:

On Sat, Aug 9, 2014 at 9:56 AM, Patrick Burns pbu...@pburns.seanet.com
wrote:


On 07/08/2014 07:21, Joshua Wiley wrote:


Hi Ryan,

It does work, but the *apply family of functions always pass to the first
argument, so you can specify e2 = , but not e1 =.  For example:

  sapply(1:3, ``, e2 = 2)



[1] FALSE FALSE  TRUE



That is not true:



But it is passed as the first argument, not by name, but positionally.  The
reason it works with your gt() is because R with regular functions is
flexible:


f - function(x, y) x  y
f(1:3, x = 2)

[1]  TRUE FALSE FALSE

but primitives ARE positionally matched


That's not true either.  Almost all primitives intended to be called as 
functions do have standard argument-matching semantics.  (Once upon a 
time they did not, but I added the requisite code years ago.)  There are 
six exceptions plus binary operators and other language elements.


See 
http://cran.r-project.org/doc/manuals/r-release/R-ints.html#g_t_002eInternal-vs-_002ePrimitive 
 and the comments about primitive functions in ?lapply.





``(1:3, 2)

[1] FALSE FALSE  TRUE

``(1:3, e1 = 2)

[1] FALSE FALSE  TRUE





gt - function(x, y) x  y


sapply(1:3, gt, y=2)

[1] FALSE FALSE  TRUE

sapply(1:3, gt, x=2)

[1]  TRUE FALSE FALSE

Specifying the first argument(s) in an apply
call is a standard way of getting flexibility.

I'd hazard to guess that the reason the original
version doesn't work is because `` is Primitive.
There's speed at the expense of not behaving quite
the same as typical functions.

Pat



  From ?sapply




   'lapply' returns a list of the same length as 'X', each element of
   which is the result of applying 'FUN' to the corresponding element
   of 'X'.

so `` is applied to each element of 1:3

``(1, ...)
``(2, ...)
``(3, ...)

and if e2 is specified than that is passed

``(1, 2)
``(2, 2)
``(3, 2)

Further, see ?Ops

 If the members of this group are called as functions, any
argument names are removed to ensure that positional matching
is always used.

and you can see this at work:

  ``(e1 = 1, e2 = 2)



[1] FALSE


``(e2 = 1, e1 = 2)


[1] FALSE

If you want to the flexibility to specify which argument the elements of X
should be *applied to, use a wrapper:

  sapply(1:3, function(x) ``(x, 2))



[1] FALSE FALSE  TRUE


sapply(1:3, function(x) ``(2, x))


[1]  TRUE FALSE FALSE


HTH,

Josh



On Thu, Aug 7, 2014 at 2:20 PM, Ryan rec...@bwh.harvard.edu wrote:

  Hi,


I'm wondering why calling  with named arguments doesn't work as
expected:

  args()



function (e1, e2)
NULL

  sapply(c(1,2,3), ``, e2=0)



[1] TRUE TRUE TRUE

  sapply(c(1,2,3), ``, e1=0)



[1] TRUE TRUE TRUE

Shouldn't the latter be FALSE?

Thanks for any help,
Ryan


The information in this e-mail is intended only for th...{{dropped:23}}



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
  'Impatient R'
  'The R Inferno'
  'Tao Te Programming')








--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford
1 South Parks Road, Oxford OX1 3TG, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Possible pair of 2 binary vectors

2014-08-09 Thread Ron Michael
Hi,

Let say I have 2 binary vectors of length 'd', therefore both these vectors can 
take only 0-1 values. Now I want to simulate all possible pairs of them. 
Theoretically there will be 4^d possible pairs.

Is there any R function to directly simulate them?

Thanks for your help.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Package for Text Manipulation

2014-08-09 Thread Omar André Gonzáles Díaz
Hi all,

I want to know, where i can find a package to simulate the functions
Search and Replace  and Find Words that contain - replace them with...,
that we can use in EXCEL.

I've look in other places and they say: Reshape2 by Hadley Wickham. How
ever, i've investigated it and its not exactly what i'm looking (it's main
functions are cast and melt, sure you know them).

May you help me please? I want to download data from Google Analytics and
clean it, what is the best approach?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Possible pair of 2 binary vectors

2014-08-09 Thread Jorge I Velez
Dear Ron,

What about this?

set.seed(123)
d - 4
x1 - sample(0:1, d, TRUE)
x2 - sample(0:1, d, TRUE)
x1
x2
expand.grid(x1 = x1, x2 = x2)

See ?expand.grid for more information.

Best,
Jorge.-



On Sat, Aug 9, 2014 at 7:46 PM, Ron Michael ron_michae...@yahoo.com wrote:

 Hi,

 Let say I have 2 binary vectors of length 'd', therefore both these
 vectors can take only 0-1 values. Now I want to simulate all possible pairs
 of them. Theoretically there will be 4^d possible pairs.

 Is there any R function to directly simulate them?

 Thanks for your help.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Package for Text Manipulation

2014-08-09 Thread Gabor Grothendieck
On Sat, Aug 9, 2014 at 8:15 AM, Omar André Gonzáles Díaz
oma.gonza...@gmail.com wrote:
 Hi all,

 I want to know, where i can find a package to simulate the functions
 Search and Replace  and Find Words that contain - replace them with...,
 that we can use in EXCEL.

 I've look in other places and they say: Reshape2 by Hadley Wickham. How
 ever, i've investigated it and its not exactly what i'm looking (it's main
 functions are cast and melt, sure you know them).

 May you help me please? I want to download data from Google Analytics and
 clean it, what is the best approach?

 [[alternative HTML version deleted]]


1. The gsubfn function in the gsubfn package can do that.  These
commands extract the words and then apply the function represented in
formula notation in the second argument to them:

library(gsubfn) # home page at http://gsubfn.googlecode.com
s - The quick brown fox # test data

# replace the word quick with QUICK

gsubfn(\\S+, ~ if (x == quick) QUICK else x, s)
## [1] The QUICK brown fox

# replace words containing o with ?

gsubfn(\\S+, ~ if (grepl(o, x)) ? else x, s)
## [1] The quick ? ?

2. It can also be done without packages:

# replace quick with QUICK

gsub(\\bquick\\b, QUICK, s)
## [1] The QUICK brown fox

# or the following which first split s into a vector of words and
# operate on that pasting it back into a single string at the end

words - strsplit(s, \\s+)[[1]]
paste(replace(words, words == quick, QUICK), collapse =  )
## [1] The QUICK brown fox

# replace words containing o with ?.  Use `words` from above.

paste(replace(words, grepl(o, words), ?), collapse =  )
## [1] The quick ? ?

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing manual package problem

2014-08-09 Thread Ista Zahn
If you just want to install the package from github, the easy way is to
first install the devtools package and use the install_github function.

Best,
Ista
 On Aug 8, 2014 4:21 PM, James Holland holland.ag...@gmail.com wrote:

 Running R 3.03 on Windows 7

 I am trying to install a package from a github repository.

 https://github.com/google/glassbox

 I downloaded the repository as a zip file, extracted it to get the glassbox
 folder and re-zipped it with 7-zip.

 I then ran

 #-Start code---#

 install.packages(C:/Users/jholland/Downloads/glassbox.zip, repos=NULL,
 type=source)

 #-#

 The output message said

 Installing package into ‘C:/Users/jholland/Documents/R/win-library/3.0’
 (as ‘lib’ is unspecified)

  library(glassbox)
 Error in library(glassbox) : ‘glassbox’ is not a valid installed package

 I'm not sure what I'm doing wrong.  When I look in the R library folder
 (...R/win-library/3.0) I see the glassbox folder there.

 I'm new to using packages not from the CRAN list so I'm trying to learn
 fast.  I tried some searching and this seems to be what I'm suppossed to
 do, but perhaps I need to use dev mode ?

 Thank you for the help.

 ~James

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing manual package problem

2014-08-09 Thread James Holland
Thank you all, I didn't know about the install_github function.

Sorry, forgot to switch to plain text

On Sat, Aug 9, 2014 at 10:11 AM, Ista Zahn istaz...@gmail.com wrote:
 If you just want to install the package from github, the easy way is to
 first install the devtools package and use the install_github function.

 Best,
 Ista

 On Aug 8, 2014 4:21 PM, James Holland holland.ag...@gmail.com wrote:

 Running R 3.03 on Windows 7

 I am trying to install a package from a github repository.

 https://github.com/google/glassbox

 I downloaded the repository as a zip file, extracted it to get the
 glassbox
 folder and re-zipped it with 7-zip.

 I then ran

 #-Start code---#

 install.packages(C:/Users/jholland/Downloads/glassbox.zip, repos=NULL,
 type=source)

 #-#

 The output message said

 Installing package into ‘C:/Users/jholland/Documents/R/win-library/3.0’
 (as ‘lib’ is unspecified)

  library(glassbox)
 Error in library(glassbox) : ‘glassbox’ is not a valid installed package

 I'm not sure what I'm doing wrong.  When I look in the R library folder
 (...R/win-library/3.0) I see the glassbox folder there.

 I'm new to using packages not from the CRAN list so I'm trying to learn
 fast.  I tried some searching and this seems to be what I'm suppossed to
 do, but perhaps I need to use dev mode ?

 Thank you for the help.

 ~James

 [[alternative HTML version deleted]]



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing manual package problem

2014-08-09 Thread Uwe Ligges



On 09.08.2014 17:40, James Holland wrote:

Thank you all, I didn't know about the install_github function.

Sorry, forgot to switch to plain text

On Sat, Aug 9, 2014 at 10:11 AM, Ista Zahn istaz...@gmail.com wrote:

If you just want to install the package from github, the easy way is to
first install the devtools package and use the install_github function.



Reason why your former approach did not work:
This is a source package, you need to install source packages via

install.packages(..., type=source)

or from the command line via

R CMD INSTALL package_version.tar.gz

See the R Installation and Administration manual for details.
To build a proper .tar.gz file, do use

R CMD build directory_name

from the command line.

Best,
Uwe Ligges




Best,
Uwe Ligges




Best,
Ista

On Aug 8, 2014 4:21 PM, James Holland holland.ag...@gmail.com wrote:


Running R 3.03 on Windows 7

I am trying to install a package from a github repository.

https://github.com/google/glassbox

I downloaded the repository as a zip file, extracted it to get the
glassbox
folder and re-zipped it with 7-zip.

I then ran

#-Start code---#

install.packages(C:/Users/jholland/Downloads/glassbox.zip, repos=NULL,
type=source)

#-#

The output message said

Installing package into ‘C:/Users/jholland/Documents/R/win-library/3.0’
(as ‘lib’ is unspecified)


library(glassbox)

Error in library(glassbox) : ‘glassbox’ is not a valid installed package

I'm not sure what I'm doing wrong.  When I look in the R library folder
(...R/win-library/3.0) I see the glassbox folder there.

I'm new to using packages not from the CRAN list so I'm trying to learn
fast.  I tried some searching and this seems to be what I'm suppossed to
do, but perhaps I need to use dev mode ?

Thank you for the help.

~James

 [[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Package for Text Manipulation

2014-08-09 Thread David Winsemius

On Aug 9, 2014, at 5:15 AM, Omar André Gonzáles Díaz wrote:

 Hi all,
 
 I want to know, where i can find a package to simulate the functions
 Search and Replace  and Find Words that contain - replace them with...,
 that we can use in EXCEL.
 
 I've look in other places and they say: Reshape2 by Hadley Wickham. How
 ever, i've investigated it and its not exactly what i'm looking (it's main
 functions are cast and melt, sure you know them).
 
 May you help me please? I want to download data from Google Analytics and
 clean it, what is the best approach?
 

That request is on the vague side. You are advised in the Posting Guide to 
include code that begins an analysis and then requests assistance with specific 
difficulties. (You are also asked to do this in a plain text message since HTML 
tends to scramble messages.) The base package offers the `grep`, `sub`, and 
`gsub` functions which bring the power of regular expression to the R user. 
There are much more flexible that anything that Excel offers. Please look at:

?grep
?regex


   [[alternative HTML version deleted]]

And do :

 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] loops with assign() and get()

2014-08-09 Thread Laura Villegas Ortiz
Dear all,

I was able to create 102 distinct dataframes (DFs1, DFs2, DFs3, etc) using
the assign() in a loop.

Now, I would like to perform the following transformation for each one of
these dataframes:

df1=DFs1[1,]
df1=df1[,1:3]
names(df1)=names(DFs1[c(1,4,5)])
df1=rbind(df1,DFs1[c(1,4,5)])
names(df1)=c(UID,Date,Location)

something like this:

for (i in 1 : nrow(unique)){

dfi=DFsi[1,]
dfi=dfi[,1:3]
names(dfi)=names(DFsi[c(1,4,5)])
dfi=rbind(dfi,DFsi[c(1,4,5)])
names(dfi)=c(UID,Date,Location)

}

I thought it could be straightforward but has proven the opposite

Many thanks

Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reading chunks of data from a file more efficiently

2014-08-09 Thread Waichler, Scott R
Hi,

I have some very large (~1.1 GB) output files from a groundwater model called 
STOMP that I want to read as efficiently as possible.  For each variable there 
are over 1 million values to read.  Variables are not organized in columns; 
instead they are written out in sections in the file, like this:

X-Direction Node Positions, m
 5.93145E+05  5.93155E+05  5.93165E+05  5.93175E+05
 5.93245E+05  5.93255E+05  5.93265E+05  5.93275E+05
. . . 
 5.94695E+05  5.94705E+05  5.94715E+05  5.94725E+05
 5.94795E+05  5.94805E+05  5.94815E+05  5.94825E+05

Y-Direction Node Positions, m
 1.14805E+05  1.14805E+05  1.14805E+05  1.14805E+05
 1.14805E+05  1.14805E+05  1.14805E+05  1.14805E+05
. . . 
 1.17195E+05  1.17195E+05  1.17195E+05  1.17195E+05
 1.17195E+05  1.17195E+05  1.17195E+05  1.17195E+05

Z-Direction Node Positions, m
 9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01
 9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01
. . .

I want to read and use only a subset of the variables.  I wrote the function 
below to find the line where each target variable begins and then scan the 
values, but it still seems rather slow, perhaps because I am opening and 
closing the file for each variable.  Can anyone suggest a faster way?

# Reads original STOMP plot file (plot.*) directly.  Should be useful when the 
plot files are
# very large with lots of variables, and you just want to retrieve a few of 
them.  
# Arguments:  1) plot filename, 2) number of nodes, 
# 3) character vector of names of target variables you want to return.
# Returns a list with the selected plot output.
READ.PLOT.OUTPUT6 - function(plt.file, num.nodes, var.names) {
  lines - readLines(plt.file)
  num.vars - length(var.names)
  tmp - list()
  for(i in 1:num.vars) {
ind - grep(var.names[i], lines, fixed=T, useBytes=T)
if(length(ind) != 1) stop(Not one line in the plot file with matching 
variable name.\n)
tmp[[i]] - scan(plt.file, skip=ind, nmax=num.nodes, quiet=T)
  }
  return(tmp)
}  # end READ.PLOT.OUTPUT6()

Regards,
Scott Waichler
Pacific Northwest National Laboratory
Richland, WA, USA
scott.waich...@pnnl.gov

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loops with assign() and get()

2014-08-09 Thread William Dunlap
 I was able to create 102 distinct dataframes (DFs1, DFs2, DFs3, etc) using
 the assign() in a loop.

The first step to making things easier to do is to put those data.frames
into a list.  I'll call it DFS and your data.frames will now be DFs[[1]],
DFs[[2]], ..., DFs[[length(DFs)]].
DFs - lapply(paste0(DFs, 1:102), get)
In the future, I think it would be easier if you skipped the 'assign()'
and just put the data into a list from the start.

Now use lapply to process that list, creating a new list called 'df', where
df[[i]] is the result of processing DFs[[i]]:

df - lapply(DFs, FUN=function(DFsi) {
  # your code from the for loop you supplied
  dfi=DFsi[1,]
  dfi=dfi[,1:3]
  names(dfi)=names(DFsi[c(1,4,5)])
  dfi=rbind(dfi,DFsi[c(1,4,5)])
  names(dfi)=c(UID,Date,Location)
  dfi # return this to put in list that lapply is making
  })

(You didn't supply sample data so I did not run this - there may be typos.)

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Sat, Aug 9, 2014 at 1:39 PM, Laura Villegas Ortiz lvil...@ncsu.edu wrote:
 Dear all,

 I was able to create 102 distinct dataframes (DFs1, DFs2, DFs3, etc) using
 the assign() in a loop.

 Now, I would like to perform the following transformation for each one of
 these dataframes:

 df1=DFs1[1,]
 df1=df1[,1:3]
 names(df1)=names(DFs1[c(1,4,5)])
 df1=rbind(df1,DFs1[c(1,4,5)])
 names(df1)=c(UID,Date,Location)

 something like this:

 for (i in 1 : nrow(unique)){

 dfi=DFsi[1,]
 dfi=dfi[,1:3]
 names(dfi)=names(DFsi[c(1,4,5)])
 dfi=rbind(dfi,DFsi[c(1,4,5)])
 names(dfi)=c(UID,Date,Location)

 }

 I thought it could be straightforward but has proven the opposite

 Many thanks

 Laura

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading chunks of data from a file more efficiently

2014-08-09 Thread Jeff Newmiller
Informally abbreviating data is not recommended... I faked some, but would 
appreciate if you would make your example reproducible next time.


All I really did for performance was use the data you read in rather than 
re-scanning the file.


# generated by using dput()
lines - c(X-Direction Node Positions, m,
 5.93145E+05  5.93155E+05  5.93165E+05  5.93175E+05,
 5.93245E+05  5.93255E+05  5.93265E+05  5.93275E+05,
 5.94695E+05  5.94705E+05  5.94715E+05  5.94725E+05,
 5.94795E+05  5.94805E+05  5.94815E+05  5.94825E+05,
,
Y-Direction Node Positions, m,
 1.14805E+05  1.14805E+05  1.14805E+05  1.14805E+05,
 1.14805E+05  1.14805E+05  1.14805E+05  1.14805E+05,
 1.17195E+05  1.17195E+05  1.17195E+05  1.17195E+05,
 1.17195E+05  1.17195E+05  1.17195E+05  1.17195E+05,
,
Z-Direction Node Positions, m,
 9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01,
 9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01,
 9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01,
 9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01,
,
X-Direction Node Positions, n,
 5.93145E+05  5.93155E+05  5.93165E+05  5.93175E+05,
 5.93245E+05  5.93255E+05  5.93265E+05  5.93275E+05,
 5.94695E+05  5.94705E+05  5.94715E+05  5.94725E+05,
 5.94795E+05  5.94805E+05  5.94815E+05  5.94825E+05,
,
Y-Direction Node Positions, n,
 1.14805E+05  1.14805E+05  1.14805E+05  1.14805E+05,
 1.14805E+05  1.14805E+05  1.14805E+05  1.14805E+05,
 1.17195E+05  1.17195E+05  1.17195E+05  1.17195E+05,
 1.17195E+05  1.17195E+05  1.17195E+05  1.17195E+05,
,
Z-Direction Node Positions, n,
 9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01,
 9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01,
 9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01,
 9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01,
, )

getDimVar - function( lines, Dim, specifiedvar, starts ) {
  vstart - grep( paste0( ^, Dim, -Direction Node Positions, 
, specifiedvar, $ ), lines )
  startv - match( vstart, starts )
  if ( 0 == length( startv ) ) {
stop( Variable , specifiedvar,  not found )
  }
  if ( length( starts ) == startv ) {
vend - length( lines )
  } else {
vend - starts[ startv + 1 ] - 1
  }
  tcon - textConnection( lines[ seq( vstart + 1, vend ) ] )
  result - scan( tcon )
  close( tcon )
  result
}

starts - grep( ^[XYZ]-Direction Node Positions, , lines )

specifiedvar - n
n - data.frame( X=getDimVar( lines, X, specifiedvar, starts )
   , Y=getDimVar( lines, Y, specifiedvar, starts )
   , Z=getDimVar( lines, Z, specifiedvar, starts ) )

# test a variable that doesn't exist
specifiedvar - o
o - data.frame( X=getDimVar( lines, X, specifiedvar, starts )
   , Y=getDimVar( lines, Y, specifiedvar, starts )
   , Z=getDimVar( lines, Z, specifiedvar, starts ) )


On Sat, 9 Aug 2014, Waichler, Scott R wrote:


Hi,

I have some very large (~1.1 GB) output files from a groundwater model called 
STOMP that I want to read as efficiently as possible.  For each variable there 
are over 1 million values to read.  Variables are not organized in columns; 
instead they are written out in sections in the file, like this:

X-Direction Node Positions, m
5.93145E+05  5.93155E+05  5.93165E+05  5.93175E+05
5.93245E+05  5.93255E+05  5.93265E+05  5.93275E+05
. . .
5.94695E+05  5.94705E+05  5.94715E+05  5.94725E+05
5.94795E+05  5.94805E+05  5.94815E+05  5.94825E+05

Y-Direction Node Positions, m
1.14805E+05  1.14805E+05  1.14805E+05  1.14805E+05
1.14805E+05  1.14805E+05  1.14805E+05  1.14805E+05
. . .
1.17195E+05  1.17195E+05  1.17195E+05  1.17195E+05
1.17195E+05  1.17195E+05  1.17195E+05  1.17195E+05

Z-Direction Node Positions, m
9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01
9.55000E+01  9.55000E+01  9.55000E+01  9.55000E+01
. . .

I want to read and use only a subset of the variables.  I wrote the function 
below to find the line where each target variable begins and then scan the 
values, but it still seems rather slow, perhaps because I am opening and 
closing the file for each variable.  Can anyone suggest a faster way?

# Reads original STOMP plot file (plot.*) directly.  Should be useful when the 
plot files are
# very large with lots of variables, and you just want to retrieve a few of 
them.
# Arguments:  1) plot filename, 2) number of nodes,
# 3) character vector of names of target variables you want to return.
# Returns a list with the selected plot output.
READ.PLOT.OUTPUT6 - function(plt.file, 

[R] Time series analysis for a large number of series

2014-08-09 Thread Trevor Miles
I have over 8000 time series that I need to analyze and forecast. Running 1500 
takes over 2 hours using just ETS, let alone Holt-Winters and ARIMA. So I am 
looking at ways in shrinking the time to generate a 2 year forecast.

The code I am using successfully to run through the time series sequentially is 
below. The essence of the code being reading data from multiple CSV files, 1 
per data set, that contain up to 5 years of historical sales by item. I parse 
each file out by item, generate a time-series for each item, fit the ETS model 
by item, generate a 24 months forecast by item, add the item number to the 
forecast, and write the forecast to an Excel file.

I'm looking for guidance in two areas:

* Reading the raw data in from Excel which is in the form:
 d1d2d3   d4...
series 1  v11   v12  v13  v14
series 2  v21   v22  v23  v24
.
.

* Using parallel processing to analyze the data more quickly using 
several cores.

I have tried to use doParallel at the item level, but without success. I have 
annotated the code to show where I tried to insert the %dopar% aspects.

# store the current directory
initial.dir-getwd()
# change to the new directory
setwd(~/R)
# load the necessary libraries
require(TTR)
require(forecast)
require(xlsx)

#require(doParallel)
#cl - makeCluster(3)
#registerDoSNOW(cl)
#chunks - getDoParWorkers()

# output plots to a file
pdf(R Plots.pdf)
# set the output file
sink(file = R Output.out, type = c(output))

# load the dataset
files - c(3MH, 6MH, 12MH)
for (j in 1:3)
{
  title - paste(\n\n\n Evaluation of, files[j],  - Started at, date(), 
\n\n\n)
  cat(title)

  History - read.csv(paste(files[j],csv, sep=.))

  # output forecast to XLSX
  outwb - createWorkbook()
  sheet - createSheet(outwb, sheetName = paste(files[j],  - ETS))
  Item - unique(unlist(History$Item))

  for (i in 1:length(Item))  # I tried using r - foreach(i=1:length(Item) , 
.combine='rbind') %dopar% at this level
  {
title - paste(Evaluation of item , Item[i], -, i, of, 
length(Item),\n)
cat(title)
data - subset(History, Item == Item[i])
dates - unique(unlist(data$Date))
d - as.Date(dates, format(%d/%m/%Y))
data.ts - ts(data$Volume, frequency=12, 
start=c(as.numeric(format(d[1],%Y)), as.numeric(format(d[1],%m
#try(plot(decompose(data.ts)))
#acf(data.ts)
try(data.ets - ets(data.ts))
try(forecast.ets - forecast.ets(data.ets, h=24))
IL - 
c(Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i],Item[i])
ets.df - data.frame(forecast.ets)
ets.df$Item - IL
r - 24*(i-1)+2
addDataFrame(ets.df, sheet, col.names=FALSE, startRow=r)
  }

  title - paste(\n\n\n Evaluation of, files[j],  - Completed at, date(), 
\n\n\n)
  cat(title)
  saveWorkbook(outwb, paste(files[j],xlsx,sep='.'))
}

# close the output file
sink()
dev.off()
#stopCluster(cl)
# change back to the original directory
setwd(initial.dir)


Trevor Miles
Vice President, Thought Leadership
[http://www.kinaxis.com/email-signature/images/logo-kinaxis.png]http://www.kinaxis.com
O: +1.613.907.7611  |  M: +1.647.248.6269  |  T: 
@MilesAheadhttps://twitter.com/milesahead  |  L: 
ca.linkedin.com/in/trevormileshttp://ca.linkedin.com/in/trevormiles

[Kinexions '14]http://kinexions.kinaxis.com

[http://www2.kinaxis.com/email-signature/images/social-icon-twitter.png]http://twitter.com/kinaxis
  [http://www2.kinaxis.com/email-signature/images/social-icon-facebook.png] 
http://www.facebook.com/Kinaxis   
[http://www2.kinaxis.com/email-signature/images/social-icon-linkedin.png] 
http://www.linkedin.com/company/kinaxis   
[http://www2.kinaxis.com/email-signature/images/social-icon-community.png] 
https://community.kinaxis.com

Confidential. This email and any attachments hereto may contain private, 
confidential, and privileged material for the sole use of the addressee. Any 
review, copying, or distribution of this email (or any attachments thereto) by 
others is strictly prohibited. If you are not the intended recipient, please 
return this email to the sender immediately and permanently delete the original 
and any copies of this email and any of its attachments. Thank you.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.