date:20090622

[R] Putting R Based open source analytics for collobrative spreadsheet working on the Cloud

2009-06-22 Thread Ajay ohri



 Dear All,

I just posted an interview with Karim Chine of http://www.biocep.net/ who
has successfully built a latform for on demand data mining enabled by the
cloud through R.

Here is an except

BIOCEP is built on top of R and Scilab and anything that you can do within
those environments is accessible through BIOCEP. Here is what you have
uniquely with this new R/Scilab-based e-platform:

- *High productivity* via the most advanced cross-platform workbench
available *for the R environment*.

- *Advanced Graphics: with BIOCEP, a graphic transducer allows the rendering
on client side of graphics produced on server s*ide and enables advanced
capabilities like zooming/unzooming/scrolling for R graphics. a client side
mouse tracker allows to display dynamically information related to the
graphics and depending on coordinates. Several virtual R Devices showing
different data can be coupled in zooming/scrolling and this helps comparing
visually complex graphics.

- *Extensibility with plug-ins:* new views (IDE-like views, analytical
interfaces...) can be created very easily either programmatically or via
drag-and-drop GUI designers.

- *Extensibility with server-side extensions: any java code can be packaged
and used on server side.* The code can interact seamlessly with R and Scilab
or provide generic bridges to other software. For example, I provide an
extension that allows you to use openoffice as a universal converter between
various files formats on server side.

- *Seamless High Performance Computing:* working with an R or Scilab on
clusters/grids/clouds becomes as simple as working with them locally.
Distributed computing becomes seamless, creating a large number R and Scilab
remote engines and using them to solve large scale problems becomes easier
than ever. From the R console the user can create logical links to existing
R engines or to newly created ones and use those logical links to pilot the
remote workers from within his R session. R functions enable using the
logical links to import/export variables from the R session to the different
workers and vice versa. R commands/scripts can be executed by the R workers
synchronously or asynchronously. Many logical R links can be aggregated into
one logical cluster variable that can be used to pilot the R workers in a
coordinated way. A cluster.apply function allows the usage of the logical
cluster to apply a function to a big data structure by slicing it and
sending elementary execution commands to the workers. The workers apply the
user's function to the slices in parallel. The elementary results are
aggregated to compose the final result that becomes available within the R
session.

- *Collaboration:* your R/scilab server running in the cloud can be accessed
simultaneously by you and your collaborators. Everything gets broadcasted
including Graphics. A spreadsheet enables to view and edit data
collaboratively. Anyone can write plug-ins to take advantage of the
collaborative capabilities of the frameworks. If your IP address is public,
you can provide a URL to anyone and get him connect to your locally running
R.

*- Powerful frameworks for Java developers:* BIOCEP provides Frameworks and
tools to use R as if it was an Object Oriented Java Toolkit or a Web Toolkit
for R-based dynamic application.

- *Webservices for C#, Perl, Python users/developers:* Most of the
capabilities of BIOCEP including piloting of R/Scilab engines on the cloud
for distributed computing or for building scalable analytical web
application are accessible from most of the programming languages thanks to
the SOAP front-end.

- *RESTful API:* simple URLs can perform computing using R/Scilab engines
and return the result as an XML or as graphics in any format. This works
like google charts and has all the power of R since the graphic is described
with an R script provided as a parameter of the URL. The same API can be
exposed on demand by the workbench. This allow for example to integrate a
Cloud-R with Excel or OpenOffice. The workbench works as a bridge between
the cloud and those applications.

While a screenshot is attached- you can read the rest of the interview at

 http://tr.im/Rcloud

or http://www.decisionstats.com/



Thanks,

Ajay Ohri
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with checking wether file is present or not

2009-06-22 Thread venkata kirankumar

Hi all,
I have a problem with checking File is present in the directory or not
like
I have a sequence of files in one folder I have to take each file in order
and have to caliculate on those files data but in order some files are
missing for that I have to check and load those files for that I am using

condition like

if(file.exists(findings)==TRUE){}



its giving results for files which are present in that folder
but for files which are not present in that folder its giving error
but it have to skip and run for remaining files

can any one suggest any other way to make it work
for all those files

thanks in advance
kiran

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Floating point precision / guard digits? (PR#13771)

2009-06-22 Thread Liviu Andronic

(re-posting to the appropriate list)

On 6/20/09, Dr. D. P. Kreil dpkr...@gmail.com wrote:
   you can suggest an online resource to help me use the right vocabulary
   and better understand the fundamental concepts, I am of course
 
There is in R the accuracy [1] package. It has a vignette (and paper)
dealing with various computational errors (in R).
Liviu

[1] http://cran.r-project.org/web/packages/accuracy/index.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error when using step

2009-06-22 Thread Chris Friedl


I have two questions about the built-in function step. Ultimately I want to
apply a lm fitting and subsequent step procedure to thousands of data sets
groups by a factor defined as a unique ID.

Q1. The code below creates a data.frame comprising three marginally noisy
surfaces. The code below fails when I use step in a function but summary
seems to show the model fits are legitimate. Any ideas on what I'm doing
wrong?

# y =  x1 +  x2  for grp A
# y = 2 + 2x1 + 4x2  for grp B
# y = 3 + 2x1 + 4x2 + 6x1x2  for grp C
ind - matrix(runif(200), ncol=2, dimnames=list(c(), c(x1,x2)))
d1 - data.frame(ind, y=ind[,x1]+ind[,x2]+rnorm(100,0,0.05),
grp=rep(A,100))
d2 - data.frame(ind,
y=2+2*ind[,x1]+4*ind[,x2]+rnorm(100,0,0.05),grp=rep(B, 100))
d3 - data.frame(ind,
y=3+2*ind[,x1]+4*ind[,x2]+6*ind[,x1]*ind[,x2]+rnorm(100,0,0.05),grp=rep(C,100))
data2 - rbind(d1,d2,d3)
# Fit each surface by grp
model - y ~ x1*x2
fits - lapply(split(data2, list(data2$grp), drop=TRUE),
function(x){lm(model, data = x)})
lapply(fits, function(x){summary(x)})

# FAIL
lapply(fits, function(x){step(x)})

# Output

Start:  AIC=-601.41
y ~ x1 * x2

Df Sum of Sq RSS AIC
- x1:x2  1   0.001590.23 -602.71
none  0.23 -601.41
Error in as.data.frame.default(data) : 
  cannot coerce class lm into a data.frame



I get a different error if I use step directly on the first fit in the list.

step(fits[[1]])

# Output

Start:  AIC=-601.41
y ~ x1 * x2

Df Sum of Sq RSS AIC
- x1:x2  1   0.001590.23 -602.71
none  0.23 -601.41
Error in eval(predvars, data, env) : 
  numeric 'envir' arg not of length one


However step works if I do the fit outside of a function.

fit - lm(model, data=data2[which(data2$grp==A),])
step(fit)

# Output

Start:  AIC=-601.41
y ~ x1 * x2

Df Sum of Sq RSS AIC
- x1:x2  1   0.001590.23 -602.71
none  0.23 -601.41

Step:  AIC=-602.71
y ~ x1 + x2

   Df Sum of Sq RSS AIC
none 0.23 -602.71
- x21  8.368.58 -241.54
- x11  9.559.78 -228.47

Call:
lm(formula = y ~ x1 + x2, data = data2[which(data2$grp == A), ])

Coefficients:
(Intercept)   x1   x2  
   0.001457 0.999313 1.010880  


Q2. How can I setup step to exclude the intercept. In this example above
both the intercept and interaction terms are insignificant. I have tried
setting direction to both and forward but that didn't help.

Thanks for your help.


-- 
View this message in context: 
http://www.nabble.com/Error-when-using-step-tp24142487p24142487.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with checking wether file is present or not

2009-06-22 Thread David Winsemius



On Jun 22, 2009, at 2:21 AM, venkata kirankumar wrote:


Hi all,
I have a problem with checking File is present in the directory or not
like
I have a sequence of files in one folder I have to take each file in  
order

and have to caliculate on those files data but in order some files are
missing for that I have to check and load those files for that I am  
using


condition like

if(file.exists(findings)==TRUE){}


What is findings? Have you defined it elsewhere? What was the code?  
Is findings a properly constructed file name for R's interaction with  
whatever filesystem you are using?


  file.exists( file=/Users/davidwinsemius/Documents/R_folder/ 
actuar.pdf)
[1] TRUE  #would have been FALSE if only the file name was used, since  
that is not my working directory.




its giving results for files which are present in that folder
but for files which are not present in that folder its giving error
but it have to skip and run for remaining files


And am unable to reproduce the problem with a similar call.
  file.exists(xyz)==TRUE
[1] FALSE

You should note that file.exists returns a logical vector, so the  
comparison with TRUE is superfluous.




can any one suggest any other way to make it work
for all those files


Without more specifics it's not possible to reproduce your problem.  
Include full code and the results of sessionInfo.


?sessionInfo




thanks in advance
kiran

[[alternative HTML version deleted]]


***

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

!



David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to customize the CV function in Boosting?

2009-06-22 Thread Michael

Hi all,

Is there a way to write my own error criteria function in the
cross-validation part of the Boosting algorithm?

I am talking about the GBM package. Of course, if you know other good
Boosting implementation in R, please give me some pointers! I am
looking for a good classifier (not neccessarily Boosting) in R. Thank
you very much!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rbind

2009-06-22 Thread Patrick Burns


I'm guessing that your 'data' and 'data1'
are just vectors so your 'rbind' command
returns a 2 by 3 matrix.

Jim showed you already that:

rbind(as.matrix(data), as.matrix(data1))

will probably get you what you are looking
for.

However, I'm suspicious that just:

c(data, data1)

will serve you just as well.  What are you
planning on doing with a one-column matrix?



Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of The R Inferno and A Guide for the Unwilling S User)

Xiaogang Yang wrote:

Hi,
I have a array like this
data:
1  5
2  2342
3 33
and another
data1:
1 6
2 5
3 7
 when I do rbind(data,data1)
I get not what I want
 they become
1 5
2 2342
3 33
101 6
102 5
103 7




but I want to make the index as increasing one by one.
like
1 ..
2 ..
3 ..
4 ..
5 ..
6 ..

So what command I should use

thank you.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error when using step

2009-06-22 Thread Dieter Menne

Chris Friedl cfriedalek at gmail.com writes:

 I have two questions about the built-in function step. Ultimately I want to
 apply a lm fitting and subsequent step procedure to thousands of data sets
 groups by a factor defined as a unique ID.
 
 Q1. The code below creates a data.frame comprising three marginally noisy
 surfaces. The code below fails when I use step in a function but summary
 seems to show the model fits are legitimate. Any ideas on what I'm doing
 wrong?

... Well designed example code omitted
 
 function(x){lm(model, data = x)})
 lapply(fits, function(x){summary(x)})
 
 # FAIL
 lapply(fits, function(x){step(x)})

 Error in as.data.frame.default(data) : 
   cannot coerce class lm into a data.frame

Looks like an environment problem. I could not find a workaround quickly,
but you might have a look at 

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/16599.html

We call it Ripley's Game here, because variants of it can help you quite 
often.

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rbind

2009-06-22 Thread Patrick Burns


Now that I've actually read the question,
I'm in a better position to answer it.

I have no idea how you are getting the
results that you show, but you can use
'rownames' to set whatever row names you
like.  As in:

rownames(result) - 1:6

Pat

Patrick Burns wrote:

I'm guessing that your 'data' and 'data1'
are just vectors so your 'rbind' command
returns a 2 by 3 matrix.

Jim showed you already that:

rbind(as.matrix(data), as.matrix(data1))

will probably get you what you are looking
for.

However, I'm suspicious that just:

c(data, data1)

will serve you just as well.  What are you
planning on doing with a one-column matrix?



Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of The R Inferno and A Guide for the Unwilling S User)

Xiaogang Yang wrote:

Hi,
I have a array like this
data:
1  5
2  2342
3 33
and another
data1:
1 6
2 5
3 7
 when I do rbind(data,data1)
I get not what I want
 they become
1 5
2 2342
3 33
101 6
102 5
103 7




but I want to make the index as increasing one by one.
like
1 ..
2 ..
3 ..
4 ..
5 ..
6 ..

So what command I should use

thank you.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Convert NA to '.'

2009-06-22 Thread David_D


Dear R-users,

For reporting purpose (using Sweave and LaTeX), I am creating complex tables
with the cat function such as

 x-c(A, B, C, NA)
 cat(x, '\n')
A B C NA

For convenience, I would like to change all my NA value to something else
like '.' (as in SAS for example). Is there a global option which allows this
change? Or should I change all my code to work with the print function and
the na.print argument?

Best regards,
David
-- 
View this message in context: 
http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24142187.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with installation of local gzip-ped packages

2009-06-22 Thread Uwe Ligges




mau...@alice.it wrote:

I was suggested to install two tar-red and gzip-ped packages that are not part 
of CRAN or BioConductors yet.
I read the R manual about Administration and could only find a good description of how to install packages 
not canonically included in CRAN repository, on UNIX systems.


Which part of the section Installing packages, including the 
subsection Windows, is not clear?


If you need some example, the R Help Desk article called Package 
Management in R News 3 (3), pp. 37-39, might be of interest.


Best,
Uwe Ligges




I work on Linux/SuSE and  Windows ... so I am stuck with such an installation.
Any suggestion in very welcome.

Thank you.
Maura


tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] coloring agnes plots

2009-06-22 Thread Raymond Wan



Hi all,

I am creating dendrograms using agnes and was wondering if it is 
possible to add color to the leaves (and just the leaves).


For example, in the documentation, they have an example using the 
votes.repub data set.  If I wanted to make the word Washington green 
(and only Washington), is it possible and if so how?


I can apply summary to the object created by agnes and even see the 
names of the objects:


Order of objects:
 [1] AlabamaGeorgiaArkansas   Louisiana 
Mississippi

...

but I don't know how to designate colors to the objects...

Thank you!

Ray

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] filtering number of values in a data frame

2009-06-22 Thread René Schönemann

This does exactly what is needed.

Thanks guys!

-Original Message-
From: markle...@verizon.net [mailto:markle...@verizon.net] 
Sent: Thursday, June 18, 2009 5:30 PM
To: rene.schoenem...@tu-berlin.de
Subject: Re: [R] filtering number of values in a data frame

do.call(rbind,lapply(unname(split(DF,list(DF$X1,DF$START),drop=TRUE)),
function(.df) {
data.frame(.df[1,1:4],count=nrow(.df))
}))

On Jun 18, 2009, René Schönemann rene.schoenem...@tu-berlin.de wrote: 

Dear list,

given is the following data frame df():

Number Place Start End
1 218024740787 HHO 5 263 2008-01-02 00:21:14 2008-01-03 15:25:16
2 218024740787 HHO 5 263 2008-01-02 00:21:14 2008-01-02 00:21:14
3 318039091794 HHO 5 263 2008-01-02 00:21:14 2008-01-02 13:22:54
4 318039091794 HHO 5 263 2008-01-02 00:21:14 2008-01-02 00:21:14
5 318039379900 HHO 1 104 2008-01-02 06:45:01 2008-01-02 09:15:23

Now, I want to count the number of equal values of column Start but I also
want the other columns to be preserved.

Using:

rle(as.character(df$Start)) - m
n - data.frame(m$values, m$lengths)

produces a list of items according to their frequency of the Start point:

m.values m.lengths
1 2008-01-02 00:21:14 4
2 2008-01-02 06:45:01 1

I want now also other columns to be in this new data frame. It should look
like that:

Number Place m.values m.lengths
1 218024740787 HHO 5 263 2008-01-02 00:21:14 4
2 318039379900 HHO 1 104 2008-01-02 06:45:01 1

Does anybody can help me with this?

Thanking you in advance!

René Schönemann

-- 
__

Technische Universität Berlin
Institut für Land- und Seeverkehr
Fachgebiet Schienenfahrwege und Bahnbetrieb 
Prof. Dr.-Ing. habil. Jürgen Siegmann

Post Sekretariat SG 18
Salzufer 17-19
D-10587 Berlin

Telefon +49 (0)30 314 - 23 314 

Internet http://www.railways.tu-berlin.de
__

Dipl.-Verk.wirtsch. René Schönemann
- Wissenschaftlicher Mitarbeiter -

Telefon +49 (0)30 314 - 22 710
Telefax +49 (0)30 314 - 25 530

E-Mail rschoenem...@railways.tu-berlin.de
__

Technische Universität Berlin
Körperschaft öffentlichen Rechts

Präsident Prof. Dr. Kurt Kutzler 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need help to optimize a piece of code involving zoo objects

2009-06-22 Thread Sergey Goriatchev

Hi, everybody

OK, I got it working with recursive. Don't know why this argument
slipped my mind, as I use filter() so often!
Now it is 44 times faster, which is good enough for me. :-)

Thank you, Gabor and Jim.

Best,
Sergey

On Fri, Jun 19, 2009 at 15:23, jim holtmanjholt...@gmail.com wrote:
 check out 'filter' to see if it does what you want with the 'recursive'
 option.

 On Fri, Jun 19, 2009 at 3:33 AM, Sergey Goriatchev serg...@gmail.com
 wrote:

 Hello, everyone

 I have a long script that uses zoo objects. In this script I used
 simple moving averages and these I can very efficiently calculate with
 filter() functions.
 Now, I have to use special exponential moving averages, and the only
 way I could write the code was with a for-loop, which makes everything
 extremely slow.
 I don't know how to optimize the code, but I need to find a solution.
 I hope someone can help me.

 The special moving average is calculated in the following way:

 EMA = ( K x ( C - P ) ) + P

 where,

 C = Current Value
 P = Previous periods EMA    (A SMA is used for the first period's
 calculation)
 K = Exponential smoothing constant

 K = 2 / ( 1 + Periods )

 Below is the code with the for-loop.

 -temp contains C
 -Periods is variable j in the for loop (so K varies)
 - I first produce a vector of simple equally weighted moving average,
 and use the first non-NA value to initiate the second for-loop

 x.Date - as.Date(2003-02-01) + seq(1,1100) - 1
 temp - zoo(rnorm(1100, 0, 10)+100, x.Date)

 start.time - proc.time()

 for(j in seq(5,100,by=5)){

        #PRODUCE FAST MOVING AVERAGE
        #Create equally weighted MA vector (we need only the first value)
        smafast - zoo(coredata(filter(coredata(temp[,1]), filter=rep(1/j,
 j), sides=1)), order.by=time(temp))

        #index of first non-NA value, which is the first SMA needed
        #which(is.na(smafast))[length(which(is.na(smafast)))]+1

        #Calculate decay factor K
        #number of periods is j
        K - 2/(1+j)

        #Calculate recursively the EMA for the fast index (starting with
 second non-NA value)
        for (k in
 (which(is.na(smafast))[length(which(is.na(smafast)))]+2):length(smafast))
 {
                smafast[k] -
 coredata(smafast[k-1])+K*(coredata(temp[k,1])-coredata(smafast[k-1]))
        }

        #PRODUCE SLOW MOVING AVERAGE
        #Create equally weighted MA vector (we need only the first value)
        smaslow - zoo(coredata(filter(coredata(temp[,1]),
 filter=rep(1/(j*4), (j*4)), sides=1)), order.by=time(temp))
        K - 2/(1+j*4)
 #Calculate EMA
        for (k in
 (which(is.na(smaslow))[length(which(is.na(smaslow)))]+2):length(smaslow))
 {
                smaslow[k] -
 coredata(smaslow[k-1])+K*(coredata(temp[k,1])-coredata(smaslow[k-1]))
        }

        #COMBINE DIFFERENCES OF FAST AND SLOW
        temp -         merge(temp, ma=smafast-smaslow)
 }

 proc.time()-start.time

 --
 I'm not young enough to know everything. /Oscar Wilde
 Experience is one thing you can't get for nothing. /Oscar Wilde
 When you are finished changing, you're finished. /Benjamin Franklin
 Tell me and I forget, teach me and I remember, involve me and I learn.
 /Benjamin Franklin
 Luck is where preparation meets opportunity. /George Patten

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?




-- 
I'm not young enough to know everything. /Oscar Wilde
Experience is one thing you can't get for nothing. /Oscar Wilde
When you are finished changing, you're finished. /Benjamin Franklin
Tell me and I forget, teach me and I remember, involve me and I learn.
/Benjamin Franklin
Luck is where preparation meets opportunity. /George Patten

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] png() resolution problem {was Silhouette ...}

2009-06-22 Thread Uwe Ligges




Martin Maechler wrote:

Hallo Sebastian,


SP == Sebastian Pölsterl s...@k-d-w.org
on Sun, 14 Jun 2009 14:04:52 +0200 writes:


SP Hello Martin,
SP I plotting the silhouette of a clustering and storing it as png. When I 
SP try to store the image as png the bars are missing. The bars are plotted 
SP when I use x11 or postscript as device. In addition, it seems to work 
SP when I use a smaller matrix (e.g. ruspini).


SP Would be great if you have look at this issue.

Hmm, I've been at a conference in Italy...
The silhouette plot only uses standard R plotting functions,
so any problem with it exposes problems in standard R
graphics.

-- Such a message should really go to R-help.
to which I CC now.

--

 library(cluster)
 nmat - matrix(rnorm(2500*300), ncol=300, nrow=2500)
 rmat - matrix(rchisq(1000, 300, 50), ncol=300, nrow=1000)
 mat - rbind(nmat, rmat)
 pr - pam(mat, 2)
 sil - silhouette(pr)
 png(sil.png)
 #postscript(sil.ps)
 plot(sil)
 dev.off()

--

Anyway, I can confirm the problem,
but of course, it has not much to do with the silhouette
function, but rather with the png() device which produces a
bitmap, and the lines you draw are too fine (in the bitmap
resolution) and so are rounded to invisible.

You can reproduce the problem much more simply:

set.seed(1); x - rlnorm(5000)

png(bar.png);barplot(x,col=gray,border=0,horiz=TRUE);dev.off()
system(eog bar.png  )

## which is also empty, and the completely analogue, replacing 
## png [bitmap] with  pdf [vector graphic]


pdf(bar.pdf);barplot(x,col=gray,border=0,horiz=TRUE);dev.off()
system(evince bar.pdf )

## gives a very nice plot, into which you can zoom and see all details.



Now in principle you should be able to use  png() with a much
higher resolution than the default one,
but replacing the above
png(bar.bng)
with
png(bar.bng, res = 1200)

did not help, as we now get the infamous
Error in plot.new() : figure margins too large

Other R-help readers will be able to make the png() example work
for such cases, where you need so many lines.
{but let's stick with barplot(*, border=0, *)}



Well, it's pretty hard, but one of the few cases where conversion from 
some vector image to the bitmap produces fair results, hence I'd use 
bitmap() with ghostscript to generate the png as in:


  bitmap(sil.png, type=pnggray, res=300)
  plot(sil)
  dev.off()

Best,
Uwe






Regards,
Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help using substitute and expression functions

2009-06-22 Thread Patrick Connolly

I'm stopped at a browser in a loop where the following objects look
like this:

Browse[1] jk
[1] 1
Browse[1] leg.ab[jk]
[1] Snails Rep1
Browse[1] top.k
[1] LT95=7.5; LT99=8.8

I can join them and a few other characters together like this easily
enough:

Browse[1] paste(jk, : , leg.ab[jk],  [, top.k, ], sep = )
[1] 1: Snails Rep1 [LT95=7.5; LT99=8.8]
Browse[1] 


Now, suppose that instead of a simple character string for top.k, I
had an expression like this:

Browse[1] leg.k.exp
expression(paste(LT[95] == 7.5, ; , LT[99] == 8.8))
Browse[1] 

which was created in a slightly complicated loop joining the 95 and 99
bits together.  That code is designed to have a variable number of
those bits.

I tried using substitute(), bquote and expression() to join the
leg.k.exp object in with the other characters, but it always falls
over trying to parse the : character string instead of using it just
as a string.  I can see in the help for plot.math lots of ways of
adding mathematical symbols and Greeek letters, but nothing for what I
wish to do.  There's a trick or two I'm missing.

Apologies for the lack of specifics of what I've tried.  That's on
another computer inaccessible from this one, and that one doesn't have
a usable email client.

TIA


-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help on creating a sequence of vectors

2009-06-22 Thread megh


I want to create a number of  vectors like :

vec1 - rnorm(1)
vec2 - rnorm(2)
vec3 - rnorm(3)

and so on...

Here I tried following :

for (i in 1:10) paste(vec, i, sep=) - rnorm(i)

However obviously that is not working. Here vectors I need to be seperated
i.e I do not want to create a list. How to modify above code?
-- 
View this message in context: 
http://www.nabble.com/Help-on-creating-a-sequence-of-vectors-tp24144347p24144347.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on creating a sequence of vectors

2009-06-22 Thread Nikos Alexandris


megh:
 I want to create a number of  vectors like :
 
 vec1 - rnorm(1)
 vec2 - rnorm(2)
 vec3 - rnorm(3)
 
 and so on...


Maybe try the assign() function. Something like:

for (i in 1:10) assign ( paste ( vec , i , sep =  ) , rnorm(i) )

Kind regards, Nikos

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Customize axis labels in xyplot

2009-06-22 Thread nmset


I have attached 2 files sample.csv and sample.r to illustrate the problem.

Thank you for considering.

-- 
View this message in context: 
http://www.nabble.com/Customize-axis-labels-in-xyplot-tp24126788p24143742.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xyplot: subscripts, groups and subset

2009-06-22 Thread Auty, Dave

Hi,

 

I'm running the following code to produce lattice plots of microfibril
angle versus ring number in Scots pine. There are 12 trees and 5 sample
positions (Position) in each tree:

 

xyplot(MFA ~ RN | Tree, data = MFA.data, 

   groups = Position, subscripts=TRUE,

   auto.key=list(space = top, points = FALSE, lines = TRUE,
reverse.rows=TRUE, title=Disc Position (1=1.3m), cex=0.5,

   text=paste(Disc:, levels(MFA.data$Position))),

   xlab = Ring Number, ylab = MFA,

   panel = function(x, y, subscripts, groups) {

   panel.grid(h=-1, v= 2)

   panel.xyplot(x, y, subscripts=subscripts, groups=groups,
type=a) 

   panel.superpose(x=x, y=y, groups=Position,
subscripts=subscripts,lty=8, cex=0.25)



   

   })

 

 

But it is giving me the error message:

 

 })

Error: unexpected '}' in}

 

 

This code has worked for me previously but now will not output the
plots. Does anyone know what's going on here?

 

Thanks in advance,

 

Dave

 

 

Dave Auty MSc
PhD Student
Forest Research NRS
Roslin
Midlothian
EH25 9SY

Tel: +44(0)131 445 6936
Fax: +44(0)131 445 5124
Mob: +44(0)7896 685895
www.forestresearch.gov.uk/timberproperties  

 



+ The Forestry Commission's computer systems may be monitored and 
communications carried out on them recorded, to secure the effective operation 
of the system and for other lawful purposes. +

The original of this email was scanned for viruses by the Government Secure 
Intranet (GSi) virus scanning service supplied exclusively by Cable  Wireless 
in partnership with MessageLabs.

On leaving the GSi this email was certified virus-free
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] New line operator in mtext

2009-06-22 Thread Steve Murray


Dear R Users,

I'm finding that when I execute the following bit of code, that the new line 
argument is actually displayed as text in the graphics device. How do I avoid 
this happening?

mtext(side=2, line=5.5, expression(paste(Monthly Summed Runoff (mm/month), 
/n, and Summed Monthly Precipitation (mm x  ,10^2,/month


I suspect that I've done, or omitted, something fairly obvious, but as yet 
cannot see it!

Thanks for your help,

Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] QQplots of probability vector data

2009-06-22 Thread bioinformatics_guy


I'm trying to determine if a set of data is normal from a qq plot but seem to
be having a bit of difficulty.

I have a file of the following form

  9 36
  3 37
  6 38
  7 39
  .

where the left column is the frequency of the number in the right column. 
I've found the probabilities of each number and put it in a file of the form

 36 .0009
 37 .0003
 38 .0006
 39 .0007

where the first column is the number and the second is the probability of
that number.

Now what I've done so far is as follows: 

null=read.table(probfile.txt,header=FALSE)#Prob file is the 2nd file
where $V1 is the number and $V2 isthe probability

and I want to do 
x=qnorm(null$V2, ... ) but I don't know how to get the mean and sd from
those 2 files.  When I looked up mean() and sd(), I had to give a vector of
numbers.  As I only have the number of occurances and their probability, I
can't really get a vector of this data unless I make another file that has
the frequency of the numbers written out -- which is intractable given I
have 1000 data points.

I mean I could write a quick program to get the mean (which I did in perl)
but I'd rather not do that for the sd as I am sure there is an easier way to
do this.

Either way, once I have the qnorm stored in x, all I have to do is qqnorm(x)
right? 
  
-- 
View this message in context: 
http://www.nabble.com/QQplots-of-probability-vector-data-tp24145321p24145321.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] New line operator in mtext

2009-06-22 Thread Uwe Ligges




Steve Murray wrote:

Dear R Users,

I'm finding that when I execute the following bit of code, that the new line 
argument is actually displayed as text in the graphics device. How do I avoid 
this happening?

mtext(side=2, line=5.5, expression(paste(Monthly Summed Runoff (mm/month), /n, and Summed 
Monthly Precipitation (mm x  ,10^2,/month



use \n rather than /n

Uwe Ligges




I suspect that I've done, or omitted, something fairly obvious, but as yet 
cannot see it!

Thanks for your help,

Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Testing if all elements are equal in a vector/matrix

2009-06-22 Thread Petr PIKAL

Hi

jim holtman jholt...@gmail.com napsal dne 19.06.2009 15:06:55:

 I have wondered about this way of testing for equality:
  
  x - c(1,0,3,0)
  x[1] * length(x) == sum(x)
 [1] TRUE
  x - rep(1,4)
  x[1] * length(x) == sum(x)
 [1] TRUE
 This would seem to indicate that both vectors contain the same values, 
but not
 necessarily true.

My solution has some flaws however if user is be reasonably sure that such 
condition is improbable it could be used.

The problem was stated as this

  I just want to check whether all the elements of a vector are same. My
  vector has one million elements and it is highly likely that there are
  distinct elements in the first few itself. For example:
 
   x = c(1,2,rep(1,10))

and for that kind of vectors it could be quite safe. Although I would 
prefer something like this.

 fff2-function(x) length(unique(x))==1
 system.time(print(fff2(x)))
[1] FALSE
   user  system elapsed 
   0.390.080.47 

Regards
Petr

 On Fri, Jun 19, 2009 at 8:18 AM, Petr PIKAL petr.pi...@precheza.cz 
wrote:
 Hi
 
 utkarshsinghal utkarsh.sing...@global-analytics.com napsal dne
 17.06.2009 15:29:34:
 
  I will wait for the next version-2.9.1 and presently using Petr's
 suggestion, i.e.,
  (x[1]*length(x))==sum(x)
  which significantly reduced the run time.
 
  The problem is now there might be only small differences ,say, of the
 order of
  10^-10 which I want to ignore.
 
  So I used:
  isTRUE(all.equal((x[1]*length(x)),sum(x)))
  as suggested in the documentation of all.equal.
 
  But this again increased the run time to five times.
 
  1) Is there any faster way of doing the same?
 
 Maybe (not tested)
 
 (x[1]*length(x))==round(sum(x),10)
 
 Petr
 
  2) Will the function anyDuplicated treat almost equal values as
 duplicated
  or not? Actually I need both the options.
 
 
  Regards
  Utkarsh
 
 
 
  Prof Brian Ripley wrote:
  On Tue, 16 Jun 2009, Prof Brian Ripley wrote:
 
  On Tue, 16 Jun 2009, jim holtman wrote:
 
  I think the only way that you are going to get it to stop on the first
  mismatch is to write your own function in C if you are concerned about
 the
  time.  Matching on character vectors will be even more costly since it
 is
  having to loop to check the equality of each character in each 
element.
  This is one of the places it might pay to convert to factors and then
 the
  comparison only uses the integer values assigned to the factors.
 
  Not so in a recent R: comparison of character vectors is now done by
 comparing
  pointers in the first instance so (at least on a 32-bit platform) is 
as
 fast
  as comparing integers.  And on x86_64 Linux:
 
  x - as.character(c(1,2,rep(1,1000)))
  system.time(print(all(x[1] == x)))
  [1] FALSE
user  system elapsed
   0.123   0.019   0.142
 
  system.time(xx - as.factor(x))
user  system elapsed
   9.874   0.284  10.159
  system.time(print(all(xx[1] == xx)))
  [1] FALSE
user  system elapsed
   0.511   0.145   0.656
 
  Recent pre-release versions of R (e.g. 2.9.1 beta) allow
 
  system.time(anyDuplicated(x))
user  system elapsed
   0.034   0.078   0.113
  system.time(anyDuplicated(xx))
user  system elapsed
   0.037   0.076   0.113
 
  I'm sorry, a line got reverted here: I had edited this to say
 
  'which is a C-level speedup of the sort the original poster seemed to 
be
 looking for'
 
 
 
  On Tue, Jun 16, 2009 at 8:31 AM, utkarshsinghal 
  utkarsh.sing...@global-analytics.com wrote:
 
  Hi Jim,
 
  What you are saying is correct. Although, my computer might not have
 same
  speed and I am getting the following for 10M entries:
 
 user  system elapsed
0.559   0.038   0.607
 
  Moreover, in the case of character vectors, it gets more than double.
 
  In my modeling, which is already highly time consuming,  I need to do
 check
  this for few thousand vectors and the entries can easily be 10M in 
each
  vector. So I am just looking for any possibilities of time saving.  I 
am
 
  pretty sure that whenever elements are not all equal, it can be
 concluded
  from any few entries (most of the times). It will be worth if I can 
find
 a
  way which stops checking further the moment it find two distinct
 elements.
 
  Regards
  Utkarsh
 
 
 
  jim holtman wrote:
 
  Just check that the first (or any other element) is equal to all the
 rest:
 
  x = c(1,2,rep(1,1000)) # 10,000,000
  system.time(print(all(x[1] == x)))
  [1] FALSE
 user  system elapsed
 0.180.000.19
 
 
  This was for 10M entries.
 
  On Tue, Jun 16, 2009 at 7:42 AM, utkarshsinghal 
  utkarsh.sing...@global-analytics.com wrote:
 
 
  Hi All,
 
  There are several replies to the question below, but I think there 
must
  exist a  better way of doing so.
  I just want to check whether all the elements of a vector are same. My
  vector has one million elements and it is highly likely that there are
  distinct elements in the first few itself. For example:
 
   x = c(1,2,rep(1,10))
 
  I want the answer as FALSE, which

Re: [R] New line operator in mtext

2009-06-22 Thread Steve Murray


Thanks for the response, however, whilst this eliminates the 'new line' 
character from appearing, it doesn't actually cause a new line to be printed! 
Instead, I have the last few characters of the first character string 
overlapping with the first few characters of the next.

How best should I change the code to execute a new line?

Thanks again!

_

[[elided Hotmail spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on creating a sequence of vectors

2009-06-22 Thread jim holtman

Consider the use of a 'list' so you don't clutter up the global enviroment
with a lot of objects that you might forget about:

 vec - lapply(1:10, rnorm)
 vec
[[1]]
[1] -0.6264538
[[2]]
[1]  0.1836433 -0.8356286
[[3]]
[1]  1.5952808  0.3295078 -0.8204684
[[4]]
[1]  0.4874291  0.7383247  0.5757814 -0.3053884
[[5]]
[1]  1.5117812  0.3898432 -0.6212406 -2.2146999  1.1249309
[[6]]
[1] -0.04493361 -0.01619026  0.94383621  0.82122120  0.59390132  0.91897737
[[7]]
[1]  0.78213630  0.07456498 -1.98935170  0.61982575 -0.05612874 -0.15579551
-1.47075238
[[8]]
[1] -0.47815006  0.41794156  1.35867955 -0.10278773  0.38767161 -0.05380504
-1.37705956 -0.41499456
[[9]]
[1] -0.3942900 -0.0593134  1.1000254  0.7631757 -0.1645236 -0.2533617
0.6969634  0.5566632 -0.6887557
[[10]]
 [1] -0.7074952  0.3645820  0.7685329 -0.1123462  0.8811077  0.3981059
-0.6120264  0.3411197 -1.1293631  1.4330237


On Mon, Jun 22, 2009 at 5:38 AM, megh megh700...@yahoo.com wrote:


 I want to create a number of  vectors like :

 vec1 - rnorm(1)
 vec2 - rnorm(2)
 vec3 - rnorm(3)

 and so on...

 Here I tried following :

 for (i in 1:10) paste(vec, i, sep=) - rnorm(i)

 However obviously that is not working. Here vectors I need to be seperated
 i.e I do not want to create a list. How to modify above code?
 --
 View this message in context:
 http://www.nabble.com/Help-on-creating-a-sequence-of-vectors-tp24144347p24144347.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Roxygen vs Sweave for S4 documentation

2009-06-22 Thread Martin Maechler

 TobiasV == Tobias Verbeke tobias.verb...@openanalytics.be
 on Sun, 21 Jun 2009 08:25:07 +0200 writes:

TobiasV Hi Ken,
 I have been using R for a while.  Recently, I have begun converting my
 package into S4 classes.  I was previously using Rdoc for documentation. 
 Now, I am looking to use the best tool for S4 documentation.  It seems 
that
 the best choices for me are Roxygen and Sweave (I am fine with tex).
 
 Are there any users of Roxygen or Sweave who can comment on the 
strengths or
 weaknesses of one or othe other?  Thanks in advance.

TobiasV For the moment proper documentation of S4 classes (with a @slot 
tag 
TobiasV e.g.) is not implemented yet,

how did you define proper here?

I know that the result of  promptClass()  may not always be
perfect. As most things are not perfect,
I would not quickly call this improper ...

Or is using *.Rd  not proper for you,
as it does not have all of code + docs in one file?

Regards,
Martin

TobiasV but my secret hope is that this will
TobiasV be implemented before Peter and Manuel (in cc) will present Roxygen
TobiasV at DSC2009. Maybe they have further comments ?

TobiasV Kind regards,
TobiasV Tobias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Roxygen vs Sweave for S4 documentation

2009-06-22 Thread Douglas Bates

On Mon, Jun 22, 2009 at 7:12 AM, Martin
Maechlermaech...@stat.math.ethz.ch wrote:
 TobiasV == Tobias Verbeke tobias.verb...@openanalytics.be
     on Sun, 21 Jun 2009 08:25:07 +0200 writes:

    TobiasV Hi Ken,
     I have been using R for a while.  Recently, I have begun converting my
     package into S4 classes.  I was previously using Rdoc for documentation.
     Now, I am looking to use the best tool for S4 documentation.  It seems 
 that
     the best choices for me are Roxygen and Sweave (I am fine with tex).
    
     Are there any users of Roxygen or Sweave who can comment on the 
 strengths or
     weaknesses of one or othe other?  Thanks in advance.

    TobiasV For the moment proper documentation of S4 classes (with a @slot 
 tag
    TobiasV e.g.) is not implemented yet,

 how did you define proper here?

 I know that the result of  promptClass()  may not always be
 perfect. As most things are not perfect,
 I would not quickly call this improper ...

 Or is using *.Rd  not proper for you,
 as it does not have all of code + docs in one file?

I think Tobias was referring to Roxygen support for S4 classes, not
the existence of .Rd tags.


 Regards,
 Martin

    TobiasV but my secret hope is that this will
    TobiasV be implemented before Peter and Manuel (in cc) will present 
 Roxygen
    TobiasV at DSC2009. Maybe they have further comments ?

    TobiasV Kind regards,
    TobiasV Tobias

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] What has happened to the R-Help Google Groups Archive? Alternative?

2009-06-22 Thread Tony Breyal

Greetings,

I usually read this mailing list through google groups
(http://groups.google.com/group/r-help-archive), but when I opened the
webpage this morning it said: The group named r-help-archive has been
removed because it violated Google's Terms Of Service.

Is there an alternative website which uses a similar structure to
google groups? I had a quick browse on the R Wiki
(http://wiki.r-project.org/rwiki/doku.php?id=links:links) but didn't
see a page with this sort of info.

Thank you kindly,
Tony Breyal

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] New line operator in mtext

2009-06-22 Thread Duncan Murdoch


On 6/22/2009 7:58 AM, Steve Murray wrote:

Thanks for the response, however, whilst this eliminates the 'new line' 
character from appearing, it doesn't actually cause a new line to be printed! 
Instead, I have the last few characters of the first character string 
overlapping with the first few characters of the next.

How best should I change the code to execute a new line?


This works for me.

mtext(side=2, line=5.5, expression(paste(Monthly Summed Runoff 
(mm/month)\nand Summed Monthly Precipitation (mm x  ,10^2,/month


Note that the newline is embedded in the string that you want to split, 
it's not a separate element of the label.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What has happened to the R-Help Google Groups Archive? Alternative?

2009-06-22 Thread Duncan Murdoch


On 6/22/2009 7:23 AM, Tony Breyal wrote:

Greetings,

I usually read this mailing list through google groups
(http://groups.google.com/group/r-help-archive), but when I opened the
webpage this morning it said: The group named r-help-archive has been
removed because it violated Google's Terms Of Service.

Is there an alternative website which uses a similar structure to
google groups? I had a quick browse on the R Wiki
(http://wiki.r-project.org/rwiki/doku.php?id=links:links) but didn't
see a page with this sort of info.


gmane.org has an archive at 
http://dir.gmane.org/gmane.comp.lang.r.general, but I don't know if it 
is similar enough for you.  You might want to contact Google to ask what 
terms of service were violated.


Duncan Murdoch



Thank you kindly,
Tony Breyal

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert NA to '.'

2009-06-22 Thread Jorge Ivan Velez

Dear David,
Try this:

x - c(A, B, C, NA)
x[is.na(x)] - .
x

HTH,

Jorge

On Mon, Jun 22, 2009 at 2:11 AM, David_D david.da...@gmail.com wrote:


 Dear R-users,

 For reporting purpose (using Sweave and LaTeX), I am creating complex
 tables
 with the cat function such as

  x-c(A, B, C, NA)
  cat(x, '\n')
 A B C NA

 For convenience, I would like to change all my NA value to something else
 like '.' (as in SAS for example). Is there a global option which allows
 this
 change? Or should I change all my code to work with the print function and
 the na.print argument?

 Best regards,
 David
 --
 View this message in context:
 http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24142187.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to make try to catch warnings in logistic glm

2009-06-22 Thread Fredrik Nilsson

Dear list,

From an earlier post I got the impression that one could promote
warnings from a glm to errors (presumably by putting
options(warn=1)?), then try() would flag them as errors. I’ve spent
half the day trying to do this, but no luck. Do you have an explicit
solution?

My problems is that I am trying to figure out during what conditions
one may find 5 significant parameters in a logistic regression with
only 21 bad and 28 outcomes (this appears in a published article) and
I would like to have  my.glm-try(glm(outcome~ald + fat+ hstop + btun
+ time, data=DF, family=binomial)) to indicate if warnings
(convergence / probabilities equal to 0/1) occurs (see attempt to code
below if my explanation is terse).

Best regards,

Fredrik Nilsson

--
Fredrik Nilsson, PhD
Competence Centre for Clinical Research
Lund University Hospital


Warnings I'd like to catch:
Warning in glm.fit(x = X, y = Y, weights = weights, start = start,
etastart = etastart,  :
 algorithm did not converge
Warning in glm.fit(x = X, y = Y, weights = weights, start = start,
etastart = etastart,  :
 fitted probabilities numerically 0 or 1 occurred

script
ngood-28
nbad-21
resmat-DFr
j-0
nsim-100
res-matrix(NA, nrow=nsim, ncol=6)
ow - options(warn)
options(warn= 1)
outcome-rep(c(G,B), times=c(ngood, nbad))
outcome-as.factor(outcome)
agood-rep(c(0,1), times=c(27,1))
abad-rep(c(0,1), times=c(14,7))
thgood-rep(c(0,1), times=c(24,4))
thbad-rep(c(0,1), times=c(12,9))

for (i in 1:nsim)
{
 tgood-sample(agood, ngood)
 tbad-sample(abad, nbad)
 hstop-c(tgood, tbad)
 hstop-as.factor(hstop)
 aldgood-rnorm(ngood, mean=54, sd=8/0.675)
 aldbad-rnorm(nbad, mean=64, sd=8/0.675)
 ald-c(aldgood, aldbad)
 fatgood-exp(rnorm(ngood, mean=log(23.9)-0.06^2/2,sd=0.06))
 fatbad-exp(rnorm(nbad, mean=log(27.4)-0.09^2/2, sd=0.09))
 fat-c(fatgood, fatbad)
 thgood-sample(thgood, ngood)
 thbad-sample(thbad, nbad)
 btun-c(thgood, thbad)
 btun-as.factor(btun)
 timegood-exp(rnorm(ngood, mean=log(443)-0.5^2/2, sd=0.5))
 timebad- exp(rnorm(nbad, mean=log(555)-0.6^2/2, sd=0.6))
 time-c(timegood, timebad)

 DF-data.frame(outcome, ald, fat, hstop, btun, time)
 my.glm-try(glm(outcome~ald + fat+ hstop + btun + time, data=DF,
family=binomial))
 test-!is.null(attr(my.glm, try-error))
 if(test | !my.glm$converged)
 {
   res[i, 1:6]-NA
 } else
 {
   W-summary(my.glm)$coefficients
   lres-as.numeric(W[,4]=0.05)
   if (sum(lres)==5)
   {
     j-j+1
     assign(paste(resmat, j, sep=., collapse=), DF)
   }
   res[i,1:6]-lres
 }
}

options(ow)



From: Dieter Menne dieter.menne_at_menne-biomed.de
Date: Fri, 14 Dec 2007 20:49:20 + (UTC) Caroline Paulsen cpaulsen
at u.washington.edu writes:

 I'm attempting to run 250 permutations of a negative binomial GLM
 model for data on fish counts. Many of the models are fit
 appropriately, but some issue warnings such as convergence not
 reached or step size truncated due to divergence.

 I'd like to figure out a way to flag the models that have warnings and
 output them into either a table or into R console. Then I could
 evaluate the problems associated with these models and know not choose
 them as the best models for fitting the data.
You could promote the options to errors (?warning) and use try() Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xyplot: subscripts, groups and subset

2009-06-22 Thread Dieter Menne

Auty, Dave dave.auty at forestry.gsi.gov.uk writes:

 
 I'm running the following code to produce lattice plots of microfibril
 angle versus ring number in Scots pine. There are 12 trees and 5 sample
 positions (Position) in each tree:
 
 xyplot(MFA ~ RN | Tree, data = MFA.data, 
 
groups = Position, subscripts=TRUE,

 subscripts=subscripts,lty=8, cex=0.25)
 
})
 
 But it is giving me the error message:
 
  })
 
 Error: unexpected '}' in}

The code syntactically works for me, but you have posted it out of
context, so problem might be BEFORE the posted sample. It is also possible
that some unprinting code might have interfered, which is difficult to find.

So if there is really no problem above the xyplot(), try to explicitly 
retype the  whole function call.

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Roxygen vs Sweave for S4 documentation

2009-06-22 Thread Tobias Verbeke

On Mon, Jun 22, 2009 at 2:18 PM, Douglas Batesba...@stat.wisc.edu wrote:
 On Mon, Jun 22, 2009 at 7:12 AM, Martin
 Maechlermaech...@stat.math.ethz.ch wrote:
 TobiasV == Tobias Verbeke tobias.verb...@openanalytics.be
     on Sun, 21 Jun 2009 08:25:07 +0200 writes:

    TobiasV Hi Ken,
     I have been using R for a while.  Recently, I have begun converting my
     package into S4 classes.  I was previously using Rdoc for 
 documentation.
     Now, I am looking to use the best tool for S4 documentation.  It seems 
 that
     the best choices for me are Roxygen and Sweave (I am fine with tex).
    
     Are there any users of Roxygen or Sweave who can comment on the 
 strengths or
     weaknesses of one or othe other?  Thanks in advance.

    TobiasV For the moment proper documentation of S4 classes (with a @slot 
 tag
    TobiasV e.g.) is not implemented yet,

 how did you define proper here?

 I know that the result of  promptClass()  may not always be
 perfect. As most things are not perfect,
 I would not quickly call this improper ...

 Or is using *.Rd  not proper for you,
 as it does not have all of code + docs in one file?

 I think Tobias was referring to Roxygen support for S4 classes, not
 the existence of .Rd tags.

Yes, I was just referring to the Roxygen support and as a matter of
fact I recommended Roxygen to mimick the nice job done by promptClass
a while ago

https://lists.r-forge.r-project.org/pipermail/roxygen-devel/2009-February/23.html

No intent to criticize nor current Rd tags nor Rd as a documentation system,
although having code + docs in one file (as in Roxygen) seems to be convenient
to some.

Best,
Tobias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to make try to catch warnings in logistic glm

2009-06-22 Thread Dieter Menne




Fredrik Nilsson-5 wrote:
 
From an earlier post I got the impression that one could promote
 warnings from a glm to errors (presumably by putting
 options(warn=1)?), then try() would flag them as errors. I’ve spent
 half the day trying to do this, but no luck. Do you have an explicit
 solution?
 
 

You were rather close... 

1) use warn=2
2) make sure to get the | and || right (this is nasty, I never learn it)

Dieter


 options(warn=2) #
 my.glm-try(glm(outcome~ald + fat+ hstop + btun + time, data=DF,
family=binomial),FALSE)
 failed - inherits(my.glm,try-error)
 if(failed || !my.glm$converged)
 {
   res[i, 1:6]-NA
 } else
 {


-- 
View this message in context: 
http://www.nabble.com/How-to-make-try-to-catch-warnings-in-logistic-glm-tp24146726p24147228.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Lattice group colors?

2009-06-22 Thread Fredrik Karlsson

Dear list,

I have been struggling to find how I would go about changing the
bakground colors of groups in a lattice barchart in a way so that the
auto.key generated also does the right thing and pick it up for the
key.
I have used the col argument (which I guess is sent to par()) in a
way so that the colors are like I would want them, but now I guess I
need to know which part part of the theme I need to change in order to
make this change permanent for all the plots, and all the keys.

I am of course thankful for all the help I can get.

(Also, how does one know these things about the theme variables? Is
there some documentation somewhere?)

/Fredrik

-- 
Life is like a trumpet - if you don't put anything into it, you don't
get anything out of it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lattice group colors?

2009-06-22 Thread Sundar Dorai-Raj

Look at show.settings() and str(trellis.par.get()). This will show you
what the default settings are. The group colors are set by the
superpose.* elements (e.g. superpose.line is for group lines). To set
them, I usually create a list and pass it to par.settings. For
example,

my.theme - list(superpose.line = list(col = c(darkred, darkblue),
lty = 1:2))
xyplot(y ~ x, data = mydata, groups = g, par.settings = my.theme,
  auto.key = list(points = FALSE, lines = TRUE))

HTH,

--sundar

On Mon, Jun 22, 2009 at 6:30 AM, Fredrik Karlssondargo...@gmail.com wrote:
 Dear list,

 I have been struggling to find how I would go about changing the
 bakground colors of groups in a lattice barchart in a way so that the
 auto.key generated also does the right thing and pick it up for the
 key.
 I have used the col argument (which I guess is sent to par()) in a
 way so that the colors are like I would want them, but now I guess I
 need to know which part part of the theme I need to change in order to
 make this change permanent for all the plots, and all the keys.

 I am of course thankful for all the help I can get.

 (Also, how does one know these things about the theme variables? Is
 there some documentation somewhere?)

 /Fredrik

 --
 Life is like a trumpet - if you don't put anything into it, you don't
 get anything out of it.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] The gradient of a multivariate normal density with respect to its parameters

2009-06-22 Thread Karl Ove Hufthammer

Does anybody know of a function that implements the derivative (gradient) of 
the multivariate normal density with respect to the *parameters*?

It’s easy enough to implement myself, but I’d like to avoid reinventing the 
wheel (with some bugs) if possible. Here’s a simple example of the result 
I’d like, using numerical differentiation:

library(mvtnorm)
library(numDeriv)
f=function(pars, xx, yy)
{
  mu=pars[1:2]
  sig1=pars[3]
  sig2=pars[4]
  rho=pars[5]
  sig=matrix(c(sig1^2,rho*sig1*sig2,rho*sig1*sig2,sig2^2),2)
  dmvnorm(cbind(x,y),mu,sig)
}

mu1=1
mu2=2
sig1=3
sig2=4
rho=.5
x=2 # or a x vector
y=3 # or a y vector
jacobian(f,c(mu1,mu2,sig1,sig2,rho),xx=x,yy=y)
# (Can replace ‘jacobian’ with ‘grad’ if x and y have length 1.)

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Eigen value calculation

2009-06-22 Thread RON70


Hi all, Eigen vectors obtained from the function eigen() are ortho-normal? I
see the documentation however there is no formal mention on that. If no,
then is there any direct function to do the same?
-- 
View this message in context: 
http://www.nabble.com/Eigen-value-calculation-tp24147807p24147807.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with checking wether file is present or not

2009-06-22 Thread Don MacQueen


What error is it giving? Please include the exact error.

What happens if you do this:

  if (file.exists(findings)) cat('File',findings,'exists\n') else 
cat('File',findings,not found\n')


Your description suggests that you are using 'if' expression inside a 
loop. If that is the case, try


  if ( !file.exists(findings) ) next

inside your loop, and near the beginning of the loop.
This should immediately skip to the next file, if the file is missing 
(and if your loop is written they way I assume it is)


-Don

At 11:51 AM +0530 6/22/09, venkata kirankumar wrote:

Hi all,
I have a problem with checking File is present in the directory or not
like
I have a sequence of files in one folder I have to take each file in order
and have to caliculate on those files data but in order some files are
missing for that I have to check and load those files for that I am using

condition like

if(file.exists(findings)==TRUE){}



its giving results for files which are present in that folder
but for files which are not present in that folder its giving error
but it have to skip and run for remaining files

can any one suggest any other way to make it work
for all those files

thanks in advance
kiran

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https:// stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http:// www. R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] New line operator in mtext

2009-06-22 Thread Steve Murray


Thanks again for a very useful comment. That seems to have separated the text 
and put it onto separate lines.

However, whilst this results in the text being centralised in relation to the 
axis, it means that the lower line is left-justified in relation to the upper 
line, rather than being centralised.

How do I go about centralising the lower line in relation to the upper line, 
whilst keeping it central to the axis?

Many thanks again,

Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The gradient of a multivariate normal density with respect to its parameters

2009-06-22 Thread Ravi Varadhan

Hi Karl,

I am not aware of any.  May I ask what your purpose is?  You don't really
need this if you are going to use it in optimization, since most optimizers
use a simple finite-difference approximation if you don't provide the
gradient.  Using the numerical approximation from numDeriv will be quite
time-consuming in an optimization routine, since numDeriv uses a high-order
Richardosn extrapolation to compute an accurate approximation of the
gradient.

Regardless of your purpose, there is a small bug in your function.  You
should change   `dmvnorm(cbind(x,y),mu,sig)'  to
`dmvnorm(cbind(xx,yy),mu,sig)'.  Also, taking the derivative of log-density
might be better.

Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Karl Ove Hufthammer
Sent: Monday, June 22, 2009 10:10 AM
To: r-h...@stat.math.ethz.ch
Subject: [R] The gradient of a multivariate normal density with respect to
its parameters

Does anybody know of a function that implements the derivative (gradient) of
the multivariate normal density with respect to the *parameters*?

It's easy enough to implement myself, but I'd like to avoid reinventing the
wheel (with some bugs) if possible. Here's a simple example of the result
I'd like, using numerical differentiation:

library(mvtnorm)
library(numDeriv)
f=function(pars, xx, yy)
{
  mu=pars[1:2]
  sig1=pars[3]
  sig2=pars[4]
  rho=pars[5]
  sig=matrix(c(sig1^2,rho*sig1*sig2,rho*sig1*sig2,sig2^2),2)
  dmvnorm(cbind(x,y),mu,sig)
}

mu1=1
mu2=2
sig1=3
sig2=4
rho=.5
x=2 # or a x vector
y=3 # or a y vector
jacobian(f,c(mu1,mu2,sig1,sig2,rho),xx=x,yy=y)
# (Can replace 'jacobian' with 'grad' if x and y have length 1.)

--
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error when using step

2009-06-22 Thread David Winsemius



On Jun 22, 2009, at 4:08 AM, Dieter Menne wrote:


Chris Friedl cfriedalek at gmail.com writes:

I have two questions about the built-in function step. Ultimately I  
want to
apply a lm fitting and subsequent step procedure to thousands of  
data sets

groups by a factor defined as a unique ID.

Q1. The code below creates a data.frame comprising three marginally  
noisy
surfaces. The code below fails when I use step in a function but  
summary
seems to show the model fits are legitimate. Any ideas on what I'm  
doing

wrong?


... Well designed example code omitted


function(x){lm(model, data = x)})
lapply(fits, function(x){summary(x)})

# FAIL
lapply(fits, function(x){step(x)})



Error in as.data.frame.default(data) :
 cannot coerce class lm into a data.frame


Looks like an environment problem. I could not find a workaround  
quickly,


Perhaps:

by(data2, data2$grp, function(x) step(lm(model, data=x)))



but you might have a look at

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/16599.html

We call it Ripley's Game here, because variants of it can help you  
quite often.


Dieter




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] New line operator in mtext

2009-06-22 Thread Duncan Murdoch


On 6/22/2009 10:30 AM, Steve Murray wrote:

Thanks again for a very useful comment. That seems to have separated the text 
and put it onto separate lines.

However, whilst this results in the text being centralised in relation to the 
axis, it means that the lower line is left-justified in relation to the upper 
line, rather than being centralised.

How do I go about centralising the lower line in relation to the upper line, 
whilst keeping it central to the axis?


I think two calls to mtext, one for each line, should do it.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GAM function with interaction

2009-06-22 Thread Paul Simonin


Hello R Users,
I have a question regarding fitting a model with GAM{mgcv}. I have data 
from several predictor (X) variables I wish to use to develop a model to 
predict one Y variable. I am working with ecological data, so have data 
collected many times (about 20) over the course of two years. Plotting 
data independently for each date there appears to be relationships 
between Y (fish density) and at least several X variables (temperature 
and light). However, the actual value of X variables (e.g., temperature) 
changes with date/season. In other words, fish distribution is likely 
related to temperature, but available temperatures change through the 
season. Thus, when data from all dates are combined to create a model 
from the entire dataset, I think I need to include some type of 
metric/variable/interaction term to account for this date relationship. 
I have written the following code using a by term:


Distribution.s.temp.logwm2.deltaT-gam(yoyras~s(temp,by=datecode)+s(logwm2,by=datecode)+s(DeltaT,by=datecode),data=AllData) 



However, I am not convinced this is the correct way to account for this 
relationship. What do you think? Is there another way to include this in 
my model? Maybe I should simply include date (datecode) as another 
term in the model?


I also believe there may be an interaction between temperature and 
light (logwm2), and based on what I have read the by method may be the 
best way to include this. Correct?


Thank you for any input, tips, or advice you may be able to offer. I am 
new to R, so especially grateful!


Thanks again,
Paul Simonin
(PhD student)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] what's the R command to make the following?

2009-06-22 Thread Jack Luo

Hi,

I have a simple question, suppose I have the date 05/16/2008, what would
be the command to get the month, day and year?

Thanks,

-Jack

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] what's the R command to make the following?

2009-06-22 Thread jim holtman

Couple of choices depending on what you want to do with the data:

 x - as.POSIXct(05/16/2008, format=%m/%d/%Y)  # if you want to use the
date
 format(x, %d) # day
[1] 16
 format(x, %m) # month
[1] 05
 format(x, %Y) # year
[1] 2008
 # or
 y - strsplit(05/16/2008, '/')  # to just get the characters
 y
[[1]]
[1] 05   16   2008



On Mon, Jun 22, 2009 at 11:36 AM, Jack Luo jluo.rh...@gmail.com wrote:

 Hi,

 I have a simple question, suppose I have the date 05/16/2008, what would
 be the command to get the month, day and year?

 Thanks,

 -Jack

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] standard error and p-value for the estimated parameter in AR model

2009-06-22 Thread FMH

Dear All,

I used an  AR(1) model to explain the process of the stationary residual and 
have used an 'ar' command in R. From the results, i tried to extract 
the standard error and p-value for the estimated parameter, but unfortunately, 
i never find any way to extract  it from the output. 

What i did was, i assigned the residuals into the 'residual' object in R and 
used an 'ar' function as below. 

 residual - residuals
 ar(residual, aic = TRUE,  method = mle, order.max = 1) 

Could someone help me to extract the stadard error and the p-value for the 
estimated parameter, please?

Thank you

Fir


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: question about using _apply and/or aggregate functions

2009-06-22 Thread Clifford Long

Resending, as am not sure about the original To: address.  Sorry for
any redundancy.
- Cliff

-- Forwarded message --
From: Clifford Long gnolff...@gmail.com
Date: Mon, Jun 22, 2009 at 11:04 AM
Subject: question about using _apply and/or aggregate functions
To: r-h...@lists.r-project.org

Hi R-list,

I'll apologize in advance for (1) the wordiness of my note (not sure
how to avoid it) and (2) any deficiencies on my part that lead to my
difficulties.

I have an application with several stages that is meant to simulate
and explore different scenarios with respect to product sales (in
units sold per month).  My session info is at the bottom of this note.

The steps include (1) an initial linear regression model, (2) building
an ARIMA model, (3) creating 12 months of simulated 'real' data - for
various values of projected changes in slope from the linear
regression - from the ARIMA model using arima.sim with subsequent
undifferencing and appending to historical data, (3) one-step-ahead
forecasting for 12 months using the 'fixed' arima model and simulated
data, (4) taking the residuals from the forecasting (simulated minus
forecast for each of the 12 months) and applying QC charting, and (5)
counting the number (if any) of runs-rules violations detected.

The simulation-aspect calculates 100 simulations for each of 50 values of slope.

All of this seems to work fine (although I'm sure that the coding
could be improved through functions, vectorization, etc. ... ).
However, the problem I'm having is at the end where I had hoped that
things would be easier.  I want to summarize and graph the probability
of detecting a runs-rule violation vs. the amount of the shift in
slope (of logunit sales).

The output data array passed from the qcc section at the end includes:
 - the adjustment made to the slope (a multiplier)
 - the actual value of the slope
 - the iteration number of the simulation loop (within each value of slope)
 - the count of QC charting limits violations
 - the count of QC charting runs rules violations

My code is in the attached file (generic_code.R) and my initial sales
data needed to prime the simulation is in the other attached file
(generic_data.csv).  The relevant section of my code is at the
bottom of the .R file after the end of the outer loop.  I've tried to
provide meaningful comments.

I've read the help files for _apply, aggregate, arrays and data types,
and have also consulted with several texts (Maindonald and Braun;
Spector; Venebles and Ripley for S-plus).  Somehow I still don't get
it.  My attempts usually result in a message like the following:

 agg.result = aggregate(simarray.part[,3], by=list4, FUN = mean)
Error in FUN(X[[1L]], ...) : arguments must have same length

But when I check the length of the arguments, they appear to match. (??)

 length(simarray.part[,3])
[1] 5000
 length(simarray.part[,4])
[1] 5000
 length(list4)
[1] 5000

I would have rather passed along a subset of the simulation/loop
output dataset, but was unsure how to save it to preserve whatever
properties that I may have missed that are causing my difficulties.
If anyone still wants to help at this point, I believe that you'll
need to load the .csv data and run the simulation (maybe reducing the
number of iterations).

Many thanks to anyone who can shed some light on my difficulties
(whether self-induced or otherwise).

Cliff

I'm using a pre-compiled binary version of R for Windows.

Session info:

 sessionInfo()
R version 2.9.0 (2009-04-17)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] qcc_1.3         forecast_1.24   tseries_0.10-18 zoo_1.5-5
[5] quadprog_1.4-11

loaded via a namespace (and not attached):
[1] grid_2.9.0      lattice_0.17-22

 Sys.getlocale()
[1] LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about using _apply and/or aggregate functions

2009-06-22 Thread David Winsemius



On Jun 22, 2009, at 12:04 PM, Clifford Long wrote:


Hi R-list,

I'll apologize in advance for (1) the wordiness of my note (not sure
how to avoid it) and (2) any deficiencies on my part that lead to my
difficulties.

I have an application with several stages that is meant to simulate
and explore different scenarios with respect to product sales (in
units sold per month).  My session info is at the bottom of this note.

The steps include (1) an initial linear regression model, (2) building
an ARIMA model, (3) creating 12 months of simulated 'real' data - for
various values of projected changes in slope from the linear
regression - from the ARIMA model using arima.sim with subsequent
undifferencing and appending to historical data, (3) one-step-ahead
forecasting for 12 months using the 'fixed' arima model and simulated
data, (4) taking the residuals from the forecasting (simulated minus
forecast for each of the 12 months) and applying QC charting, and (5)
counting the number (if any) of runs-rules violations detected.

The simulation-aspect calculates 100 simulations for each of 50  
values of slope.


All of this seems to work fine (although I'm sure that the coding
could be improved through functions, vectorization, etc. ... ).
However, the problem I'm having is at the end where I had hoped that
things would be easier.  I want to summarize and graph the probability
of detecting a runs-rule violation vs. the amount of the shift in
slope (of logunit sales).

The output data array passed from the qcc section at the end includes:
 - the adjustment made to the slope (a multiplier)
 - the actual value of the slope
 - the iteration number of the simulation loop (within each value of  
slope)

 - the count of QC charting limits violations
 - the count of QC charting runs rules violations


My code is in the attached file (generic_code.R) and my initial sales
data needed to prime the simulation is in the other attached file
(generic_data.csv).  The relevant section of my code is at the
bottom of the .R file after the end of the outer loop.  I've tried to
provide meaningful comments.

I've read the help files for _apply, aggregate, arrays and data types,
and have also consulted with several texts (Maindonald and Braun;
Spector; Venebles and Ripley for S-plus).  Somehow I still don't get
it.  My attempts usually result in a message like the following:


agg.result = aggregate(simarray.part[,3], by=list4, FUN = mean)

Error in FUN(X[[1L]], ...) : arguments must have same length


I cannot comment on the overall strategy, but wondered if this minor  
mod to the code might succeed;



agg.result = aggregate(simarray.part[,3], by=list(list4), FUN = mean)


My personal experience with aggregate() is not positive. I generally  
end up turning to tapply()  (which is at the heart of aggregate()  
anyway) probably because I forget to wrap the second argument in a  
list. Slow learner, I guess.





But when I check the length of the arguments, they appear to match.  
(??)



length(simarray.part[,3])

[1] 5000

length(simarray.part[,4])

[1] 5000

length(list4)

[1] 5000


I would have rather passed along a subset of the simulation/loop
output dataset, but was unsure how to save it to preserve whatever
properties that I may have missed that are causing my difficulties.
If anyone still wants to help at this point, I believe that you'll
need to load the .csv data and run the simulation (maybe reducing the
number of iterations).

Many thanks to anyone who can shed some light on my difficulties
(whether self-induced or otherwise).

Cliff



I'm using a pre-compiled binary version of R for Windows.

Session info:


sessionInfo()

R version 2.9.0 (2009-04-17)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] qcc_1.3 forecast_1.24   tseries_0.10-18 zoo_1.5-5
[5] quadprog_1.4-11

loaded via a namespace (and not attached):
[1] grid_2.9.0  lattice_0.17-22



Sys.getlocale()

[1] LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Roxygen vs Sweave for S4 documentation

2009-06-22 Thread Duncan Murdoch


On 6/22/2009 9:23 AM, Tobias Verbeke wrote:

On Mon, Jun 22, 2009 at 2:18 PM, Douglas Batesba...@stat.wisc.edu wrote:

On Mon, Jun 22, 2009 at 7:12 AM, Martin
Maechlermaech...@stat.math.ethz.ch wrote:

TobiasV == Tobias Verbeke tobias.verb...@openanalytics.be
on Sun, 21 Jun 2009 08:25:07 +0200 writes:


   TobiasV Hi Ken,
I have been using R for a while.  Recently, I have begun converting my
package into S4 classes.  I was previously using Rdoc for documentation.
Now, I am looking to use the best tool for S4 documentation.  It seems 
that
the best choices for me are Roxygen and Sweave (I am fine with tex).
   
Are there any users of Roxygen or Sweave who can comment on the strengths 
or
weaknesses of one or othe other?  Thanks in advance.

   TobiasV For the moment proper documentation of S4 classes (with a @slot tag
   TobiasV e.g.) is not implemented yet,

how did you define proper here?

I know that the result of  promptClass()  may not always be
perfect. As most things are not perfect,
I would not quickly call this improper ...

Or is using *.Rd  not proper for you,
as it does not have all of code + docs in one file?


I think Tobias was referring to Roxygen support for S4 classes, not
the existence of .Rd tags.


Yes, I was just referring to the Roxygen support and as a matter of
fact I recommended Roxygen to mimick the nice job done by promptClass
a while ago

https://lists.r-forge.r-project.org/pipermail/roxygen-devel/2009-February/23.html

No intent to criticize nor current Rd tags nor Rd as a documentation system,
although having code + docs in one file (as in Roxygen) seems to be convenient
to some.


Hopefully the Rd changes coming in R 2.10.0 will enhance either way of 
generating an Rd file.  There are some things (e.g. all the methods for 
a generic, or all the generics with methods for a class, all the classes 
extending a given one) that aren't known at the time you run promptClass 
or Roxygen, but which are known at display time; the new system should 
allow those to be computed just before displaying the page.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert NA to '.'

2009-06-22 Thread David_D


Jorge,

Thanks a lot for your reply. It's indeed working in this case. However, with
data frames mixing numeric and character is not working anymore. Morever I
am working with many variables and I don't want to modify them. What I would
really appreciate is a global option (in the Rprofile?) that allow to change
the display nd printing of NA by any symbol?

Does it exist?

Best regards,
David


Jorge Ivan Velez wrote:
 
 Dear David,
 Try this:
 
 x - c(A, B, C, NA)
 x[is.na(x)] - .
 x
 
 HTH,
 
 Jorge
 
 On Mon, Jun 22, 2009 at 2:11 AM, David_D david.da...@gmail.com wrote:
 

 Dear R-users,

 For reporting purpose (using Sweave and LaTeX), I am creating complex
 tables
 with the cat function such as

  x-c(A, B, C, NA)
  cat(x, '\n')
 A B C NA

 For convenience, I would like to change all my NA value to something else
 like '.' (as in SAS for example). Is there a global option which allows
 this
 change? Or should I change all my code to work with the print function
 and
 the na.print argument?

 Best regards,
 David
 --
 View this message in context:
 http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24142187.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24147157.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The gradient of a multivariate normal density with respect to its parameters

2009-06-22 Thread Karl Ove Hufthammer


Ravi Varadhan skreiv:

I am not aware of any.  May I ask what your purpose is?  You don't really
need this if you are going to use it in optimization, since most optimizers
use a simple finite-difference approximation if you don't provide the
gradient.  Using the numerical approximation from numDeriv will be quite
time-consuming in an optimization routine, since numDeriv uses a high-order
Richardosn extrapolation to compute an accurate approximation of the
gradient.
  


No, I don’t use it in an optimisation. The expression is part of a more 
complicated formula used for calculating some estimates in a special 
nonparametric model.


I won’t use the numerical approximation; the alternative would be to 
calculate the analytical expressions myself. It’s not too difficult, but 
tedious, and the expressions I end up with may not be the fastest or 
most numerically accurate, so if there was a package implementing them 
in a good way, it would be nice. :)



Regardless of your purpose, there is a small bug in your function.  You
should change   `dmvnorm(cbind(x,y),mu,sig)'  to
`dmvnorm(cbind(xx,yy),mu,sig)'.


Yes, of course. I originally used x and y when creating the example, but 
then discovered that the jacobian() function already used x as an 
argument for something else, so I renamed them to xx and yy (though 
obviously not everywhere!). I really should have tested it in a 
completely clean environment before posting.


--
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help needed: Fraction for Histogram 1 ???

2009-06-22 Thread charles78


I have been trying to draw histogram for my manscript and found some strange
things that I could not figure out why.

Using the same code listed below I have successfully draw histograms for a
few figures with fraction labeled on Y axis less than 1 (acturally between 0
to 0.1).  But one dataset gives the Y axis label 0 to 5 as fraction.  This
is not true, as fraction are less than 1, although the value distribution on
the figure seems to me is right.

The only difference between the first few datasets and last dataset is:

All values for the first few data sets  1.

The values for the last data sets between 0 and 1.

Any idea why this happens.

Your help is highly appreciated.  

Charles

===
postscript(Figure.eps, paper=letter, horizontal=FALSE)
par(mfrow=c(3,3))
par(omi=c(2,0.2,1.8,0.2), mai=c(0.4, 0.4, 0.5, 0.1) )

my.input - read.table(input.data, header=FALSE, sep=\t)
my.input.exp - read.table(exp.data, header=FALSE, sep=\t)

hist(my.input.exp[,2], breaks=40,freq = FALSE,xlab=, border = grey30,
ylab=, main=)
hist(my.input[,2], breaks=40,freq = FALSE,xlab=, ylab=, main=, border
= red,add =TRUE)
mtext(Fraction, side=2, line=2, cex=0.7)
box()
==
-- 
View this message in context: 
http://www.nabble.com/Help-needed%3A-Fraction-for-Histogram-%3E-1-tp24150345p24150345.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Roxygen vs Sweave for S4 documentation

2009-06-22 Thread Martin Maechler

 DM == Duncan Murdoch murd...@stats.uwo.ca
 on Mon, 22 Jun 2009 09:41:12 -0400 writes:

DM On 6/22/2009 9:23 AM, Tobias Verbeke wrote:
 On Mon, Jun 22, 2009 at 2:18 PM, Douglas Batesba...@stat.wisc.edu 
wrote:
 On Mon, Jun 22, 2009 at 7:12 AM, Martin
 Maechlermaech...@stat.math.ethz.ch wrote:
 TobiasV == Tobias Verbeke tobias.verb...@openanalytics.be
 on Sun, 21 Jun 2009 08:25:07 +0200 writes:
 
TobiasV Hi Ken,
  I have been using R for a while.  Recently, I have begun converting 
my
  package into S4 classes.  I was previously using Rdoc for 
documentation.
  Now, I am looking to use the best tool for S4 documentation.  It 
seems that
  the best choices for me are Roxygen and Sweave (I am fine with tex).
 
  Are there any users of Roxygen or Sweave who can comment on the 
strengths or
  weaknesses of one or othe other?  Thanks in advance.
 
TobiasV For the moment proper documentation of S4 classes (with a @slot tag
TobiasV e.g.) is not implemented yet,
 
 how did you define proper here?
 
 I know that the result of  promptClass()  may not always be
 perfect. As most things are not perfect,
 I would not quickly call this improper ...
 
 Or is using *.Rd  not proper for you,
 as it does not have all of code + docs in one file?
 
 I think Tobias was referring to Roxygen support for S4 classes, not
 the existence of .Rd tags.
 
 Yes, I was just referring to the Roxygen support and as a matter of
 fact I recommended Roxygen to mimick the nice job done by promptClass
 a while ago
 
 
https://lists.r-forge.r-project.org/pipermail/roxygen-devel/2009-February/23.html
 
 No intent to criticize nor current Rd tags nor Rd as a documentation 
system,
 although having code + docs in one file (as in Roxygen) seems to be 
convenient
 to some.

DM Hopefully the Rd changes coming in R 2.10.0 will enhance either way of 
DM generating an Rd file.  There are some things (e.g. all the methods for 
DM a generic, or all the generics with methods for a class, all the 
classes 
DM extending a given one) that aren't known at the time you run 
promptClass 
DM or Roxygen, but which are known at display time; the new system should 
DM allow those to be computed just before displaying the page.

Yes, indeed!
S4 classes and methods are indeed a very good example why a
dynamic help system as you are envisaging is very desirable.
Thanks a lot, Duncan, for your work on this!

Martin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] what's the R command to make the following?

2009-06-22 Thread Gabor Grothendieck

There are many ways to do this in R.  See R News 4/1 for info on dates.
Here is one method:

 library(chron)
 mdy - month.day.year(05/16/2008)
 mdy
$month
[1] 5

$day
[1] 16

$year
[1] 2008

 mdy$month
[1] 5


On Mon, Jun 22, 2009 at 11:36 AM, Jack Luojluo.rh...@gmail.com wrote:
 Hi,

 I have a simple question, suppose I have the date 05/16/2008, what would
 be the command to get the month, day and year?

 Thanks,

 -Jack

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The gradient of a multivariate normal density with respect toits parameters

2009-06-22 Thread Ravi Varadhan

Karl,

You may want to look at the paper by Dwyer on Some applications of matrix
derivatives in multivariate analysis (JASA 1967), especially the Table 2 on
p. 617. 

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: Karl Ove Hufthammer [mailto:karl.huftham...@math.uib.no] 
Sent: Monday, June 22, 2009 12:12 PM
To: r-h...@stat.math.ethz.ch
Cc: Ravi Varadhan
Subject: Re: [R] The gradient of a multivariate normal density with respect
toits parameters

Ravi Varadhan skreiv:
 I am not aware of any.  May I ask what your purpose is?  You don't 
 really need this if you are going to use it in optimization, since 
 most optimizers use a simple finite-difference approximation if you 
 don't provide the gradient.  Using the numerical approximation from 
 numDeriv will be quite time-consuming in an optimization routine, 
 since numDeriv uses a high-order Richardosn extrapolation to compute 
 an accurate approximation of the gradient.
   

No, I don't use it in an optimisation. The expression is part of a more
complicated formula used for calculating some estimates in a special
nonparametric model.

I won't use the numerical approximation; the alternative would be to
calculate the analytical expressions myself. It's not too difficult, but
tedious, and the expressions I end up with may not be the fastest or most
numerically accurate, so if there was a package implementing them in a good
way, it would be nice. :)

 Regardless of your purpose, there is a small bug in your function.  You
 should change   `dmvnorm(cbind(x,y),mu,sig)'  to
 `dmvnorm(cbind(xx,yy),mu,sig)'.

Yes, of course. I originally used x and y when creating the example, but
then discovered that the jacobian() function already used x as an argument
for something else, so I renamed them to xx and yy (though obviously not
everywhere!). I really should have tested it in a completely clean
environment before posting.

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problems trying to run R2jags on a Linux Cluster

2009-06-22 Thread Luwis Tapiwa Diya

Dear R users,

I am having problems in trying to run R2jags on a Linux Cluster. When I
tried to run the model given in the R2jags manual I get the following error
error message:

Error in FUN(X[[3L]], ...) : object 'J' not found

Can any one help me with this problem?

Looking to hear from you all.

Regards,


-- 
Luwis Diya,
Leuven Biostatistics and Statistical Bioinformatics Centre (L-BioStat),
Kapucijnenvoer 35 blok d - bus 7001,
3000 Leuven,
Belgium

Tel: +32 16 336886 or +32 16 336892
Fax: +32 16 337015

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] User defined GLM?

2009-06-22 Thread francogrex


Hello, I have this generalized linear formula:
log(x[i]/n[i])=log(sum(x)/sum(n)) + beta[i]
where the the x[i] and the n[i] are known.
Is there a way to program the GLM procedure to input the formula above and
get the beta[i] estimates? If not the GLM is there another procedure to do
that? The aim also afterwards is to estimate the profile-likelihood CIs for
the beta parameters.
-- 
View this message in context: 
http://www.nabble.com/User-defined-GLM--tp24150859p24150859.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help needed: Fraction for Histogram 1 ???

2009-06-22 Thread Greg Snow

When freq=FALSE then the y axis is not the proportion in each group (what I am 
assuming you mean by fraction), but rather is scaled so that the total area of 
the histogram is 1 (making comparing to theoretical densities easier).  If all 
the data values are between 0 and 1, then the height of at least one bar needs 
to be = 1 for the total area to equal 1.

If you want the y-axis to show relative frequency (proportion, fraction, etc.), 
then you either need to plot the y-axis yourself or use a different function 
than 'hist'.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of charles78
 Sent: Monday, June 22, 2009 10:08 AM
 To: r-help@r-project.org
 Subject: [R] Help needed: Fraction for Histogram  1 ???
 
 
 I have been trying to draw histogram for my manscript and found some
 strange
 things that I could not figure out why.
 
 Using the same code listed below I have successfully draw histograms
 for a
 few figures with fraction labeled on Y axis less than 1 (acturally
 between 0
 to 0.1).  But one dataset gives the Y axis label 0 to 5 as fraction.
 This
 is not true, as fraction are less than 1, although the value
 distribution on
 the figure seems to me is right.
 
 The only difference between the first few datasets and last dataset is:
 
 All values for the first few data sets  1.
 
 The values for the last data sets between 0 and 1.
 
 Any idea why this happens.
 
 Your help is highly appreciated.
 
 Charles
 
 ===
 postscript(Figure.eps, paper=letter, horizontal=FALSE)
 par(mfrow=c(3,3))
 par(omi=c(2,0.2,1.8,0.2), mai=c(0.4, 0.4, 0.5, 0.1) )
 
 my.input - read.table(input.data, header=FALSE, sep=\t)
 my.input.exp - read.table(exp.data, header=FALSE, sep=\t)
 
 hist(my.input.exp[,2], breaks=40,freq = FALSE,xlab=, border =
 grey30,
 ylab=, main=)
 hist(my.input[,2], breaks=40,freq = FALSE,xlab=, ylab=, main=,
 border
 = red,add =TRUE)
 mtext(Fraction, side=2, line=2, cex=0.7)
 box()
 ==
 --
 View this message in context: http://www.nabble.com/Help-needed%3A-
 Fraction-for-Histogram-%3E-1-tp24150345p24150345.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert NA to '.'

2009-06-22 Thread Jorge Ivan Velez

Hi David,
It works for me when handling data frames mixing characters and numeric by
using either print() or a function called foo:

# Some data
x - c(A, B, C, 3.5, 1, 0, NA, a character,NA, another character)
mydata - data.frame(x = x, y = sample(x))
mydata

# Option 1: print()ing
print(mydata, na.print = .)

# Option 2: using a function foo
foo - function(x){
y - as.character(x)
y[is.na(y)] - .
y
}

mydata[,] - apply(mydata, 2, foo)
mydata

See ?print for more information.

HTH,

Jorge


On Mon, Jun 22, 2009 at 9:22 AM, David_D david.da...@gmail.com wrote:


 Jorge,

 Thanks a lot for your reply. It's indeed working in this case. However,
 with
 data frames mixing numeric and character is not working anymore. Morever I
 am working with many variables and I don't want to modify them. What I
 would
 really appreciate is a global option (in the Rprofile?) that allow to
 change
 the display nd printing of NA by any symbol?

 Does it exist?

 Best regards,
 David


 Jorge Ivan Velez wrote:
 
  Dear David,
  Try this:
 
  x - c(A, B, C, NA)
  x[is.na(x)] - .
  x
 
  HTH,
 
  Jorge
 
  On Mon, Jun 22, 2009 at 2:11 AM, David_D david.da...@gmail.com wrote:
 
 
  Dear R-users,
 
  For reporting purpose (using Sweave and LaTeX), I am creating complex
  tables
  with the cat function such as
 
   x-c(A, B, C, NA)
   cat(x, '\n')
  A B C NA
 
  For convenience, I would like to change all my NA value to something
 else
  like '.' (as in SAS for example). Is there a global option which allows
  this
  change? Or should I change all my code to work with the print function
  and
  the na.print argument?
 
  Best regards,
  David
  --
  View this message in context:
  http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24142187.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

 --
 View this message in context:
 http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24147157.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert NA to '.'

2009-06-22 Thread David Winsemius



On Jun 22, 2009, at 1:21 PM, Jorge Ivan Velez wrote:


Hi David,
It works for me when handling data frames mixing characters and  
numeric by

using either print() or a function called foo:

# Some data
x - c(A, B, C, 3.5, 1, 0, NA, a character,NA, another  
character)

mydata - data.frame(x = x, y = sample(x))
mydata


Those are not numeric. They are factors.


# Option 1: print()ing
print(mydata, na.print = .)


Does not work as expected with numeric NA's.

 mydata - data.frame(x = x, y = sample(x),  
c=sample(c(1:2,NA),length(x),replace=TRUE))

 print(mydata, na.print = .)
   x y  c
1  A . NA
2  B . NA
3  C another character NA
43.5 0  1
5  1 B  1
6  0 C  1
7  . 1  1
8a character   a character  1
9  .   3.5  2
10 another character A  1


?format.data.frame




# Option 2: using a function foo
foo - function(x){
   y - as.character(x)
   y[is.na(y)] - .
   y
   }

mydata[,] - apply(mydata, 2, foo)
mydata



But this does work, even when when given numeric NA's

 mydata[,] - apply(mydata, 2, foo)
 mydata
   x y  c
1  A .  .
2  B .  .
3  C another character  .
43.5 0  1
5  1 B  1
6  0 C  1
7  . 1  1
8a character   a character  1
9  .   3.5  2
10 another character A  1




On Mon, Jun 22, 2009 at 9:22 AM, David_D david.da...@gmail.com  
wrote:




Jorge,

Thanks a lot for your reply. It's indeed working in this case.  
However,

with
data frames mixing numeric and character is not working anymore.  
Morever I
am working with many variables and I don't want to modify them.  
What I

would
really appreciate is a global option (in the Rprofile?) that allow to
change
the display nd printing of NA by any symbol?

Does it exist?

Best regards,
David


Jorge Ivan Velez wrote:


Dear David,
Try this:

x - c(A, B, C, NA)
x[is.na(x)] - .
x

HTH,

Jorge

On Mon, Jun 22, 2009 at 2:11 AM, David_D david.da...@gmail.com  
wrote:




Dear R-users,

For reporting purpose (using Sweave and LaTeX), I am creating  
complex

tables
with the cat function such as


x-c(A, B, C, NA)
cat(x, '\n')

A B C NA

For convenience, I would like to change all my NA value to  
something

else
like '.' (as in SAS for example). Is there a global option which  
allows

this
change? Or should I change all my code to work with the print  
function

and
the na.print argument?

Best regards,
David
--
View this message in context:
http://www.nabble.com/Convert-NA-to-%27.%27- 
tp24142187p24142187.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context:
http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24147157.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] User defined GLM?

2009-06-22 Thread David Winsemius



On Jun 22, 2009, at 12:35 PM, francogrex wrote:



Hello, I have this generalized linear formula:
log(x[i]/n[i])=log(sum(x)/sum(n)) + beta[i]
where the the x[i] and the n[i] are known.
Is there a way to program the GLM procedure to input the formula  
above and
get the beta[i] estimates? If not the GLM is there another procedure  
to do
that? The aim also afterwards is to estimate the profile-likelihood  
CIs for

the beta parameters.


Not as stated. That would be expecting R to have symbolic algebraic  
capabilities. Maybe next year^Wdecade?


That model is equivalent to:


log(x[i]/n[i])=log(sum(x)) - log(sum(n)) + beta[i]


So you could use a no-intercept model with log link in glm() and an  
offset term for the first two terms of the r.h.s.


probi - (x[i]/n[i])
glm(probi ~ 0 + offset(log(sum(x)) - log(sum(n) ), family=poisson)
--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] SAS-like method of recoding variables?

2009-06-22 Thread Mark Na

Dear R-helpers,

I am helping a SAS user run some analyses in R that she cannot do in
SAS and she is complaining about R's peculiar (to her!) way of
recoding variables. In particular, she is wondering if there is an R
package that allows this kind of SAS recoding:

IF TYPE='TRUCK' and count=12 THEN VEHICLES=TRUCK+((CAR+BIKE)/2.2);

Thanks for any help or suggestions you might be able to provide!

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with storing a sequence of lmer() model fit into a list

2009-06-22 Thread Sean Zhang

Dear R-helpers:
May I ask a question related to storing a number of lmer model fit into a
list.

Basically, I have a for-loop (see towards the bottom of this email)
in the loop, I am very sure that the i-th model fit (i.e.,fit_i) is
successfully generated and the character string (i.e., tmp_i) is created
correctly.
The problem stems from the following line in the for-loop
#trouble making line below
fit.list[[tmp_i]] - fit_i


I tried the following example which stores glm() model fit without a
problem.
#the following code can store glm() model fit into a list
---
x1-runif(200)
x2-rnorm(200)
y-x1+x2
testdf-data.frame(y=y, x1=x1, x2=x2)
indepvec-c(x1,x2)
fit.list-NULL
fit_1-glm(y~x1,data=testdf)
fit_2-glm(y~x2,data=testdf)
fit.list[[paste('fit_',indepvec[1],sep='')]]-fit_1
fit.list[[paste('fit_',indepvec[12],sep='')]]-fit_2

so why cannot I store lmer() model fit in a list?
Would someone kindly explain to me what the R error message(last line of
this email) really means?
Your kind help will be highly appreciated!

-Sean

#the following for-loop intends to store lmer() random poisson model output
into list (fit.list), it does not work
---
fit.list-NULL
for (i in seq_along(depvar_vec))
{
  #I found that s_sex, ses1 and race are not useful
  fit_i - lmer(as.formula(gen.ranpoisson.fml.jh(depvar_vec[i],
offsetvar ,factorindepvars,  nonfactorindepvars ,ranintvar )),
family=quasipoisson(link=log),verbose=F, data=indf)
  tmp_i-paste('ranpoi_', depvar_vec[i], sep='')
  fit.list[[tmp_i]] - fit_i
  #assign also does not work
  #assign(fit.list$parse(text = tmp_i), fit_i)
 }
---


#R gives the following error message.

Error in fit.list[[tmp_i]] - fit_i : invalid type/length (S4/0) in vector
allocation

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SAS-like method of recoding variables?

2009-06-22 Thread David Winsemius



On Jun 22, 2009, at 2:27 PM, Mark Na wrote:


Dear R-helpers,

I am helping a SAS user run some analyses in R that she cannot do in
SAS and she is complaining about R's peculiar (to her!) way of
recoding variables. In particular, she is wondering if there is an R
package that allows this kind of SAS recoding:

IF TYPE='TRUCK' and count=12 THEN VEHICLES=TRUCK+((CAR+BIKE)/2.2);


?ifelse




Thanks for any help or suggestions you might be able to provide!

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SAS-like method of recoding variables?

2009-06-22 Thread Chuck Cleland

On 6/22/2009 2:27 PM, Mark Na wrote:
 Dear R-helpers,
 
 I am helping a SAS user run some analyses in R that she cannot do in
 SAS and she is complaining about R's peculiar (to her!) way of
 recoding variables. In particular, she is wondering if there is an R
 package that allows this kind of SAS recoding:
 
 IF TYPE='TRUCK' and count=12 THEN VEHICLES=TRUCK+((CAR+BIKE)/2.2);
 
 Thanks for any help or suggestions you might be able to provide!

  If the variables are in a data frame called mydf, she might do
something like this:

mydf$VEHICLE - with(mydf, ifelse(TYPE=='TRUCK'  count==12,
  TRUCK+((CAR+BIKE)/2.2),
  NA))

or

mydf - transform(mydf, VEHICLE = ifelse(TYPE=='TRUCK'  count==12,
 TRUCK+((CAR+BIKE)/2.2),
 NA))

 Mark Na
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SAS-like method of recoding variables?

2009-06-22 Thread Peter Dalgaard


Mark Na wrote:

Dear R-helpers,

I am helping a SAS user run some analyses in R that she cannot do in
SAS and she is complaining about R's peculiar (to her!) way of
recoding variables. In particular, she is wondering if there is an R
package that allows this kind of SAS recoding:

IF TYPE='TRUCK' and count=12 THEN VEHICLES=TRUCK+((CAR+BIKE)/2.2);


vehicles - ifelse(TYPE=='TRUCK'  count=12, TRUCK+((CAR+BIKE)/2.2), NA)

or maybe

newdata - within(mydata,
   vehicles - ifelse(TYPE=='TRUCK'  count=12,
TRUCK+((CAR+BIKE)/2.2), NA)
)

or transform(mydata, vehicles=)

or, if you insist on only having the calculation done for the subgroup,

sub - with(mydata, TYPE=='TRUCK'  count=12)
sub - sub  !is.na(sub)
mydata$vehicles - NA
mydata$vehicles[sub] - with(mydata[sub,],  TRUCK+((CAR+BIKE)/2.2) )

(all assuming that there's no vehicles in the data to begin with).


--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] User defined GLM?

2009-06-22 Thread Emmanuel Charpentier

Le lundi 22 juin 2009 à 09:35 -0700, francogrex a écrit :
 Hello, I have this generalized linear formula:
 log(x[i]/n[i])=log(sum(x)/sum(n)) + beta[i]

I do not understand the problem as stated. if x[i] and n[i] are known,
and unless sum(n)=0, your dataset reduces to a set of nrow(dataset)
independent linear equations with nrow(dataset) unknowns (the beta[i]),
whose solution is trivially beta[i]=log(x[i]/n[i])-log(sum(x)/sum(n),
except when n[i]=0 in which case your equation has no solution.

Could you try to re-express your problem ?

Emmanuel Charpentier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What has happened to the R-Help Google Groups Archive? Alternative?

2009-06-22 Thread David M Smith

Gmane has already been mentioned, and you might also want to consider
Nabble -- judging from my referrer logs many people use it to read
r-help. If you use Gmail, you might also want to consider subscribing
to the list and using a simple filter. Details and links at
blog.revolution-computing.com, here:

http://bit.ly/FE2s9

# David Smith

On Mon, Jun 22, 2009 at 4:23 AM, Tony Breyaltony.bre...@googlemail.com wrote:
 Greetings,

 I usually read this mailing list through google groups
 (http://groups.google.com/group/r-help-archive), but when I opened the
 webpage this morning it said: The group named r-help-archive has been
 removed because it violated Google's Terms Of Service.

 Is there an alternative website which uses a similar structure to
 google groups? I had a quick browse on the R Wiki
 (http://wiki.r-project.org/rwiki/doku.php?id=links:links) but didn't
 see a page with this sort of info.

 Thank you kindly,
 Tony Breyal

-- 
David M Smith da...@revolution-computing.com
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (San Francisco, USA)

Check out our upcoming events schedule at www.revolution-computing.com/events

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Shapiro.test on data frame

2009-06-22 Thread Gonzalo Quiroga

Hi, I need help to perform a Shapiro.test on a data frame, I know that
this test works only with vector but I guess there most be a way to
permor it on a data frame instead of vactor by vector (i.e. I've got 40
variables to analyze and its kinda annoying to do it one by one)

Thanks to anyone that can help me.

 Gonzalo Quiroga

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] maps maptools

2009-06-22 Thread John Lipkins

Hey,

A basic question. Is there anyone that knows a good text on using the
R packages map and maptools for R? I have read the references for both
but having trouble getting started.

What I want to do is to load a map (.shp file) and display statistics
on this maps using colours.

Thanks in advance.
Kind regards,

John

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] User defined GLM?

2009-06-22 Thread francogrex



Emmanuel Charpentier wrote:
 
 I do not understand the problem as stated. if x[i] and n[i] are known,
 and unless sum(n)=0, your dataset reduces to a set of nrow(dataset)
 independent linear equations with nrow(dataset) unknowns (the beta[i]),
 whose solution is trivially beta[i]=log(x[i]/n[i])-log(sum(x)/sum(n),
 except when n[i]=0 in which case your equation has no solution.
 Could you try to re-express your problem ?
 

This is taken from this article: Methods for confidence interval estimation
of a ratio parameter with application to location quotients. The authors
argue that it's not the solution to find beta that they're after but the
side effect of having the profile-likelihood confidence intervals in the
estimation process. Those guys say they used the SAS Genmod procedure. I
thought I could use glm in R and find the profile-likelihood CIs using
confint on the glm results.

-- 
View this message in context: 
http://www.nabble.com/User-defined-GLM--tp24150859p24154447.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help needed: Fraction for Histogram 1 ???

2009-06-22 Thread charles78


It is not the case as you described.  In any case, the total area should be 1
and labeled fraction on y axis should be far less than 1, since I have more
than 1 data points.  I also test differerent bin size by change the
break.

I draw the graph using only 1 group, the same result was obtained.

Any othe suggestion?

Charles.  

Greg Snow-2 wrote:
 
 When freq=FALSE then the y axis is not the proportion in each group (what
 I am assuming you mean by fraction), but rather is scaled so that the
 total area of the histogram is 1 (making comparing to theoretical
 densities easier).  If all the data values are between 0 and 1, then the
 height of at least one bar needs to be = 1 for the total area to equal 1.
 
 If you want the y-axis to show relative frequency (proportion, fraction,
 etc.), then you either need to plot the y-axis yourself or use a different
 function than 'hist'.
 
 -- 
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of charles78
 Sent: Monday, June 22, 2009 10:08 AM
 To: r-help@r-project.org
 Subject: [R] Help needed: Fraction for Histogram  1 ???
 
 
 I have been trying to draw histogram for my manscript and found some
 strange
 things that I could not figure out why.
 
 Using the same code listed below I have successfully draw histograms
 for a
 few figures with fraction labeled on Y axis less than 1 (acturally
 between 0
 to 0.1).  But one dataset gives the Y axis label 0 to 5 as fraction.
 This
 is not true, as fraction are less than 1, although the value
 distribution on
 the figure seems to me is right.
 
 The only difference between the first few datasets and last dataset is:
 
 All values for the first few data sets  1.
 
 The values for the last data sets between 0 and 1.
 
 Any idea why this happens.
 
 Your help is highly appreciated.
 
 Charles
 
 ===
 postscript(Figure.eps, paper=letter, horizontal=FALSE)
 par(mfrow=c(3,3))
 par(omi=c(2,0.2,1.8,0.2), mai=c(0.4, 0.4, 0.5, 0.1) )
 
 my.input - read.table(input.data, header=FALSE, sep=\t)
 my.input.exp - read.table(exp.data, header=FALSE, sep=\t)
 
 hist(my.input.exp[,2], breaks=40,freq = FALSE,xlab=, border =
 grey30,
 ylab=, main=)
 hist(my.input[,2], breaks=40,freq = FALSE,xlab=, ylab=, main=,
 border
 = red,add =TRUE)
 mtext(Fraction, side=2, line=2, cex=0.7)
 box()
 ==
 --
 View this message in context: http://www.nabble.com/Help-needed%3A-
 Fraction-for-Histogram-%3E-1-tp24150345p24150345.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Help-needed%3A-Fraction-for-Histogram-%3E-1-tp24150345p24154095.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Shapiro.test on data frame

2009-06-22 Thread Henrique Dallazuanna

Try this:

x - data.frame(A = runif(10), B = rnorm(10))
lapply(x, shapiro.test)

On Mon, Jun 22, 2009 at 3:15 PM, Gonzalo Quiroga
quirogagonz...@gmail.comwrote:

 Hi, I need help to perform a Shapiro.test on a data frame, I know that
 this test works only with vector but I guess there most be a way to
 permor it on a data frame instead of vactor by vector (i.e. I've got 40
 variables to analyze and its kinda annoying to do it one by one)

 Thanks to anyone that can help me.

 Gonzalo Quiroga

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with storing a sequence of lmer() model fit into a list

2009-06-22 Thread Rolf Turner



(a) Your code is unnecessarily convoluted.

(b) The example of things *not* working is not reproducible.  (Read  
the posting guide!!!)


(c) Nonetheless the phenomenon you describe is weird/interesting.

On my system, the following runs without error:

fit.list - NULL
a - factor(rep(1:10,each=20))
set.seed(42)
for(i in 1:10) {
x1-runif(200)
x2-rnorm(200)
y - x1+x2
testdf-data.frame(y=y, x1=x1)
fit - lm(y~x1+a,data=testdf)
fit.list[[i]] - fit
}


However if the call to lm() is replace by a call to lmer()

library(lme4)
fit.list - NULL
a - factor(rep(1:10,each=20))
set.seed(42)
for(i in 1:10) {
x1-runif(200)
x2-rnorm(200)
y - x1+x2
testdf-data.frame(y=y, x1=x1)
fit - lmer(y~x1 + (1|a),data=testdf)
fit.list[[i]] - fit
}

I get the same error that you reported, namely

Error in fit.list[[i]] - fit :
  invalid type/length (S4/0) in vector allocation

Finally if fit.list is initialized to an empty list (which is the way  
it *should*
be initialized anyway) rather than to NULL, then things appear to  
work smoothly even

with a call to lmer():

library(lme4)
fit.list - list()
a - factor(rep(1:10,each=20))
set.seed(42)
for(i in 1:10) {
x1-runif(200)
x2-rnorm(200)
y - x1+x2
testdf-data.frame(y=y, x1=x1)
fit - lmer(y~x1 + (1|a),data=testdf)
fit.list[[i]] - fit
}

So a ``fix'' for the problem would seem to be to start with fit.list  
being an empty list,

rather than NULL.

But why the problem shows up only with calls to lmer() is completely  
mysterious to me.

Perhaps others will be able to shed light.

HTH

cheers,

Rolf Turner

On 23/06/2009, at 6:34 AM, Sean Zhang wrote:


Dear R-helpers:
May I ask a question related to storing a number of lmer model fit  
into a

list.

Basically, I have a for-loop (see towards the bottom of this email)
in the loop, I am very sure that the i-th model fit (i.e.,fit_i) is
successfully generated and the character string (i.e., tmp_i) is  
created

correctly.
The problem stems from the following line in the for-loop
#trouble making line below
fit.list[[tmp_i]] - fit_i


I tried the following example which stores glm() model fit without a
problem.
#the following code can store glm() model fit into a list
---
x1-runif(200)
x2-rnorm(200)
y-x1+x2
testdf-data.frame(y=y, x1=x1, x2=x2)
indepvec-c(x1,x2)
fit.list-NULL
fit_1-glm(y~x1,data=testdf)
fit_2-glm(y~x2,data=testdf)
fit.list[[paste('fit_',indepvec[1],sep='')]]-fit_1
fit.list[[paste('fit_',indepvec[12],sep='')]]-fit_2

so why cannot I store lmer() model fit in a list?
Would someone kindly explain to me what the R error message(last  
line of

this email) really means?
Your kind help will be highly appreciated!

-Sean

#the following for-loop intends to store lmer() random poisson  
model output

into list (fit.list), it does not work
-- 
-

fit.list-NULL
for (i in seq_along(depvar_vec))
{
  #I found that s_sex, ses1 and race are not useful
  fit_i - lmer(as.formula(gen.ranpoisson.fml.jh(depvar_vec[i],
offsetvar ,factorindepvars,  nonfactorindepvars ,ranintvar )),
family=quasipoisson(link=log),verbose=F, data=indf)
  tmp_i-paste('ranpoi_', depvar_vec[i], sep='')
  fit.list[[tmp_i]] - fit_i
  #assign also does not work
  #assign(fit.list$parse(text = tmp_i), fit_i)
 }
---


#R gives the following error message.

Error in fit.list[[tmp_i]] - fit_i : invalid type/length (S4/0) in  
vector

allocation


##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Shapiro.test on data frame

2009-06-22 Thread Dylan Beaudette

On Monday 22 June 2009, Gonzalo Quiroga wrote:
 Hi, I need help to perform a Shapiro.test on a data frame, I know that
 this test works only with vector but I guess there most be a way to
 permor it on a data frame instead of vactor by vector (i.e. I've got 40
 variables to analyze and its kinda annoying to do it one by one)

 Thanks to anyone that can help me.

  Gonzalo Quiroga

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented, minimal,
 self-contained, reproducible code.

are you looking to perform this column-wise or row-wise?

see ?apply for ideas

cheers,
Dylan

-- 
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help needed: Fraction for Histogram 1 ???

2009-06-22 Thread Thomas Lumley


On Mon, 22 Jun 2009, charles78 wrote:



It is not the case as you described.  In any case, the total area should be 1
and labeled fraction on y axis should be far less than 1, since I have more
than 1 data points.  I also test differerent bin size by change the
break.


Try reading Greg's response again.  Then calculate the area of one of the bars 
in your histogram, remembering that the area of a rectangle is not the same as 
its height, but is the width times height.  If the area of the bar is greater than 
1, then you can report a problem.


  -thomas



I draw the graph using only 1 group, the same result was obtained.

Any othe suggestion?

Charles.

Greg Snow-2 wrote:


When freq=FALSE then the y axis is not the proportion in each group (what
I am assuming you mean by fraction), but rather is scaled so that the
total area of the histogram is 1 (making comparing to theoretical
densities easier).  If all the data values are between 0 and 1, then the
height of at least one bar needs to be = 1 for the total area to equal 1.

If you want the y-axis to show relative frequency (proportion, fraction,
etc.), then you either need to plot the y-axis yourself or use a different
function than 'hist'.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
project.org] On Behalf Of charles78
Sent: Monday, June 22, 2009 10:08 AM
To: r-help@r-project.org
Subject: [R] Help needed: Fraction for Histogram  1 ???


I have been trying to draw histogram for my manscript and found some
strange
things that I could not figure out why.

Using the same code listed below I have successfully draw histograms
for a
few figures with fraction labeled on Y axis less than 1 (acturally
between 0
to 0.1).  But one dataset gives the Y axis label 0 to 5 as fraction.
This
is not true, as fraction are less than 1, although the value
distribution on
the figure seems to me is right.

The only difference between the first few datasets and last dataset is:

All values for the first few data sets  1.

The values for the last data sets between 0 and 1.

Any idea why this happens.

Your help is highly appreciated.

Charles



=
==

postscript(Figure.eps, paper=letter, horizontal=FALSE)
par(mfrow=c(3,3))
par(omi=c(2,0.2,1.8,0.2), mai=c(0.4, 0.4, 0.5, 0.1) )

my.input - read.table(input.data, header=FALSE, sep=\t)
my.input.exp - read.table(exp.data, header=FALSE, sep=\t)

hist(my.input.exp[,2], breaks=40,freq = FALSE,xlab=, border =
grey30,
ylab=, main=)
hist(my.input[,2], breaks=40,freq = FALSE,xlab=, ylab=, main=,
border
= red,add =TRUE)
mtext(Fraction, side=2, line=2, cex=0.7)
box()


=
=

--
View this message in context: http://www.nabble.com/Help-needed%3A-
Fraction-for-Histogram-%3E-1-tp24150345p24150345.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: http://www.nabble.com/Help-needed%3A-

Fraction-for-Histogram-%3E-1-tp24150345p24154095.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-

guide.html

and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nls vs nlme: parameter constraints

2009-06-22 Thread Manuel Morales

Hello list,

I'm trying to fit a model like beta[trt]/(1+alpha*x) where the data
include some grouping factor. The problem is that the estimate for alpha
is undefined for some of the treatments - any value greater than 20 is
equally good and a step function would suffice. Ignoring the grouping
structure, I can fit this using nls with the port algorithm by
restricting the upper value of alpha to 20. Is there a way to do
something similar in nlme?

Thanks!

Manuel

-- 
http://mutualism.williams.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] maps maptools

2009-06-22 Thread Kingsford Jones

Hi John,

The SIDS examples in the spatial graphics gallery may be helpful:

http://r-spatial.sourceforge.net/gallery/

For a general reference, see Applied Spatial Data Analysis with R.
Note that the book's webpage includes figures with code

http://www.asdar-book.org/

There's also the spatial Task View

http://cran.r-project.org/web/views/Spatial.html

and an active spatial mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/R-SIG-Geo/


hth,

Kingsford Jones

On Mon, Jun 22, 2009 at 1:01 PM, John
Lipkinsjohn.lipk...@googlemail.com wrote:
 Hey,

 A basic question. Is there anyone that knows a good text on using the
 R packages map and maptools for R? I have read the references for both
 but having trouble getting started.

 What I want to do is to load a map (.shp file) and display statistics
 on this maps using colours.

 Thanks in advance.
 Kind regards,

 John

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question about using _apply and/or aggregate functions

2009-06-22 Thread Clifford Long

Hi R-list,

I'll apologize in advance for (1) the wordiness of my note (not sure
how to avoid it) and (2) any deficiencies on my part that lead to my
difficulties.

I have an application with several stages that is meant to simulate
and explore different scenarios with respect to product sales (in
units sold per month).  My session info is at the bottom of this note.

The steps include (1) an initial linear regression model, (2) building
an ARIMA model, (3) creating 12 months of simulated 'real' data - for
various values of projected changes in slope from the linear
regression - from the ARIMA model using arima.sim with subsequent
undifferencing and appending to historical data, (3) one-step-ahead
forecasting for 12 months using the 'fixed' arima model and simulated
data, (4) taking the residuals from the forecasting (simulated minus
forecast for each of the 12 months) and applying QC charting, and (5)
counting the number (if any) of runs-rules violations detected.

The simulation-aspect calculates 100 simulations for each of 50 values of slope.

All of this seems to work fine (although I'm sure that the coding
could be improved through functions, vectorization, etc. ... ).
However, the problem I'm having is at the end where I had hoped that
things would be easier.  I want to summarize and graph the probability
of detecting a runs-rule violation vs. the amount of the shift in
slope (of logunit sales).

The output data array passed from the qcc section at the end includes:
  - the adjustment made to the slope (a multiplier)
  - the actual value of the slope
  - the iteration number of the simulation loop (within each value of slope)
  - the count of QC charting limits violations
  - the count of QC charting runs rules violations


My code is in the attached file (generic_code.R) and my initial sales
data needed to prime the simulation is in the other attached file
(generic_data.csv).  The relevant section of my code is at the
bottom of the .R file after the end of the outer loop.  I've tried to
provide meaningful comments.

I've read the help files for _apply, aggregate, arrays and data types,
and have also consulted with several texts (Maindonald and Braun;
Spector; Venebles and Ripley for S-plus).  Somehow I still don't get
it.  My attempts usually result in a message like the following:

 agg.result = aggregate(simarray.part[,3], by=list4, FUN = mean)
Error in FUN(X[[1L]], ...) : arguments must have same length

But when I check the length of the arguments, they appear to match. (??)

 length(simarray.part[,3])
[1] 5000
 length(simarray.part[,4])
[1] 5000
 length(list4)
[1] 5000


I would have rather passed along a subset of the simulation/loop
output dataset, but was unsure how to save it to preserve whatever
properties that I may have missed that are causing my difficulties.
If anyone still wants to help at this point, I believe that you'll
need to load the .csv data and run the simulation (maybe reducing the
number of iterations).

Many thanks to anyone who can shed some light on my difficulties
(whether self-induced or otherwise).

Cliff



I'm using a pre-compiled binary version of R for Windows.

Session info:

 sessionInfo()
R version 2.9.0 (2009-04-17)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] qcc_1.3 forecast_1.24   tseries_0.10-18 zoo_1.5-5
[5] quadprog_1.4-11

loaded via a namespace (and not attached):
[1] grid_2.9.0  lattice_0.17-22


 Sys.getlocale()
[1] LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about using _apply and/or aggregate functions

2009-06-22 Thread Clifford Long

Hi David,

I appreciate the advice.  I had coerced 'list4' to as.list, but forgot
to specify list=() in the call to aggregate.  I made the correction,
and now get the following:

 slope.mult = simarray[,1]
 adj.slope.value = simarray[,2]
 adj.slope.level = simarray[,2]
 qc.run.violation = simarray[,5]
 simarray.part = cbind(slope.mult, adj.slope.value, qc.run.violation, 
 adj.slope.level)
 list4 = as.list(simarray.part[,4])
 agg.result = aggregate(simarray.part[,3], by=list(list4), FUN = mean)
Error in sort.list(unique.default(x), na.last = TRUE) :
  'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

... I'm not sure what this means that I've done wrong.  I did check
'list4' using is.list, and get TRUE back as an answer, so feel
that my mistake is some other fundamental aspect of R that I'm failing
to grasp.

To your note on 'tapply' ... I did try this as well (actually, tried
it first) with no initial success.  On your recommendation, I gave
tapply another go, and get something recognizable:

vtt = tapply(simarray.part[,3], simarray.part[,2], mean)

 dim(vtt)
[1] 50
 length(vtt)
[1] 50
 vtt[1:5]
0.003132 0.006264 0.009396 0.012528  0.01566
0.29 0.24 0.23 0.16 0.22
 vtt[1]
0.003132
0.29
 vtt[[1]][1]
[1] 0.29


I see that the output stored in vtt has one dimension with
length=50.  But each place in vtt appears to hold two values.  I'll
admit that I'm not sure how to access/reference the entirety of all 50
values =  0.29  0.24  0.23  0.16  0.22 (and so on).  I don't appear to
be able to access/reference only what appears to be an embedded index
= 0.003132   0.006264   0.009396  etc.   What am I missing?  Is there
a reference that I need to re-read?  I would like to be able to plot
one against the other.

Thanks again for taking the time outside of your day job for your
earlier reply!

Cliff


On Mon, Jun 22, 2009 at 11:28 AM, David Winsemiusdwinsem...@comcast.net wrote:

 On Jun 22, 2009, at 12:04 PM, Clifford Long wrote:

 Hi R-list,

 I'll apologize in advance for (1) the wordiness of my note (not sure
 how to avoid it) and (2) any deficiencies on my part that lead to my
 difficulties.

 I have an application with several stages that is meant to simulate
 and explore different scenarios with respect to product sales (in
 units sold per month).  My session info is at the bottom of this note.

 The steps include (1) an initial linear regression model, (2) building
 an ARIMA model, (3) creating 12 months of simulated 'real' data - for
 various values of projected changes in slope from the linear
 regression - from the ARIMA model using arima.sim with subsequent
 undifferencing and appending to historical data, (3) one-step-ahead
 forecasting for 12 months using the 'fixed' arima model and simulated
 data, (4) taking the residuals from the forecasting (simulated minus
 forecast for each of the 12 months) and applying QC charting, and (5)
 counting the number (if any) of runs-rules violations detected.

 The simulation-aspect calculates 100 simulations for each of 50 values of
 slope.

 All of this seems to work fine (although I'm sure that the coding
 could be improved through functions, vectorization, etc. ... ).
 However, the problem I'm having is at the end where I had hoped that
 things would be easier.  I want to summarize and graph the probability
 of detecting a runs-rule violation vs. the amount of the shift in
 slope (of logunit sales).

 The output data array passed from the qcc section at the end includes:
  - the adjustment made to the slope (a multiplier)
  - the actual value of the slope
  - the iteration number of the simulation loop (within each value of
 slope)
  - the count of QC charting limits violations
  - the count of QC charting runs rules violations


 My code is in the attached file (generic_code.R) and my initial sales
 data needed to prime the simulation is in the other attached file
 (generic_data.csv).  The relevant section of my code is at the
 bottom of the .R file after the end of the outer loop.  I've tried to
 provide meaningful comments.

 I've read the help files for _apply, aggregate, arrays and data types,
 and have also consulted with several texts (Maindonald and Braun;
 Spector; Venebles and Ripley for S-plus).  Somehow I still don't get
 it.  My attempts usually result in a message like the following:

 agg.result = aggregate(simarray.part[,3], by=list4, FUN = mean)

 Error in FUN(X[[1L]], ...) : arguments must have same length

 I cannot comment on the overall strategy, but wondered if this minor mod to
 the code might succeed;

 agg.result = aggregate(simarray.part[,3], by=list(list4), FUN = mean)

 My personal experience with aggregate() is not positive. I generally end up
 turning to tapply()  (which is at the heart of aggregate() anyway) probably
 because I forget to wrap the second argument in a list. Slow learner, I
 guess.



 But when I check the length of the arguments, they appear to match. (??)

[R] Calculating row standard deviations

2009-06-22 Thread Mark Na

Hi R-helpers,

I have been struggling with calculating row and column statistics,
e.g. standard deviation.

I know that
 datac$Mean-rowMeans(datac,na.rm=TRUE)
will give me row means.

I have tried to replicate those row means with the apply function:
 datac$Mean2-apply(datac,2,mean)

so that I can replace the function argument with sd (instead of
mean) to get standard deviations.

But, I'm running into this error:

 dim(datac)
[1]  17 271
 datac$Mean2-apply(datac,2,mean)
Error in dimnames(x) - dn :
  length of 'dimnames' [2] not equal to array extent


Can anyone see what I'm doing wrong?

Thanks!

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice logaritmic scale (basis e ), rewriting labels using xscale.component

2009-06-22 Thread Katharina May

thanks Deepayan, that works great!


2009/6/19 Deepayan Sarkar deepayan.sar...@gmail.com:
 On 6/18/09, Katharina May may.kathar...@googlemail.com wrote:
 Hi there,

  sorry for troubling everybody once again, I've got a problem rewriting
  Sarkar's function for
  rewriting the tick locations in a logaritmic way (s.
  http://lmdvr.r-forge.r-project.org/code/Chapter08.R):

  His example works for log 2 but I need log e (natural logarithm). My
  problem is that if I replace
  2 with e (using paste()), I get the error message that the location
  isn't a numeric value.

 R doesn't have a constant that represents e, but

  tick.at - logTicks(exp(lim), loc = c(1, 3))

 instead of

  tick.at - logTicks(paste(e^,lim,sep=), loc = c(1, 3))

 should give you e^lim. Or, if it makes it easier,

  e - exp(1)
  tick.at - logTicks(e^lim, loc = c(1, 3))

 I don't think you need to change the logTicks function.

 -Deepayan

  Is there any way to get this working somehow or do I have to take a
  different approach?

  Thanks, Katharina

  Here my failing approach:

  require(lattice)
  data(Earthquake, package = MEMSS)

  xscale.components.log - function(lim, ...) {
     ans - xscale.components.default(lim = lim, ...)
     tick.at - logTicks(paste(e^,lim,sep=), loc = c(1, 3))
     ans$bottom$ticks$at - log(tick.at, 2)
     ans$bottom$labels$at - log(tick.at, 2)
     ans$bottom$labels$labels - as.character(tick.at)
     ans
  }

  logTicks - function (lim, loc = c(1, 5)) {
     ii - floor(log(range(lim))) + c(-1, 2)
     main - paste(e^,(ii[1]:ii[2]),sep=)
     r - as.numeric(outer(loc, main, *))
     r[lim[1] = r  r = lim[2]]
  }
  xyplot(accel ~ distance, data=Earthquake, scales = list(log = e),
  xscale.components = xscale.components.log,




  Here is the original  code of Sarkar:

  logTicks - function (lim, loc = c(1, 5)) {
     ii - floor(log10(range(lim))) + c(-1, 2)
     main - 10^(ii[1]:ii[2])
     r - as.numeric(outer(loc, main, *))
     r[lim[1] = r  r = lim[2]]
  }
  xscale.components.log2 - function(lim, ...) {
     ans - xscale.components.default(lim = lim, ...)
     tick.at - logTicks(2^lim, loc = c(1, 3))
     ans$bottom$ticks$at - log(tick.at, 2)
     ans$bottom$labels$at - log(tick.at, 2)
     ans$bottom$labels$labels - as.character(tick.at)
     ans
  }

  __
  r-h...@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.





-- 
Time flies like an arrow, fruit flies like bananas.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating row standard deviations

2009-06-22 Thread Peter Alspach

Tena koe Mark

I think you might want

apply(datac, 1, mean) 

i.e., apply the function to the first dimension (rows) rather than the
second (columns).

HTH ...

Peter Alspach

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Mark Na
 Sent: Tuesday, 23 June 2009 10:20 a.m.
 To: r-help@r-project.org
 Subject: [R] Calculating row standard deviations
 
 Hi R-helpers,
 
 I have been struggling with calculating row and column 
 statistics, e.g. standard deviation.
 
 I know that
  datac$Mean-rowMeans(datac,na.rm=TRUE)
 will give me row means.
 
 I have tried to replicate those row means with the apply function:
  datac$Mean2-apply(datac,2,mean)
 
 so that I can replace the function argument with sd (instead of
 mean) to get standard deviations.
 
 But, I'm running into this error:
 
  dim(datac)
 [1]  17 271
  datac$Mean2-apply(datac,2,mean)
 Error in dimnames(x) - dn :
   length of 'dimnames' [2] not equal to array extent
 
 
 Can anyone see what I'm doing wrong?
 
 Thanks!
 
 Mark Na
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

The contents of this e-mail are confidential and may be subject to legal 
privilege.
 If you are not the intended recipient you must not use, disseminate, 
distribute or
 reproduce all or any part of this e-mail or attachments.  If you have received 
this
 e-mail in error, please notify the sender and delete all material pertaining 
to this
 e-mail.  Any opinion or views expressed in this e-mail are those of the 
individual
 sender and may not represent those of The New Zealand Institute for Plant and
 Food Research Limited.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating row standard deviations

2009-06-22 Thread Peter Dalgaard


Mark Na wrote:

Hi R-helpers,

I have been struggling with calculating row and column statistics,
e.g. standard deviation.

I know that

datac$Mean-rowMeans(datac,na.rm=TRUE)

will give me row means.

I have tried to replicate those row means with the apply function:

datac$Mean2-apply(datac,2,mean)


so that I can replace the function argument with sd (instead of
mean) to get standard deviations.

But, I'm running into this error:


dim(datac)

[1]  17 271

datac$Mean2-apply(datac,2,mean)

Error in dimnames(x) - dn :
  length of 'dimnames' [2] not equal to array extent


Can anyone see what I'm doing wrong?


Not really, but you need apply(datac,1,mean) for row means:

 apply(airquality,2,mean)
Ozone   Solar.R  Wind  Temp Month   Day
   NANA  9.957516 77.882353  6.993464 15.803922

which are obviously column means.
However, I see a different error if I try to assign that back

 airquality$m - apply(airquality,2,mean)
Error in `$-.data.frame`(`*tmp*`, m, value = c(NA, NA, 
9.95751633986928,  :

  replacement has 6 rows, data has 153

Can't get much closer without a REPRODUCIBLE example.

--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] easiest way to extend and recompile a package?

2009-06-22 Thread Michael

Hi all,

I am thinking of extending a package by directly adding stuff to its
C++ code. And then I have to recompile the package.

Do I have to download the whole R source repository, in order to do
the recompilation? What is the minimal setup requirement for such a
recompilation?

I am using MSVC Express 2008 on Windows, is that okay?

Please give me some guidance. Thank you!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Recursive partitioning algorithms in R vs. alia

2009-06-22 Thread jude.ryan

I have used all 3 packages for decision trees (SAS/EM, CART and R). As
another user on the list commented, the algorithms CART uses are
proprietary. I also know that since the algorithms are proprietary, the
decision tree that you get from SAS is based on a slightly different
algorithm so as to not violate copyright laws. When I first started
using R (rpart) I benchmarked it (in terms of results obtained) for my
particular problem at the time against Salford Systems CART. R gave me
an identical tree with the splitting value being different in the 2nd or
3rd decimal place from what I recall. I did not have SAS/EM at that
particular company and so could not benchmark it. Salford Systems CART
does have additional types of splitting criteria such as towing etc.,
but again, these may be of value in certain types of problems. The
splitting criteria found in R are good enough. 

 

I do have SAS/EM right now but prefer R to SAS/EM since R can be
programmed and SAS/EM cannot. This may not be relevant for decision
trees but for neural networks, for example, if I want to build hundreds
of neural networks (since there are no variable selection methods for
neural networks) with different predictors and different number of
neurons, I can do this easily in R but cannot do this in SAS/EM. SAS/EM
does have a variable selection node but that is independent of the
neural network node, so, from what I understand, you have to select the
variables and then pass them to the neural network node.

 

In general, you get prettier output with CART and SAS/EM for trees.
However, there are packages in R that can give you prettier output than
rpart does. One GUI that you may want to explore, that works with R, is
Rattle. This builds trees, neural network, boosting, etc. and you can
see the generated R code as well.

 

In terms of handling large volumes of data, SAS/EM is probably the best.
However, if you have a 64 bit operating system with lots of RAM, and use
random sampling, R should suffice. It is debatable whether the extra
features like pretty output and variable importance is worth the huge
costs you have to pay for those products, unless you really need these
features. With R you can do what you want, and that is build a good
tree. From what I have read, variable importance measures can be biased
as they are affected by factors such as multicollinearity, variables
with many categories, etc., so their usefulness is questionable
(however, end-users may love them).

 

SAS/EM is by far the most expensive product, and Salford Systems CART is
pretty expensive as well. So depending on your needs, R may be good
enough or the best, because you can program it, and the latest
methodologies will always be implemented in R first. For comparisons of
the programming capabilities of SAS (macros) versus R you may want to
look at what Frank Harrell and Terry Thearneu (who wrote rpart) have to
say. Both are experts in SAS and R.

 

Hope this helps.

 

Jude

 

 

Carlos wrote:

 

Dear R-helpers,

 

I had a conversation with a guy working in a business intelligence

department at a major Spanish bank. They rely on recursive partitioning

methods to rank customers according to certain criteria. 

 

They use both SAS EM and Salford Systems' CART. I have used package R

part in the past, but I could not provide any kind of feature comparison

or the like as I have no access to any installation of the first two

proprietary products.

 

Has anybody experience with them? Is there any public benchmark

available? Is there any very good --although solely technical-- reason

to pay hefty software licences? How would the algorithms implemented in

rpart compare to those in SAS and/or CART?

 

Best regards,

 

Carlos J. Gil Bellosta

http://www.datanalytics.com http://www.datanalytics.com/ 

 

 

___
Jude Ryan
Director, Client Analytical Services
Strategy  Business Development
UBS Financial Services Inc.
1200 Harbor Boulevard, 4th Floor
Weehawken, NJ 07086-6791
Tel. 201-352-1935
Fax 201-272-2914
Email: jude.r...@ubs.com



Please do not transmit orders or instructions regarding a UBS 
account electronically, including but not limited to e-mail, 
fax, text or instant messaging. The information provided in 
this e-mail or any attachments is not an official transaction 
confirmation or account statement. For your protection, do not 
include account numbers, Social Security numbers, credit card 
numbers, passwords or other non-public information in your e-mail. 
Because the information contained in this message may be privileged, 
confidential, proprietary or otherwise protected from disclosure, 
please notify us immediately by replying to this message and 
deleting it from your computer if you have received this 
communication in error. Thank you. 

UBS Financial Services Inc. 
UBS International Inc. 
UBS Financial Services Incorporated of Puerto Rico 
UBS AG

 
UBS reserves the right to retain all messages.

[R] reading data in rjags

2009-06-22 Thread David Cross

I am trying to run jags.model and need the data read in a suitable 
format.  The package (rjags) documentation describes read.data, while 
the classic-bugs examples use read.jagsdata.  I am running R (R 
2.8.1) on a PowerBook G4 (Mac OS X 10.5.7), and my installation 
recognizes neither read command.  Am I missing something?


TIA,
David Cross
www.davidcross.us

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Convert ragged list to matrix

2009-06-22 Thread Kenneth Takagi

Hi,

 

I have a list made up of character strings with each item a different
length (each item is between 1and 6 strings long).  Is there a way to
convert a ragged list to a matrix such that each item is its own row?
Here is a simple example:

 

a=list();

a[[1]] = c(a, b, c);

a[[2]] = c(d, e);

a[[3]] = c(f, g, h, i);

 

I would like to convert the list to a matrix (or data frame) in this
form, with the missing entries NA or something similar:

 

 [,1] [,2] [,3] [,4]

[1,] a  b  c  NA

[2,] d  e  NA  NA

[3,] f  g  h  i

 

 

Any suggestions? 

 

Thanks!

 

 

Ken

kat...@psu.edu

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert ragged list to matrix

2009-06-22 Thread jim holtman

try this:

 a
[[1]]
[1] a b c
[[2]]
[1] d e
[[3]]
[1] f g h i
 # find max row length
 rowMax - max(sapply(a, length))
 # now create output matrix by lengthening rows
 do.call(rbind, lapply(a, function(x){
+ length(x) - rowMax
+ x
+ }))
 [,1] [,2] [,3] [,4]
[1,] a  b  c  NA
[2,] d  e  NA   NA
[3,] f  g  h  i



On Mon, Jun 22, 2009 at 7:15 PM, Kenneth Takagi kat...@psu.edu wrote:

 Hi,



 I have a list made up of character strings with each item a different
 length (each item is between 1and 6 strings long).  Is there a way to
 convert a ragged list to a matrix such that each item is its own row?
 Here is a simple example:



 a=list();

 a[[1]] = c(a, b, c);

 a[[2]] = c(d, e);

 a[[3]] = c(f, g, h, i);



 I would like to convert the list to a matrix (or data frame) in this
 form, with the missing entries NA or something similar:



 [,1] [,2] [,3] [,4]

 [1,] a  b  c  NA

 [2,] d  e  NA  NA

 [3,] f  g  h  i





 Any suggestions?



 Thanks!





 Ken

 kat...@psu.edu




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert ragged list to matrix

2009-06-22 Thread Henrique Dallazuanna

Try this:

matrix(unlist(lapply(a, '[', 1:max(sapply(a, length, ncol = 4, byrow =
TRUE)

or

 do.call(rbind, lapply(a, '[', 1:max(sapply(a, length


On Mon, Jun 22, 2009 at 8:15 PM, Kenneth Takagi kat...@psu.edu wrote:

 Hi,



 I have a list made up of character strings with each item a different
 length (each item is between 1and 6 strings long).  Is there a way to
 convert a ragged list to a matrix such that each item is its own row?
 Here is a simple example:



 a=list();

 a[[1]] = c(a, b, c);

 a[[2]] = c(d, e);

 a[[3]] = c(f, g, h, i);



 I would like to convert the list to a matrix (or data frame) in this
 form, with the missing entries NA or something similar:



 [,1] [,2] [,3] [,4]

 [1,] a  b  c  NA

 [2,] d  e  NA  NA

 [3,] f  g  h  i





 Any suggestions?



 Thanks!





 Ken

 kat...@psu.edu




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert ragged list to matrix

2009-06-22 Thread Kenneth Takagi

Thanks for all the feedback; I found a previous post covering this
question:

 

https://stat.ethz.ch/pipermail/r-help/2009-February/189232.html

 

Problem solved!

 

 

Kenneth Takagi

kat...@psu.edu

 



From: jim holtman [mailto:jholt...@gmail.com] 
Sent: Monday, June 22, 2009 7:24 PM
To: Kenneth Takagi
Cc: r-help@r-project.org
Subject: Re: [R] Convert ragged list to matrix



try this:



 a
[[1]]
[1] a b c

[[2]]
[1] d e

[[3]]
[1] f g h i

 # find max row length
 rowMax - max(sapply(a, length))
 # now create output matrix by lengthening rows
 do.call(rbind, lapply(a, function(x){
+ length(x) - rowMax
+ x
+ }))
 [,1] [,2] [,3] [,4]
[1,] a  b  c  NA  
[2,] d  e  NA   NA  
[3,] f  g  h  i 
 



On Mon, Jun 22, 2009 at 7:15 PM, Kenneth Takagi kat...@psu.edu wrote:

Hi,



I have a list made up of character strings with each item a different
length (each item is between 1and 6 strings long).  Is there a way to
convert a ragged list to a matrix such that each item is its own row?
Here is a simple example:



a=list();

a[[1]] = c(a, b, c);

a[[2]] = c(d, e);

a[[3]] = c(f, g, h, i);



I would like to convert the list to a matrix (or data frame) in this
form, with the missing entries NA or something similar:



[,1] [,2] [,3] [,4]

[1,] a  b  c  NA

[2,] d  e  NA  NA

[3,] f  g  h  i





Any suggestions?



Thanks!





Ken

kat...@psu.edu




   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert ragged list to matrix

2009-06-22 Thread Gabor Grothendieck

Try this:

 unname(t(do.call(cbind, lapply(a, ts
 [,1] [,2] [,3] [,4]
[1,] a  b  c  NA
[2,] d  e  NA   NA
[3,] f  g  h  i


On Mon, Jun 22, 2009 at 7:15 PM, Kenneth Takagikat...@psu.edu wrote:
 Hi,



 I have a list made up of character strings with each item a different
 length (each item is between 1and 6 strings long).  Is there a way to
 convert a ragged list to a matrix such that each item is its own row?
 Here is a simple example:



 a=list();

 a[[1]] = c(a, b, c);

 a[[2]] = c(d, e);

 a[[3]] = c(f, g, h, i);



 I would like to convert the list to a matrix (or data frame) in this
 form, with the missing entries NA or something similar:



     [,1] [,2] [,3] [,4]

 [1,] a  b  c  NA

 [2,] d  e  NA  NA

 [3,] f  g  h  i





 Any suggestions?



 Thanks!





 Ken

 kat...@psu.edu




        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about using _apply and/or aggregate functions

2009-06-22 Thread David Winsemius



On Jun 22, 2009, at 6:16 PM, Clifford Long wrote:


Hi David,

I appreciate the advice.  I had coerced 'list4' to as.list, but forgot
to specify list=() in the call to aggregate.  I made the correction,
and now get the following:


slope.mult = simarray[,1]
adj.slope.value = simarray[,2]
adj.slope.level = simarray[,2]
qc.run.violation = simarray[,5]
simarray.part = cbind(slope.mult, adj.slope.value,  
qc.run.violation, adj.slope.level)

list4 = as.list(simarray.part[,4])
agg.result = aggregate(simarray.part[,3], by=list(list4), FUN = mean)

Error in sort.list(unique.default(x), na.last = TRUE) :
 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

... I'm not sure what this means that I've done wrong.  I did check
'list4' using is.list, and get TRUE back as an answer, so feel
that my mistake is some other fundamental aspect of R that I'm failing
to grasp.

To your note on 'tapply' ... I did try this as well (actually, tried
it first) with no initial success.  On your recommendation, I gave
tapply another go, and get something recognizable:

vtt = tapply(simarray.part[,3], simarray.part[,2], mean)


dim(vtt)

[1] 50

length(vtt)

[1] 50

vtt[1:5]

0.003132 0.006264 0.009396 0.012528  0.01566
   0.29 0.24 0.23 0.16 0.22

vtt[1]

0.003132
   0.29

vtt[[1]][1]

[1] 0.29


I see that the output stored in vtt has one dimension with
length=50.  But each place in vtt appears to hold two values.


Nope, that's just the output from an implicit print(vtt). vtt is an  
array with one row and an associated group of labels. If you doubt me  
(and I had some trouble with this myself) at that point, try  
is.matrix(vtt) and I predict will you get TRUE.



 I'll
admit that I'm not sure how to access/reference the entirety of all 50
values =  0.29  0.24  0.23  0.16  0.22 (and so on).  I don't appear to
be able to access/reference only what appears to be an embedded index
= 0.003132   0.006264   0.009396  etc.   What am I missing?


names(vtt)


 Is there
a reference that I need to re-read?


?tapply

Value
When FUN is present, tapply calls FUN for each cell that has any data  
in it. If FUN returns a single atomic value for each such cell (e.g.,  
functions mean or var) and when simplify is TRUE, tapplyreturns a  
multi-way array containing the values, and NA for the empty cells. 




I would like to be able to plot
one against the other.


plot(names(vtt), vtt)



Thanks again for taking the time outside of your day job for your
earlier reply!

Cliff


On Mon, Jun 22, 2009 at 11:28 AM, David Winsemiusdwinsem...@comcast.net 
 wrote:


On Jun 22, 2009, at 12:04 PM, Clifford Long wrote:


Hi R-list,

I'll apologize in advance for (1) the wordiness of my note (not sure
how to avoid it) and (2) any deficiencies on my part that lead to my
difficulties.

I have an application with several stages that is meant to simulate
and explore different scenarios with respect to product sales (in
units sold per month).  My session info is at the bottom of this  
note.


The steps include (1) an initial linear regression model, (2)  
building
an ARIMA model, (3) creating 12 months of simulated 'real' data -  
for

various values of projected changes in slope from the linear
regression - from the ARIMA model using arima.sim with subsequent
undifferencing and appending to historical data, (3) one-step-ahead
forecasting for 12 months using the 'fixed' arima model and  
simulated

data, (4) taking the residuals from the forecasting (simulated minus
forecast for each of the 12 months) and applying QC charting, and  
(5)

counting the number (if any) of runs-rules violations detected.

The simulation-aspect calculates 100 simulations for each of 50  
values of

slope.

All of this seems to work fine (although I'm sure that the coding
could be improved through functions, vectorization, etc. ... ).
However, the problem I'm having is at the end where I had hoped that
things would be easier.  I want to summarize and graph the  
probability

of detecting a runs-rule violation vs. the amount of the shift in
slope (of logunit sales).

The output data array passed from the qcc section at the end  
includes:

 - the adjustment made to the slope (a multiplier)
 - the actual value of the slope
 - the iteration number of the simulation loop (within each value of
slope)
 - the count of QC charting limits violations
 - the count of QC charting runs rules violations


My code is in the attached file (generic_code.R) and my initial  
sales

data needed to prime the simulation is in the other attached file
(generic_data.csv).  The relevant section of my code is at the
bottom of the .R file after the end of the outer loop.  I've tried  
to

provide meaningful comments.

I've read the help files for _apply, aggregate, arrays and data  
types,

and have also consulted with several texts (Maindonald and Braun;
Spector; Venebles and Ripley for S-plus).  Somehow I still don't get
it.  My attempts usually result in a message like the

Re: [R] Calculating row standard deviations

2009-06-22 Thread David Winsemius



On Jun 22, 2009, at 6:19 PM, Mark Na wrote:


Hi R-helpers,

I have been struggling with calculating row and column statistics,
e.g. standard deviation.

I know that

datac$Mean-rowMeans(datac,na.rm=TRUE)

will give me row means.

I have tried to replicate those row means with the apply function:

datac$Mean2-apply(datac,2,mean)


so that I can replace the function argument with sd (instead of
mean) to get standard deviations.

But, I'm running into this error:


dim(datac)

[1]  17 271

datac$Mean2-apply(datac,2,mean)

Error in dimnames(x) - dn :
 length of 'dimnames' [2] not equal to array extent


If you are trying to create a group means value for each element in an  
array or data.frame then the function to use is ave with its default  
function is mean. Other functions can be used but that is not  
necessary here. You could try:


datac$Mean2-apply(datac,2,ave)





Can anyone see what I'm doing wrong?

Thanks!

Mark Na


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Automatically Adding Categories

2009-06-22 Thread Jeffrey Edgington


Greetings,

I have two files which contain responses to a series of multiple  
choice questions.  One
file contains responses before an intervention and the other  
contains the responses afterward.


There were three possible responses to each question: D, F, T (for  
Don't Know, False, and True).


I would like to try McNemar's test to determine if there was any  
significant difference between before

and after.

I read the files.  I create tables using:

firstQuestion - table( PreSurveyData$q1, PostSurveyData$q2)

for example.

The problem is that for several of the questions not all of the  
possible responses appear.  So I get

a table like this:

T
D   6
F   2
T   12

Which cannot be used in mcnemar.test because it is not a square table  
and does not have enough rows and columns.


Is there some way to specify that R should count the occurrences of  
D, F, and T even though they do not appear

in the Data?  Or some easy way to add the missing columns?

Thank you,

Jeffrey Edgington

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about using _apply and/or aggregate functions

2009-06-22 Thread David Winsemius



On Jun 22, 2009, at 7:55 PM, David Winsemius wrote:



On Jun 22, 2009, at 6:16 PM, Clifford Long wrote:


Hi David,

I appreciate the advice.  I had coerced 'list4' to as.list, but  
forgot
to specify list=() in the call to aggregate.  I made the  
correction,

and now get the following:


slope.mult = simarray[,1]
adj.slope.value = simarray[,2]
adj.slope.level = simarray[,2]
qc.run.violation = simarray[,5]
simarray.part = cbind(slope.mult, adj.slope.value,  
qc.run.violation, adj.slope.level)

list4 = as.list(simarray.part[,4])
agg.result = aggregate(simarray.part[,3], by=list(list4), FUN =  
mean)

Error in sort.list(unique.default(x), na.last = TRUE) :
'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

... I'm not sure what this means that I've done wrong.  I did check
'list4' using is.list, and get TRUE back as an answer, so feel
that my mistake is some other fundamental aspect of R that I'm  
failing

to grasp.

To your note on 'tapply' ... I did try this as well (actually, tried
it first) with no initial success.  On your recommendation, I gave
tapply another go, and get something recognizable:

vtt = tapply(simarray.part[,3], simarray.part[,2], mean)



snipped other stuff...





I would like to be able to plot
one against the other.


plot(names(vtt), vtt)


Or perhaps:

plot(as.numeric(names(vtt)), vtt)

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Automatically Adding Categories

2009-06-22 Thread Henrique Dallazuanna

Try this:

table(factor(pre, levels = c(D, F, T)),
factor(post, levels = c(D, F, T)))

On Mon, Jun 22, 2009 at 8:56 PM, Jeffrey Edgington jedgi...@du.edu wrote:

 Greetings,

 I have two files which contain responses to a series of multiple choice
 questions.  One
 file contains responses before an intervention and the other contains the
 responses afterward.

 There were three possible responses to each question: D, F, T (for Don't
 Know, False, and True).

 I would like to try McNemar's test to determine if there was any
 significant difference between before
 and after.

 I read the files.  I create tables using:

 firstQuestion - table( PreSurveyData$q1, PostSurveyData$q2)

 for example.

 The problem is that for several of the questions not all of the possible
 responses appear.  So I get
 a table like this:

T
 D   6
 F   2
 T   12

 Which cannot be used in mcnemar.test because it is not a square table and
 does not have enough rows and columns.

 Is there some way to specify that R should count the occurrences of D, F,
 and T even though they do not appear
 in the Data?  Or some easy way to add the missing columns?

 Thank you,

 Jeffrey Edgington

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 125 matches

Mail list logo