[R] Logistic and Linear Regression Libraries

2009-10-30 Thread tdm

Hi all,

I'm trying to discover the options available to me for logistic and linear
regression. I'm doing some tests on a dataset and want to see how different
flavours of the algorithms cope. 

So far for logistic regression I've tried glm(MASS) and lrm (Design) and
found there is a big difference. Is there a list anywhere detailing the
options available which details the specific algorithms used?

Thanks in advance,

Phil



-- 
View this message in context: 
http://old.nabble.com/Logistic-and-Linear-Regression-Libraries-tp26140248p26140248.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] a publication quality table for summary.zeroinfl()

2009-10-30 Thread Chris Fowler
I will swallow my pride and post to this list for the second time in 24 
hours--I have a paper due to a reviewer and I am desperate.


Has anybody written code to move the output from "summary()" called on 
the results of a zeroinfl() model (from the pscl package) into a form 
suitable for publication?


 When I hit send on this message I am going to begin hand typing stars 
into a spreadsheet.


The problem is that the zero-inflated model has two parts: a count and a 
zero portion--its coefficients are stored in separate arrays and there 
is a Log(theta) that needs to be thrown in there that is in a completely 
separate place in the structure of the summary function. As a result the 
functions that I have found for outputting summaries of other linear 
models all give error messages when attempted on this summary object.


My ideal would be to gather the information onto the clipboard so I 
could paste it into Excel and do the formatting there, but any approach 
would be better than what I have now.


Thanks,

Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple linear regression with constraint (imposition)

2009-10-30 Thread Ravi Varadhan

Simply do this:

lm(y ~ I(X1 - X2))


Ravi.


Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: "CE.KA" 
Date: Friday, October 30, 2009 7:48 pm
Subject: Re: [R] Multiple linear regression with constraint (imposition)
To: r-help@r-project.org


> Sorry there was a mistake:
> 
> Hi R users
> 
> I want to do a multiple linear regression with R
> 
> With a normal model (Y=Cste+A1*X1+A2*X2) the program would be
> lm(Y~X1+X2)
> 
> My model is Y=Cste+A1*X1+A2*X2 with the constraint A1=-A2
> What is the program for such a model?
> 
> Best regards
> 
> 
> -- 
> View this message in context: 
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> 
> PLEASE do read the posting guide 
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plots with k-means

2009-10-30 Thread Iuri Gavronski
Hi,

I'm doing a k-means cluster with 6 clusters and 15 variables. Any
suggestions on how to plot the results?

I've tried the standard xy plot, but couldn't get much of it.

Thansk in advance,

Iuri.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Formula with in memory data.frame

2009-10-30 Thread rkevinburton
I have an array of data.frame(s) that I would like to smooth with loess one at 
a time. the array is master and the two variable that I am interested in is 
Period and Quantity. So my first attempt at calling loess is:

loess(Quantity ~ Period, master[[i]])

But I get the following error:

Error: NA/NaN/Inf in foreign function call (arg 2)
In addition: Warning message:
NAs introduced by coercion 

I did a str on the array element str(master[[j]]):

'data.frame':   58 obs. of  3 variables:
 Factor w/ 41 levels "10\" Plates",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Period  : POSIXct, format: "0001-01-20" "0002-01-20" ...
 $ Quantity: int  0 0 0 0 0 0 0 0 0 0 ...

So am I doing something wrong?

Thank you.

Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Stratified Maximum Likelihood

2009-10-30 Thread Noah Silverman

Hi,

I've search rseek.org high and low and can't seem to find an answer to this.

I want to maximize likelihood for a set of training data, but the data 
is grouped.  (Think multiple trials.)


It would probably be possible to do this with some nested for loops 
manually, but would be painfully slow.


The general formula is this...  (Please excuse my notation, but I can't 
write proper math formulas in an email.)


L(a) = product(
for( trial in 1:length(groups)){
exp(a * X) / sum(exp(a * X))
}
)

As you can see, a normal logLik function will lose all the group data.  
This seems like a common enough application that there must me some easy 
function in R.


THEN, just to complicate things, I need to run a second logLik with some 
trickier data. There are 14 variables and I need to adjust them all to 
find the maximum likelihood from a formula. (kelly criterion.)


Any suggestions would be gratefully appreciated.

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Johnson-Neyman procedure (ANCOVA)

2009-10-30 Thread Stropharia

Dear R users,

Does anyone know of a package that implements the Johnson-Neyman procedure -
for testing differences among groups when the regression slopes are
heterogeneous (in an ANCOVA model)? I did not get any matches for functions
when searching using R Site Search.

If it has not been implemented in R, does anyone know of another statistical
package that has this procedure? I found very old scripts available for
Mathematica, but have had difficulty getting them to work.

Thanks in advance.

best,

Steve
-- 
View this message in context: 
http://old.nabble.com/Johnson-Neyman-procedure-%28ANCOVA%29-tp26138992p26138992.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] possible memory leak in predict.gbm(), package gbm ?

2009-10-30 Thread Ridgeway, Greg
All of the memory allocations for predictions use R's allocVector(). See
gbm_pred in gbmentry.cpp. That is, R's memory manager is doing all the
work. gbm_pred is not allocating memory separate from R. It's just
creating R objects within R that can be deleted or garbage collected.

 

Make sure your gbm object does not have a "Terms" component. I don't
think it should if you called gbm.fit() directly. Without a Terms
component it will not run model.frame() and shouldn't really take so
much memory.

 

If you want to experiment with the latest development version you can
find it at https://r-forge.r-project.org/projects/gbm/

 

Greg

 



From: Markus Loecher [mailto:markus.loec...@gmail.com] 
Sent: Friday, October 30, 2009 6:17 AM
To: r-help@r-project.org
Cc: Ridgeway, Greg
Subject: possible memory leak in predict.gbm(), package gbm ?



Dear gbm users,
When running predict.gbm() on a "large" dataset (150,000 rows, 300
columns, 500 trees), I notice that the memory used by R grows beyond
reasonable limits. My 14GB of RAM are often not sufficient. I am
interpreting this as a memory leak since there should be no reason to
expand memory needs once the data are loaded and passed to predict.gbm()
?


Running R version 2.9.2 on Linux, gbm package 1.6-3.

Thanks,

Markus


__

This email message is for the sole use of the intended r...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple linear regression with constraint (imposition)

2009-10-30 Thread David Winsemius


On Oct 30, 2009, at 7:47 PM, CE.KA wrote:



Sorry there was a mistake:


I could not see what was different?



Hi R users

I want to do a multiple linear regression with R

With a normal model (Y=Cste+A1*X1+A2*X2) the program would be
lm(Y~X1+X2)

My model is Y=Cste+A1*X1+A2*X2 with the constraint A1=-A2
What is the program for such a model?



Try calculating X3= X2-X1 and estimate the shared parameter with:

lm(y ~ X3)

or instead;

lm(Y ~ I(X2 - X1) )

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exact string match?

2009-10-30 Thread Linlin Yan
How about using operator ==

On Sat, Oct 31, 2009 at 5:00 AM, bamsel  wrote:
>
> Dear R users:
> I need to compare character strings stored in 2 separate data frames. I need
> an exact match, so finding "a" in "animal" is no good.
> I've tried regexpr, match, and grepl, but to no avail.
> Anybody know how to accomplish this?
> Ben
> --
> View this message in context: 
> http://old.nabble.com/exact-string-match--tp26136677p26136677.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple linear regression with constraint (imposition)

2009-10-30 Thread CE.KA

Sorry there was a mistake:

Hi R users

I want to do a multiple linear regression with R

With a normal model (Y=Cste+A1*X1+A2*X2) the program would be
lm(Y~X1+X2)

My model is Y=Cste+A1*X1+A2*X2 with the constraint A1=-A2
What is the program for such a model?

Best regards


-- 
View this message in context: 
http://old.nabble.com/Multiple-linear-regression-with-constraint-%28imposition%29-tp26138081p26138446.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with RGDAL (SOLVED)

2009-10-30 Thread David Winsemius
I had earlier downloaded what I thought to be the most recent version  
of rgdal (rgdal_0.6-18.tgz). After checking to see that all of the  
other dependencies in the configure files were met, I finally got  
around to checking the version of the rgdal and found my problem. It  
was different (and not as recent as) the target that your code was  
expecting. Downloaded version 0.6-20 and ... Success. Your help is  
greatly appreciated.


-- David


On Oct 30, 2009, at 6:16 PM, cls59 wrote:




Pablo Alvarez-2 wrote:



Hello,
We (two mac users) have been attempting to install rgdal from
"http://www.kyngchaos.com/software:frameworks";, given that it is not
available as a binary on the CRAN (binaries) of the "Package  
Installer".





The GDAL library contains an impressive collection of utilities and  
also
leverages several other libraries such as HDF, netCDF and PROJ4,  
just to
name a few. Building and distributing a binary R package that  
contains the
GDAL library bloats the size of the package and causes redundancies  
if a
user already has GDAL installed somewhere else on their system. It  
also

creates a lot of extra work for the package maintainer.

This is probably done for Windows because a run-of-the-mill Windows  
system

is utterly incapable of building anything from source. Linux and OS X
systems are perfectly capable, so it is expected that you can build  
the

package from source and use your installed version of GDAL.



Pablo Alvarez-2 wrote:



I have also tried to solve this problem by looking on the net for  
an old
question, and though I have found it, the answers do not help very  
much

because our porgraming skills are embryonic.


Our GIS professor has suggested we ask you guys how to proceed with  
the
CRAN (source). Any suggestions or direction to an old document  
dealing

with this?

Thank you very much for your help,
Pablo and Margarita
PS. We have already installed: GDAL (for QGIS) and "sp" (for R)




I've built the GDAL library and installed rgdal a few times-- it's  
typically
one of the first things I do after wiping or reinstalling my OS X  
system or
a Linux system. Building software from source can be a tricky  
business with
a steep learning curve-- but once you figure out some of the  
patterns it
becomes quite doable (as long as those darn platform-specific issues  
don't

cut you off at the kneecaps).

The rgdal package contains a Configure script that is run during
installation and attempts to locate the GDAL library. When building  
software
on Unix/Linux, there are three kinds of folders containing various  
items of
interest that compilers and linkers will frequently pester you  
about. These

are:

 * "lib" folders-- these contain libraries that hold code you are  
trying to

use.

 * "include" folders-- these contain header files that tell the  
compiler

how to use
the code in the libraries.

 * "bin" folders-- these contain scripts and programs that may help  
the

Configure script,
or the program you are building do the things they do.

I suspect the Configure script for the rgdal package only checks  
standard
Unix/Linux folders such as /usr/bin or /usr/local/lib. The Framework  
package
provided by William Kyngesburye installs GDAL Mac-Style, which means  
these

folders are located in:

 /Library/Frameworks/GDAL.framework/unix/

And

 /Library/Frameworks/PROJ.framework/unix/

Fortunately, R allows us to pass hints to configure during package
installation in order to help it find what it is looking for.

So, grab the source tarball of rgdal from:

 http://cran.r-project.org/web/packages/rgdal/index.html

I'm also using the PROJ4 and GDAL frameworks installed by the GDAL  
Complete

framework, available from:

 http://www.kyngchaos.com/software:frameworks/

Also, make sure you have installed the "Developer Tools" contained  
on your
OS X System Software CD as these provide C compilers and other tools  
needed

to build the code in the rdal package.

Now pop open a Terminal. Assuming you saved the package in your  
Downloads

folder, execute the following.

 cd ~/Downloads
 R CMD INSTALL rgdal_0.6-20.tar.gz

R will start to install the package, and then wipe out with a  
message that

includes:

 Error: gdal-config not found
 The gdal-config script distributed with GDAL could not be found.
 If you have not installed the GDAL libraries, you can
 download the source from  http://www.gdal.org/
 If you have installed the GDAL libraries, then make sure that
 gdal-config is in your path. Try typing gdal-config at a
 shell prompt and see if it runs. If not, use:
 --configure-args='--with-gdal-config=/usr/local/bin/gdal-config'  
echo with

appropriate values for your installation.


gdal-config is a little script who's sole purpose is to tell Configure
scripts whatever they want to know about a GDAL installation.  
Scripts and
programs that accompany a library are commonly installed in a "bin"  
folder.
So given what I've noted above about the install l

[R] Multiple linear regression with constraint (imposition)

2009-10-30 Thread CE.KA


Hi R users

I want to do a multiple linear regression with R

With a normal model (Y=Cste+A1*X1+A2*X2) the program would be
lm(Y ~X2+X2)

My model is Y=Cste+A1*X1+A2*X2 with the constraint A1=-A2
What is the program for such a model?

Best regards

-- 
View this message in context: 
http://old.nabble.com/Multiple-linear-regression-with-constraint-%28imposition%29-tp26138081p26138081.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with RGDAL

2009-10-30 Thread cls59


Pablo Alvarez-2 wrote:
> 
> 
> Hello, 
> We (two mac users) have been attempting to install rgdal from
> "http://www.kyngchaos.com/software:frameworks";, given that it is not
> available as a binary on the CRAN (binaries) of the "Package Installer". 
> 
> 

The GDAL library contains an impressive collection of utilities and also
leverages several other libraries such as HDF, netCDF and PROJ4, just to
name a few. Building and distributing a binary R package that contains the
GDAL library bloats the size of the package and causes redundancies if a
user already has GDAL installed somewhere else on their system. It also
creates a lot of extra work for the package maintainer.

This is probably done for Windows because a run-of-the-mill Windows system
is utterly incapable of building anything from source. Linux and OS X
systems are perfectly capable, so it is expected that you can build the
package from source and use your installed version of GDAL.



Pablo Alvarez-2 wrote:
> 
> 
> I have also tried to solve this problem by looking on the net for an old
> question, and though I have found it, the answers do not help very much
> because our porgraming skills are embryonic. 
> 
> 
> Our GIS professor has suggested we ask you guys how to proceed with the
> CRAN (source). Any suggestions or direction to an old document dealing
> with this?
> 
> Thank you very much for your help, 
> Pablo and Margarita
> PS. We have already installed: GDAL (for QGIS) and "sp" (for R)
> 
> 

I've built the GDAL library and installed rgdal a few times-- it's typically
one of the first things I do after wiping or reinstalling my OS X system or
a Linux system. Building software from source can be a tricky business with
a steep learning curve-- but once you figure out some of the patterns it
becomes quite doable (as long as those darn platform-specific issues don't
cut you off at the kneecaps).

The rgdal package contains a Configure script that is run during
installation and attempts to locate the GDAL library. When building software
on Unix/Linux, there are three kinds of folders containing various items of
interest that compilers and linkers will frequently pester you about. These
are:

  * "lib" folders-- these contain libraries that hold code you are trying to
use.

  * "include" folders-- these contain header files that tell the compiler
how to use 
 the code in the libraries.

  * "bin" folders-- these contain scripts and programs that may help the
Configure script,
 or the program you are building do the things they do.

I suspect the Configure script for the rgdal package only checks standard
Unix/Linux folders such as /usr/bin or /usr/local/lib. The Framework package
provided by William Kyngesburye installs GDAL Mac-Style, which means these
folders are located in:

  /Library/Frameworks/GDAL.framework/unix/

And 

  /Library/Frameworks/PROJ.framework/unix/

Fortunately, R allows us to pass hints to configure during package
installation in order to help it find what it is looking for.

So, grab the source tarball of rgdal from:

  http://cran.r-project.org/web/packages/rgdal/index.html

I'm also using the PROJ4 and GDAL frameworks installed by the GDAL Complete
framework, available from:

  http://www.kyngchaos.com/software:frameworks/

Also, make sure you have installed the "Developer Tools" contained on your
OS X System Software CD as these provide C compilers and other tools needed
to build the code in the rdal package.

Now pop open a Terminal. Assuming you saved the package in your Downloads
folder, execute the following.

  cd ~/Downloads
  R CMD INSTALL rgdal_0.6-20.tar.gz

R will start to install the package, and then wipe out with a message that
includes:

  Error: gdal-config not found
  The gdal-config script distributed with GDAL could not be found.
  If you have not installed the GDAL libraries, you can
  download the source from  http://www.gdal.org/
  If you have installed the GDAL libraries, then make sure that 
  gdal-config is in your path. Try typing gdal-config at a
  shell prompt and see if it runs. If not, use:
  --configure-args='--with-gdal-config=/usr/local/bin/gdal-config' echo with
appropriate values for your installation.


gdal-config is a little script who's sole purpose is to tell Configure
scripts whatever they want to know about a GDAL installation. Scripts and
programs that accompany a library are commonly installed in a "bin" folder.
So given what I've noted above about the install location used by the GDAL
framework, we can give rgdal's Configure script a hint:

  R CMD INSTALL --configure-args=\
   
"--with-gdal-config=/Library/Frameworks/GDAL.framework/unix/bin/gdal-config"
\  
rgdal_0.6-20.tar.gz  

The R installer now spits out another error:

  Error: proj_api.h not found.
  If the PROJ.4 library is installed in a non-standard location,
  use --configure-args='--with-proj-include=/opt/local/include'
  for example, replacing /opt/local/* with appropriate values
  for your ins

Re: [R] possible to increase precision

2009-10-30 Thread Prof Brian Ripley

I think he means something like packages Rmpfr and gmp provide.

However, careful numerical analysis (and working with x-1 or log(x) 
for numbers very near one) means that such packages are rarely 
necessary (although they can be very convenient).


It all depends on the calculations: vague questions necessitate vague 
answers.


On Fri, 30 Oct 2009, Ista Zahn wrote:


Hi Greg,
Others on this list are much more knowledgeable than I am, and I'm
sure they  will correct me if I'm wrong.

My understanding is that R uses double precision (see
http://en.wikipedia.org/wiki/Double_precision_floating-point_format),
which has the following implication:


x <- c(1,2,3)/10^308
x*10^308

[1] 1 2 3

x <- c(1,2,3)/10^309
x*10^309

[1] NaN NaN NaN

among others. See

RSiteSearch("floating point")

for other additional implications. I think you're out of luck it you
need to work with values smaller than 1*10^308.

-Ista

On Fri, Oct 30, 2009 at 5:04 PM, Greg Michaelson  wrote:

I would like to increase the precision/accuracy of R.  That is, I'm dealing
with numbers exceedingly close to 1, and I would like to increase the number
of significant digits used in computation.  In matlab, you can use vpa() to
accomplish this. I'm wondering if there's an equivalent or a workaround in
R.

Greg


--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] possible to increase precision

2009-10-30 Thread Duncan Murdoch

On 30/10/2009 5:04 PM, Greg Michaelson wrote:
I would like to increase the precision/accuracy of R.  That is, I'm  
dealing with numbers exceedingly close to 1, and I would like to  
increase the number of significant digits used in computation.  In  
matlab, you can use vpa() to accomplish this. I'm wondering if there's  
an equivalent or a workaround in R.


There's a package for extra precision, but you'd probably be better off 
just working with log(x).  That will give you about 300+ digits of 
precision near 1, i.e. log(1 + epsilon) is representable accurately down 
to epsilon=10^(-300).


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficient way to code using optim()

2009-10-30 Thread parkbomee

Thank you.

But I'd prefer using a written function which allows me more flexible model 
specification.
Later on, I could have random parameters.
So I want to know if there is any more efficient way so that I can speed it up.


> Date: Fri, 30 Oct 2009 16:10:29 -0600
> To: bbom...@hotmail.com
> CC: r-help@r-project.org
> Subject: Re: [R] Efficient way to code using optim()
> From: gpet...@uark.edu
> 
> 
> Unless this is a homework problem, you would be much better off using
> glm(). 
> 
> Giovanni
> 
> > Date: Fri, 30 Oct 2009 12:23:45 -0700
> > From: parkbomee 
> > Sender: r-help-boun...@r-project.org
> > Importance: Normal
> > Precedence: list
> > 
> > 
> > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)
> > Content-type: text/plain
> > Content-transfer-encoding: 8BIT
> > Content-disposition: inline
> > Content-length: 1692
> > 
> > 
> > Hi all,
> >  
> > I am trying to estimate a simple logit model.
> > By using MLE, I am maximizing the log likelihood, with optim().
> > The thing is, each observation has different set of choice options, so I 
> > need a loop inside the objective function,
> > which I think slows down the optimization process.
> >  
> > The data is constructed so that each row represent the characteristics for 
> > one alternative,
> > and CS is a variable that represents choice situations. (say, 1 ~ Number of 
> > observations)
> > cum_count is the ¡°cumulative¡± count of each choice situations, i.e. 
> > number of available alternatives in each CS.
> > So I am maximizing the sum of [exp(U(chosen)) / sum(exp(U(all 
> > alternatives)))]
> >  
> > When I have 6,7 predictors, the running time is about 10 minutes, and it 
> > slows down exponentially as I have more predictors. (More theta¡¯s to 
> > estimate)
> > I want to know if there is a way I can improve the running time.
> > Below is my code..
> >  
> > simple_logit = function(theta){
> > realized_prob = rep(0, max(data$CS))
> > theta_multiple = as.matrix(data[,4:35]) %*% as.matrix(theta)
> > realized_prob[1] = exp(theta_multiple[1]) / 
> > sum(exp(theta_multiple[1:cum_count[1]]))
> > for (i in 2:length(realized_prob)){
> > realized_prob[i] = 
> > exp(theta_multiple[cum_count[(i-1)]+1]) / 
> > sum(exp(theta_multiple[((cum_count[(i-1)]+1):cum_count[i])]))
> > }
> > -sum(log(realized_prob))
> > }
> >  
> > initial = rep(0,32)
> > out33 = optim(initial, simple_logit, method="BFGS", hessian=TRUE)
> >  
> >  
> >  
> > Many thanks in advance!!! 
> > _
> > 
> > 
> > [[alternative HTML version deleted]]
> > 
> > 
> > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)
> > MIME-version: 1.0
> > Content-type: text/plain; charset=us-ascii
> > Content-transfer-encoding: 7BIT
> > Content-disposition: inline
> > 
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)--
> > 
> > 
  
_
나의 글로벌 인맥, Windows Live Space!
http://www.spaces.live.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] possible to increase precision

2009-10-30 Thread Ista Zahn
Hi Greg,
Others on this list are much more knowledgeable than I am, and I'm
sure they  will correct me if I'm wrong.

My understanding is that R uses double precision (see
http://en.wikipedia.org/wiki/Double_precision_floating-point_format),
which has the following implication:

> x <- c(1,2,3)/10^308
> x*10^308
[1] 1 2 3
> x <- c(1,2,3)/10^309
> x*10^309
[1] NaN NaN NaN

among others. See

RSiteSearch("floating point")

for other additional implications. I think you're out of luck it you
need to work with values smaller than 1*10^308.

-Ista

On Fri, Oct 30, 2009 at 5:04 PM, Greg Michaelson  wrote:
> I would like to increase the precision/accuracy of R.  That is, I'm dealing
> with numbers exceedingly close to 1, and I would like to increase the number
> of significant digits used in computation.  In matlab, you can use vpa() to
> accomplish this. I'm wondering if there's an equivalent or a workaround in
> R.
>
> Greg
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Obtaining illogical results from posterior LDA-classification because of "too good" data?

2009-10-30 Thread Arne Schulz
Dear list,
my problem seems to be primarily a statistical one, but maybe there is a
misspecification within R (and hopefully a solution).

I have two groups with two measured variables as training data. According to
the variables, the groups differ totally. I know that this is a very easy
situation, but the later analysis will use the same principle (aside from
more groups and more possible values). The example should be enough to draw
my problem:
matrix <- matrix(rep(c(0,0,0,0,0,1,1,1,1,1),3), ncol = 3, byrow = FALSE)
matrix[,2:3] <- jitter(matrix[,2:3], .001)
lda <- lda(matrix[,2:3],matrix[,1], prior = c(5,5)/10)

I added some jitter to obtain a little within-group variance. The LDA would
fail otherwise. When trying to predict to probability of new values, I get
some strange results:
testmatrix <- matrix(c(0,0,1,1,0,1,1,0), ncol = 2, byrow = TRUE)
predict(lda,testmatrix)$posterior
> predict(lda,testmatrix)$posterior
 0 1
[1,] 1 0
[2,] 0 1
[3,] 0 1
[4,] 1 0

Row 1 and 2 are quite right, although the probability should be not equal to
1, rather be close to 1. But row 3 and 4 really bothers me. The
probabilities should be .5 for every value. Additionally the coefficients
seem to be way to high:
> lda[["scaling"]]
  LD1
[1,] 5835.805
[2,] 7000.393

When I insert 1 error per group, the results are quite right (jitter is not
needed in this case):
matrix <- matrix(rep(c(0,0,0,0,0,1,1,1,1,1),3), ncol = 3, byrow = FALSE)
matrix[3,2] <- c(1)
matrix[8,3] <- c(0)
lda <- lda(matrix[,2:3],matrix[,1], prior = c(5,5)/10)
predict(lda,testmatrix)$posterior
> predict(lda,testmatrix)$posterior
01
[1,] 0.9996646499 0.0003353501
[2,] 0.0003353501 0.9996646499
[3,] 0.50 0.50
[4,] 0.50 0.50


My question is now: Is my data "too good" or did I make a mistake in my
code?


Best regards,
Arne Schulz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficient way to code using optim()

2009-10-30 Thread Giovanni Petris

Unless this is a homework problem, you would be much better off using
glm(). 

Giovanni

> Date: Fri, 30 Oct 2009 12:23:45 -0700
> From: parkbomee 
> Sender: r-help-boun...@r-project.org
> Importance: Normal
> Precedence: list
> 
> 
> --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)
> Content-type: text/plain
> Content-transfer-encoding: 8BIT
> Content-disposition: inline
> Content-length: 1692
> 
> 
> Hi all,
>  
> I am trying to estimate a simple logit model.
> By using MLE, I am maximizing the log likelihood, with optim().
> The thing is, each observation has different set of choice options, so I need 
> a loop inside the objective function,
> which I think slows down the optimization process.
>  
> The data is constructed so that each row represent the characteristics for 
> one alternative,
> and CS is a variable that represents choice situations. (say, 1 ~ Number of 
> observations)
> cum_count is the ¡°cumulative¡± count of each choice situations, i.e. number 
> of available alternatives in each CS.
> So I am maximizing the sum of [exp(U(chosen)) / sum(exp(U(all alternatives)))]
>  
> When I have 6,7 predictors, the running time is about 10 minutes, and it 
> slows down exponentially as I have more predictors. (More theta¡¯s to 
> estimate)
> I want to know if there is a way I can improve the running time.
> Below is my code..
>  
> simple_logit = function(theta){
> realized_prob = rep(0, max(data$CS))
> theta_multiple = as.matrix(data[,4:35]) %*% as.matrix(theta)
> realized_prob[1] = exp(theta_multiple[1]) / 
> sum(exp(theta_multiple[1:cum_count[1]]))
> for (i in 2:length(realized_prob)){
> realized_prob[i] = 
> exp(theta_multiple[cum_count[(i-1)]+1]) / 
> sum(exp(theta_multiple[((cum_count[(i-1)]+1):cum_count[i])]))
> }
> -sum(log(realized_prob))
> }
>  
> initial = rep(0,32)
> out33 = optim(initial, simple_logit, method="BFGS", hessian=TRUE)
>  
>  
>  
> Many thanks in advance!!!   
> _
> 
> 
>   [[alternative HTML version deleted]]
> 
> 
> --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)
> MIME-version: 1.0
> Content-type: text/plain; charset=us-ascii
> Content-transfer-encoding: 7BIT
> Content-disposition: inline
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)--
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summing identical IDs

2009-10-30 Thread PDXRugger

David,
  You are correct.  I think the frist two assumptions can be thrown out and
only the latter two (c,d) can be considered.  So how would i combine Acres
for matching Bldgids based on assumptions c,d?



David Winsemius wrote:
> 
> 
> On Oct 29, 2009, at 5:23 PM, PDXRugger wrote:
> 
>>
>> Terrific help thank you.
>> dupbuild<-aggregate(DF$Acres, list(Bldgid), sum)
>> This line worked best.
>>
>> Now im going to challenge everyone (i think?)
>>
>> Consider the following:
>>
>>
>> Acres<-c(100,101,100,130,156,.5,293,300,.09,100,12.5)
>> Bldgid<-c(1,2,3,4,5,5,6,7,7,8,8)
>> Year<-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000)
>> ImpValue<-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000)
>> DF=cbind(Acres,Bldgid,Year,ImpValue)
>> DF<-as.data.frame(DF)
>>
>> I would like to do the same, except there are some rules i want to  
>> follow.
>> I only want to aggregate the Acres if :
>> a) The Years are not identical
>> b) The ImpValues are not identical
>> c) The Years are identical and the ImpValue are not
>> d)The ImpValues are identical and the Years are not
> 
> As I review your Boolean logic, I run into serious problems.
> 
> c) and d) cannot be true if a and b) are true.
> 
> So no cases satisfy all 4 specs. In particular both of the pairs you  
> say you want aggregated (5+6) and 10+11) violate rule a) and the  
> second pair also violates b).
> 
> -- 
> David
>>
>> but if the Acres and ImpValues are identical i would still like to  
>> add the
>> Acres together and form one case.
>> If the cases are put together i would also like to add the ImpValues
>> together.  So the below
>>
>>Acres Bldgid Year ImpValue
>> 1  100.00  1 1946 1000
>> 2  101.00  2 1952 1400
>> 3  100.00  3 1922 1300
>> 4  130.00  4 1910  900
>> 5  156.00  5 1955 5000
>> 60.50  5 1955 1200
>> 7  293.00  6 1999  500
>> 8  300.00  7 1990 1000
>> 90.09  7 1991  300
>> 10 100.00  8 2000 1000
>> 11  12.50  8 2000 1000
>>
>> would become
>>
>>Acres Bldgid Year ImpValue
>> 1  100.00  1 1946 1000
>> 2  101.00  2 1952 1400
>> 3  100.00  3 1922 1300
>> 4  130.00  4 1910  900
>> 5  156.50 5 1955 6200
>> 7  293.00  6 1999  500
>> 8  300.09 7 1990 1300
>> 10 112.50  8 2000 1000
>>
>> Thanks, i gave it a bunch of shots but nothing worth posting.
>>
>>
>>
>>
>>
>> PDXRugger wrote:
>>>
>>> Hello All,
>>>   I would like to select records with identical IDs then sum an  
>>> attribute
>>> then and return them to the data frame as a single record.  Please
>>> consider
>>>
>>>
>>> Acres<-c(100,101,100,130,156,.5,293,300,.09)
>>> Bldgid<-c(1,2,3,4,5,5,6,7,7)
>>>
>>> DF=cbind(Acres,Bldgid)
>>> DF<-as.data.frame(DF)
>>>
>>> So that:
>>>
>>>  Acres Bldgid
>>> 1 100.00  1
>>> 2 101.00  2
>>> 3 100.00  3
>>> 4 130.00  4
>>> 5 156.00  5
>>> 6   0.50  5
>>> 7 293.00  6
>>> 8 300.00  7
>>> 9   0.09  7
>>>
>>> Becomes
>>>
>>>  Acres Bldgid
>>> 1 100.00  1
>>> 2 101.00  2
>>> 3 100.00  3
>>> 4 130.00  4
>>> 5 156.50  5
>>> 7 293.00  6
>>> 8 300.09  7
>>>
>>> dup<-unique(DF$Bldgid[duplicated(Bldgid)])
>>> dupbuild<-DF[DF$Bldgid %in% dup,]
>>> dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild 
>>> $Bldgid)])
>>>
>>> This sums the unique Ids of the duplicated records, not whati want.
>>> Thanks ahead of time
>>>
>>> JR
>>>
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Summing-identical-IDs-tp26118922p26135732.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] possible to increase precision

2009-10-30 Thread Greg Michaelson
I would like to increase the precision/accuracy of R.  That is, I'm  
dealing with numbers exceedingly close to 1, and I would like to  
increase the number of significant digits used in computation.  In  
matlab, you can use vpa() to accomplish this. I'm wondering if there's  
an equivalent or a workaround in R.


Greg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] NA values in Standard Error for zeroinfl()

2009-10-30 Thread Chris Fowler
I am fitting a model using zeroinfl() and it runs without errors, 
returning results that are generally consistent with my hypotheses.


One of my variables is percent black (pblack). This variable was highly 
significant in some of the other count models I ran on the way to my 
current formulation. It is not significant in this model. As such I 
decided to try adding pblack^2 to the model to see whether the issue 
might captured there (if it matters I created a new variable in my data 
frame called pblacksqared and added that to the count portion of the 
model only).


zeroinfl() runs just fine and pblack becomes significant and moves in 
the appropriate direction once the new variable is added to the model. 
However, pblacksquared is not significant, and while a coefficient is 
generated, I get NA values for Std. Error, z value and Pr(>|z|). pblack 
and pblacksquared are both positive and continuous with no missing values.


I can't find anything in the documentation about returning an NA value 
here. Any suggestions as to where it comes from and how I might 
interpret/remedy it?


Thanks,

Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] possible to increase precision

2009-10-30 Thread Greg Michaelson
I would like to increase the precision/accuracy of R.  That is, I'm  
dealing with numbers exceedingly close to 1, and I would like to  
increase the number of significant digits used in computation.  In  
matlab, you can use vpa() to accomplish this. I'm wondering if there's  
an equivalent or a workaround in R.


Greg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] exact string match?

2009-10-30 Thread bamsel

Dear R users:
I need to compare character strings stored in 2 separate data frames. I need
an exact match, so finding "a" in "animal" is no good. 
I've tried regexpr, match, and grepl, but to no avail. 
Anybody know how to accomplish this?
Ben
-- 
View this message in context: 
http://old.nabble.com/exact-string-match--tp26136677p26136677.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] opacity under dispersion command under plotrix

2009-10-30 Thread David Winsemius


On Oct 30, 2009, at 3:49 PM, Joe King wrote:


Heres my code:

era1 <- seq(1.5,5,.25)
plot(era1,yhyp1$pe, xlab="ERA", ylab="Probability of Winning Cy
Young",type="l", col="black", lwd=2)


This plots.

library(plotrix)  # should go here.


dispersion(era1,yhyp1$pe,y1upper,y1lower,type="l",
fill="blue",arrow.cap=0.01,intervals=FALSE)#Requres plotrix


This throws an error even with plotrix loaded.


dispersion(era1,yhyp2$pe,y2upper,y2lower,type="l",
fill="lightcyan",arrow.cap=0.01,intervals=FALSE)#Requres plotrix
dispersion(era1,yhyp3$pe,y3upper,y3lower,type="l",
fill="gray97",arrow.cap=0.01,intervals=FALSE)#Requres plotrix


Try using the rgb function to create transparent colors:

>  rgb(1,0,0,0.3)
[1] "#FF4D"


lines(era1, yhyp1$pe, col="#FF4D", lwd=5)


The above _does_ plot with transparent color, but the below will not  
without data.



lines(era1,yhyp2$pe,col="blue", lwd=5)
lines(era1,yhyp3$pe,col="red", lwd=5)



After a bunch of cutting and pasting of the yhyp1 data. I still get  
errors because y1upper and y1lower are not defined (nor their values  
implied by what you have written.)


Learn to use dput and review your code to see what variables are  
undefined.


--
David

my raw data is too much to add as I have coefficients from a model  
run and a

lot of code before that but the matricies for the above variables I am
trying to plot are

yhyp1$pe

[1] 0.91938328 0.88005171 0.82235810 0.74124787 0.63520716 0.51090516
[7] 0.38417025 0.27254070 0.18569395 0.12375682 0.08183003 0.05420338
[13] 0.03619331 0.02446051 0.01677548


yhyp1$upper

[,1]
up 0.98470376
up 0.96729674
up 0.93342185
up 0.87016432
up 0.77356024
up 0.63337489
up 0.49704477
up 0.39523107
up 0.32380653
up 0.26340188
up 0.21370693
up 0.17113476
up 0.13810301
up 0.9516
up 0.08930777


yhyp1$lower

  [,1]
low 0.758338075
low 0.704684752
low 0.636448570
low 0.569633622
low 0.485986148
low 0.390656111
low 0.276924551
low 0.169273771
low 0.092501362
low 0.048176412
low 0.023014862
low 0.010788438
low 0.005082446
low 0.002250877
low 0.001063545

Please forgive my poor posting manners.

Joe King
206-913-2912
j...@joepking.com
"Never throughout history has a man who lived a life of ease left a  
name

worth remembering." --Theodore Roosevelt


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Friday, October 30, 2009 7:15 AM
To: Joe King
Cc: r-help@r-project.org
Subject: Re: [R] opacity under dispersion command under plotrix


On Oct 30, 2009, at 7:02 AM, Joe King wrote:


Is there anyway to make the lines in the dispersion command come
forward in
a plot and allow the fill in the dispersion parameter be transparent
so all
of the lines I am using to note confidence intervals are shown?


Got code?


---

Joe King, M.A.

Ph.D. Student


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What should my.data.frame[c(3, 1)] <- list(NULL, NULL) evaluate to?

2009-10-30 Thread Kevin Wright
(R 2.9.2 on Windows XP)

Turns out that

 my.data.frame[c(1,3)] <- list(NULL, NULL)
and
 my.data.frame[c(3,1)] <- list(NULL, NULL)

are not the same.  Is this expected?

Example:

d0 <- structure(list(Year = c(2009L, 2009L, 2009L), Season = c(1L,
1L, 1L), Series = c(1L, 1L, 1L), Mst = structure(c(1L,
1L, 1L), .Label = "", class = "factor"), Yield = structure(c(1L,
1L, 1L), .Label = "", class = "factor")), .Names = c("Year",
"Season", "Series", "Mst", "Yield"), row.names = c(NA,
3L), class = "data.frame")

d1 <- d0
d1[c(3,1)] <- list(1:3, 4:6)
print(d1)

  Year Season Series Mst Yield
14  1  1
25  1  2
36  1  3


d1[c(3,1)] <- list(NULL, NULL)
print(d1)

  Season Series Yield
1  1  1
2  1  2
3  1  3

It appears that the above assignment deletes column 1 from the data frame,
then sequentially deletes column 3 (Mst, which was originally column 4
before column 1 was deleted).

Compare this to

d1 <- d0
d1[c(1,3)] <- list(NULL, NULL)
print(d1)
  Season Mst Yield
1  1
2  1
3  1


Kevin Wright

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] opacity under dispersion command under plotrix

2009-10-30 Thread Joe King
Heres my code:

era1 <- seq(1.5,5,.25)
plot(era1,yhyp1$pe, xlab="ERA", ylab="Probability of Winning Cy
Young",type="l", col="black", lwd=2)
dispersion(era1,yhyp1$pe,y1upper,y1lower,type="l",
fill="blue",arrow.cap=0.01,intervals=FALSE)#Requres plotrix
dispersion(era1,yhyp2$pe,y2upper,y2lower,type="l",
fill="lightcyan",arrow.cap=0.01,intervals=FALSE)#Requres plotrix
dispersion(era1,yhyp3$pe,y3upper,y3lower,type="l",
fill="gray97",arrow.cap=0.01,intervals=FALSE)#Requres plotrix
lines(era1,yhyp1$pe,col="black", lwd=5)
lines(era1,yhyp2$pe,col="blue", lwd=5)
lines(era1,yhyp3$pe,col="red", lwd=5)

my raw data is too much to add as I have coefficients from a model run and a
lot of code before that but the matricies for the above variables I am
trying to plot are
> yhyp1$pe
 [1] 0.91938328 0.88005171 0.82235810 0.74124787 0.63520716 0.51090516
 [7] 0.38417025 0.27254070 0.18569395 0.12375682 0.08183003 0.05420338
[13] 0.03619331 0.02446051 0.01677548

> yhyp1$upper
 [,1]
up 0.98470376
up 0.96729674
up 0.93342185
up 0.87016432
up 0.77356024
up 0.63337489
up 0.49704477
up 0.39523107
up 0.32380653
up 0.26340188
up 0.21370693
up 0.17113476
up 0.13810301
up 0.9516
up 0.08930777

> yhyp1$lower
   [,1]
low 0.758338075
low 0.704684752
low 0.636448570
low 0.569633622
low 0.485986148
low 0.390656111
low 0.276924551
low 0.169273771
low 0.092501362
low 0.048176412
low 0.023014862
low 0.010788438
low 0.005082446
low 0.002250877
low 0.001063545

Please forgive my poor posting manners.

Joe King
206-913-2912
j...@joepking.com
"Never throughout history has a man who lived a life of ease left a name
worth remembering." --Theodore Roosevelt


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Friday, October 30, 2009 7:15 AM
To: Joe King
Cc: r-help@r-project.org
Subject: Re: [R] opacity under dispersion command under plotrix


On Oct 30, 2009, at 7:02 AM, Joe King wrote:

> Is there anyway to make the lines in the dispersion command come  
> forward in
> a plot and allow the fill in the dispersion parameter be transparent  
> so all
> of the lines I am using to note confidence intervals are shown?

Got code?

> ---
>
> Joe King, M.A.
>
> Ph.D. Student

-- 

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify which commands have generated graphics?

2009-10-30 Thread Duncan Murdoch

On 10/30/2009 3:13 PM, Liviu Andronic wrote:

Dear all
Is it possible to programmatically detect which commands have
generated a graphic to the x11() graphics device? When in front of the
computer, it is easy to see that after a command---say, plot(1:10)---a
graphics window opens/activates and displays a graphic. But is there a
way to detect this from the command line?


You could probably detect changes using some combination of setHook() 
(to detect when plot.new() is called) and recordPlot() (to detect any 
changes to the current plot), but I don't think there's any easy way to 
do what you want.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficient way to code using optim()

2009-10-30 Thread parkbomee

Hi all,
 
I am trying to estimate a simple logit model.
By using MLE, I am maximizing the log likelihood, with optim().
The thing is, each observation has different set of choice options, so I need a 
loop inside the objective function,
which I think slows down the optimization process.
 
The data is constructed so that each row represent the characteristics for one 
alternative,
and CS is a variable that represents choice situations. (say, 1 ~ Number of 
observations)
cum_count is the ¡°cumulative¡± count of each choice situations, i.e. number of 
available alternatives in each CS.
So I am maximizing the sum of [exp(U(chosen)) / sum(exp(U(all alternatives)))]
 
When I have 6,7 predictors, the running time is about 10 minutes, and it slows 
down exponentially as I have more predictors. (More theta¡¯s to estimate)
I want to know if there is a way I can improve the running time.
Below is my code..
 
simple_logit = function(theta){
realized_prob = rep(0, max(data$CS))
theta_multiple = as.matrix(data[,4:35]) %*% as.matrix(theta)
realized_prob[1] = exp(theta_multiple[1]) / 
sum(exp(theta_multiple[1:cum_count[1]]))
for (i in 2:length(realized_prob)){
realized_prob[i] = 
exp(theta_multiple[cum_count[(i-1)]+1]) / 
sum(exp(theta_multiple[((cum_count[(i-1)]+1):cum_count[i])]))
}
-sum(log(realized_prob))
}
 
initial = rep(0,32)
out33 = optim(initial, simple_logit, method="BFGS", hessian=TRUE)
 
 
 
Many thanks in advance!!! 
_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with re-loading exported data

2009-10-30 Thread jim holtman
Can you at least show us what you were doing.  It sounds like you were
trying to 'load' a text file, not something that was saved with
'save'.  Here is what I get:

> load('trackV3.r')
Error: bad restore file magic number (file may be corrupted) -- no data loaded
In addition: Warning message:
file 'trackV3.r' has magic number '# rea'
   Use of save versions prior to 2 is deprecated

Which is very similar to your message.  If you did a 'write.table',
then use 'read.table' to read it back in.

On Fri, Oct 30, 2009 at 2:11 PM, Marion Dumas  wrote:
> Hello,
> I have a data file originally as .mat file from matlab that I imported in R
> using the R.matlab package (readMat). This loaded a list containing amongst
> other things a 4x4 array of rainfall data that is quite heavy (around 40 Mb,
> it took 10 minutes to load on my computer). I tried to export it as R data
> file or simply a text file (using the command write) to share this data with
> other R users. When I open it in the text editor, I get a normal looking
> file with space-delimited numerical values. However, when I try to to load
> the resulting output file, I get the following error message:
>
>
> Error: bad restore file magic number (file may be corrupted) -- no data
> loaded
> In addition: Warning message:
> file 'outrainfall' has magic number '0 0 0'
>   Use of save versions prior to 2 is deprecated
>
> I would really appreciate your help, as this is a rather  important issue to
> clarify.
> Thanks
> Marion Dumas
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with RGDAL

2009-10-30 Thread David Winsemius


On Oct 30, 2009, at 1:51 PM, Pablo Alvarez wrote:



Hello,
We (two mac users) have been attempting to install rgdal from "http://www.kyngchaos.com/software:frameworks 
", given that it is not available as a binary on the CRAN (binaries)  
of the "Package Installer".
I have also tried to solve this problem by looking on the net for an  
old question, and though I have found it, the answers do not help  
very much because our porgraming skills are embryonic.
Our GIS professor has suggested we ask you guys how to proceed with  
the CRAN (source). Any suggestions or direction to an old document  
dealing with this?

Thank you very much for your help,
Pablo and Margarita
PS. We have already installed: GDAL (for QGIS) and "sp" (for R)


There is a Mac-specific list to which this would be more appropriate.

The maintainer of that version of rgdal also suggests in the readme  
that accompanies the software that you can contact him for  
installation questions.


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] identify which commands have generated graphics?

2009-10-30 Thread Liviu Andronic
Dear all
Is it possible to programmatically detect which commands have
generated a graphic to the x11() graphics device? When in front of the
computer, it is easy to see that after a command---say, plot(1:10)---a
graphics window opens/activates and displays a graphic. But is there a
way to detect this from the command line?

Thank you
Liviu


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problems with re-loading exported data

2009-10-30 Thread Marion Dumas

Hello,
I have a data file originally as .mat file from matlab that I imported  
in R using the R.matlab package (readMat). This loaded a list  
containing amongst other things a 4x4 array of rainfall data that is  
quite heavy (around 40 Mb, it took 10 minutes to load on my computer).  
I tried to export it as R data file or simply a text file (using the  
command write) to share this data with other R users. When I open it  
in the text editor, I get a normal looking file with space-delimited  
numerical values. However, when I try to to load the resulting output  
file, I get the following error message:



Error: bad restore file magic number (file may be corrupted) -- no  
data loaded

In addition: Warning message:
file 'outrainfall' has magic number '0 0 0'
   Use of save versions prior to 2 is deprecated

I would really appreciate your help, as this is a rather  important  
issue to clarify.

Thanks
Marion Dumas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with RGDAL

2009-10-30 Thread Pablo Alvarez

Hello, 
We (two mac users) have been attempting to install rgdal from 
"http://www.kyngchaos.com/software:frameworks";, given that it is not available 
as a binary on the CRAN (binaries) of the "Package Installer". 
I have also tried to solve this problem by looking on the net for an old 
question, and though I have found it, the answers do not help very much because 
our porgraming skills are embryonic. 
Our GIS professor has suggested we ask you guys how to proceed with the CRAN 
(source). Any suggestions or direction to an old document dealing with this?
Thank you very much for your help, 
Pablo and Margarita
PS. We have already installed: GDAL (for QGIS) and "sp" (for R)
  
_



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R strucchange question: recursive-based CUSUM

2009-10-30 Thread Achim Zeileis

Julia:

I'm trying now to apply the package strucchange to see whether there is a 
structural change in linear regression. I have noted the following problem 
that arises in my case with recursive-based CUSUM: generic function 
recresid() in efp() generates an error, since (probably) it cannot compute 
the inverse matrix of (X^(i-1)^T)*(X^(i-1)) at each step (i-1), because the 
matrix (X^(i-1)^T)*(X^(i-1)) does not have full rank for all i (X consists of 
dummy variables). Does any solution of this problem exist (for example, to 
replace the ordinary inverse by the generalised inverse, ginv())?


The 1-step-ahead prediction error is well-defined even if there are rank 
deficiencies. For example, using lm.fit() will automatically alias 
coefficients that are not identified. The reason why recresid() doesn't 
use this is that it employs a more efficient updating algorithm.


If you need to investigate the recursive CUSUM test, you could hack 
recresid() and use the slower but more robust implementation based on 
lm.fit().


Personally, however, I would recommend to use a different test. In most 
situations (unless the break occurs very early in the sample), there are 
more powerful methods than the recursive CUSUM test.


hth,
Z

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multidimensional scaling on a correlation matrix

2009-10-30 Thread Hollix

Hi there,

when conducting a multidimensional scaling analysis, it is possible to do
this on a correlation matrix?

I calculated a dissimilarity matrix by substracting the corr.matrix from 1.
Then I calculated the distance matrix with d <- dist(dissimilarity matrix)
and ran the isoMDS function.

Is that correct? 

In addition: The stress value was 2.9. I read that this is a percentage. So
- according to the usual manner to report stress values from 0 to 1, this
would be .029?

Finally, I compared the solution (diagram) with the alscal-algorithm from
SPSS (however computed on the raw data) and there were some slight
differences. Can you tell me the differences between alscal and isoMDS?

As you might see, I am not very experienced in MDS. Thus, any recommendation
(literatur etc.) would be a great help.

Thanks
Holger
-- 
View this message in context: 
http://old.nabble.com/Multidimensional-scaling-on-a-correlation-matrix-tp26126667p26126667.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-help Digest, Vol 80, Issue 30

2009-10-30 Thread Greg Snow
Which OS and how do you start R?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Huidong TIAN
> Sent: Friday, October 30, 2009 5:36 AM
> To: r-help@r-project.org
> Subject: Re: [R] R-help Digest, Vol 80, Issue 30
> 
> Dear friends,
> 
> I will be very happy if anyone tell me the way to change work directory
> permanently?
> I mean not use the function setwd() which can only change temporary,
> when
> you close the console, it will the old directory.
> Sys.setenv(R_USER = '') also doesn't work.
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Easy method to set user-mode virtual memory space in Windows Vista and 7

2009-10-30 Thread Kingsford Jones
Thank you for the response, Prof Ripley.  Some comments below

On Fri, Oct 30, 2009 at 9:27 AM, Prof Brian Ripley
 wrote:
> On Wed, 28 Oct 2009, Kingsford Jones wrote:
>
>> I thought I'd share this with the list since it appears to provide a
>> quick fix to some memory problems, and I haven't see it discussed in
>> relation to R.
>
> Seems you didn't look in the places mentioned in the posting guide, fo this
> is an actual FAQ!
>
> See the references in rw-FAQ Q2.9 (and this has been documented there for
> many years, and the R developer made R large-address-aware so users who
> looked at the rw-FAQ could benefit from this).

I did read the rw-FAQ, including the links from 2.9.  I am, on a daily
basis, appreciative of the impressive effort that has gone into making
R what it is, including the documentation;  however, AFAICS the links
currently in 2.9 do not tell Vista or 7 users how to set the \3GB
switch.  I see discussion of changing the boot.ini file to set the
switch, but no mention of using BCDEdit.exe.  My plan was to send the
msg to r-help and if nobody indicated it was a bad idea, make a
suggestion on r-devel that the link I sent be added to rw-FAQ 2.9.


>
> See http://msdn.microsoft.com/en-us/library/bb613473%28VS.85%29.aspx for a
> more rounded view covering all the relevant OSes.

That is a more helpful link than the one sent; perhaps it can be added
to the rw-FAQ?

>
>> To reallocate virtual memory from kernel-mode to user-mode in 32-bit
>> Vista or Windows 7 one can use the increaseuserva boot option value.
>> See
>> http://msdn.microsoft.com/en-us/library/aa906211.aspx
>>
>> On my 4GB Vista machine, R is now able to use 3GB memory (previously
>> at the default value of 2GB).
>>
>> Here's the method:
>>
>> Open cmd.exe as administrator and type
>>
>> BCDEdit /set IncreaseUserVA 
>>
>> where  is between 2048 and 3072 (ie 2-3GB), then reboot.  Given
>> that you've set the --max-mem-size flag, or used the memory.limit
>> function in an R session to increase R's memory allocation, your R
>> processes should now be allowed to access up to 3GB of virtual memory.
>
> See the rw-FAQ: it is 3GB of address space and 2.5GB of memory by default.

indeed -- I confused 'virtual memory' with 'address space' -- thank you

>
>> I am not a Windows expert, so if anyone knows of disadvantages of
>> using this method *please* post a response.
>
> This is not advised in general by those who are experts (why do you think
> that it is not the default in Windows?).

But my suggestion is essentially the same as that given in the links
from 2.9, isn't it?  The difference being that it is for modern
versions of Windows.

>  As the URL above says
>
>  'For applications that are memory-intensive, such as database
>  management systems (DBMS), the use of a larger virtual address space
>  can provide considerable performance and scalability benefits.
>  However, the file cache, paged pool, and nonpaged pool are smaller,
>  which can adversely affect applications with heavy networking or
>  I/O.  Therefore, you might want to test your application under load,
>  and examine the performance counters to determine whether your
>  application benefits from the larger address space.'
>
> Remember that you are setting this for all uses and users of your machine,
> not just the R process (so the last sentence is somewhat misleading).
>
> The best advice seems to be that if you have 4GB or more of RAM is to run a
> 64-bit version of Windows, which gives a 32-bit process a 4GB user address
> space

but I assume for many it is not an option (due to budget, organization
policy, application compatiblity...)

>.  If you have a lot more memory, consider a 64-bit build of R (and an
> OS that handles memory more efficiently: one of my background tasks right
> now is to investigate why R under Windows is running a particular
> memory-intensive multiprocessor task so much more slowly than Linux on the
> same hardware).

and your skills and efforts are very much appreciated.

regards,

Kingsford Jones

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Interpreting gnls() output in comparison to nls()

2009-10-30 Thread Michael A. Gilchrist

Hi,

I've been trying to work with the gnls() function in the "nlme" package.  My 
decision to use gnls() was so that I could fit varPower and such to some of 
the data.  However, in working with a small dataset, I've found that the 
results given by gnls() don't seem to make any sense and they differ 
substantially from those produced by nls().  I suspect that I am just 
misinterpreting the gnls() output and could use a little guidance.



Here's the data I am trying to fit:



myPbmcData

Grouped Data: lnCount ~ Time | Type
   Time Mouse Count   Type  lnCount
1 0  T0-1  37.6  Naive 3.627004
2 0  T0-2  23.6  Naive 3.161247
3 0  T0-3  49.2  Naive 3.895894
4 8  T8-1  20.8  Naive 3.034953
5 8  T8-2  26.9  Naive 3.292126
6 8  T8-3  34.0  Naive 3.526361
724 T24-1  36.8  Naive 3.605498
824 T24-2  34.0  Naive 3.526361
924 T24-3  19.6  Naive 2.975530
10   48 T48-1  55.4  Naive 4.014580
11   48 T48-2  54.2  Naive 3.992681
12   48 T48-3  51.4  Naive 3.939638
130  T0-1 342.0 Memory 5.834811
140  T0-2 139.0 Memory 4.934474
150  T0-3 191.0 Memory 5.252273
168  T8-1 104.0 Memory 4.644391
178  T8-2 192.0 Memory 5.257495
188  T8-3 204.0 Memory 5.318120
19   24 T24-1 161.0 Memory 5.081404
20   24 T24-2 131.0 Memory 4.875197
21   24 T24-3 230.0 Memory 5.438079
22   48 T48-1 458.0 Memory 6.126869
23   48 T48-2 594.0 Memory 6.386879
24   48 T48-3 374.0 Memory 5.924256





Now I fit nls() and gnls() to the data.

--

##Fit nls

nlsOut =

+   nls(lnCount ~ log(C0[Type] + C1[Type] * Time + C2[Type]* Time^2),
+   start = list( C0 = c(exp(3.5), exp(3.5)), C1 = c(-0.01, -0.01), C2 = 
c(0.01, 0.01)),

+data = myPbmcData,
+   )


summary(nlsOut)


Formula: lnCount ~ log(C0[Type] + C1[Type] * Time + C2[Type] * Time^2)

Parameters:
 Estimate Std. Error t value Pr(>|t|)
C01  33.824695.08870   6.647 3.08e-06 ***
C02 209.40868   31.01673   6.751 2.51e-06 ***
C11  -0.904470.58030  -1.559  0.13649
C12  -8.647793.73034  -2.318  0.03241 *
C21   0.027750.01261   2.201  0.04102 *
C22   0.291560.08975   3.249  0.00446 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3 on 18 degrees of freedom

Number of iterations to convergence: 5
Achieved convergence tolerance: 1.447e-06




##Fit gnls
gnlsOut =

+   gnls(lnCount ~ log(C0 + C1 * Time + C2* Time^2),
+   start = list( C0 = c(exp(3.5), exp(3.5)), C1 = c(-0.5, -0.5), C2 = 
c(0.01, 0.01)),

+data = myPbmcData,
+params = list( C0 + C1 + C2 ~ Type),
+verbose = TRUE
+ )


summary(gnlsOut)

Generalized nonlinear least squares fit
  Model: lnCount ~ log(C0 + C1 * Time + C2 * Time^2)
  Data: myPbmcData
   AIC  BIClogLik
  17.41510 25.66148 -1.707552

Coefficients:
   Value Std.Error   t-value p-value
C0.(Intercept) 121.61674 15.715703  7.738549  0.
C0.Type.L  124.15665 22.225361  5.586260  0.
C1.(Intercept)  -4.77613  1.887602 -2.530265  0.0209
C1.Type.L   -5.47536  2.669472 -2.051104  0.0551
C2.(Intercept)   0.15965  0.045315  3.523169  0.0024
C2.Type.L0.18654  0.064085  2.910877  0.0093

 Correlation:
   C0.(I) C0.T.L C1.(I) C1.T.L C2.(I)
C0.Type.L   0.948
C1.(Intercept) -0.750 -0.713
C1.Type.L  -0.713 -0.750  0.953
C2.(Intercept)  0.554  0.529 -0.924 -0.884
C2.Type.L   0.529  0.554 -0.884 -0.924  0.961

Standardized residuals:
Min  Q1 Med  Q3 Max
-1.41262246 -0.76633639 -0.03330171  0.67865673  1.63503742

Residual standard error: 0.37
Degrees of freedom: 24 total; 18 residual






Examining the the results, they don't match up as I read them.
For example, look at C0. nls() gives initial values for
CO Naive ~33 and  CO Memory ~210.

In contrast, gnls() gives an intercept at C0 121 and an effect of, if I am 
reading the output correctly,

   C0 Naive ~ 121.62- 124.16 = -2.54 and CO Memory ~ 121.62+ 124.16 = 245.78

However, if I compare the predictions they match up very well

--


(predict(gnlsOut) - predict(nlsOut))/predict(nlsOut)

1 2 3 4 5
 3.646201e-07  3.646201e-07  3.646201e-07  7.849412e-07  7.849412e-07
6 7 8 910
 7.849412e-07  3.655158e-07  3.655158e-07  3.655158e-07 -1.298393e-06
   1112131415
-1.298393e-06 -1.298393e-06  6.636562e-08  6.636562e-08  6.636562e-08
   1617181920
-7.458066e-10 -7.458066e-10 -7.458066e-10 -7.685896e-08 -7.685896e-08
   21222324
-7.685896e-08  1.456831e-08  1.456831e-08  1.456831e-08
attr(,"la

[R] polar.plot

2009-10-30 Thread Tony Greig
Hi,

Two questions:

1 - Say I have average speed and directions for tide and I would like to
plot them on a polar plot, but with different colors so I can indicate the
two directions. I'm using polar.plot from the plotrix library. How can I add
a second "b" and "dir.b" series to a polar.plot?

library(plotrix)
a = 3
dir.a = 85
b = 4
dir.b = 250
polar.plot(a, dir.a, start = 90, clockwise = T, show.grid.labels = T,
radial.lim=c(0,5), line.col=2, lwd=2)


2 - Which parameter in polar.plot can I use to set the orientation for the
grid labels which seem to default at 90 in my example above?


Thanks,
Tony

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] MatLab SimBiology

2009-10-30 Thread mauede
Is there any R package that implements the same capability of MatLab toolbox 
called SimBiology ?
We are expecially interested in protein-protein interactions and network 
analysis.
As far as I know SimBiology implements a system of ODEs reflecting the kinetic 
chemical reactions.
We would be more interested in stochastic simulations.
Thank you in advance.

Maura


tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] deriv() to take vector of expressions as 1st arg?

2009-10-30 Thread t121
Thanks a lot for the hints. The sapply method may work for me.

But how can I extract just the ".grad" expression from the return value
of the deriv function? (and secondly store in a matrix of expressions?)

Thanks again.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R crashes

2009-10-30 Thread cls59


premmad wrote:
> 
> Windows XP 32 bit machine.
> allocated 2gb as memory for R 2.9.2
> data  will be 20mb and was running calculations on the data .
> but the r crashes with out any warning or so.
> 

Ahh, well 20 MB is not huge data for R, so there should be no problem with
processing it. The pertinent questions now are:

  * What does your data look like? A small representational sample will
suffice, preferably something that can be copied out of an email and pasted
into R.

  * What operations are you running when R crashes? In your first post you
mentioned "R crashes frequently"-- does this happen with many different data
sets, but the same operations? 

If you could provide us the data and functions that are being executed just
prior to the crash, then we may be able to help. But without a reproducible
example to test on our own machines, as suggested by the posting guide for
this mailing list, there is really nothing we can do to help with your
problem.


-Charlie

-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://old.nabble.com/R-crashes-tp26110355p26132909.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to loop thru a matrix or data frame , and append calculations to a new data frame?

2009-10-30 Thread David Winsemius


On Oct 30, 2009, at 11:46 AM, Robert Wilkins wrote:


How do you do a double loop through a matrix or data frame , and
within each iteration , do a calculation and append it to a new,
second data frame?
(So, if your original matrix or data frame is 4 x 5 , then 20
calculations are done, and the new data frame, which had 0 rows to
start with, now has 20 rows)


Much vagueness in question, Grasshopper:

mtx <- matrix(1:20, 4)
df1 <- data.frame(x=1:20)
df1$mtxsq <- as.vector(mtx)^2
df1

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] : source R-script interactively outside R

2009-10-30 Thread Trevor Davis
At least in the Unix version of the ``Rscript`` front-end to R (but 
oddly enough not ``R CMD BATCH``) you could put something like this in 
test.R:


cat("Save workspace image? [y/n]: ")
answer <- tolower( scan("stdin", "", n=1) )
if (substr(answer, 1, 1) == "y") {
q("yes")
} else {
q("no")
}

Which you would use like this:

Rscript test.R

Angel Spassov wrote:

DeaR list,

let us assume we have stored a couple of functions in 'test.R'. Is it
possible to run

R CMD BATCH "test.R"
or
R --file="test.R"

interactively and if yes how?

A better formulated question sounds like this: how can I source "test.R"
interactively OUTSIDE R?

Another clarification try: I want to fix the
" in non-interactive use: command-line default will be used"
Warning Message.

The last one can be reproduced if you type for example:
q("ask")
in "test.R".

I hope at least one of the questions above was formulated understandably.

Thx in advance,
AS

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can I write Nothing to a variable?

2009-10-30 Thread Carl Witthoft
OK, here's the first-cut code, which does execute properly.  What I 
would like to do, simply for 'cleanliness', is to assign "nothing" to 
the variable "l" , rather than assigning "seq(1,dim(data)[timedim])". 
Then the result of all the get(unlist(dims)) operations below will be


func[j,k,]

Where right now the code returns func[j,k,seq(1:length(dim(data)[timedim]))]


Linewraps below have broken the definition of jmap() into at least 3 
lines, btw.


What this function, timecalc(), does is to calculate the selected 'func' 
over the specified dimension of the data, returning a 2-dimensional 
array of the calculated values.   Think, for example, of a FITS data 
stack, and calculating the mean intensity in each spatial pixel over the 
full time-sequence of frames.




#generic version, for 'any' function and any axis (3-D data only)
timecalc<-function( data, func=mean, timedim=3) {
# build a "[,j,k]" string where the blank is for
# the timedim selected.  I can't find a way to create a true "blank"
# which is why I load the seq(1:N) in there
#  (put error checkers on length of dim later)
dims<-as.list(dim(data))
dims[-timedim]<-c('j','k')
#sloppy way to extract the time-sequence, but lets me 'unlist' all 3 dims
dims[[timedim]]<-'l'
l<-seq(1,dim(data)[timedim])
alldim<-seq(1,3)  # just create a holder
# define the two dims which are NOT summed over, i.e. we'll return an array
# of size scandim[1]xscandim[2], each element is func-ed over 3rd dim
scandims<-alldim[alldim!=timedim]
funcname<-deparse(substitute(func))
datname<-deparse(substitute(data))
jmap<-function(k) mapply(function(j) 
eval(call(funcname,get(datname)[get(unlist((dims[1]))),get(unlist(dims[2])),get(unlist(dims[3]))])), 
j=seq(1:dim(data)[scandims[1]]))

mapply(jmap,k=seq(1:dim(data)[scandims[2]]))->bar
return(invisible(bar))
}
***


David Winsemius wrote:
It certainly seems possible that it is more complex, but at the moment I 
don't think it is possible to any more vague.


The specifics of implementation will depend on the precise meaning 
assigned to the words "a function", "index", "axis", "dimension", "each 
element of which is the sum of all values along the third axis". At the 
moment those appear to be unfortunately imprecisely described.


The quick answer to your first questions is yes, it is possible to 
create structures with nothing in them. Emply vectors, empty arrays, 
sparse matrices, and empty lists are all feasible. Make up your mind 
what you want and then articulate it.


some the the function sthat may do what you want are:

apply
integrate
lapply

All of the specifics depend ... on specifics.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bayesian regression stepwise function?

2009-10-30 Thread Kjetil Halvorsen
On Fri, Oct 23, 2009 at 6:05 PM, Ravi Varadhan  wrote:
> Frank,
>
> I have heard this (i.e. only head-to-head comparisons are valid) and various
> other folklores about AIC and BIC based model selection before, including
> one that these information criteria are only applicable for comparing two
> nested models.
>
> Where has it been demonstrated that AIC/BIC cannot be used to find the best
> subset, i.e. the subset that is closest to the "true" model (assuming that
> true model is contained in the set of models considered, and that maximum
> likelihood estimation is used for estimating parameters in the models)?
>
> I would greatly appreciate any reference that shows this.

You could look at Claeskens/Hjort: "Model Selection and Model
Averaging", for instance
section 2.3,  AIC and the Kullback-Leibler distance.

AIC corresponds to searching for the model with smallest KL distance
to thruth, while
BIC aims at consistent selection of the true model (assuming that beast exists!)

Then they go on to show that this two goals cannot be reconciled.

Kjetil

>
> Best,
> Ravi.
>
> 
> ---
>
> Ravi Varadhan, Ph.D.
>
> Assistant Professor, The Center on Aging and Health
>
> Division of Geriatric Medicine and Gerontology
>
> Johns Hopkins University
>
> Ph: (410) 502-2619
>
> Fax: (410) 614-9625
>
> Email: rvarad...@jhmi.edu
>
> Webpage:
> http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
> tml
>
>
>
> 
> 
>
>
> -Original Message-
> From: Frank E Harrell Jr [mailto:f.harr...@vanderbilt.edu]
> Sent: Friday, October 23, 2009 4:04 PM
> To: Ravi Varadhan
> Cc: jlu...@ria.buffalo.edu; 'Allan.Y'; r-help@r-project.org;
> r-help-boun...@r-project.org
> Subject: Re: [R] Bayesian regression stepwise function?
>
> Ravi Varadhan wrote:
>> The stepAIC() function in "MASS" can be used, with k = log(n), to
> implement
>> your suggestion of "quasi-Bayesian" stepwise selection using the BIC
>> criterion.
>>
>> Ravi.
>
> Although many statisticians use BIC otherwise, it was only designed to
> compare two pre-specified models.
>
> Frank
>
>>
>>
> 
>> ---
>>
>> Ravi Varadhan, Ph.D.
>>
>> Assistant Professor, The Center on Aging and Health
>>
>> Division of Geriatric Medicine and Gerontology
>>
>> Johns Hopkins University
>>
>> Ph: (410) 502-2619
>>
>> Fax: (410) 614-9625
>>
>> Email: rvarad...@jhmi.edu
>>
>> Webpage:
>>
> http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
>> tml
>>
>>
>>
>>
> 
>> 
>>
>>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On
>> Behalf Of jlu...@ria.buffalo.edu
>> Sent: Friday, October 23, 2009 2:31 PM
>> To: Allan.Y
>> Cc: r-help@r-project.org; r-help-boun...@r-project.org
>> Subject: Re: [R] Bayesian regression stepwise function?
>>
>> The BIC (Raftery) can be used for quasi-Bayesian model selection, but it's
>
>> not stepwise.   Ntzoufras shows how to use WinBUGS to conduct Bayesian
>> model selection, but again it's not stepwise
>>
>>
>> Ntzoufras, I. (2002), 'Gibbs variable selection using BUGS', Journal of
>> Statistical Software 7(7), 1--19.
>> Ntzoufras, I. (2009), Bayesian modeling using WinBUGS, Wiley, Hoboken, NJ.
>> Raftery, A. E. (1995), 'Bayesian model selection in social research',
>> Sociological Methodology 25, 111-163.
>>
>>
>>
>>
>>
>>
>>
>> "Allan.Y" 
>> Sent by: r-help-boun...@r-project.org
>> 10/22/2009 01:09 PM
>>
>> To
>> r-help@r-project.org
>> cc
>>
>> Subject
>> [R]  Bayesian regression stepwise function?
>>
>>
>>
>>
>>
>>
>>
>> Hi everyone,
>>
>> I am wondering if there exists a stepwise regression function for the
>> Bayesian regression model.  I tried googling, but I couldn't find
>> anything.
>> I know "step" function exists for regular stepwise regression, but nothing
>> for Bayes.
>>
>>
>> Thanks
>
>
> --
> Frank E Harrell Jr   Professor and Chair           School of Medicine
>                      Department of Biostatistics   Vanderbilt University
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to loop thru a matrix or data frame , and append calculations to a new data frame?

2009-10-30 Thread Robert Wilkins
How do you do a double loop through a matrix or data frame , and
within each iteration , do a calculation and append it to a new,
second data frame?
(So, if your original matrix or data frame is 4 x 5 , then 20
calculations are done, and the new data frame, which had 0 rows to
start with, now has 20 rows)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] .Rprofile replacement function setwd() causing errors

2009-10-30 Thread Duncan Murdoch

On 10/30/2009 9:30 AM, Michael Friendly wrote:
In my .Rprofile  I have the following functions which display the 
current directory in the main R window title bar,
and modify base::setwd() to keep this up to date.  I like this because I 
can always tell where I am in the file system.


cd <- function(dir) {
  base::setwd(dir)
  utils::setWindowTitle( short.path(base::getwd()) )
}

short.path <- function(dir, len=2) {
np <-length(parts <- unlist(strsplit(dir, '/')))
parts <-rev( rev(parts)[1:min(np,len)] )
dots <- ifelse (np>len, '...', '')
paste(dots,paste(parts, '/', sep='', collapse=''))
}

utils::setWindowTitle(short.path(base::getwd()))
utils::assignInNamespace("setwd",
function(dir){
.Internal(setwd(dir))
utils::setWindowTitle( short.path(base::getwd()) )
},
"base")


I would guess this would be fixed if you save the value from the 
.Internal call, and return it as the result of your function, the way 
the standard setwd does.  A safer approach would be something like this:


local({
  oldsetwd <- setwd
  utils::assignInNamespace("setwd",
 function(dir) {
value <- oldsetwd(dir)
utils::setWindowTitle( short.path(base::getwd()) )
value
 }, "base")
})

This way, if the internals of setwd() ever change, your function won't 
break.


Duncan Murdoch




However, this causes errors in some cases where setwd is used by other 
functions, particularly example():


 > library(HistData)
 > example(Snow)
Error in setwd(olddir) : cannot change working directory
 > traceback()
6: setwd(olddir)
5: open.srcfile(srcfile, first)
4: open(srcfile, first)
3: getSrcLines(srcfile, lastshown + 1, srcref[3L])
2: source(zfile, local, echo = echo, prompt.echo = paste(prompt.prefix,
   getOption("prompt"), sep = ""), continue.echo = paste(prompt.prefix,
   getOption("continue"), sep = ""), verbose = verbose, 
max.deparse.length = Inf,

   encoding = encoding, skip.echo = skips, keep.source = TRUE)
1: example(Snow)
 >

Questions:
*  Is there someway I can re-write these functions to avoid such problems?
* Is there someway I can 'hide' my custom functions like cd(), 
short.path() so they don't appear in the global

environment but are available in my R session?

thanks
-Michael



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to properly shade the background panels of an xyplot?

2009-10-30 Thread Ottorino-Luca Pantani

Dear R users,
this is a follow up of this message
http://tolstoy.newcastle.edu.au/R/e6/help/09/05/13897.html
I'm reproducing the core of it for convenience.


//
/ data(Oats, package = "MEMSS") /
/ tp1.oats <- xyplot(yield ~ nitro | Variety + Block, /
/  data = Oats, /
/  panel = function(x, y, subscripts, ...) { /
/# How to normalize my heatmap metric 
with /
/# the value of the panel that has 
maximum average ? /
/# metric = eval(mean(y)/ 
max() /

/metric = eval(mean(y)/max(Oats$yield)) /
/panel.fill(col = gray(metric)) /
/panel.lines(x,y) /
/  } /
/) /
/ print(tp1.oats) /
//

xyplot(yield ~ nitro | Variety + Block,

   data = Oats,
   max.mean = max(with(Oats, tapply(yield, list(Variety, Block), mean))),
   panel = function(x, y, subscripts, max.mean, ...) {
   metric = mean(y)/max.mean
   panel.fill(col = gray(metric))
   panel.lines(x,y)
   })


or

xyplot(yield ~ nitro | Variety + Block,

   data = Oats,
   aux.env = new.env(parent = emptyenv()),
   prepanel = function(x, y, aux.env, ...) {
   aux.env$max.mean.y <- max(aux.env$max.mean.y, mean(y))
   list()
   },
   panel = function(x, y, subscripts, aux.env, ...) {
   metric = mean(y) / aux.env$max.mean.y
   panel.fill(col = gray(metric))
   panel.lines(x,y)
   })

-Deepayan
The result is a trellis object in which the background colour of the 
panels is an outcome of the data contained in the panel itself. After 
all, this is what "panel = function (x,y ." is meant for, right ?


But what, if I want to highlight some panels ? Arbitrarily or 
conditioned by another variable.


Say I want to shade in gray only the upper right panels (Block VI, 
Victory and Marvellous varieties )


Given a data frame like this, with a variable intended to set the colour 
of the panel background

/
data(Oats, package = "MEMSS")
/Oats1 <- cbind.data.frame(Oats,
Highlight =
ifelse(Oats$Block == "VI" &
   Oats$Variety %in% c("Victory", 
"Marvellous" ),

   "gray",
   "transparent")
)

which is a possible code ?

I (more or less) know how to manage the data in the panel,
but I cannot imagine how to do it with variables external to the panel 
itself.


I suppose that the panel functions are not useful here.
I'm wandering through par.settings, themes and panel.fill, but still 
without success.

Any hint ?

--
Ottorino-Luca Pantani, Università di Firenze
Dip. Scienza del Suolo e Nutrizione della Pianta
P.zle Cascine 28 50144 Firenze Italia
Ubuntu 8.04.3 LTS -- GNU Emacs 23.0.60.1 (x86_64-pc-linux-gnu, GTK+ 
Version 2.12.9)

ESS version 5.5 -- R 2.9.2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with creating some loops

2009-10-30 Thread David Winsemius


On Oct 30, 2009, at 11:24 AM, Vadlamani, Satish {FLNA} wrote:


Hi:
In general, how to I cast a character to the class that I am trying  
to change.


For example, if I have a data frame df1. df1 has a column x. suppose  
I want to a substring of x (the first 3 chars). Then I want to do  
something like


df1$new = substring(of x)

Example
Data frame df1
x
abcd
efgh

Now df1$new should be
ab
ef


Following your written specification (rather than your example):

> df1 <- data.frame(x=c("abcd", "efgh"))
> df1
 x
1 abcd
2 efgh
> df1$new <- substr(df1$x,start=1,stop=3)
> df1
 x new
1 abcd abc
2 efgh efg




Thanks.
Satish



_
From:   Vadlamani, Satish {FLNA}
Sent:   Friday, October 30, 2009 8:40 AM
To: R-help@r-project.org
Subject:Help with creating some loops

Hi All:

I have a data frame called all_corn. This has 31 columns. The first  
column is a character key. The next 15 columns  
(stat1,stat2,...,stat15) are the statistical forecast. The last 15  
columns (sls1,sls2,...,sls5) are actual sales.
I want to calculate textbook tracking signal and cuulative percent  
error.


1) I am showing some of the calculations below. How can I make a  
loop out of this instead of manually doing this 15 times?
2) Once All these calculations are done, how do I put all these  
columns (err1,err2, etc.) into the same data frame?


Thanks.

attach(all_corn)

cum_sls1 <- sls1
err1 <- sls1-stat1
cum_err1 <- sls1-stat1
cum_abs_err1 <- abs(err1)
mad1 <- abs(cum_err1)/1
cum_pct_err1 <- (ifelse(cum_sls1 > 0, cum_err1/cum_sls1, 1))*100
ts1 <- ifelse(mad1 > 0, cum_err1/mad1, 0)

cum_sls2 <- cum_sls1 + sls2
err2 <- sls2-stat2
cum_err2 <- cum_err1 + sls2-stat2
cum_abs_err2 <- cum_abs_err1 + abs(err2)
mad2 <- cum_abs_err2/2
cum_pct_err2 <- (ifelse(cum_sls2 > 0, cum_err2/cum_sls2, 1))*100
ts2 <- ifelse(mad2 > 0, cum_err2/mad2, 0)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Easy method to set user-mode virtual memory space in Windows Vista and 7

2009-10-30 Thread Prof Brian Ripley

On Wed, 28 Oct 2009, Kingsford Jones wrote:


I thought I'd share this with the list since it appears to provide a
quick fix to some memory problems, and I haven't see it discussed in
relation to R.


Seems you didn't look in the places mentioned in the posting guide, fo 
this is an actual FAQ!


See the references in rw-FAQ Q2.9 (and this has been documented there 
for many years, and the R developer made R large-address-aware so 
users who looked at the rw-FAQ could benefit from this).


See http://msdn.microsoft.com/en-us/library/bb613473%28VS.85%29.aspx 
for a more rounded view covering all the relevant OSes.



To reallocate virtual memory from kernel-mode to user-mode in 32-bit
Vista or Windows 7 one can use the increaseuserva boot option value.
See
http://msdn.microsoft.com/en-us/library/aa906211.aspx

On my 4GB Vista machine, R is now able to use 3GB memory (previously
at the default value of 2GB).

Here's the method:

Open cmd.exe as administrator and type

BCDEdit /set IncreaseUserVA 

where  is between 2048 and 3072 (ie 2-3GB), then reboot.  Given
that you've set the --max-mem-size flag, or used the memory.limit
function in an R session to increase R's memory allocation, your R
processes should now be allowed to access up to 3GB of virtual memory.


See the rw-FAQ: it is 3GB of address space and 2.5GB of memory by 
default.



I am not a Windows expert, so if anyone knows of disadvantages of
using this method *please* post a response.


This is not advised in general by those who are experts (why do you 
think that it is not the default in Windows?).  As the URL above says


  'For applications that are memory-intensive, such as database
  management systems (DBMS), the use of a larger virtual address space
  can provide considerable performance and scalability benefits.
  However, the file cache, paged pool, and nonpaged pool are smaller,
  which can adversely affect applications with heavy networking or
  I/O.  Therefore, you might want to test your application under load,
  and examine the performance counters to determine whether your
  application benefits from the larger address space.'

Remember that you are setting this for all uses and users of your 
machine, not just the R process (so the last sentence is somewhat 
misleading).


The best advice seems to be that if you have 4GB or more of RAM is to 
run a 64-bit version of Windows, which gives a 32-bit process a 4GB 
user address space.  If you have a lot more memory, consider a 64-bit 
build of R (and an OS that handles memory more efficiently: one of my 
background tasks right now is to investigate why R under Windows is 
running a particular memory-intensive multiprocessor task so much more 
slowly than Linux on the same hardware).



best,

Kingsford Jones


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with creating some loops

2009-10-30 Thread Vadlamani, Satish {FLNA}
Hi:
In general, how to I cast a character to the class that I am trying to change.

For example, if I have a data frame df1. df1 has a column x. suppose I want to 
a substring of x (the first 3 chars). Then I want to do something like

df1$new = substring(of x)

Example
Data frame df1
x
abcd
efgh

Now df1$new should be
ab
ef

Thanks.
Satish



_
From:   Vadlamani, Satish {FLNA}
Sent:   Friday, October 30, 2009 8:40 AM
To: R-help@r-project.org
Subject:Help with creating some loops

Hi All:

I have a data frame called all_corn. This has 31 columns. The first column is a 
character key. The next 15 columns (stat1,stat2,...,stat15) are the statistical 
forecast. The last 15 columns (sls1,sls2,...,sls5) are actual sales.
I want to calculate textbook tracking signal and cuulative percent error.

1) I am showing some of the calculations below. How can I make a loop out of 
this instead of manually doing this 15 times?
2) Once All these calculations are done, how do I put all these columns 
(err1,err2, etc.) into the same data frame?

Thanks.

attach(all_corn)

cum_sls1 <- sls1
err1 <- sls1-stat1
cum_err1 <- sls1-stat1
cum_abs_err1 <- abs(err1)
mad1 <- abs(cum_err1)/1
cum_pct_err1 <- (ifelse(cum_sls1 > 0, cum_err1/cum_sls1, 1))*100
ts1 <- ifelse(mad1 > 0, cum_err1/mad1, 0)

cum_sls2 <- cum_sls1 + sls2
err2 <- sls2-stat2
cum_err2 <- cum_err1 + sls2-stat2
cum_abs_err2 <- cum_abs_err1 + abs(err2)
mad2 <- cum_abs_err2/2
cum_pct_err2 <- (ifelse(cum_sls2 > 0, cum_err2/cum_sls2, 1))*100
ts2 <- ifelse(mad2 > 0, cum_err2/mad2, 0)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R_PHP

2009-10-30 Thread Moumita Das
Hi All,
Once again , would like to know about your suggestions regarding the
integration of PHP and R.I have read **

*"4. R Web Interfaces*" here:

 http://cran.r-project.org/doc/FAQ/R-FAQ.txt

It says about many tools.Here my requirement is :--
I have a tool developed in PHP.I want to integrate that tool with these
R-scripts and gets the results online.
I run R-scripts in linux.

Can anyone write few simple steps for me to integrate the R-scripts with
that tool(written in PHP)?
I also  R_PHP_ONLINE here :-- http://steve-chen.net/download_area/r
How do i proceed integration?No propers steps are written anywhere.
-- 
Thanks in advance
Moumita

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] opacity under dispersion command under plotrix

2009-10-30 Thread David Winsemius


On Oct 30, 2009, at 7:02 AM, Joe King wrote:

Is there anyway to make the lines in the dispersion command come  
forward in
a plot and allow the fill in the dispersion parameter be transparent  
so all

of the lines I am using to note confidence intervals are shown?


Got code?


---

Joe King, M.A.

Ph.D. Student


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Back-to-back graph with lattice package

2009-10-30 Thread David Winsemius


On Oct 30, 2009, at 9:11 AM, Erik Svensson wrote:


Hi,
I am an [R] and lattice beginner.
I an doing an epidemiological study and want to present my data in a
condensed and clear manner as possible.

The example below is fake, but representative for the the data I  
want to show.


I have a data frame with different categorical data and one column of
numerical data:

Country, Hair, Eyes, Sex, Age
sweden, blond, blue, male, 25
sweden, dark, brown, female, 30
denmark, dark, blue, male, 22
denmark, blond, blue, female 33
norway, dark, blue, female, 21
norway, blond, blue, male, 31
finland, dark, blond, male, 43
island, blond, blue, female, 40
...and so on for 300 rows...

Country <- as.factor(rep(c("Sweden", "Denmark", "Norway", "Finland",
"Iceland","Sweden", "Denmark", "Norway", "Finland","Sweden" ),50))
Hair <- as.factor(rep(c("blond",  "dark", "blond", "dark", "blond"), 
100))

Eyes <- as.factor(rep(c("blue",  "blue", "blue", "brown"),125))
Sex <- as.factor(rep(c("male",  "female", "female", "male"),125))
Age <- rep(c(1:20, 1:100, 76:80),4)


Country has 5 levels, Age has many.
Hair, Eyes, and Sex have two levels each

I want to do one lattice graph with 5 columns of panels with these
descriptive data.
I would prefer to use the lattice barchart or the corresponding
dotplot with lines (except for the Age panel)

The  Age panel and probably the Country panel I think I can solve  
myself:

barchart(Country)  # This only gives one panel, though...
densityplot(~ Age|Country,  layout=c(1,5), xlim=c(0,100),scales =
list(y=list (relation="free")))

But, for the dichotomous variables Hair, Eyes, and Sex columns I would
like to have a bihistogram aka back-to-back aka age-sex-pyramid
solution.
I want all bars to be horizontal, the names of the countries to the
left on the y-axis.
I would also like to sort the country names according to a list of  
their names.


For the back-to-back graph  I have only managed to get two panels per
column instead of one:
barchart(table(Country, Hair),groups=FALSE)


My suggestion would be to look for a worked solution.

library(sos)
???pyramid

The two I found upon taking my own advice are pyramid in epicalc and  
pyramid.plot in plotrix. Experimenting shows that the epicalc pyramid  
solution is more flexible and requires less pre-processing of the data.


library(epicalc)
data(Oswego)
use(Oswego)
pyramid(age, sex)

# ... and on your data:
pyramid(Age, Hair)# ... seems to deliver the requested plot.




How can I do all this? Is it possible at all?


library(fortunes)
fortune("Only how")



Erik Svensson


--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] environment or other technique to isolate some private working information

2009-10-30 Thread Prof. John C Nash
In order to instrument functions for optimization or DE solving etc., I
want to set up a private ensemble of working data somewhere in my
workspace that won't collide with the problem-data. I can do this
already with named objects within .GlobalEnv, but I'd like to have
functions like

fnmon.setup<-function( etc. ) {

...
 # create an environment
monenv<-new.env(baseenv())

# then create some names in the monenv for a file of data to be written,
#  some counters, timers etc.

}

and within my function call a function
  fnmon.next<- ...

that gets the appropriate infomation from the special area writes lines
to the monitoring file

and finally

  fnmon.stop<-

to close things out.

I've got it working with objects in .GlobalEnv as indicated, but
attempts to do it as above cause failure to find the environment in the
fnmon.next.

Likely I'm misreading the documentation on 'environment' somehow (it may
be clear to folks who are deep into Lisp/Scheme, but not mere mortals).
Or perhaps there is a better way to isolate the private collection of
objects that will do the monitoring.

I can provide the files to anyone wanting a tryout, but they are a bit
long, and I suspect there is an "easy" approach that is eluding me.

JN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Help with creating some loops

2009-10-30 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 30.10.2009 14:40:06:

> Hi All:
> 
> I have a data frame called all_corn. This has 31 columns. The first 
column is 
> a character key. The next 15 columns (stat1,stat2,...,stat15) are the 
> statistical forecast. The last 15 columns (sls1,sls2,...,sls5) are 
actual sales.
> I want to calculate textbook tracking signal and cuulative percent 
error.
> 
> 1) I am showing some of the calculations below. How can I make a loop 
out of 
> this instead of manually doing this 15 times?
> 2) Once All these calculations are done, how do I put all these columns 
> (err1,err2, etc.) into the same data frame?
> 
> Thanks.
> 

do not attach

> attach(all_corn)
> 

work with whole data frame instead columns

> cum_sls1 <- sls1
> err1 <- sls1-stat1

err <- all_corn[,17:31]-all_corn[,2:16]

err/all_corn[,17:31]*100
you can than handle all infs at once by is.finite

Regards
Petr

> cum_err1 <- sls1-stat1
> cum_abs_err1 <- abs(err1)
> mad1 <- abs(cum_err1)/1
> cum_pct_err1 <- (ifelse(cum_sls1 > 0, cum_err1/cum_sls1, 1))*100
> ts1 <- ifelse(mad1 > 0, cum_err1/mad1, 0)
> 
> cum_sls2 <- cum_sls1 + sls2
> err2 <- sls2-stat2
> cum_err2 <- cum_err1 + sls2-stat2
> cum_abs_err2 <- cum_abs_err1 + abs(err2)
> mad2 <- cum_abs_err2/2
> cum_pct_err2 <- (ifelse(cum_sls2 > 0, cum_err2/cum_sls2, 1))*100
> ts2 <- ifelse(mad2 > 0, cum_err2/mad2, 0)
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fast optimizer

2009-10-30 Thread Prof. John C Nash
> Date: Fri, 30 Oct 2009 09:29:06 +0100
> From: Christophe Dutang 
> Subject: Re: [R] [R-SIG-Finance]  Fast optimizer
> To: R_help Help 
> Cc: r-help@r-project.org
>> > Ok. I have the following likelihood function.
>> >
>> > L <- p*dpois(x,a)*dpois(y,b+c)+(1-p)*dpois(x,a+c)*dpois(y,b)
>> >
>> > where I have 100 points of (x,y) and parameters c(a,b,c,p) to
>> > estimate. Constraints are:
>> >
>> > 0 < p < 1
>> > a,b,c > 0
>> > c < a
>> > c < b
>> >
>> > I construct a loglikelihood function out of this. First ignoring the
>> > last two constraints, it takes optim with box constraint about 1-2 min
>> > to estimate this. I have to estimate the MLE on about 200 rolling
>> > windows. This will take very long. Is there any faster implementation?
> Take a look at the CRAN task view on optimisation, you may find faster  
> algorithms.
> 

There are several new or revised methods in development as well as a new
wrapper optimx() in the r-forge OptimizeR project
http://r-forge.r-project.org/R/?group_id=395

In particular Rvmmin is an all-R implementation of the algorithm at the
heart of optim's BFGS but with bounds and mask (fixed parameter)
constraints. I'm looking into how it can be made more convenient to
hot-start with suspected "good" parameters, which would be likely be
important here.

JN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple pages with ggplot2 facet_wrap?

2009-10-30 Thread hadley wickham
On Wed, Oct 28, 2009 at 8:19 PM, Bill Gillespie  wrote:
> I currently use lattice functions to produce multiple pages of plots using
> the "layout" argument to specify the number of rows and columns of panels,
> e.g.,
>
> xyplot(price ~ carat | clarity, diamonds, layout = c(2, 2))
>
> This results in 2 pages of 4 panels each. "diamonds" is a data.frame
> distributed with ggplot2.
>
> I would like to do the same with ggplot2 but have been unsuccessful. The
> following sequence of statements seemed like a logical way to do it:
> p <- ggplot(diamonds, aes(carat, price))
> p + geom_point() + facet_wrap(~clarity, ncol = 2, nrow = 2)
> But they result in the error statement: "Error in nrow * ncol : non-numeric
> argument to binary operator".
>
> Is facet_wrap or facet_grid capable of producing multiple pages of plots
> and, if so, how?

Not automatically - it's up to you to figure out how to split the
factor levels in pages and then create the appropriate subsets.

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summing identical IDs

2009-10-30 Thread David Winsemius


On Oct 29, 2009, at 5:23 PM, PDXRugger wrote:



Terrific help thank you.
dupbuild<-aggregate(DF$Acres, list(Bldgid), sum)
This line worked best.

Now im going to challenge everyone (i think?)

Consider the following:


Acres<-c(100,101,100,130,156,.5,293,300,.09,100,12.5)
Bldgid<-c(1,2,3,4,5,5,6,7,7,8,8)
Year<-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000)
ImpValue<-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000)
DF=cbind(Acres,Bldgid,Year,ImpValue)
DF<-as.data.frame(DF)

I would like to do the same, except there are some rules i want to  
follow.

I only want to aggregate the Acres if :
a) The Years are not identical
b) The ImpValues are not identical
c) The Years are identical and the ImpValue are not
d)The ImpValues are identical and the Years are not


As I review your Boolean logic, I run into serious problems.

c) and d) cannot be true if a and b) are true.

So no cases satisfy all 4 specs. In particular both of the pairs you  
say you want aggregated (5+6) and 10+11) violate rule a) and the  
second pair also violates b).


--
David


but if the Acres and ImpValues are identical i would still like to  
add the

Acres together and form one case.
If the cases are put together i would also like to add the ImpValues
together.  So the below

   Acres Bldgid Year ImpValue
1  100.00  1 1946 1000
2  101.00  2 1952 1400
3  100.00  3 1922 1300
4  130.00  4 1910  900
5  156.00  5 1955 5000
60.50  5 1955 1200
7  293.00  6 1999  500
8  300.00  7 1990 1000
90.09  7 1991  300
10 100.00  8 2000 1000
11  12.50  8 2000 1000

would become

   Acres Bldgid Year ImpValue
1  100.00  1 1946 1000
2  101.00  2 1952 1400
3  100.00  3 1922 1300
4  130.00  4 1910  900
5  156.50 5 1955 6200
7  293.00  6 1999  500
8  300.09 7 1990 1300
10 112.50  8 2000 1000

Thanks, i gave it a bunch of shots but nothing worth posting.





PDXRugger wrote:


Hello All,
  I would like to select records with identical IDs then sum an  
attribute

then and return them to the data frame as a single record.  Please
consider


Acres<-c(100,101,100,130,156,.5,293,300,.09)
Bldgid<-c(1,2,3,4,5,5,6,7,7)

DF=cbind(Acres,Bldgid)
DF<-as.data.frame(DF)

So that:

 Acres Bldgid
1 100.00  1
2 101.00  2
3 100.00  3
4 130.00  4
5 156.00  5
6   0.50  5
7 293.00  6
8 300.00  7
9   0.09  7

Becomes

 Acres Bldgid
1 100.00  1
2 101.00  2
3 100.00  3
4 130.00  4
5 156.50  5
7 293.00  6
8 300.09  7

dup<-unique(DF$Bldgid[duplicated(Bldgid)])
dupbuild<-DF[DF$Bldgid %in% dup,]
dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild 
$Bldgid)])


This sums the unique Ids of the duplicated records, not whati want.
Thanks ahead of time

JR





--
View this message in context: 
http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with creating some loops

2009-10-30 Thread Vadlamani, Satish {FLNA}
Hi All:

I have a data frame called all_corn. This has 31 columns. The first column is a 
character key. The next 15 columns (stat1,stat2,...,stat15) are the statistical 
forecast. The last 15 columns (sls1,sls2,...,sls5) are actual sales.
I want to calculate textbook tracking signal and cuulative percent error.

1) I am showing some of the calculations below. How can I make a loop out of 
this instead of manually doing this 15 times?
2) Once All these calculations are done, how do I put all these columns 
(err1,err2, etc.) into the same data frame?

Thanks.

attach(all_corn)

cum_sls1 <- sls1
err1 <- sls1-stat1
cum_err1 <- sls1-stat1
cum_abs_err1 <- abs(err1)
mad1 <- abs(cum_err1)/1
cum_pct_err1 <- (ifelse(cum_sls1 > 0, cum_err1/cum_sls1, 1))*100
ts1 <- ifelse(mad1 > 0, cum_err1/mad1, 0)

cum_sls2 <- cum_sls1 + sls2
err2 <- sls2-stat2
cum_err2 <- cum_err1 + sls2-stat2
cum_abs_err2 <- cum_abs_err1 + abs(err2)
mad2 <- cum_abs_err2/2
cum_pct_err2 <- (ifelse(cum_sls2 > 0, cum_err2/cum_sls2, 1))*100
ts2 <- ifelse(mad2 > 0, cum_err2/mad2, 0)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] .Rprofile replacement function setwd() causing errors

2009-10-30 Thread Michael Friendly
In my .Rprofile  I have the following functions which display the 
current directory in the main R window title bar,
and modify base::setwd() to keep this up to date.  I like this because I 
can always tell where I am in the file system.


cd <- function(dir) {
 base::setwd(dir)
 utils::setWindowTitle( short.path(base::getwd()) )
}

short.path <- function(dir, len=2) {
   np <-length(parts <- unlist(strsplit(dir, '/')))
   parts <-rev( rev(parts)[1:min(np,len)] )
   dots <- ifelse (np>len, '...', '')
   paste(dots,paste(parts, '/', sep='', collapse=''))
}

utils::setWindowTitle(short.path(base::getwd()))
utils::assignInNamespace("setwd",
   function(dir){
   .Internal(setwd(dir))
   utils::setWindowTitle( short.path(base::getwd()) )
   },
   "base")

However, this causes errors in some cases where setwd is used by other 
functions, particularly example():


> library(HistData)
> example(Snow)
Error in setwd(olddir) : cannot change working directory
> traceback()
6: setwd(olddir)
5: open.srcfile(srcfile, first)
4: open(srcfile, first)
3: getSrcLines(srcfile, lastshown + 1, srcref[3L])
2: source(zfile, local, echo = echo, prompt.echo = paste(prompt.prefix,
  getOption("prompt"), sep = ""), continue.echo = paste(prompt.prefix,
  getOption("continue"), sep = ""), verbose = verbose, 
max.deparse.length = Inf,

  encoding = encoding, skip.echo = skips, keep.source = TRUE)
1: example(Snow)
>

Questions:
*  Is there someway I can re-write these functions to avoid such problems?
* Is there someway I can 'hide' my custom functions like cd(), 
short.path() so they don't appear in the global

environment but are available in my R session?

thanks
-Michael

--
Michael Friendly Email: friendly AT yorku DOT ca 
Professor, Psychology Dept.

York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function to manage seconds?

2009-10-30 Thread Gabor Grothendieck
Try this:

> library(chron)
> times(100/(24*60*60)) # 100 seconds
[1] 00:01:40


On Fri, Oct 30, 2009 at 5:37 AM, Stefano Leonardi
 wrote:
> Hi,
> I could not find a function that returns a time in hours and minutes given
> seconds as argument. I looked also in the chron package.
>
> I would need something like this:
>
> mytime(61) #61 are seconds
> [1] "00:01:01"
>
> mytime(7385) #7385 are seconds
> [1] "02:03:05"
>
> Am I missing something? or should I write it by myself?
> Thank you
> Stefano
>
>
> --
> ==
>  Stefano Leonardi
>  Dipartimento di Scienze Ambientali
>  Universita` di Parma               E-mail: stefano.leonardi at unipr.it
>  Viale Usberti 11/A                            Phone : +39-0521-905659
>  43100 PARMA  (Italy)                          Fax   : +39-0521-905402
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-help Digest, Vol 80, Issue 30

2009-10-30 Thread Huidong TIAN
Dear friends,

I will be very happy if anyone tell me the way to change work directory
permanently?
I mean not use the function setwd() which can only change temporary, when
you close the console, it will the old directory.
Sys.setenv(R_USER = '') also doesn't work.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R strucchange question: recursive-based CUSUM

2009-10-30 Thread Julia Bondarenko

Hello R users:

I'm trying now to apply the package strucchange to see whether there is 
a structural change in linear regression. I have noted the following 
problem that arises in my case with recursive-based CUSUM: generic 
function recresid() in efp() generates an error, since (probably) it 
cannot compute the inverse matrix of (X^(i-1)^T)*(X^(i-1)) at each step 
(i-1), because the  matrix (X^(i-1)^T)*(X^(i-1)) does not have full rank 
for all i (X consists of dummy variables). Does any solution of this 
problem exist (for example, to replace the ordinary inverse by the 
generalised inverse, ginv())?


Thank you in advance for your help!

Julia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Function to manage seconds?

2009-10-30 Thread Stefano Leonardi

Hi,
I could not find a function that returns a time in hours and minutes 
given seconds as argument. I looked also in the chron package.


I would need something like this:

mytime(61) #61 are seconds
[1] "00:01:01"

mytime(7385) #7385 are seconds
[1] "02:03:05"

Am I missing something? or should I write it by myself?
Thank you
Stefano


--
==
 Stefano Leonardi
 Dipartimento di Scienze Ambientali
 Universita` di Parma   E-mail: stefano.leonardi at unipr.it
 Viale Usberti 11/APhone : +39-0521-905659
 43100 PARMA  (Italy)  Fax   : +39-0521-905402

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Quarterly data in PLM package

2009-10-30 Thread Jose Iparraguirre D'Elia
Dear all,

 

Does anyone know if the PLM package (to run Panel Data Analysis) accepts 
quarterly data?

The package vignette and documentation only use annual data -and the only time 
index available seems to work for years.

 

José 

 

 

Mr José Luis Iparraguirre

Senior Research Economist

Economic Research Institute of Northern Ireland

2 -14 East Bridge Street

Belfast BT1 3NQ

Northern Ireland

United Kingdom

 

Tel: +44 (0)28 9072 7365

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] opacity under dispersion command under plotrix

2009-10-30 Thread Joe King
Is there anyway to make the lines in the dispersion command come forward in
a plot and allow the fill in the dispersion parameter be transparent so all
of the lines I am using to note confidence intervals are shown?

 

---

Joe King, M.A.

Ph.D. Student 

University of Washington - Seattle

206-913-2912  

j...@joepking.com

---

"Never throughout history has a man who lived a life of ease left a name
worth remembering." --Theodore Roosevelt

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] different L2 regularization behavior between lrm, glmnet, and penalized? (original question)

2009-10-30 Thread J.J.Goeman
 Dear Robert,
 
The differences have to do with diffent scaling defaults.
 
lrm by default standardizes the covariates to unit sd before applying
penalization. penalized by default does not do any standardization, but
if asked standardizes on unit second central moment. In your example:
 
x = c(-2, -2, -2, -2, -1, -1, -1, 2, 2, 2, 3, 3, 3, 3)
z = c(0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1)
 
You got:
 
> g = lrm(z ~ x, penalty = 1)
> g$coefficients
 Intercept  x 
-0.1620727  0.3241454 
 
Now:
 
> coef(penalized(z ~ x, lambda2 = 1,standardize=T))
 
(Intercept)   x 
 -0.1651565   0.3303130 
 
is already more similar. To counter the difference between dividing by n
(penalized) or n-1 (lrm), do
 
> coef(penalized(z ~ x, lambda2 = 14/13,standardize=T))
 
(Intercept)   x 
 -0.1620727   0.3241454 
 
The glmnet case should be similar, but I don't know the details here. It
seems glmnet uses its own peculiar way of defining the penalty, but some
choice of scaling should be able to bring glmnet in line as well.
 
Kind regards,
 
Jelle
 
 





From: Robert V (Bob) Sasseen [mailto:sass...@ai.sri.com] 
Sent: 29 October 2009 19:32
To: Goeman, J.J. (MSTAT)
Subject: different L2 regularization behavior between lrm,
glmnet, and penalized? (original question)


Jelle,

Below is the original question. The formatting apparently got a
bit messed up when it was parsed by the list server. This may be easier
to read.

Bob



-

The following R code using different packages gives the same
results for a simple logistic regression without regularization, but
different results with regularization. This may just be a matter of
different scaling of the regularization parameters, but if anyone
familiar with these packages has insight into why the results differ,
I'd appreciate hearing about it. I'm new to R. Thanks. (Version info
below. Same results on Windows and Solaris 8, except that I haven't
gotten glmnet to compile on the latter.)

Robert V (Bob) Sasseen
sass...@ai.sri.com


> # Some x values (predictive var).
> x = c(-2, -2, -2, -2, -1, -1, -1, 2, 2, 2, 3, 3, 3, 3)

> # Some z values--the observed outcome.
> # Effect is that for 
> # x = -2, p = 1/4; 
> # x = -1, p = 1/3; 
> # x =  2, p = 2/3; 
> # x =  3, p = 3/4.
> z = c(0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1)


> library(Design)
> g = lrm(z ~ x)
> g$coefficients
 Intercept  x 
-0.2224842  0.4449685 

> g = lrm(z ~ x, penalty = 1)
> g$coefficients
 Intercept  x 
-0.1620727  0.3241454 


> library(glmnet)
> g = glmnet(cbind(x), cbind(1-z, z), family = "binomial",
lambda = 0, standardize = FALSE)
> coef(g)
   1
  -0.2224843
x  0.4449687

> g = glmnet(cbind(x), cbind(1-z, z), family = "binomial",
lambda = 1, alpha = 0, standardize = FALSE)
> coef(g)
   1
  -0.1098361
x  0.2196721


> library(penalized)
> fit = penalized(z ~ x)
> coefficients(fit, "all")
(Intercept)   x 
 -0.2224843   0.4449687 

> fit = penalized(z ~ x, lambda2 = 1)
> coefficients(fit, "all")
(Intercept)   x 
 -0.2060658   0.4121315 


> sessionInfo()
R version 2.9.2 (2009-08-24) 
i386-pc-mingw32 

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] splines   stats graphics  grDevices utils datasets
methods  
[8] base 

other attached packages:
[1] penalized_0.9-26   glmnet_1.1-3   Matrix_0.999375-30
lattice_0.17-25   
[5] Design_2.3-0   Hmisc_3.7-0survival_2.35-4   

loaded via a namespace (and not attached):
[1] cluster_1.12.0 grid_2.9.2


-- 

Robert V (Bob) Sasseen 
Senior Software Engineer 
Advanced Analytics 
Artificial Intelligence Center 
SRI International 


  


3661 Valley Centre Drive, Suite 150 
San Diego, CA 92130 
robert.sass...@sri.com 
(858) 350-2035 




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide com

[R] Multidimensional scaling

2009-10-30 Thread Hollix

Hi there,

when conducting a multidimensional scaling analysis, it is possible to do
this on a correlation matrix?

I calculated a dissimilarity matrix by substracting the corr.matrix from 1.
Then I calculated the distance matrix with d <- dist(dissimilarity matrix)
and ran the isoMDS function.

Is that correct? 

In addition: The stress value was 2.9. I read that this is a percentage. So
- according to the usual manner to report stress values from 0 to 1, this
would be .029?

Finally, I compared the solution (diagram) with the alscal-algorithm from
SPSS (however computed on the raw data) and there were some slight
differences. Can you tell me the differences between alscal and isoMDS?

As you might see, I am not very experienced in MDS. Thus, any recommendation
(literatur etc.) would be a great help.

Thanks
Holger
-- 
View this message in context: 
http://old.nabble.com/Multidimensional-scaling-tp26126655p26126655.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] possible memory leak in predict.gbm(), package gbm ?

2009-10-30 Thread Markus Loecher
Dear gbm users,
When running predict.gbm() on a "large" dataset (150,000 rows, 300 columns,
500 trees), I notice that the memory used by R grows beyond reasonable
limits. My 14GB of RAM are often not sufficient. I am interpreting this as a
memory leak since there should be no reason to expand memory needs once the
data are loaded and passed to predict.gbm() ?


Running R version 2.9.2 on Linux, gbm package 1.6-3.

Thanks,

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Back-to-back graph with lattice package

2009-10-30 Thread Erik Svensson
Hi,
I am an [R] and lattice beginner.
I an doing an epidemiological study and want to present my data in a
condensed and clear manner as possible.

The example below is fake, but representative for the the data I want to show.

I have a data frame with different categorical data and one column of
numerical data:

Country, Hair, Eyes, Sex, Age
sweden, blond, blue, male, 25
sweden, dark, brown, female, 30
denmark, dark, blue, male, 22
denmark, blond, blue, female 33
norway, dark, blue, female, 21
norway, blond, blue, male, 31
finland, dark, blond, male, 43
island, blond, blue, female, 40
...and so on for 300 rows...

Country <- as.factor(rep(c("Sweden", "Denmark", "Norway", "Finland",
"Iceland","Sweden", "Denmark", "Norway", "Finland","Sweden" ),50))
Hair <- as.factor(rep(c("blond",  "dark", "blond", "dark", "blond"),100))
Eyes <- as.factor(rep(c("blue",  "blue", "blue", "brown"),125))
Sex <- as.factor(rep(c("male",  "female", "female", "male"),125))
Age <- rep(c(1:20, 1:100, 76:80),4)


Country has 5 levels, Age has many.
Hair, Eyes, and Sex have two levels each

I want to do one lattice graph with 5 columns of panels with these
descriptive data.
I would prefer to use the lattice barchart or the corresponding
dotplot with lines (except for the Age panel)

The  Age panel and probably the Country panel I think I can solve myself:
barchart(Country)  # This only gives one panel, though...
densityplot(~ Age|Country,  layout=c(1,5), xlim=c(0,100),scales =
list(y=list (relation="free")))

But, for the dichotomous variables Hair, Eyes, and Sex columns I would
like to have a bihistogram aka back-to-back aka age-sex-pyramid
solution.
I want all bars to be horizontal, the names of the countries to the
left on the y-axis.
I would also like to sort the country names according to a list of their names.

For the back-to-back graph  I have only managed to get two panels per
column instead of one:
barchart(table(Country, Hair),groups=FALSE)

How can I do all this? Is it possible at all?

Erik Svensson

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fractal Package - Hurst exponent

2009-10-30 Thread Ricardo Rios
Hi, everybody.

My name is Ricardo and I'm doing doctorate at USP (Brazil). I have used a
lot the package 'fractal', developed by William Constantine and Donald
Percival, in my research.

However, I have a doubt about the execution of RoverS function. I was
reading an article (http://www.bearcave.com/misl/misl_tech/wavelets/hurst/)
about the hurst exponent and I resolved to compare the results present there
and the results provided by RoverS function. Initially, I got a synthetic
data set, available in
http://www.bearcave.com/misl/misl_tech/wavelets/hurst/brown72.h, and I
loaded the data set into the variable dataH and executed it using
RoverS(dataH) command.

So, this command returned the result:

> RoverS(dataH)
X
0.7165741

However, the result of the hurst exponent provided by the website is 0.7270.
Is it a problem or can I use the value 0.72? Can this difference cause
biggest errors in others data sets? Could anyone help me?

Thanks for your attention!
Ricardo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time series temporal disaggregation

2009-10-30 Thread Gabor Grothendieck
See:

https://stat.ethz.ch/pipermail/r-sig-finance/2009q3/004677.html

On Fri, Oct 30, 2009 at 6:09 AM, Axel Leroix  wrote:
>
> Hi,
> This is a newbie question.
> I would to be able to convert annual time series of flow data into quarterly 
> data. I wonder if there is any existing R-function which permits to do it? In 
> what package ?
>
> I the archive, i found that some poeple speak about "tempDis" package for 
> performing time series temporal disaggregation, but when I try to download it 
> I can not found it in the list of proposed packages.
>
> Tank you in advance for your help.
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multicore package: sharing/modifying variable accross processes

2009-10-30 Thread Renaud Gaujoux

Hi,

I want to parallelize some computations when it's possible on multicore 
machines.
Each computation produces a big objects that I don't want to store if 
not necessary: in the end only the object that best fits my data have to 
be returned. In non-parallel mode, a single gloabl object is updated if 
the current computation gets a better result than the best previously found.
My plan was to use package multicore. But there is obviously an issue of 
concurrent access to the global result variable.
Is there a way to implement something like a lock/mutex to ensure make 
the procedure thread safe?

Maybe something already exist to deal with such things?
It looks like package multicore run the different processes in different 
environments with copy-on-change of everything when forking. Anybody has 
experimented working with a shared environment with package multicore?


Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Names of list members in a plot using sapply

2009-10-30 Thread Gabor Grothendieck
You could lapply over the names:

junk <- lapply(names(list1), function(nm)
   with(list1[[nm]], plot(x, y, type = "l", main = paste("Graphic
of", nm

which is a slight improvement though that still involves dealing with
a list and its names separately.

On Fri, Oct 30, 2009 at 3:41 AM, Kenneth Roy Cabrera Torres
 wrote:
> Hi R users:
>
> I got this code to generate a graphic for each member of a lists.
>
> list1<-list(A=data.frame(x=c(1,2),y=c(5,6)),B=data.frame(x=c(8,9),y=c(12,6)))
> names1<-names(list1)
> sapply(1:length(list1),function(i)
> with(list1[[i]],plot(x,y,type="l",main=paste("Graphic of",names1[i]
>
> Is there a more elegant solution for not to use two separate lists?
>
> I would like to just use
>
> sapply(list1, somePlotfunction)
>
> and plot for each graphic in the title have the name of the member of
> the list.
>
> Thank you for your help.
>
> Kenneth
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] ascii package updated

2009-10-30 Thread David Hajage
Hello,

ascii package has been updated to version 0.3.2.
ascii is a R package for writing asciidoc, txt2tags, sphinx or org documents
with embeded R code. See http://eusebe.github.com/ascii/ for some examples.

News since version 0.2 are:

 - new sphinx driver and output (see http://sphinx.pocoo.org/)
 - new org driver and output (see http://orgmode.org/)
 - ascii.simple.list method
 - ftable method
 - packageDescription method
 - sessionInfo method
 - new option "label" for list.type argument
 - new rownames and colnames arguments
 - cgroup argument works with txt2tags output
 - improve col alignment in txt2tags output
 - improve row and col span (cgroup and rgroup)
 - remove SweaveAscii() function
 - new Asciidoc(), T2t(), Sphinx() and Org() functions (wrapper for
Sweave("file.Rnw", RweaveXxx))

Best,

David Hajage

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Names of list members in a plot using sapply

2009-10-30 Thread jim holtman
sapply(names(list1), function(.data.){
with(list1[[.data.]], plot(x, y, type='l', main=paste("Graphic of", .data)))
})

On Fri, Oct 30, 2009 at 3:41 AM, Kenneth Roy Cabrera Torres
 wrote:
> Hi R users:
>
> I got this code to generate a graphic for each member of a lists.
>
> list1<-list(A=data.frame(x=c(1,2),y=c(5,6)),B=data.frame(x=c(8,9),y=c(12,6)))
> names1<-names(list1)
> sapply(1:length(list1),function(i)
> with(list1[[i]],plot(x,y,type="l",main=paste("Graphic of",names1[i]
>
> Is there a more elegant solution for not to use two separate lists?
>
> I would like to just use
>
> sapply(list1, somePlotfunction)
>
> and plot for each graphic in the title have the name of the member of
> the list.
>
> Thank you for your help.
>
> Kenneth
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Time series temporal disaggregation

2009-10-30 Thread Axel Leroix
 
Hi,
This is a newbie question. 
I would to be able to convert annual time series of flow data into quarterly 
data. I wonder if there is any existing R-function which permits to do it? In 
what package ?
 
I the archive, i found that some poeple speak about "tempDis" package for 
performing time series temporal disaggregation, but when I try to download it I 
can not found it in the list of proposed packages. 
 
Tank you in advance for your help.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame extracting data row-wise

2009-10-30 Thread Christian Lerch
Nice workaround :-). Thank you.
Best,
Christian
 Original-Nachricht 
> Datum: Fri, 30 Oct 2009 10:54:49 +0100
> Von: Karl Ove Hufthammer 
> An: r-h...@stat.math.ethz.ch
> Betreff: Re: [R] data.frame extracting data row-wise

> On Fri, 30 Oct 2009 10:42:12 +0100 Christian Lerch  
> wrote:
> > I am struggling with extracting data from a data frame:
> > x=data.frame(a=1:11,b=100:110)
> > 
> > What I want is a list/vector in this sence: 1 100 2 101 3 102...
> 
> y=t(as.matrix(x))
> as.vector(y)
> 
> Happy to help.
> 
> -- 
> Karl Ove Hufthammer
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] insert a text in panels, always in the same position (lattice, ltext, ?prepanel?)

2009-10-30 Thread Deepayan Sarkar
On Fri, Oct 30, 2009 at 2:10 AM, Ottorino-Luca Pantani
 wrote:
> Dear R-users,
> my present problem is related to lattice.
>
> I would like to put some text in each panel, namely a p-value.
> I therefore wrote a simple panel function as reported here below.
>
> I'm able to write the value in each panel at the maximum value of y for each
> panel,
> but this obviously overlay the text to the points.
>
> What I'm looking for is to write the text always in the same position along
> the panels,
> say in the middle of each panel
> something like  "ltext(x = 1.5,  y = (ylim[2] - ylim[1])/2 )"
>
> I tried to play with "ylim" and prepanel but unsuccesfully.

Use the (lower-level) grid function grid.text(), which gives you more
flexibility in units. You want something like

 grid.text(label, x = unit(0.5, "npc"), y = unit(0.5, "npc"))

(Of course, a lattice-only solution is also possible; you could also
use current.panel.limits() to compute the center of the panel in
native coordinates, and then use panel.text().)

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame extracting data row-wise

2009-10-30 Thread Karl Ove Hufthammer
On Fri, 30 Oct 2009 10:42:12 +0100 Christian Lerch  
wrote:
> I am struggling with extracting data from a data frame:
> x=data.frame(a=1:11,b=100:110)
> 
> What I want is a list/vector in this sence: 1 100 2 101 3 102...

y=t(as.matrix(x))
as.vector(y)

Happy to help.

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data.frame extracting data row-wise

2009-10-30 Thread Christian Lerch
Dear All,

I am struggling with extracting data from a data frame:
x=data.frame(a=1:11,b=100:110)

What I want is a list/vector in this sence: 1 100 2 101 3 102...

For single rows, this works fine:
as.matrix(x)[1,]

For, say 2 rows, this works fine:
z=c(as.matrix(x)[1,],as.matrix(x)[2,])

But
z=c(as.matrix(x)[1:2,])

gives 1 2 100 101!?

Is there an 'automated way' to produce a list/vector from a data frame in which 
data of each row are just added to the end of the previous row?

Many thanks,
Christian
-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Applying a function on n nearest neighbours

2009-10-30 Thread Karl Ove Hufthammer
On Fri, 30 Oct 2009 10:28:49 +0100 Karl Ove Hufthammer  
wrote:
> $ (pos=which(order(abs(iris$Sepal.Length-x)) %in% 2:6))

This should of course be:
(pos=order(abs(iris$Sepal.Length-x))[2:6])

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] whitespace with subscript text

2009-10-30 Thread Uwe Ligges



e-letter wrote:

Readers,

I have the following command:

expression(A[1]B~"%")

but that causes an error so I changed to:

expression(A[1]~B~"%")

but the result is too much whitespace between subscript and B. Is
there a way to reduce this whitespace?



Use * rather than ~

Uwe Ligges



Yours,

rhelp at conference.jabber.org
r 251
mandriva 2008

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to test for stationarity in time series?

2009-10-30 Thread Javed Pathan
Hi Luna,
"tseries" package provides various stationarity and unit root
tests including Augmented Dickey-Fuller, Phillips-Perron, and KPSS.
Alternative implementations of the ADF and KPSS tests are in the
"urca" package, which also includes further methods such as
Elliott-Rothenberg-Stock, Schmidt-Phillips and Zivot-Andrews tests. The
"fUnitRoots" package also provides the MacKinnon test. "CADFtest"  provides
implementations of both the standard ADF and a covariate-augmented ADF
(CADF) test.

Thanks,
Javed Pathan
Chief Statistician
Manthan Software Services Ltd, Bangalore
India


On Fri, Oct 30, 2009 at 1:39 PM, Liviu Andronic wrote:

> On 10/30/09, Luna Laurent  wrote:
> >  Could anybody tell me how to test for stationarity in time series?
> >
>
> http://www.rseek.org/?cx=010923144343702598753%3Aboaz1reyxd4&q=stationarity+time-series&sa=Search+functions%2C+lists%2C+and+more&cof=FORID%3A11
>
> Liviu
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] x-y plot as an histogram

2009-10-30 Thread Federico Abascal
Thanks for all the suggestions.

Cut + plotmeans (gplots) solved my problem.
> groups <- cut(t[,1],breaks=5)
> plotmeans(t[,2]~groups)

Thanks!
Federico

--- El jue, 29/10/09, Jim Lemon  escribió:

De: Jim Lemon 
Asunto: Re: [R] x-y plot as an histogram
Para: "Federico Abascal" 
CC: r-help@r-project.org
Fecha: jueves, 29 de octubre, 2009 23:29

On 10/30/2009 06:12 AM, Federico Abascal wrote:
> Hi,
> 
> I am investigating a problem for which I found no solution.
> I have a matrix with two columns. I have plotted it as an x-y plot 
> (plot(matrix[,1],matrix[,2])
> But this is not apropriate for my purposes. I need to group the data in 
> matrix[,1] into groups (as an histogram would do). Then I have to calculate 
> the average of matrix[,2] for each group.
> 
> Before starting with loops and loops I would like to know if is there some 
> way to do this easily with R.
Hi Federico,
If I understand your problem, I think you want first to use "cut":

groups<-cut(matrix[1,],breaks=???)

where ??? is the vector of breakpoints that you want. Then you can use whatever 
averaging function (AVEFUN) you want with something like "by":

by(matrix[,2],groups,AVEFUN)

Jim




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Temperature Prediction Model

2009-10-30 Thread bartjoosen

Hi,

also interested...

If you are checking for non-normal behaviour, first let us define normal
behaviour: only small temperature changes and no steep ramps?

If so, maybe you can make an rolling average of the last x points, and check
if the following point deviates more than ... ?

Or is it more like a trend analysis, to see whether the last variation is a
trend or normal variation?
Then you can take a look at change point analysis.

gl

Bart



Aneeta wrote:
> 
> Thank you Clint for your response. I am happy to know that you have gotten
> interested in this analysis.:-)
> 
> Let me give you some details about sensor networks that would help you
> understand my goal better. Sensor nodes run on small batteries which have
> limited life. So communication amongst various nodes is kept at a minimum
> to preserve battery life. Hence, I am working on a localized analysis
> which each node could perform on its own. 
> 
> The disturbances caused by the malicious nodes would be abrupt and for a
> short span. But please note that the data set that I have supplied is from
> a perfectly working sensor network which is not under any attack. The
> attack model will be simulated separately. I apologize if I have given you
> the idea that the data set consists of noisy data. 
> 
> Ideally I would first like to build a model for the normal behaviour of
> each node from the data set which is at hand. Once we are able to define
> the normal behaviour I could introduce some gorillas and see if we can
> detect that some node is under attack. I have this part figured out.
> 
> I am not looking at a day-of-week dependence. Rather I am looking for a
> time-of-day dependence. We could take into consideration more than 7 days
> of data if that gives us a stronger model. What I was hoping to do was to
> plot the data for the past 7-days on top of each other to get a general
> idea on how the temperature varies throughout the day and thus build an
> equation which would calculate the temperature given the time of the day. 
> 
> Thank you again for your help.
> 
> Best Regards,
> Aneeta
> 
> 
> 
> 
> 
> Clint Bowman wrote:
>> 
>> Aneeta,
>> 
>> My "gorilla and mouse" analogies were referring to the magnitude of 
>> the disturbance and also to its time signature.  Are you only 
>> interested in the large disturbance which is abrupt (the gorilla)? 
>> Or do you also want to be able to detect the more surreptitious 
>> attack which may be quite gradual (the mouse)?
>> 
>> You will want to define the magnitude (and perhaps the associated 
>> duration) of the smallest disturbance that would be important.  I 
>> would look at the entire data set to see what would be the 
>> likelihood of detecting such a change given the noise in the 
>> temperature data.  Or alternatively, use the global analysis to 
>> help define the minimum disturbance that could be detected.
>> 
>> Then see what can be done with just the first 7 days of data (or 
>> for matter the past 7 days regardless of when they occur).
>> 
>> I applaude your goal of looking at each sensor without referring to 
>> other nodes but I think I would develop the analysis by looking for 
>> anomalies in one sensor's data when compared with other sensors and 
>> then focusing on those periods to determine an approach for 
>> detecting a disturbance.
>> 
>> Because you are looking at 7 days, should we assume that you expect 
>> a day-of-week dependence?  If so, I'd be more comfortable if you 
>> used more than one week to develop it.
>> 
>> I fear that you've gotten me quite interested in this analysis, 
>> good luck.
>> 
>> Clint
>> 
>> -- 
>> Clint Bowman INTERNET:   cl...@ecy.wa.gov
>> Air Quality Modeler  INTERNET:   cl...@math.utah.edu
>> Department of EcologyVOICE:  (360) 407-6815
>> PO Box 47600 FAX:(360) 407-7534
>> Olympia, WA 98504-7600
>> 
>> On Sun, 25 Oct 2009, Aneeta wrote:
>> 
>>>
>>> Thank you everyone for all the responses.
>>>
>>> Clint you are correct in assuming that the problem deals with sensors in
>>> a
>>> lab setup which can be assumed to be isolated from outside temperature
>>> changes. And, I am only dealing with temperature so the other parameters
>>> are
>>> not important.
>>>
>>> There will be no gorillas or mouses in the picture but rather some
>>> malicious
>>> attacker who would try to cause disturbances in the normal readings.
>>> That is
>>> why it is important to have an equation that defines 'normal behaviour'.
>>>
>>> The data-sets contain readings for multiple days. I want to take the
>>> first 7
>>> days for each node and establish a relationship between time(column 2)
>>> and
>>> temperature(column 4).
>>>
>>> My objective is not to model temperature variation throughout the year
>>> and
>>> take into consideration climatic changes. Rather, it is to define a
>>> model
>>> for the given data which happens to be temperature recorded by nodes. In
>>> a
>>> simple w

Re: [R] similarity measure for binary data

2009-10-30 Thread Javed Pathan
Hi Karuna,
You can try vegdist() function in the vegan package. The function
computes dissimilarity indices that are useful for or popular with community
ecologists. All indices use quantitative data, although they would be named
by the corresponding binary index, but you can calculate the binary index
using an appropriate argument.
**
Thanks,
Javed Pathan
Chief Statistician
Manthan Software Services Ltd,Bangalore
India



On Fri, Oct 30, 2009 at 1:19 PM, Rodrigo Villegas wrote:

> The 'Gower' metric is one that is commonly used.
>
> Rod
>
> On Thu, Oct 29, 2009 at 10:22 AM, karuna m  wrote:
> > I am doing hierarchical clustering with cluster package.  I couldnot find
> similarity measures like matching coefficient, Jaccard coefficient and sokal
> and sneath. Could anyone please tell package with similarity measures for
> binary data?
> > kind regards,
> >
> > Ms.Karunambigai M
> > PhD Scholar
> > Dept. of Biostatistics
> > NIMHANS
> > Bangalore
> > India
> >
> >
> >  From cricket scores to your friends. Try the Yahoo! India Homepage!
> http://in.yahoo.com/trynew
> >[[alternative HTML version deleted]]
> >
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Applying a function on n nearest neighbours

2009-10-30 Thread Karl Ove Hufthammer
I'm having a problem where I have to apply a function to a subset of a 
variable, where the subset is defined by the n nearest neighbours of a 
second variable.

Here's an example applied to the 'iris' dataset:

$ head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1  5.1 3.5  1.4 0.2  setosa
2  4.9 3.0  1.4 0.2  setosa
3  4.7 3.2  1.3 0.2  setosa
4  4.6 3.1  1.5 0.2  setosa
5  5.0 3.6  1.4 0.2  setosa
6  5.4 3.9  1.7 0.4  setosa

For each row, I look at the value of Sepal.Length. I then figure out the 
n rows where the value of Sepal.Length is closest to that in the 
original row, and apply a function on the values of Sepal.Width to these 
rows (typically returning a scalar).

For example, setting n = 5 and calculcating the mean on a slightly 
modified dataset, based on the first row (Sepal.Length ~= 5.1):

$ set.seed(1)
$ iris[,1:4]=iris[,1:4]+runif(150)/100
$ x=iris$Sepal.Length[1]
$ (pos=which(order(abs(iris$Sepal.Length-x)) %in% 2:6))
[1] 18 26 40 42 52
$ mean(iris$Sepal.Width[pos])
[1] 3.086595

Now, I could easily use a 'for' loop or 'sapply' to do this for all 
rows, but I would think there is a better (and perhaps even faster?) 
way. Anyone know of a specific function in a package for this sort of 
thing?

Also note that this way of doing it won't necessarily work on the 
unmodified dataset, where a number of rows have the same values for 
'Sepal.Length', and the original row won't necessarily have 'order' 
value equal to 1. (Exactly how to break ties when there are more than n 
number of observations with the same distance to the original row isn't 
very important, though. For example, using the ones with lowest row 
numbers would be an OK solution, or n random ones, would both OK.)

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] whitespace with subscript text

2009-10-30 Thread e-letter
Readers,

I have the following command:

expression(A[1]B~"%")

but that causes an error so I changed to:

expression(A[1]~B~"%")

but the result is too much whitespace between subscript and B. Is
there a way to reduce this whitespace?

Yours,

rhelp at conference.jabber.org
r 251
mandriva 2008

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] insert a text in panels, always in the same position (lattice, ltext, ?prepanel?)

2009-10-30 Thread Ottorino-Luca Pantani

Dear R-users,
my present problem is related to lattice.

I would like to put some text in each panel, namely a p-value.
I therefore wrote a simple panel function as reported here below.

I'm able to write the value in each panel at the maximum value of y for 
each panel,

but this obviously overlay the text to the points.

What I'm looking for is to write the text always in the same position 
along the panels,

say in the middle of each panel
something like  "ltext(x = 1.5,  y = (ylim[2] - ylim[1])/2 )"

I tried to play with "ylim" and prepanel but unsuccesfully.

Thank a lot in advance

#

df <-
 expand.grid(a = gl(3,1),
 b = gl(2,1),
 c = gl(2,1),
 d = gl(2,1),
 rep = gl(3,1))
df$x <-rnorm(nrow(df))
df$x[c(1, 3, 25, 27, 49, 51)] <- df$x[c(1, 3, 25, 27, 49, 51)] + 5
## modify some values in order to have some significant values (1st and 
3rd panel)

my.panel <- function (x, y, ...)
{
 lm1 <- lm(y ~ x) ## build the model
 p.value.diff <- coef(summary(lm1))[2, 4]## extract the p-value
 panel.fill(col =  ifelse(p.value.diff < 0.05, "lightgrey", 
"transparent")) ## background color as function of p-value

 panel.xyplot(x, y) ## draw the points
 ## insert the p-value in each panel
 ltext(x = 1.1,
   y = max(y),
   ## the above is the problematic point. I tried also y = 
panel.default.xyplot(list(ylim[2])) but without success
   labels = paste("p", as.character(round(p.value.diff, 3)), sep = 
"="))

}
##
xyplot(x ~ d| a + b + c,
  data = df,
  layout = c(6, 2),
  strip = strip.custom(strip.names = TRUE, strip.levels =TRUE),   
  panel = my.panel)


--
Ottorino-Luca Pantani, Università di Firenze
Dip. Scienza del Suolo e Nutrizione della Pianta
P.zle Cascine 28 50144 Firenze Italia
Ubuntu 8.04.3 LTS -- GNU Emacs 23.0.60.1 (x86_64-pc-linux-gnu, GTK+ 
Version 2.12.9)

ESS version 5.5 -- R 2.9.2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing & generating data by category

2009-10-30 Thread Adaikalavan Ramasamy
Hmm, so if read correctly you want to remove exactly duplicated rows. So 
maybe try the following to begin with.


 duplicated(newdf[ , c("id", "loc", "clm")])
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE 
TRUE TRUE


Then you can remove the duplicated rows before proceeding with what has 
been suggested before.


Also you can try unique(newdf[ , c("id", "loc", "clm")]) if you are not 
interested in carrying over other corresponding variables.


See help(duplicated) and help(unique).

Regards, Adai




David Winsemius wrote:

Color me puzzled. Can you express the run more clearly in Boolean logic?

If someone has five policies: 3 Life and 2 General ...  is he in or out?

Applying the alternate strategy to that data set I get:
out <- tapply( dat$clm, dat$uid, paste ,collapse=",")
 >
 > out
  A1.B1   
A2.B2  A3.B1
  "General"  
"General,Life"  "General"
  A3.B3   
A4.B4  A5.B5
"General,Life,General,General"  
"General,Life,General" "General,Life"


Please explain why you want A3.B3.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Agglomerative Information Bottleneck (AIB)

2009-10-30 Thread rami jiossy

Hi;

Does anybody aware of R implementation for AIB (Agglomerative Information 
Bottleneck) by Slomni and Tishby?

in which package should i find it?

Thanks
  
_


WLMTAGL:ON:WL:en-US:WWL_WIN_evergreen1:102009
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-SIG-Finance] Fast optimizer

2009-10-30 Thread Christophe Dutang


Le 30 oct. 2009 à 02:21, R_help Help a écrit :


Ok. I have the following likelihood function.

L <- p*dpois(x,a)*dpois(y,b+c)+(1-p)*dpois(x,a+c)*dpois(y,b)

where I have 100 points of (x,y) and parameters c(a,b,c,p) to
estimate. Constraints are:

0 < p < 1
a,b,c > 0
c < a
c < b

I construct a loglikelihood function out of this. First ignoring the
last two constraints, it takes optim with box constraint about 1-2 min
to estimate this. I have to estimate the MLE on about 200 rolling
windows. This will take very long. Is there any faster implementation?
Take a look at the CRAN task view on optimisation, you may find faster  
algorithms.




Secondly, I cannot incorporate the last two contraints using optim  
function.

you can test the trick

Ltilde <- function(a,b,c,p) ifelse( c < a & c < b, p*dpois(x,a)*dpois 
(y,b+c)+(1-p)*dpois(x,a+c)*dpois(y,b), -10^6)


Christophe



Thank you,

rc



On Thu, Oct 29, 2009 at 9:02 PM, Ravi Varadhan   
wrote:


You have hardly given us any information for us to be able to help  
you.  Give us more information on your problem, and, if possible, a  
minimal, self-contained example of what you are trying to do.


Ravi.


Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: R_help Help 
Date: Thursday, October 29, 2009 8:24 pm
Subject: [R] Fast optimizer
To: r-help@r-project.org, r-sig-fina...@stat.math.ethz.ch



Hi,

I'm using optim with box constraints to MLE on about 100 data  
points.

It goes quite slow even on 4GB machine. I'm wondering if R has any
faster implementation? Also, if I'd like to impose
equality/nonequality constraints on parameters, which package I  
should

use? Any help would be appreciated. Thank you.

rc

__
R-help@r-project.org mailing list

PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.




___
r-sig-fina...@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only.
-- If you want to post, subscribe first.


--
Christophe Dutang
Ph.D. student at ISFA, Lyon, France
website: http://dutangc.free.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to test for stationarity in time series?

2009-10-30 Thread Liviu Andronic
On 10/30/09, Luna Laurent  wrote:
>  Could anybody tell me how to test for stationarity in time series?
>
http://www.rseek.org/?cx=010923144343702598753%3Aboaz1reyxd4&q=stationarity+time-series&sa=Search+functions%2C+lists%2C+and+more&cof=FORID%3A11

Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2: Histogram with negative values on x-axis doesn't work

2009-10-30 Thread baptiste auguie
2009/10/30 hadley wickham :
> I read anything that mentions
> ggplot2 no matter where it is.
>

... one should hope this statement only applies to "the Internet"
though, does it? Please do share your regexp if it's not the case.

:)

baptiste

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] similarity measure for binary data

2009-10-30 Thread Rodrigo Villegas
The 'Gower' metric is one that is commonly used.

Rod

On Thu, Oct 29, 2009 at 10:22 AM, karuna m  wrote:
> I am doing hierarchical clustering with cluster package.  I couldnot find 
> similarity measures like matching coefficient, Jaccard coefficient and sokal 
> and sneath. Could anyone please tell package with similarity measures for 
> binary data?
> kind regards,
>
> Ms.Karunambigai M
> PhD Scholar
> Dept. of Biostatistics
> NIMHANS
> Bangalore
> India
>
>
>      From cricket scores to your friends. Try the Yahoo! India Homepage! 
> http://in.yahoo.com/trynew
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >