Re: [R] readline() and Rterm in Windows

2005-11-03 Thread Mikkel Grum
I've tried your proposal in a number of ways, and
there must be something I'm not understanding. If I
run your script (using source() in RGui, or ctrl-R
from the R Editor, I get:

 conout - file('CONOUT$','w')
Error in file(CONOUT$, w) : unable to open
connection
In addition: Warning message:
cannot open file 'CONOUT$', reason 'Permission denied'



so I added the path as in:

conout - file('C:\\R\\R-2.2.0\\CONOUT$','w')
conin - file('C:\\R\\R-2.2.0\\CONIN$', 'r')
cat('Please enter an ID:', file=conout)
flush(conout)
id - readLines(conin, 1)
print(id)


Using RGui and ctrl-R from the R Editor, I get

 conout - file('C:\\R\\R-2.2.0\\CONOUT$','w')
 conin - file('C:\\R\\R-2.2.0\\CONIN$', 'r')
Error in file(C:\\R\\R-2.2.0\\CONIN$, r) : 
unable to open connection
In addition: Warning message:
cannot open file 'C:\R\R-2.2.0\CONIN$', reason 'No
such file or directory' 
 cat('Please enter an ID:', file=conout)
 flush(conout)
 id - readLines(conin, 1)
Error in readLines(conin, 1) : object conin not
found

and with
 source(foo.R)
Error in file(C:\\R\\R-2.2.0\\CONIN$, r) : 
unable to open connection
In addition: Warning message:
cannot open file 'C:\R\R-2.2.0\CONIN$', reason 'No
such file or directory' 
 

When I create a batch file with the following command
:
C:\R\R-2.2.0\bin\Rterm.exe --vanilla
C:\R\R-2.2.0\foo.R C:\R\R-2.2.0\foo.out

and double click on the batch file, the out file gives
me:

R : Copyright 2005, The R Foundation for Statistical
Computing
Version 2.2.0  (2005-10-06 r35749)
ISBN 3-900051-07-0
. . .
Type 'q()' to quit R.

 conout - file('C:\\R\\R-2.2.0\\CONOUT$','w')
 conin - file('C:\\R\\R-2.2.0\\CONIN$', 'r')

and nothing else. In none of the situations do I get
prompted for input. What am I doing hopelessly wrong?

Mikkel

--- Duncan Murdoch [EMAIL PROTECTED] wrote:

 Mikkel Grum wrote:
  Duncan and Prof, thanks for your comments and
  apologies for not being more specific. I'm not
 getting
  the same results you get from the steps you
 propose.
  
  If I write a script foo.R with two lines
  
  id - readline(Please enter an ID: )
  id
  
  and then use source(foo.R) (either at the Rterm
  prompt, or in RGui) it is true that get prompted,
 but
  the second line does not visibly run, i.e. I get
  
 source(id.r)
  
  Please enter an ID: 5
  
  
  and if I then type id, I get
  
 id
  
  [1] id
  
  If I cut and paste the two lines in RGui (in one
 go),
  I get
  
 id - readline(Please enter an ID: )
  
  Please enter an ID: id
  
  
  What I really want is a batch file on the desktop
 with
  the following commands:
  
 c:\r\R-2.2.0\bin\Rterm.exe --no-save
 --no-restore
  script.R script.out 21
 c:\texmf\miktex\bin\latex
  \nonstopmode\input{blue.tex}
  
  
  and script.R reads something like:
  
 id - readline(Please enter an ID: )
 id
 Sweave(blue.Rnw)
  
  I said that script.R didn't run, which was an
  incorrect description. It runs without prompting
 for
  the ID, and gives error messages all through
 because
  blue.Rnw needs the id.
  
  This is a very simplified version of what I'm
 doing,
  but if I use only the first line of the batch file
 and
  the first two lines of the script and could get
 that
  to work, I could figure out the rest.
 
 It won't work so simply.  You're redirecting stdin,
 so user input would 
 be taken from there; you're redirecting stdout and
 stderr, so the prompt 
 won't be visible to the user.
 
 You need to open new handles to the console.  The
 code below will do it 
 in Windows; the syntax to specify the console in
 Unix-alikes will be 
 different (but I don't know what it is).
 
 conout - file('CONOUT$','w')
 conin - file('CONIN$', 'r')
 cat('Please enter an ID:', file=conout)
 flush(conout)
 id - readLines(conin, 1)
 print(id)
 
 Duncan Murdoch


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R save very huge matrices in files

2005-11-03 Thread moritz . marienfeld

I have to work with really huge matrices (about 1000*1000 or more). And I want 
to save those matrices in some file on my computer.
I tried to do so by using the command

write.tabe(SMatrix,file=C:/Programme/rw1061/SMatrix.txt,sep= 
,quote=FALSE,row.names=FALSE,col.names=FALSE)

SMatrix is the matrix I want as a file.

Unfortunately this does not work. Error message:

Error: cannot allocate vector of size 32665 Kb
In addition: Warning message: 
Reached total allocation of 255Mb: see help(memory.size) 

Is there any command, which could help? What can I to in order to save SMatrix?

Moritz Marienfeld

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] readline() and Rterm in Windows

2005-11-03 Thread Duncan Murdoch
Mikkel Grum wrote:
 I've tried your proposal in a number of ways, and
 there must be something I'm not understanding. If I
 run your script (using source() in RGui, or ctrl-R
 from the R Editor, I get:

It requires a command line console, i.e. it will only work in Rterm, not 
Rgui.  I was assuming you'd run it using the style of your batch file 
down below, but without changing the paths.

Duncan Murdoch
 
 
conout - file('CONOUT$','w')
 
 Error in file(CONOUT$, w) : unable to open
 connection
 In addition: Warning message:
 cannot open file 'CONOUT$', reason 'Permission denied'
 
 
 
 so I added the path as in:
 
 conout - file('C:\\R\\R-2.2.0\\CONOUT$','w')
 conin - file('C:\\R\\R-2.2.0\\CONIN$', 'r')
 cat('Please enter an ID:', file=conout)
 flush(conout)
 id - readLines(conin, 1)
 print(id)
 
 
 Using RGui and ctrl-R from the R Editor, I get
 
 
conout - file('C:\\R\\R-2.2.0\\CONOUT$','w')
conin - file('C:\\R\\R-2.2.0\\CONIN$', 'r')
 
 Error in file(C:\\R\\R-2.2.0\\CONIN$, r) : 
 unable to open connection
 In addition: Warning message:
 cannot open file 'C:\R\R-2.2.0\CONIN$', reason 'No
 such file or directory' 
 
cat('Please enter an ID:', file=conout)
flush(conout)
id - readLines(conin, 1)
 
 Error in readLines(conin, 1) : object conin not
 found
 
 and with
 
source(foo.R)
 
 Error in file(C:\\R\\R-2.2.0\\CONIN$, r) : 
 unable to open connection
 In addition: Warning message:
 cannot open file 'C:\R\R-2.2.0\CONIN$', reason 'No
 such file or directory' 
 
 
 When I create a batch file with the following command
 :
 C:\R\R-2.2.0\bin\Rterm.exe --vanilla
 C:\R\R-2.2.0\foo.R C:\R\R-2.2.0\foo.out
 
 and double click on the batch file, the out file gives
 me:
 
 R : Copyright 2005, The R Foundation for Statistical
 Computing
 Version 2.2.0  (2005-10-06 r35749)
 ISBN 3-900051-07-0
 . . .
 Type 'q()' to quit R.
 
 
conout - file('C:\\R\\R-2.2.0\\CONOUT$','w')
conin - file('C:\\R\\R-2.2.0\\CONIN$', 'r')
 
 
 and nothing else. In none of the situations do I get
 prompted for input. What am I doing hopelessly wrong?
 
 Mikkel
 
 --- Duncan Murdoch [EMAIL PROTECTED] wrote:
 
 
Mikkel Grum wrote:

Duncan and Prof, thanks for your comments and
apologies for not being more specific. I'm not

getting

the same results you get from the steps you

propose.

If I write a script foo.R with two lines

 id - readline(Please enter an ID: )
 id

and then use source(foo.R) (either at the Rterm
prompt, or in RGui) it is true that get prompted,

but

the second line does not visibly run, i.e. I get


source(id.r)

Please enter an ID: 5


and if I then type id, I get


id

[1] id

If I cut and paste the two lines in RGui (in one

go),

I get


id - readline(Please enter an ID: )

Please enter an ID: id


What I really want is a batch file on the desktop

with

the following commands:

   c:\r\R-2.2.0\bin\Rterm.exe --no-save

--no-restore

script.R script.out 21
   c:\texmf\miktex\bin\latex
\nonstopmode\input{blue.tex}


and script.R reads something like:

   id - readline(Please enter an ID: )
   id
   Sweave(blue.Rnw)

I said that script.R didn't run, which was an
incorrect description. It runs without prompting

for

the ID, and gives error messages all through

because

blue.Rnw needs the id.

This is a very simplified version of what I'm

doing,

but if I use only the first line of the batch file

and

the first two lines of the script and could get

that

to work, I could figure out the rest.

It won't work so simply.  You're redirecting stdin,
so user input would 
be taken from there; you're redirecting stdout and
stderr, so the prompt 
won't be visible to the user.

You need to open new handles to the console.  The
code below will do it 
in Windows; the syntax to specify the console in
Unix-alikes will be 
different (but I don't know what it is).

conout - file('CONOUT$','w')
conin - file('CONIN$', 'r')
cat('Please enter an ID:', file=conout)
flush(conout)
id - readLines(conin, 1)
print(id)

Duncan Murdoch

 
 
 
 
   
   
 __ 
 Yahoo! Mail - PC Magazine Editors' Choice 2005 
 http://mail.yahoo.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RODBC and Excel: Wrong Data Type Assumed on Import

2005-11-03 Thread Petr Pikal
Hi

As I now exclusively use copy paste method to transfer data from 
Excel to R I tried it and I got correctly a factor column when there 
were some non numeric data in Excel.

Ctrl-C in Excel
mydf-read.delim(clipboard) in R


Are you sure that a respective column in Excel has values 275a and 
275b in it? 

If yes I had tried to define colClasses vector for your columns.

HTH
Petr


On 2 Nov 2005 at 12:45, Earl F. Glynn wrote:

To: r-help@stat.math.ethz.ch
From:   Earl F. Glynn [EMAIL PROTECTED]
Date sent:  Wed, 2 Nov 2005 12:45:53 -0600
Subject:[R] RODBC and Excel:  Wrong Data Type Assumed on Import

 The first column in my Excel sheet has mostly numbers but I need to
 treat it as character data:
 
  library(RODBC)
  channel - odbcConnectExcel(U:/efg/lab/R/Plasmid/construct
  list.xls) plasmid - sqlFetch(channel,Sheet1, as.is=TRUE)
  odbcClose(channel)
 
  names(plasmid)
 [1] Plasmid Number PlasmidConcentration  Comments
 Lost
 
 # How is the type decided?  I need a character type.
  class(plasmid$Plasmid Number)
 [1] numeric
  typeof(plasmid$Plasmid Number)
 [1] double
 
  plasmid$Plasmid Number[273:276]
 [1] 274  NA  NA 276
 
 The two NAs are supposed to be 275a and 275b.  I tried the
 as.is=TRUE but that didn't help.
 
 I consulted Section 4, Relational databases, in the R Data
 Import/Export document (for Version 2.2.0).
 
 Section 4.2.2, Data types, was not helpful.  In particular, this did
 not seem helpful:  The more comprehensive of the R interface packages
 hide the type conversion issues from the user.
 
 Section 4.3.2, Package RODBC, provided a simple example of using ODBC
 .. with a(sic) Excel spreadsheet but is silent on how to control the
 data type on import.  Could the documentation be expanded to address
 this issue?
 
 I really need to show Plasmid 275a and Plasmid 275b instead of
 Plasmid NA.
 
 Thanks for any help with this.
 
 efg
 --
 Earl F. Glynn
 Scientific Programmer
 Bioinformatics Department
 Stowers Institute for Medical Research
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Bug report on get.hist.quote

2005-11-03 Thread Adrian Trapletti


get.hist.quote(instrument=INR/USD, provider=oanda, start=2005-10-20)
  

trying URL 
'http://www.oanda.com/convert/fxhistory?lang=endate1=10%2F20%2F2005date=11%2F01%2F2005date_fmt=usexch=INRexch2=expr=USDexpr2=margin_fixed=0SUBMIT=Get+Tableformat=ASCIIredirected=1'
Content type 'text/html' length unknown
opened URL
.. ...
downloaded 13Kb

2005-10-20 2005-10-21 2005-10-22 2005-10-23 2005-10-24 2005-10-25 2005-10-26 
   0.022200.022180.022240.022240.022240.022190.02226 
2005-10-27 2005-10-28 2005-10-29 2005-10-30 2005-10-31 2005-11-01 
   0.022240.022250.022240.022240.022240.02221 


The answer is wrong. What is shown here is USD/INR, not INR/USD.


  

Ajay,

This is not a bug! The answer is correct. *1 Indian Rupee = 0.02220 US 
Dollar on *2005-10-20 and this is also the standard way currencies are 
quoted. Pls read, e.g.

http://fxtrade.oanda.com/currency_trading/fxbasics.shtml

Best regards
Adrian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] quadratic form

2005-11-03 Thread Alvarez Pedro
On page 22 of the R-introduction guide it's written:

the quadratic form x^{'} A^{-1} x which is used in
multivariate computations, should be computed by
something like x%*%solve(A,x), rather than computing
the inverse of A.

Why isn't it good to compute t(x) %*% solve(A) %*% x?

Thanks a lot for help!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] merging dataframes

2005-11-03 Thread Gavin Simpson
Dear List,

I often have to merge two or more data frames containing unique row
names but with some columns (names) common to the two data frames and
some columns not common. This toy example will explain the kind of setup
I am talking about:

mat1 - as.data.frame(matrix(rnorm(20), nrow = 5))
mat2 - as.data.frame(matrix(rnorm(20), nrow = 4))
rownames(mat1) - paste(site, 1:5, sep=)
rownames(mat2) - paste(site, 6:9, sep=)
names(mat1) - paste(species, c(1,3,5,7), sep=)
names(mat2) - paste(species, c(2,3,4,7,9), sep=)
mat1
mat2

So sites (rows) are unique across both data frames, but there are only 7
unique species (columns):

unique(c(names(mat1), names(mat2)))

merge(mat1, mat2, all = TRUE)

gives almost what I want, but it drops or looses the rownames()
information from the two merged data frames, and it re-orders the rows
so that one simply cannot write back the correct row names.

How might I go about merging two data frames as I have described, but
preserve the row names and more importantly, keep the order of rows the
same, so that rows from mat1 come before the rows of mat2?

Many thanks,

Gavin
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [T] +44 (0)20 7679 5522
ENSIS Research Fellow [F] +44 (0)20 7679 7565
ENSIS Ltd.  ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
UCL Department of Geography   [W] http://www.ucl.ac.uk/~ucfagls/cv/
26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/
London.  WC1H 0AP.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ML optimization question--unidimensional unfolding scalin g

2005-11-03 Thread Liaw, Andy
Alternatively, just type debug(optim) before using it, then step through it
by hitting enter repeatedly...

When you're done, do undebug(optim).

Andy

 From: Spencer Graves
 
 Have you looked at the code for optim?  If you 
 execute optim, it 
 will list the code.  You can copy that into a script file and walk 
 through it line by line to figure out what it does.  By doing 
 this, you 
 should be able to find a place in the iteration where you can 
 test both 
 branches of each bifurcation and pick one -- or keep a list 
 of however 
 many you want and follow them all more or less 
 simultaneously, pruning 
 the ones that seem too implausible.  Then you can alternate between a 
 piece of the optim code, bifurcating and pruning, adjusting 
 each and 
 printing intermediate progress reports to help you understand 
 what it's 
 doing and how you might want to modify it.
 
 With a bit more effort, you can get the official 
 source code with 
 comments.  To do that, I think you go to www.r-project.org 
 - CRAN - 
 (select a local mirror) - Software:  R sources.  From there, just 
 download The latest release:  R-2.2.0.tar.gz.
 
 For more detailed help, I suggest you try to think of 
 the simplest 
 possible toy problem that still contains one of the issues 
 you find most 
 difficult.  Then send that to this list.  If readers can copy a few 
 lines of R code from your email into R and try a couple of things in 
 less than a minute, I think you might get more useful replies quicker.
 
 Best Wishes,
 Spencer Graves
 
 Peter Muhlberger wrote:
 
  Hi Spencer:  Thanks for your interest!  Also, the posting 
 guide was helpful.
  
  I think my problem might be solved if I could find a way to 
 terminate nlm or
  optim runs from within the user-given minimization function 
 they call.
  Optimization is unconstrained.
  
  I'm essentially using normal like curves that translate 
 observed values on a
  set of variables (one curve per variable) into latent 
 unfolded values.  The
  observed values are on the Y-axis  the latent (hence 
 parameters to be
  estimated) are on the X-axis.  The problem is that there 
 are two points into
  which an observed value can map on a curve--one on either 
 side of the curve
  mean.  Only one of these values actually will be optimal 
 for all observed
  variables, but it's easy to show that most estimation 
 methods will get stuck
  on the non-optimal value if they find that one first.  
 Moving away from that
  point, the likelihood gets a whole lot worse before the 
 routine will 'see'
  the optimal point on the other side of the normal curve.
  
  SANN might work, but I kind of wonder how useful it'd be in 
 estimating
  hundreds of parameters--thanks to that latent scale.
  
  My (possibly harebrained) thought for how to estimate this 
 unfolding using
  some gradient-based method would be to run through some 
 iterations and then
  check to see whether a better solution exists on the 'other 
 side' of the
  normal curves.  If it does, replace those parameters with 
 the better ones.
  Because this causes the likelihood to jump, I'd probably 
 have to start the
  estimation process over again (maybe).  But, I see no way 
 from within the
  minimization function called by NLM or optim to tell NLM or optim to
  terminate its current run.  I could make the algorithm 
 recursive, but that
  eats up resources  will probably have to be terminated w/ an error.
  
  Peter
  
  
  On 10/11/05 11:11 PM, Spencer Graves 
 [EMAIL PROTECTED] wrote:
  
  
  There may be a few problems where ML (or more generally 
 Bayes) fails
 to give sensible answers, but they are relatively rare.
 
  What is your likelihood?  How many parameters are you trying to
 estimate?
 
  Are you using constrained or unconstrained optimization?  If
 constrained, I suggest you remove the constraints by appropriate
 transformation.  When considering alternative transformations, I
 consider (a) what makes physical sense, and (b) which transformation
 produces a log likelihood that is more close to being parabolic.
 
  Hou are you calling optim?  Have you tried all SANN as well as
 Nelder-Mead, BFGS, and CG?  If you are using constrained
 optimization, I suggest you move the constraints to Inf by 
 appropriate
 transformation and use the other methods, as I just suggested.
 
  If you would still like more suggestions from this group, please
 provide more detail -- but as tersely as possible.  The 
 posting guide
 is, I believe, quite useful (www.R-project.org/posting-guide.html).
 
  spencer graves
  
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 -- 
 Spencer 
 Graves, PhD
 Senior Development Engineer
 PDF Solutions, Inc.
 333 West San Carlos Street Suite 700
 San Jose, CA 95110, USA
 
 [EMAIL PROTECTED]
 

[R] How to calculate errors in histogram values

2005-11-03 Thread Kilian Hagemann
Hi there,

I'm new to R but I thought this is the most likely place I could get advice or 
hints w.r.t the following problem:

I have a series of measurements xi with associated uncertainties dxi. I would 
like to construct the probability density histogram of this data where each 
density estimate has an associated error that is derived from the dxi. In 
other words, for large dxi the histogram should also display large 
uncertainties and vice versa. I need this for a curve fitting algorithm.

I have seen many crude ways of working out the error in each bin based on the 
bin count alone, but that's obviously independent of the dxi and thus not 
what I'm after.

So,

1) Is there an R package that can do this (there's nothing in the refence of 
2.1.1)? If so, what algorithm does it use?

2) Could anybody please point me in the right direction (papers, books, 
websites etc.)

Thanks,

-- 
Kilian Hagemann

Climate Systems Analysis Group
University of Cape Town
Republic of South Africa
Tel(w): ++27 21 650 2748

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] quadratic form

2005-11-03 Thread Uwe Ligges
Alvarez Pedro wrote:

 On page 22 of the R-introduction guide it's written:
 
 the quadratic form x^{'} A^{-1} x which is used in
 multivariate computations, should be computed by
 something like x%*%solve(A,x), rather than computing
 the inverse of A.
 
 Why isn't it good to compute t(x) %*% solve(A) %*% x?
 
 Thanks a lot for help!
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

See

Bates, D. (2004): Least Squares Calculations in R. R News 4(1), 17-20, 
http://CRAN.R-project.org/doc/Rnews/

Uwe Ligges

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] merging dataframes

2005-11-03 Thread Roger Bivand
On Thu, 3 Nov 2005, Gavin Simpson wrote:

 Dear List,
 
 I often have to merge two or more data frames containing unique row
 names but with some columns (names) common to the two data frames and
 some columns not common. This toy example will explain the kind of setup
 I am talking about:
 
 mat1 - as.data.frame(matrix(rnorm(20), nrow = 5))
 mat2 - as.data.frame(matrix(rnorm(20), nrow = 4))
 rownames(mat1) - paste(site, 1:5, sep=)
 rownames(mat2) - paste(site, 6:9, sep=)
 names(mat1) - paste(species, c(1,3,5,7), sep=)
 names(mat2) - paste(species, c(2,3,4,7,9), sep=)
 mat1
 mat2
 
 So sites (rows) are unique across both data frames, but there are only 7
 unique species (columns):
 
 unique(c(names(mat1), names(mat2)))
 
 merge(mat1, mat2, all = TRUE)
 
 gives almost what I want, but it drops or looses the rownames()
 information from the two merged data frames, and it re-orders the rows
 so that one simply cannot write back the correct row names.
 
 How might I go about merging two data frames as I have described, but
 preserve the row names and more importantly, keep the order of rows the
 same, so that rows from mat1 come before the rows of mat2?

merge(mat1, mat2, all = TRUE, sort=FALSE)

seems to fix the second question. The first is mentioned tangentially in 
the help page details if you are merging on row names, which you are not - 
maybe prepend to both a column called sites:

mat1a - data.frame(sites=row.names(mat1), mat1)
mat2a - data.frame(sites=row.names(mat2), mat2)
data.frame(merge(mat1a, mat2a, all = TRUE, sort=FALSE), row.names=sites)

is a bit long-winded, but gets you there.

Roger

 
 Many thanks,
 
 Gavin
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] merging dataframes

2005-11-03 Thread Dimitris Rizopoulos
you could use something like:

mat1$id1 - 1:nrow(mat1)
mat2$id2 - 1:nrow(mat2)

out - merge(mat1, mat2, all = TRUE)
out[order(out$id1, out$id2), ]

I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Gavin Simpson [EMAIL PROTECTED]
To: R-help R-help@stat.math.ethz.ch
Sent: Thursday, November 03, 2005 2:08 PM
Subject: [R] merging dataframes


 Dear List,

 I often have to merge two or more data frames containing unique row
 names but with some columns (names) common to the two data frames 
 and
 some columns not common. This toy example will explain the kind of 
 setup
 I am talking about:

 mat1 - as.data.frame(matrix(rnorm(20), nrow = 5))
 mat2 - as.data.frame(matrix(rnorm(20), nrow = 4))
 rownames(mat1) - paste(site, 1:5, sep=)
 rownames(mat2) - paste(site, 6:9, sep=)
 names(mat1) - paste(species, c(1,3,5,7), sep=)
 names(mat2) - paste(species, c(2,3,4,7,9), sep=)
 mat1
 mat2

 So sites (rows) are unique across both data frames, but there are 
 only 7
 unique species (columns):

 unique(c(names(mat1), names(mat2)))

 merge(mat1, mat2, all = TRUE)

 gives almost what I want, but it drops or looses the rownames()
 information from the two merged data frames, and it re-orders the 
 rows
 so that one simply cannot write back the correct row names.

 How might I go about merging two data frames as I have described, 
 but
 preserve the row names and more importantly, keep the order of rows 
 the
 same, so that rows from mat1 come before the rows of mat2?

 Many thanks,

 Gavin
 -- 
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [T] +44 (0)20 7679 5522
 ENSIS Research Fellow [F] +44 (0)20 7679 7565
 ENSIS Ltd.  ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
 UCL Department of Geography   [W] 
 http://www.ucl.ac.uk/~ucfagls/cv/
 26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/
 London.  WC1H 0AP.
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] merging dataframes

2005-11-03 Thread Liaw, Andy
The `Value' section of ?merge does say that `... in all cases the result has
no special row names', so you're left to handle that on your own.  One
possibility is to use

  result - merge(mat1, mat2, all=TRUE, sort=FALSE)

so that the sorting is not done, then you can just do

  rownames(result) - c(rownames(mat1), rownames(mat2))

Cheers,
Andy


 From: Gavin Simpson
 
 Dear List,
 
 I often have to merge two or more data frames containing unique row
 names but with some columns (names) common to the two data frames and
 some columns not common. This toy example will explain the 
 kind of setup
 I am talking about:
 
 mat1 - as.data.frame(matrix(rnorm(20), nrow = 5))
 mat2 - as.data.frame(matrix(rnorm(20), nrow = 4))
 rownames(mat1) - paste(site, 1:5, sep=)
 rownames(mat2) - paste(site, 6:9, sep=)
 names(mat1) - paste(species, c(1,3,5,7), sep=)
 names(mat2) - paste(species, c(2,3,4,7,9), sep=)
 mat1
 mat2
 
 So sites (rows) are unique across both data frames, but there 
 are only 7
 unique species (columns):
 
 unique(c(names(mat1), names(mat2)))
 
 merge(mat1, mat2, all = TRUE)
 
 gives almost what I want, but it drops or looses the rownames()
 information from the two merged data frames, and it re-orders the rows
 so that one simply cannot write back the correct row names.
 
 How might I go about merging two data frames as I have described, but
 preserve the row names and more importantly, keep the order 
 of rows the
 same, so that rows from mat1 come before the rows of mat2?
 
 Many thanks,
 
 Gavin
 -- 
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
 %~%~%~%~%
 Gavin Simpson [T] +44 (0)20 7679 5522
 ENSIS Research Fellow [F] +44 (0)20 7679 7565
 ENSIS Ltd.  ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
 UCL Department of Geography   [W] 
 http://www.ucl.ac.uk/~ucfagls/cv/
 26 Bedford Way  
   [W] http://www.ucl.ac.uk/~ucfagls/
 London.  WC1H 0AP.
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
 %~%~%~%~%
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] quadratic form

2005-11-03 Thread Prof Brian Ripley
On Thu, 3 Nov 2005, Alvarez Pedro wrote:

 On page 22 of the R-introduction guide it's written:

 the quadratic form x^{'} A^{-1} x which is used in
 multivariate computations, should be computed by
 something like x%*%solve(A,x), rather than computing
 the inverse of A.

 Why isn't it good to compute t(x) %*% solve(A) %*% x?

The answer is only two lines above;

   Numerically, it is both inefficient and potentially unstable to compute
   @code{x - solve(A) %*% b} instead of @code{solve(A,b)}.

See also the footnote.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] readline() and Rterm in Windows

2005-11-03 Thread Mikkel Grum
And that was the only combination I didn't try, duhh.
As you say, it works. Excellent! TX! They'll be a
number of pleased usres too.

--- Duncan Murdoch [EMAIL PROTECTED] wrote:

 Mikkel Grum wrote:
  I've tried your proposal in a number of ways, and
  there must be something I'm not understanding. If
 I
  run your script (using source() in RGui, or ctrl-R
  from the R Editor, I get:
 
 It requires a command line console, i.e. it will
 only work in Rterm, not 
 Rgui.  I was assuming you'd run it using the style
 of your batch file 
 down below, but without changing the paths.
 
 Duncan Murdoch
  
  
 conout - file('CONOUT$','w')
  
  Error in file(CONOUT$, w) : unable to open
  connection
  In addition: Warning message:
  cannot open file 'CONOUT$', reason 'Permission
 denied'
  
  
  
  so I added the path as in:
  
  conout - file('C:\\R\\R-2.2.0\\CONOUT$','w')
  conin - file('C:\\R\\R-2.2.0\\CONIN$', 'r')
  cat('Please enter an ID:', file=conout)
  flush(conout)
  id - readLines(conin, 1)
  print(id)
  
  
  Using RGui and ctrl-R from the R Editor, I get
  
  
 conout - file('C:\\R\\R-2.2.0\\CONOUT$','w')
 conin - file('C:\\R\\R-2.2.0\\CONIN$', 'r')
  
  Error in file(C:\\R\\R-2.2.0\\CONIN$, r) : 
  unable to open connection
  In addition: Warning message:
  cannot open file 'C:\R\R-2.2.0\CONIN$', reason 'No
  such file or directory' 
  
 cat('Please enter an ID:', file=conout)
 flush(conout)
 id - readLines(conin, 1)
  
  Error in readLines(conin, 1) : object conin not
  found
  
  and with
  
 source(foo.R)
  
  Error in file(C:\\R\\R-2.2.0\\CONIN$, r) : 
  unable to open connection
  In addition: Warning message:
  cannot open file 'C:\R\R-2.2.0\CONIN$', reason 'No
  such file or directory' 
  
  
  When I create a batch file with the following
 command
  :
  C:\R\R-2.2.0\bin\Rterm.exe --vanilla
  C:\R\R-2.2.0\foo.R C:\R\R-2.2.0\foo.out
  
  and double click on the batch file, the out file
 gives
  me:
  
  R : Copyright 2005, The R Foundation for
 Statistical
  Computing
  Version 2.2.0  (2005-10-06 r35749)
  ISBN 3-900051-07-0
  . . .
  Type 'q()' to quit R.
  
  
 conout - file('C:\\R\\R-2.2.0\\CONOUT$','w')
 conin - file('C:\\R\\R-2.2.0\\CONIN$', 'r')
  
  
  and nothing else. In none of the situations do I
 get
  prompted for input. What am I doing hopelessly
 wrong?
  
  Mikkel
  
  --- Duncan Murdoch [EMAIL PROTECTED] wrote:
  
  
 Mikkel Grum wrote:
 
 Duncan and Prof, thanks for your comments and
 apologies for not being more specific. I'm not
 
 getting
 
 the same results you get from the steps you
 
 propose.
 
 If I write a script foo.R with two lines
 
id - readline(Please enter an ID: )
id
 
 and then use source(foo.R) (either at the Rterm
 prompt, or in RGui) it is true that get prompted,
 
 but
 
 the second line does not visibly run, i.e. I get
 
 
 source(id.r)
 
 Please enter an ID: 5
 
 
 and if I then type id, I get
 
 
 id
 
 [1] id
 
 If I cut and paste the two lines in RGui (in one
 
 go),
 
 I get
 
 
 id - readline(Please enter an ID: )
 
 Please enter an ID: id
 
 
 What I really want is a batch file on the desktop
 
 with
 
 the following commands:
 
c:\r\R-2.2.0\bin\Rterm.exe --no-save
 
 --no-restore
 
 script.R script.out 21
c:\texmf\miktex\bin\latex
 \nonstopmode\input{blue.tex}
 
 
 and script.R reads something like:
 
id - readline(Please enter an ID: )
id
Sweave(blue.Rnw)
 
 I said that script.R didn't run, which was an
 incorrect description. It runs without prompting
 
 for
 
 the ID, and gives error messages all through
 
 because
 
 blue.Rnw needs the id.
 
 This is a very simplified version of what I'm
 
 doing,
 
 but if I use only the first line of the batch
 file
 
 and
 
 the first two lines of the script and could get
 
 that
 
 to work, I could figure out the rest.
 
 It won't work so simply.  You're redirecting
 stdin,
 so user input would 
 be taken from there; you're redirecting stdout and
 stderr, so the prompt 
 won't be visible to the user.
 
 
=== message truncated ===

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] merging dataframes

2005-11-03 Thread ManuelPerera-Chang




Hi,

what about padding both datasets with dummy missing records ... and then
play with cbind and rbind

... like e.g.

 species5-c(NA,NA,NA,NA)

 modmat2-cbind(mat2,species1,species5)

and then similarly with mat1 ...
e.g.
species2-c(NA,NA,NA,NA,NA)

 modmad1-cbind(mat1,species2,species4,species9)

 rbind(modmad1,modmat2)
species1species3   species5   species7   species2   species4
species9
site1 -0.7190044 -0.52482580 -1.1813567 -1.5584831 NA NA
NA
site2 -1.1782180  1.72337964  0.1652343 -0.9026087 NA NA
NA
site3  0.3823015 -0.07226644 -1.2907470 -0.3692091 NA NA
NA
site4 -1.3051131 -0.61107947  0.6264416  1.5259373 NA NA
NA
site5  0.2028565 -1.28374638  1.6284780 -1.2975163 NA NA
NA
site6 NA  1.19088414 NA  0.3159949 -0.1624538  0.5987733
0.2205512
site7 NA  0.75292176 NA  1.7524988  0.8335334 -0.7998774
-0.9788762
site8 NA -0.47803396 NA -1.3041628  1.7925165 -0.4153879
-0.4708165
site9 NA -0.20063523 NA  1.8119115  1.5351801 -1.3334419
0.5812675

it will need modifications of course if you are working with several
datasets

Saludos,

Manuel

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] merging dataframes

2005-11-03 Thread Gavin Simpson
On Thu, 2005-11-03 at 08:33 -0500, Liaw, Andy wrote:
 The `Value' section of ?merge does say that `... in all cases the result has
 no special row names', so you're left to handle that on your own.  One
 possibility is to use
 
   result - merge(mat1, mat2, all=TRUE, sort=FALSE)
 
 so that the sorting is not done, then you can just do
 
   rownames(result) - c(rownames(mat1), rownames(mat2))
 
 Cheers,
 Andy

Thanks Roger, Andy and Dimitris for your solutions.

I'd missed the default for sort being set to TRUE - must pay more
attention in class. Roger's and Andy's approaches seem like the most
fool-proof way of canning this into a function that does all the I asked
and a bit of tidying up of NA's etc.

Cheers,

G

 
  From: Gavin Simpson
  
  Dear List,
  
  I often have to merge two or more data frames containing unique row
  names but with some columns (names) common to the two data frames and
  some columns not common. This toy example will explain the 
  kind of setup
  I am talking about:
  
  mat1 - as.data.frame(matrix(rnorm(20), nrow = 5))
  mat2 - as.data.frame(matrix(rnorm(20), nrow = 4))
  rownames(mat1) - paste(site, 1:5, sep=)
  rownames(mat2) - paste(site, 6:9, sep=)
  names(mat1) - paste(species, c(1,3,5,7), sep=)
  names(mat2) - paste(species, c(2,3,4,7,9), sep=)
  mat1
  mat2
  
  So sites (rows) are unique across both data frames, but there 
  are only 7
  unique species (columns):
  
  unique(c(names(mat1), names(mat2)))
  
  merge(mat1, mat2, all = TRUE)
  
  gives almost what I want, but it drops or looses the rownames()
  information from the two merged data frames, and it re-orders the rows
  so that one simply cannot write back the correct row names.
  
  How might I go about merging two data frames as I have described, but
  preserve the row names and more importantly, keep the order 
  of rows the
  same, so that rows from mat1 come before the rows of mat2?
  
  Many thanks,
  
  Gavin
  -- 
  %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
  %~%~%~%~%
  Gavin Simpson [T] +44 (0)20 7679 5522
  ENSIS Research Fellow [F] +44 (0)20 7679 7565
  ENSIS Ltd.  ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk
  UCL Department of Geography   [W] 
  http://www.ucl.ac.uk/~ucfagls/cv/
  26 Bedford Way  
[W] http://www.ucl.ac.uk/~ucfagls/
  London.  WC1H 0AP.
  %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
  %~%~%~%~%
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
  
 
 
 --
 Notice:  This e-mail message, together with any attachment...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Potential for R to conflict with other softwares

2005-11-03 Thread Soukup, Mat
Hi.

After some time, my collegues at the Food and Drug Adminstration have
finally acknowledged R as a powerful statistical computing environment.
However, in order to comply with the Office of Information and Technology
standards there are a couple of questions about whether R could interfere
with other software. As I'm more of a driver of the R software and not a
mechanic, I was hoping for the insight of the many great useRs. Below is a
list of 5 proposed questions to which I value any comment.

Thank you for your time,

-Mat


1. Does R have high resolution graphics?

2. Does R have .dll files, or other executables which are not located in the
R software directory tree?

3. Does R modify the Windows registry in a non-obvious way, i.e. other than
defining itself and what extensions to associate with R, and what are those
extensions?

4. Does R add macros to any part of MS Office?

5. Can you anticipate any other way in which installing and using R could
disrupt the operation of another software?
 

***
Mat Soukup, Ph.D.
Food and Drug Administration
10903 New Hampshire Ave. 
BLDG 22 RM 5329
Silver Spring, MD 20993-0002
Phone: 301.796.1005
***


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] problems with pan(): Indizierung ausserhalb der Grenzen = subscript out of bounds

2005-11-03 Thread Leo Gürtler
Dear alltogether,

I tried pan() to impute NAs for longitudinal data.
The terminology in the following output follows the pan manpage. No data 
are attached to this script as this may be too huge.


y = 15 responses
pred = at first just intercept was tried (later on covariates should follow)
subj = 168 different subjects with 4 to 6 observations for each subject 
at time points t1, t2, ..., t6

# extract of y
  y[1:4,]
anpr impr kepr   lernpr  lstofpr  nachwela   nachfaknachw 
sbpr widapr zdompr   zerfpr zgleipr  zstimpr  zugrupr
2   3.50 2.75  3.4 2.22 2.67  3.33 15.00 5.909091 
2.33  21.5 3.67   4 3.00 3.56
202 2.25 2.50  3.6 2.22 3.67 12.00 13.75 7.78 
1.67  22.0 3.33   4 3.33 3.56
402   NA   NA   NA   NA   NANANA   NA   
NA NA NA   NA  NA   NA   NA
602 1.75 2.75  3.4 1.56 3.33  2.67  6.67 5.00 
2.00  21.0 3.33   4 2.67 3.33
  dim(y)
[1] 940  15 # matrix y with 15 responses and 940 obs
# y is ordered according to subj
  length(subj)
[1] 940 #940 obs of 168 different subjects (persons)
# extract of subj
  subj[1:30]
 [1] 2 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7
# how many observations for each subject
  table(subj)
subj
  2   3   4   5   6   7   8   9  12  13  14  15  16  17  18  19  20  21  
23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  41  
42  43  44  46  47  48  49
  6   5   5   6   6   6   5   6   6   6   6   6   5   6   6   5   6   
6   6   6   5   6   6   5   6   6   4   6   5   5   6   6   6   5   6   
6   4   6   5   6   5   6   6
 50  52  53  55  57  58  59  60  61  62  64  66  67  68  69  70  72  73  
75  76  77  78  79  81  82  83  84  85  86  87  88  89  90  91  92  93  
94  95  97  98  99 100 101
  6   5   5   5   6   6   5   5   6   6   6   6   4   6   6   5   5   
6   6   5   6   4   6   5   5   4   6   5   6   6   4   5   5   6   6   
6   6   6   6   6   6   6   6
103 104 105 106 107 108 109 110 112 113 114 115 116 117 118 121 122 123 
124 125 126 127 128 129 130 131 132 133 134 135 137 138 139 140 141 142 
144 146 147 148 149 150 151
  5   6   5   6   6   6   5   5   5   6   6   5   5   6   5   6   6   
6   6   6   5   4   6   6   6   6   6   4   4   6   5   4   6   5   4   
6   6   6   6   6   6   5   5
152 153 154 155 156 157 158 159 160 162 164 165 166 167 170 171 173 174 
175 176 178 179 181 182 185 186 187 188 189 190 191 192 193 195 196 197 
198 199 200
  5   6   6   6   5   6   6   6   6   6   6   6   6   4   5   6   6   
6   6   5   6   6   6   6   6   6   6   5   6   6   6   5   6   6   6   
6   6   6   6
  pred - cbind(interc=rep(1,dim(y)[1])) # just intercept (at first)
  dim(pred)
[1] 940  1
xcol - 1:dim(pred)[2]
  xcol
[1] 1
#xcol = 1 , using all number of cols of pred[]
  zcol - c(1)# = 1 , number of cols to use
  y.ncol - dim(y)[2]
  n.zcol - length(zcol)
  prior - list(a=y.ncol,
+  Binv=diag(y.ncol),
+  c=n.zcol,
+  Dinv=diag(n.zcol))
  prior
$a
[1] 15

$Binv
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] 
[,13] [,14] [,15]
 [1,]100000000 0 0 0 
0 0 0
 [2,]010000000 0 0 0 
0 0 0
 [3,]001000000 0 0 0 
0 0 0
 [4,]000100000 0 0 0 
0 0 0
 [5,]000010000 0 0 0 
0 0 0
 [6,]000001000 0 0 0 
0 0 0
 [7,]000000100 0 0 0 
0 0 0
 [8,]000000010 0 0 0 
0 0 0
 [9,]000000001 0 0 0 
0 0 0
[10,]000000000 1 0 0 
0 0 0
[11,]000000000 0 1 0 
0 0 0
[12,]000000000 0 0 1 
0 0 0
[13,]000000000 0 0 0 
1 0 0
[14,]000000000 0 0 0 
0 1 0
[15,]000000000 0 0 0 
0 0 1

$c
[1] 1

$Dinv
 [,1]
[1,]1

#prior a = number of cols in y
#  Binv = identity matrix (ncols = nrows = y)
#  c = length of zcol[]
#  Dinv = identity matrix (ncols = length(nzcol))

Now the error message:

  pan(y, subj, pred, xcol, zcol, prior, seed=1234,iter=1000)
Fehler: Indizierung außerhalb der Grenzen
# error massage = subscript out of bounds

I do not understand that. Thank you very 

Re: [R] quadratic form

2005-11-03 Thread Peter Dalgaard
Alvarez Pedro [EMAIL PROTECTED] writes:

 On page 22 of the R-introduction guide it's written:
 
 the quadratic form x^{'} A^{-1} x which is used in
 multivariate computations, should be computed by
 something like x%*%solve(A,x), rather than computing
 the inverse of A.
 
 Why isn't it good to compute t(x) %*% solve(A) %*% x?

It's just a waste of CPU time. For k x k matrices, solution of Ax=b is
of computational complexity O(k^2) whereas inversion of A is O(k^3).
This is obviously more important for k=1000 than for k=5.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Potential for R to conflict with other softwares

2005-11-03 Thread Duncan Murdoch
On 11/3/2005 9:11 AM, Soukup, Mat wrote:
 Hi.
 
 After some time, my collegues at the Food and Drug Adminstration have
 finally acknowledged R as a powerful statistical computing environment.
 However, in order to comply with the Office of Information and Technology
 standards there are a couple of questions about whether R could interfere
 with other software. As I'm more of a driver of the R software and not a
 mechanic, I was hoping for the insight of the many great useRs. Below is a
 list of 5 proposed questions to which I value any comment.
 
 Thank you for your time,
 
 -Mat
 
 

These answers are about the Windows version only, but from the 
questions, I think that's what you were looking for.  They apply to all 
versions since 1.6.x at least (though the earlier ones would have put 
fewer entries into the registry, they put them in the same places).

 1. Does R have high resolution graphics?

Yes, but I don't think I get the point of this question.  How would that 
interfere with other software?
 
 2. Does R have .dll files, or other executables which are not located in the
 R software directory tree?

No, it installs everything below R_HOME.
 
 3. Does R modify the Windows registry in a non-obvious way, i.e. other than
 defining itself and what extensions to associate with R, and what are those
 extensions?

I think all of its modifications would count as obvious.  They are 
mainly below HKLM/Software/R-core or HKCU/Software/R-core (where the 
file locations are recorded); additionally file associations are set up 
for .Rdata files (which are called RWorkspace files there), and an 
uninstall entry is made.
 
 4. Does R add macros to any part of MS Office?

No.
 
 5. Can you anticipate any other way in which installing and using R could
 disrupt the operation of another software?

No, not really.  Maybe users will become addicted to it?  ;-)

Duncan Murdoch
  
 
 ***
 Mat Soukup, Ph.D.
 Food and Drug Administration
 10903 New Hampshire Ave. 
 BLDG 22 RM 5329
 Silver Spring, MD 20993-0002
 Phone: 301.796.1005
 ***
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Problems with abline adding regression line to a graph

2005-11-03 Thread CG Pettersson
Hello all,

R2.1.1, W2k

I try to make a plot of a simple regression model in this way:

  with(njfA_bcd, {
+ plot(TC_OS.G31,Prot,cex = 2, col = red, xlab= TC/OS at GS32,
+ ylab=Grain crude protein (CP))
+ })

This part works well and produces the datapoints as red circles.
When I try to add a line, using a fitted linear model in a way
that works perfect with other variables in the same dateset the
following happens:

  with(njfA_bcd, {
+abline(lm(predict(m1tc) ~ TC_OS.G31), lty = 1, col = red)
+ })
Error in model.frame(formula, rownames, variables, varnames, extras, 
extranames,  :
variable lengths differ

And this means?

There exists missing values for TC_OS.G31 in the dataset. From the 
beginning
m1tc was a lm() object, which gave the same Error message. To try to fix the
problem I changed to lme() and used na.action=na.omit explicitely, but this
didn´t help.

Here is the summary of m1tc:

  summary(m1tc)
Linear mixed-effects model fit by REML
 Data: njfA_bcd
   AIC  BIClogLik
  209.4914 219.0692 -100.7457

Random effects:
 Formula: ~1 | Trial
(Intercept)  Residual
StdDev:1.242184 0.6520464

Fixed effects: Prot ~ TC_OS.G31
Value Std.Error DF   t-value p-value
(Intercept)  14.86209  0.957630 68 15.519662   0
TC_OS.G31   -24.22286  4.792801 68 -5.054008   0
 Correlation:
  (Intr)
TC_OS.G31 -0.935

Standardized Within-Group Residuals:
Min  Q1 Med  Q3 Max
-1.68329774 -0.73751040 -0.05600477  0.68301243  2.21693174

Number of Observations: 83
Number of Groups: 14
 

What is happening and what shall I do about it?

Cheers
/CG

-- 
CG Pettersson, MSci, PhD Stud.
Swedish University of Agricultural Sciences (SLU)
Dept. of Crop Production Ecology. Box 7043.
SE-750 07 UPPSALA, Sweden.
+46 18 671428, +46 70 3306685
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Rserve/Python

2005-11-03 Thread Barry Rowlingson
Has anyone done anything on a Python client for Rserve?

Simon Urbanek (Rserve dev) tells me he heard of some people working on 
it a couple of years ago but nothing came of it. If anyone has done 
anything, or might find it interesting, please get in touch with me.

I know there's also the RSPython package but I'm tied to a particular 
python version and I can't be sure I'll get RSPython working (in Windows 
as well as Linux). At least if I write a Python-Rserve client its my job 
to make it work!

Thanks,

Barry Rowlingson
Maths and Stats
Lancaster University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] locfit: simultaneous confidence band

2005-11-03 Thread Liaw, Andy
Apologies for coming to this so late...

Variance is rarely known in real life data.  You should really consult the
book `Local Regression and Likelihood' by Prof. Loader for the details on
simultaneous confidence bands.  `Locfit' is the support software for that
book.

Andy

 From: Michael Gälger
 
 I'm using the package 'locfit' for nonparametric regression. 
 This package
 contains the function 'scb' to compute simultaneous confidence bands. 
 The variance of the data is unknown. Up to now I compute a fit with
 'locfit'. Afterwards an estimate of the residual variance is 
 computed by the
 function 'rv'. The weights in the 'scb'-function are set 1/sigma^2 to
 compute the confidence band.
 Is this procedure correct or is there any other way to 
 compute confidence
 bands with unknown variance?
 
 Thanks very much for any help you can offer. 
 
 Michael Gälger
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Search within a file

2005-11-03 Thread Tuszynski, Jaroslaw W.
Hi,

I am looking for a way to search a file for position of some expression,
from within R. My current code:

 sha1Pos = gregexpr(sha1, readChar(filename,
file.info(filename)$size))[[1]]

Works fine for small files, but text files I will be working with might get
up to Gb range, so I was trying to accomplish the same without loading the
whole file into R.

I realize this is not what R is designed to do, but maybe there is some way
I am missing.

Jarek Tuszynski

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Search within a file

2005-11-03 Thread Gabor Grothendieck
Would this be ok (on Windows, use grep on UNIX):

# line numbers of all lines conitaining R in the R README file
setwd(R.home())
as.numeric(sub(:.*, , system(findstr /n R README, intern = TRUE)))


On 11/3/05, Tuszynski, Jaroslaw W. [EMAIL PROTECTED] wrote:
 Hi,

 I am looking for a way to search a file for position of some expression,
 from within R. My current code:

  sha1Pos = gregexpr(sha1, readChar(filename,
 file.info(filename)$size))[[1]]

 Works fine for small files, but text files I will be working with might get
 up to Gb range, so I was trying to accomplish the same without loading the
 whole file into R.

 I realize this is not what R is designed to do, but maybe there is some way
 I am missing.

 Jarek Tuszynski

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] npmc package

2005-11-03 Thread Liaw, Andy
I just downloaded the file from that location, and took a quick look.  At
least the only thing that R CMD check under 2.2.0 only complained about the
file report.Rd.  The problem seems to be 

  \keyword { print }

instead of 

  \keyword{print}

If no one else is aware of any other problems with the package (e.g., ones
that R CMD check can not test for), please let me know.  Otherwise I'll take
over the maintanence and submit it to CRAN.

Andy



 From: Prof Brian Ripley
 
 On Fri, 21 Oct 2005, Kjetil Holuerson wrote:
 
  Martin Maechler wrote:
  Carlos == Carlos Mauricio Cardeal Mendes [EMAIL PROTECTED]
  on Wed, 19 Oct 2005 15:11:32 -0300 writes:
 
  Carlos So, is there another package to substitute those
  Carlos functions described on ORPHANED npmc package ?
 
  May be not.
  But nobody stops you from becoming the new maintainer of the
 
  Just checked. This package is not  now in the ORPHANES
  subdirectory, neither in the main CRAN  listing.
 
 See http://cran.r-project.org/src/contrib/Archive/N/
 
 The Orphaned (sic) subdirectory applies to packages dropped 
 by the former 
 maintainer which no longer pass R CMD check, and also to 
 those where the 
 CRAN maintainers are able to deduce it has been dropped.  
 Others which 
 just fail without a positive indication of no active 
 maintainer may end in 
 the archive.
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] quadratic form

2005-11-03 Thread Robin Hankin
Hi Alvarez


If you define

quad.form.inv -  function (M, x)
{
 drop(crossprod(x, solve(M, x)))
}

then you will avoid an expensive call to %*% as well.


HTH


Robin

On 3 Nov 2005, at 13:01, Alvarez Pedro wrote:

 On page 22 of the R-introduction guide it's written:

 the quadratic form x^{'} A^{-1} x which is used in
 multivariate computations, should be computed by
 something like x%*%solve(A,x), rather than computing
 the inverse of A.

 Why isn't it good to compute t(x) %*% solve(A) %*% x?

 Thanks a lot for help!

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting- 
 guide.html


--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RODBC and Excel: Wrong Data Type Assumed on Import

2005-11-03 Thread Kevin Wright
From my experience (somewhat of a guess):

1.

Excel uses the first 16 rows of data to determine if a column is numeric or
character. The data type which is most common in the first 16 rows will then
be used for the whole column. If you sort the data so that at least the
first 9 rows have character data, you may find this allows the data to be
interpreted as character. There is supposedly a registy setting that can
control how many lines to use (instead of 16), but I have not had success
with the setting. I suspect that ODBC uses JET4, which may be the real
source of the problem. See more here:
http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-data-types/

2.

The gregmisc bundle has a different read.xls function that uses a Perl
script (xls2csv) and seems to be safer with mixed-type columns.
Requires a working version of Perl.

Best,

Kevin Wright



The first column in my Excel sheet has mostly numbers but I need to treat it
as character data:

 library(RODBC)
http://tolstoy.newcastle.edu.au/R/help/05/09/11324.html#14938qlink1
* channel - odbcConnectExcel(U:/efg/lab/R/Plasmid/construct list.xls) *
* plasmid - sqlFetch(channel,Sheet1, as.is=TRUE) *
* odbcClose(channel) *

 names(plasmid)

[1] Plasmid Number Plasmid Concentration Comments Lost

# How is the type decided? I need a character type.
 class(plasmid$Plasmid Number)

[1] numeric
 typeof(plasmid$Plasmid Number)

[1] double

 plasmid$Plasmid Number[273:276]

[1] 274 NA NA 276

The two NAs are supposed to be 275a and 275b. I tried the as.is=TRUE but
that didn't help.

I consulted Section 4, Relational databases, in the R Data Import/Export
document (for Version 2.2.0).

Section 4.2.2, Data types, was not helpful. In particular, this did not seem
helpful: The more comprehensive of the R interface packages hide the type
conversion issues from the user.

Section 4.3.2, Package RODBC, provided a simple example of using ODBC ..
with a(sic) Excel spreadsheet but is silent on how to control the data type
on import. Could the documentation be expanded to address this issue?

I really need to show Plasmid 275a and Plasmid 275b instead of Plasmid
NA.

Thanks for any help with this.

efg

--
Earl F. Glynn
Scientific Programmer
Bioinformatics Department

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to calculate errors in histogram values

2005-11-03 Thread Ted Harding
On 03-Nov-05 Kilian Hagemann wrote:
 Hi there,
 
 I'm new to R but I thought this is the most likely place I
 could get advice or hints w.r.t the following problem:
 
 I have a series of measurements xi with associated uncertainties dxi.
 I would like to construct the probability density histogram of this
 data where each density estimate has an associated error that is
 derived from the dxi.
 In other words, for large dxi the histogram should also display
 large uncertainties and vice versa. I need this for a curve fitting
 algorithm.
 
 I have seen many crude ways of working out the error in each bin based
 on the bin count alone, but that's obviously independent of the dxi
 and thus not what I'm after.
 
 So,
 
 1) Is there an R package that can do this (there's nothing in the
 refence of 
 2.1.1)? If so, what algorithm does it use?
 
 2) Could anybody please point me in the right direction (papers, books,
 websites etc.)
 
 Thanks,
 
 -- 
 Kilian Hagemann

I don't know about an R package that would deal with this directly,
but I can think of an aproach, not difficult to implement in R,
which may be helpful.

I'm going to assume (at any rate for the time being), that you
are interested in a per-bin uncertainty, i.e. that you want
to be able to to answer For each bin, what is the uncertainty
in the count for this bin, regardless of any other bins?
I.e. you are ignoring the fact that there is correlation between
bins (what goes into one bin can not go into another).

1. Say you have N observations (i = 1:N). Draw a preliminary
   histogram, and from this decide on a good set of fixed breaks.

2. Extend this list by a few bins on either side (you may have
   to return to this point, depending on the outcome of later
   stages). Say this gives you K bins (j = 1:K).

3. For each data-point xi, with associated dxi, and for each bin j,
   use this to compute the probability pij that a point with
   mean xi and error dxi should fall into bin j. This might be
   based on something as naive as integrating a Normal distribution
   with mean xi and SD dxi over the range of the bin j.

4. You then have an array P = p[i,j], say, where in row i you
   have the computed probabilities for bins 1:K

5. Now: for bin j, you have an N-column of pij values.

   The expected number of the N which might really be in bin j
   is then

 Ej = sum(P[,j])

   and its variance (assuming that the errors in the xi are
   independent of each other) is

 Vj = sum(P[,j]*(1 - P[,j]))

6. So now you have the bin that might have been, with expected
   value Ej and standard deviation Sj = sqrt(Vj). Now draw a
   histogram (you can use 'lines()' for this in R) with bin
   heights Ej, and errors bars +/- Sj.

   It is at this stage that you may have to go back to stage 2.
   In order to be sure that you will not overlook xi values that
   spill outside the range of the bins you chose in stage 2,
   you need to verify that the bins you are using extend beyond
   the range of the original data, and that the two end-bins
   have negligible E1 and EK.

7. NOTE that the Ej will in general be different from the counts
   in bin j from the original data. This is due to overspill:
   if you have an original bin with a small count, which has next
   to it a bin with a large count, the uncertainty about whether
   some of the latter should really be in the former will contribute
   positively to the bin with the small count, by a larger amount
   than the bin with the small count will contribute to the bin
   with the large count.

   As you can see, this will have an effect of somewhat flattening
   the histogram, and of smoothing irregular variation from bin to
   bin.

This is just an outline of a possible approach, which you may be
able to develop to better suit your purposes if they are different
from what I've been assuming.

Best wishes,
Ted.



E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 03-Nov-05   Time: 15:45:35
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Potential for R to conflict with other softwares

2005-11-03 Thread Peter Dalgaard
Duncan Murdoch [EMAIL PROTECTED] writes:

 On 11/3/2005 9:11 AM, Soukup, Mat wrote:
  Hi.
  
  After some time, my collegues at the Food and Drug Adminstration have
  finally acknowledged R as a powerful statistical computing environment.
  However, in order to comply with the Office of Information and Technology
  standards there are a couple of questions about whether R could interfere
  with other software. As I'm more of a driver of the R software and not a
  mechanic, I was hoping for the insight of the many great useRs. Below is a
  list of 5 proposed questions to which I value any comment.
  
  Thank you for your time,
  
  -Mat
  
  
 
 These answers are about the Windows version only, but from the 
 questions, I think that's what you were looking for.  They apply to all 
 versions since 1.6.x at least (though the earlier ones would have put 
 fewer entries into the registry, they put them in the same places).
 
  1. Does R have high resolution graphics?
 
 Yes, but I don't think I get the point of this question.  How would that 
 interfere with other software?

Device drivers! We have seen cases where the FPU control word had to
be reset (DM will know this more precisely than me, I think). We do
have code in place to catch that particular issue though. 

If you want to be really sure, I believe that batch runs using the
postscript() or pdf() drivers would be immune to such issues (right?).

(BTW, does the FDA trust SAS for Windows? I've seen a weird thing or
two happening to the display there...)

  
  2. Does R have .dll files, or other executables which are not located in the
  R software directory tree?
 
 No, it installs everything below R_HOME.

We do rely on MCVCRT.DLL and some other system DLLs though, and I
think this can get modified by other software.

  
  3. Does R modify the Windows registry in a non-obvious way, i.e. other than
  defining itself and what extensions to associate with R, and what are those
  extensions?
 
 I think all of its modifications would count as obvious.  They are 
 mainly below HKLM/Software/R-core or HKCU/Software/R-core (where the 
 file locations are recorded); additionally file associations are set up 
 for .Rdata files (which are called RWorkspace files there), and an 
 uninstall entry is made.
  
  4. Does R add macros to any part of MS Office?
 
 No.
  
  5. Can you anticipate any other way in which installing and using R could
  disrupt the operation of another software?
 
 No, not really.  Maybe users will become addicted to it?  ;-)
 
 Duncan Murdoch
   
  
  ***
  Mat Soukup, Ph.D.
  Food and Drug Administration
  10903 New Hampshire Ave. 
  BLDG 22 RM 5329
  Silver Spring, MD 20993-0002
  Phone: 301.796.1005
  ***

( I think Mat is owed a big Thank you for his efforts).

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] MDS: Sample in one group appears twice. Why?

2005-11-03 Thread A Ezhil
Hi All,

I am trying to apply MDS for 4 groups in my data. The
groups are:  

groups = list( Day1C=c(9), Day1T=c(7,8,10),
Day2C=c(1,2,3,6,11,13,14,15), Day2T=c(4,5,12,16,17,18)
)

When I do the MDS plot the group1 member appears twice
instead of one time in the plot. I don't know why this
is happening.

I would greatly appreciate your help in fixing this
problem.

Thanks in Advance.
Ezhil

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Help on model selection using AICc

2005-11-03 Thread german . lopez
Hi,
   I'm fitting poisson regression models to counts of birds in 
1x1 km squares using several environmental variables as predictors. 
I do this in a stepwise way, using the stepAIC function. However the 
resulting models appear to be overparametrized, since too much 
variables were included. 
  I would like to know if there is the possibility of fitting models 
by steps but using the AICc instead of AIC. Or at least I wonder if it 
would be possible to save the AIC value and number of parameters of 
models fitted in each step and to calculate AICc afterward.
   Help on this will be very much appreciated
   German Lopez

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ML optimization question--unidimensional unfolding scalin g

2005-11-03 Thread Spencer Graves
Hi, Andy and Peter:

  That's interesting.  I still like the idea of making my own local 
copy, because I can more easily add comments and test ideas while 
working through the code.  I haven't used debug, but I think I should 
try it, because some things occur when running a function that don't 
occur when I walk through it line by line, e.g., parsing the call and 
... arguments.

  Two more comments on the original question:

  1.  What is the structure of your data?  Have you considered 
techniques for Multidimensional Scaling (MDS)?  It seems that your 
problem is just a univariate analogue of the MDS problem.  For metric 
MDS from a complete distance matrix, the solution is relatively 
straightforward computation of eigenvalues and vectors from a matrix 
computed from the distance matrix, and there is software widely 
available for the nonmetric MDS problem.  For a terse introduction to 
that literature, see Venables and Ripley (2002) Modern Applied 
Statistics with S, 4th ed. (Springer, distance methods in sec. 11.1, 
pp. 306-308).

  2.  If you don't have a complete distance matrix, might it be 
feasible to approach the problem starting small and building larger, 
i.e., start with 3 nodes, then add a fourth, etc.?

  spencer graves

Liaw, Andy wrote:

 Alternatively, just type debug(optim) before using it, then step through it
 by hitting enter repeatedly...
 
 When you're done, do undebug(optim).
 
 Andy
 
 
From: Spencer Graves

Have you looked at the code for optim?  If you 
execute optim, it 
will list the code.  You can copy that into a script file and walk 
through it line by line to figure out what it does.  By doing 
this, you 
should be able to find a place in the iteration where you can 
test both 
branches of each bifurcation and pick one -- or keep a list 
of however 
many you want and follow them all more or less 
simultaneously, pruning 
the ones that seem too implausible.  Then you can alternate between a 
piece of the optim code, bifurcating and pruning, adjusting 
each and 
printing intermediate progress reports to help you understand 
what it's 
doing and how you might want to modify it.

With a bit more effort, you can get the official 
source code with 
comments.  To do that, I think you go to www.r-project.org 
- CRAN - 
(select a local mirror) - Software:  R sources.  From there, just 
download The latest release:  R-2.2.0.tar.gz.

For more detailed help, I suggest you try to think of 
the simplest 
possible toy problem that still contains one of the issues 
you find most 
difficult.  Then send that to this list.  If readers can copy a few 
lines of R code from your email into R and try a couple of things in 
less than a minute, I think you might get more useful replies quicker.

Best Wishes,
Spencer Graves

Peter Muhlberger wrote:


Hi Spencer:  Thanks for your interest!  Also, the posting 

guide was helpful.

I think my problem might be solved if I could find a way to 

terminate nlm or

optim runs from within the user-given minimization function 

they call.

Optimization is unconstrained.

I'm essentially using normal like curves that translate 

observed values on a

set of variables (one curve per variable) into latent 

unfolded values.  The

observed values are on the Y-axis  the latent (hence 

parameters to be

estimated) are on the X-axis.  The problem is that there 

are two points into

which an observed value can map on a curve--one on either 

side of the curve

mean.  Only one of these values actually will be optimal 

for all observed

variables, but it's easy to show that most estimation 

methods will get stuck

on the non-optimal value if they find that one first.  

Moving away from that

point, the likelihood gets a whole lot worse before the 

routine will 'see'

the optimal point on the other side of the normal curve.

SANN might work, but I kind of wonder how useful it'd be in 

estimating

hundreds of parameters--thanks to that latent scale.

My (possibly harebrained) thought for how to estimate this 

unfolding using

some gradient-based method would be to run through some 

iterations and then

check to see whether a better solution exists on the 'other 

side' of the

normal curves.  If it does, replace those parameters with 

the better ones.

Because this causes the likelihood to jump, I'd probably 

have to start the

estimation process over again (maybe).  But, I see no way 

from within the

minimization function called by NLM or optim to tell NLM or optim to
terminate its current run.  I could make the algorithm 

recursive, but that

eats up resources  will probably have to be terminated w/ an error.

Peter


On 10/11/05 11:11 PM, Spencer Graves 

[EMAIL PROTECTED] wrote:


There may be a few problems where ML (or more generally 

Bayes) fails

to give sensible answers, but they are relatively rare.

What is your likelihood?  How many parameters are you trying 

[R] fatal error unused tempdir

2005-11-03 Thread SANDRINE COELHO

Hello,

I am running R on XP Windows machine, and frequently I have enable to start R. I
have this message: Fatal Error: cannot find unused tempdir name. I don't know
why.

Thanks, in advance, for your help

Sandrine Coelho

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] PL2006 Program Ballot

2005-11-03 Thread Gabor Grothendieck
Anyone interested in voting for R to be included in the annual
Pricelessware list (Windows freeware list voted on by the readers
of the alt.comp.freeware newsgroup, see www.pricelesswarehome.org)
can reply to this message:

http://groups.google.com/group/alt.comp.freeware/msg/d9e52e406d9e2ceb

simply deleting all programs you don't want to vote for and leaving
R (and any other program) you wish to vote for.  Deadline for voting
is November 7.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Potential for R to conflict with other softwares

2005-11-03 Thread Duncan Murdoch
On 11/3/2005 10:46 AM, Peter Dalgaard wrote:
 Duncan Murdoch [EMAIL PROTECTED] writes:
 
 On 11/3/2005 9:11 AM, Soukup, Mat wrote:
  Hi.
  
  After some time, my collegues at the Food and Drug Adminstration have
  finally acknowledged R as a powerful statistical computing environment.
  However, in order to comply with the Office of Information and Technology
  standards there are a couple of questions about whether R could interfere
  with other software. As I'm more of a driver of the R software and not a
  mechanic, I was hoping for the insight of the many great useRs. Below is a
  list of 5 proposed questions to which I value any comment.
  
  Thank you for your time,
  
  -Mat
  
  
 
 These answers are about the Windows version only, but from the 
 questions, I think that's what you were looking for.  They apply to all 
 versions since 1.6.x at least (though the earlier ones would have put 
 fewer entries into the registry, they put them in the same places).
 
  1. Does R have high resolution graphics?
 
 Yes, but I don't think I get the point of this question.  How would that 
 interfere with other software?
 
 Device drivers! We have seen cases where the FPU control word had to
 be reset (DM will know this more precisely than me, I think). We do
 have code in place to catch that particular issue though. 
 If you want to be really sure, I believe that batch runs using the
 postscript() or pdf() drivers would be immune to such issues (right?).
 
 (BTW, does the FDA trust SAS for Windows? I've seen a weird thing or
 two happening to the display there...)
 
  
  2. Does R have .dll files, or other executables which are not located in 
  the
  R software directory tree?
 
 No, it installs everything below R_HOME.
 
 We do rely on MCVCRT.DLL and some other system DLLs though, and I
 think this can get modified by other software.

That's MSVCRT.DLL, the run-time library for MS Visual C++.  It has been 
distributed with the OS since Win98 or so, and we don't touch it, so 
this is more of a possibility of other software interfering with us by 
replacing it with a bad version.

 
  
  3. Does R modify the Windows registry in a non-obvious way, i.e. other than
  defining itself and what extensions to associate with R, and what are those
  extensions?
 
 I think all of its modifications would count as obvious.  They are 
 mainly below HKLM/Software/R-core or HKCU/Software/R-core (where the 
 file locations are recorded); additionally file associations are set up 
 for .Rdata files (which are called RWorkspace files there), and an 
 uninstall entry is made.
  
  4. Does R add macros to any part of MS Office?
 
 No.
  
  5. Can you anticipate any other way in which installing and using R could
  disrupt the operation of another software?
 
 No, not really.  Maybe users will become addicted to it?  ;-)
 
 Duncan Murdoch
   
  
  ***
  Mat Soukup, Ph.D.
  Food and Drug Administration
  10903 New Hampshire Ave. 
  BLDG 22 RM 5329
  Silver Spring, MD 20993-0002
  Phone: 301.796.1005
  ***
 
 ( I think Mat is owed a big Thank you for his efforts).

Yes, indeed!

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] text mining with R

2005-11-03 Thread Ken Termiso
Hi all,

Just wondering if anyone knows of any text mining projects in R...I googled 
a bit but didn't get anything...

TIA,
ken

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] quadratic form

2005-11-03 Thread Jari Oksanen

On 3 Nov 2005, at 17:25, Robin Hankin wrote:

 Hi Alvarez


 If you define

 quad.form.inv -  function (M, x)
 {
  drop(crossprod(x, solve(M, x)))
 }

 then you will avoid an expensive call to %*% as well.

Is %*% really expensive in all platforms? I had a function that used QR 
decomposition instead of quadratic forms, but then I got a message from 
Canada suggesting that %*% would be faster. Indeed, it was in 
not-too-large data sets and in Mac (or powerpc). I run some tests with 
real applications, and found that my 800MHz iBook G4 run like a 2.5GHz 
Intel machine when %*% was used. This really was architecture 
dependent, since the performance boost was similar under OS X and Linux 
in the very same PowerPC. So it seems that %*% is very cheap if you 
have PowerPC, but it may be expensive in Intel. (I also run a test in 
Sun, and it was somewhere between Intel and PowerPC.)

cheers, jari oksanen
--
Jari Oksanen, Oulu, Finland

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] quadratic form

2005-11-03 Thread Peter Dalgaard
Peter Dalgaard [EMAIL PROTECTED] writes:

 Alvarez Pedro [EMAIL PROTECTED] writes:
 
  On page 22 of the R-introduction guide it's written:
  
  the quadratic form x^{'} A^{-1} x which is used in
  multivariate computations, should be computed by
  something like x%*%solve(A,x), rather than computing
  the inverse of A.
  
  Why isn't it good to compute t(x) %*% solve(A) %*% x?
 
 It's just a waste of CPU time. For k x k matrices, solution of Ax=b is
 of computational complexity O(k^2) whereas inversion of A is O(k^3).
 This is obviously more important for k=1000 than for k=5.

Erm, Thomas Lumley points out that solution of Ax=b is also of order
k^3 ingeneral (R uses a QR decomposition). So it's just the multiplier
that differs. The savings should still be on the order of k^3 though.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] text mining with R

2005-11-03 Thread Andy Bunn
 Just wondering if anyone knows of any text mining projects in 
 R...I googled 
 a bit but didn't get anything...

RSiteSearch(text mining) turns up 85 hits...

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] fatal error unused tempdir

2005-11-03 Thread Duncan Murdoch
On 11/3/2005 11:12 AM, SANDRINE COELHO wrote:
 Hello,
 
 I am running R on XP Windows machine, and frequently I have enable to start 
 R. I
 have this message: Fatal Error: cannot find unused tempdir name. I don't 
 know
 why.

This has come up before.  When R starts up, it tries to create a folder 
for temporary files, using the TMP, TEMP or R_USER environment variables 
(in that order) to find a base path.  The tempdir() function will tell 
you the folder name.

It chooses the name by appending a random number between 0 and 65535 to 
a base name.  It makes 100 attempts at this, and if it can't find a name 
that works, it bails out.  That's what you're seeing.  Sometimes it gets 
lucky and finds a name in the first 100 tries; then you don't get the 
error.

R tries to delete the directory when it quits, but if there are files 
there, deletion will fail.  So you should probably try to track down 
what is producing files and not getting rid of them.

What you need to do is to find the directory containing all these 
temporary files (print tempdir() on one of your successful attempts), 
and delete all the old ones.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] newbie graphics question: Two density plots in same frame ?

2005-11-03 Thread Alpert, William
I swear I've scoured the help files and several texts before posting
what feels like a dumb newbie question.  
 
How can I draw two kernel density plots in the same frame ? I have
similar variables in two separate data frames, and I would like to show
their two histograms/densities in a single picture.  Same units, scale,
range for both, so I'm simply trying to draw one and then add the other
to the picture.  Nothin' fancy.
 

Bill Alpert

Sr. Editor

Barron's

212.416.2742

[EMAIL PROTECTED]

 

 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie graphics question: Two density plots in same frame ?

2005-11-03 Thread bogdan romocea
Here's a function that you can customize to fit your needs. lst is a named list.

multicomp - function(lst)
{
clr - c(darkgreen,red,blue,brown,magenta)
alldens - lapply(lst,function(x) {density(x,from=min(x),to=max(x))})
allx - sapply(alldens,function(d) {d$x})
ally - sapply(alldens,function(d) {d$y})
plot(allx,ally,type=n)
for (i in 1:length(lst)) {
lines(alldens[[i]]$x,alldens[[i]]$y,lty=i,col=clr[i],lwd=3)
}
legend(topright,xjust=1,legend=names(lst),lwd=3,lty=1:length(lst),col=head(clr,length(lst)))
}
#---
toplot - list(var1=dfr1$var,var2=dfr2$var)
multicomp(toplot)


 -Original Message-
 From: Alpert, William [mailto:[EMAIL PROTECTED]
 Sent: Thursday, November 03, 2005 11:45 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] newbie graphics question: Two density plots in
 same frame ?


 I swear I've scoured the help files and several texts before posting
 what feels like a dumb newbie question.

 How can I draw two kernel density plots in the same frame ? I have
 similar variables in two separate data frames, and I would
 like to show
 their two histograms/densities in a single picture.  Same
 units, scale,
 range for both, so I'm simply trying to draw one and then add
 the other
 to the picture.  Nothin' fancy.


 Bill Alpert

 Sr. Editor

 Barron's

 212.416.2742

 [EMAIL PROTECTED]





   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Add dots at the mean of a bwplot using panel.points

2005-11-03 Thread Andy Bunn
How can I modify the example below to put a dot at the mean of each violin
plot? I assume I use panel.points but that's as far as I can go.

 bwplot(voice.part ~ height, singer,
panel = function(..., box.ratio) {
panel.violin(..., col = transparent,
 varwidth = FALSE, box.ratio = box.ratio)
#panel.points(mean(x.))
} )

TIA, Andy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie graphics question: Two density plots in same frame ?

2005-11-03 Thread Francisco J. Zagmutt
To plot two Kernel densities you can use matplot:

x1-density(rnorm(100))
x2-density(rnorm(100))
matplot(cbind(x1$y,x2$y), type=l)


Or if both distributions are really very similar and you don't have to 
adjust the axes you can simply use
plot(x1)
lines(x2, col=red)


Finally if you want to have two histograms in the same picture (I would not 
recomend it tough since the distributions are similar so the overlapping 
will make it very messy) you can use the argument add within hist

hist(rnorm(100), col=red)
hist(rnorm(100), col=blue, add=T)

I hope this helps

Francisco

From: Alpert, William [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject: [R] newbie graphics question: Two density plots in same frame ?
Date: Thu, 3 Nov 2005 11:45:21 -0500

I swear I've scoured the help files and several texts before posting
what feels like a dumb newbie question.

How can I draw two kernel density plots in the same frame ? I have
similar variables in two separate data frames, and I would like to show
their two histograms/densities in a single picture.  Same units, scale,
range for both, so I'm simply trying to draw one and then add the other
to the picture.  Nothin' fancy.


Bill Alpert

Sr. Editor

Barron's

212.416.2742

[EMAIL PROTECTED]





   [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Visualizing a Data Distribution -- Was: breaks in hist()

2005-11-03 Thread Leaf Sun
 Thanks for all the response. I think plotting a cdf or taking transformation 
could make the plot look better.

 But my further question is how to set the breaks to make the histogram 
concentrate in the interval of (0.01,0.2). I can even ignore the other parts of 
the values. 

Thanks!

Leaf



=== At 2005-11-02, 12:07:12 you wrote: ===

  Leaf Sun wrote:
  The histogram is highly screwed to the right, say, the range
  of the vector is [0, 2], but 95% of the value is squeezed in
  the interval (0.01, 0.2).

I guess the histogram is as you wrote. See
http://web.maths.unsw.edu.au/~tduong/seminars/intro2kde/
for a short explanation.


 -Original Message-
 From: Berton Gunter [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, November 02, 2005 1:10 PM
 To: 'Leaf Sun'; r-help@stat.math.ethz.ch
 Subject: [R] Visualizing a Data Distribution -- Was: breaks in hist()


 Leaf:

 An interesting question concerning graphical perception. As
 you have noted,
 choice of bin boundaries in a histogram can have a big effect on how a
 distribution is perceived. My $.02 (U.S.):

 Histograms are a relic of manual data plotting. We have much better
 alternatives these days that should be used instead. e.g.

 1. (my preference, but properly not consumer-friendly). Plot
 the cdf instead
 (?ecdf) .

 2. Plot a density estimator (?density ; ?densityplot)

 3. See David Scott's ash package, perhaps the KernSmooth package also
 (though density() probably already has anything that you'd
 need from it).

 Cheers,

 -- Bert Gunter
 Genentech Non-Clinical Statistics
 South San Francisco, CA

 The business of the statistician is to catalyze the
 scientific learning
 process.  - George E. P. Box



  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Leaf Sun
  Sent: Wednesday, November 02, 2005 9:49 AM
  To: r-help@stat.math.ethz.ch
  Subject: [R] breaks in hist()
 
  Dear listers,
 
  A quick question about breaks in hist().
 
  The histogram is highly screwed to the right, say, the range
  of the vector is [0, 2], but 95% of the value is squeezed in
  the interval (0.01, 0.2). My question is : how to set the
  breaks then make the histogram look even?
 
  Thanks in advance,
 
  Leaf
 
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


= = = = = = = = = = = = = = = = = = = =

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] quadratic form

2005-11-03 Thread Liaw, Andy
If you meant QR vs. inverting X'X for linear regression, the motivation for
using QR is not speed, but numerical stability.  There's no univerally good
least squares algorithm that would be uniformly better than anything else
for any kind of data.

Andy

 From: Jari Oksanen
 
 
 On 3 Nov 2005, at 17:25, Robin Hankin wrote:
 
  Hi Alvarez
 
 
  If you define
 
  quad.form.inv -  function (M, x)
  {
   drop(crossprod(x, solve(M, x)))
  }
 
  then you will avoid an expensive call to %*% as well.
 
 Is %*% really expensive in all platforms? I had a function 
 that used QR 
 decomposition instead of quadratic forms, but then I got a 
 message from 
 Canada suggesting that %*% would be faster. Indeed, it was in 
 not-too-large data sets and in Mac (or powerpc). I run some 
 tests with 
 real applications, and found that my 800MHz iBook G4 run like 
 a 2.5GHz 
 Intel machine when %*% was used. This really was architecture 
 dependent, since the performance boost was similar under OS X 
 and Linux 
 in the very same PowerPC. So it seems that %*% is very cheap if you 
 have PowerPC, but it may be expensive in Intel. (I also run a test in 
 Sun, and it was somewhere between Intel and PowerPC.)
 
 cheers, jari oksanen
 --
 Jari Oksanen, Oulu, Finland
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] nlme questions

2005-11-03 Thread Christian Mora




Dear R users;

Ive got two questions concerning nlme library 3.1-65 (running on R 2.2.0 /
Win XP Pro). The first one is related to augPred function. Ive been working
with a nonlinear mixed model with no problems so far. However, when the
parameters of the model are specified in terms of some other covariates,
say treatment (i.e. phi1~trt1+trt2, etc) the augPred function give me the
following error: Error in predict.nlme(object,
value[1:(nrow(value)/nL),,drop=FALSE], : Levels 0,1 not allowed for trt1,
trt2. The same model specification as well as the augPred function under
SPlus 2000 run without problems. The second question has to deal with the
time needed for the model to converge. It really takes a lot of time to fit
the model on R in relation to the time required to fit the same model on
SPlus. I can imagine this is related to the optimization algorithm or
something like that, but I would like to have a different opinion on these
two issues.

Thanks in advance

Christian Mora

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ML optimization question--unidimensional unfolding scaling

2005-11-03 Thread Peter Muhlberger
Hi Spencer  Andy:  Thanks for your thoughtful input!  I did at one point
look at the optim() function  run debug on it (wasn't aware of
browser--that's helpful!).  My impression is that optim() simply calls a C
function that handles the maximization.  So if I want to break out of my
likelihood function to restart optim() w/ new values, it seems I'd have to
somehow communicate to C that it's time to stop.  May need to rewrite the C,
with which I'm not familiar--Java yes, so maybe when I have some real free
time

Another possibility might be finding some jerry-rigged way to break out of
optim.  Maybe if I tell the likelihood function to freeze its returned value
at some point, optim will conclude it's done and stop.  Probably inefficient
 I will have the problem of telling when the break point ought to occur.
Just wish there were some programmatic way to say 'stop this and return
control to the higher-level calling function 'blah''.

A third possibility is one suggested by Spencer who seems to think it's ok
for the routine to pursue multiple branches w/o restarting, hence no restart
problem.  But w/ Newtonian-style convergence the latent scale values (which
are parameters to be estimated) have current positions  are supposed to
smoothly move toward lower likelihood values.  What will happen in branched
convergence, however, is that some of the latent values will prove to have
better values on the other side of a normal curve from their current
position.  My guess is that this will cause the likelihood function to make
a sudden, non-continuous jump not predictable by derivatives, which may mean
it can't converge properly.

Spencer's MDS alternative is intriguing  I'll need to think more about it.
Maybe I should also consider full-out Bayesian Monte Carlo methods (if I
have time), which would simultaneously explore the whole solution space.

Thanks,
Peter

On 11/2/05 9:01 PM, Spencer Graves [EMAIL PROTECTED] wrote:

  Have you looked at the code for optim?  If you execute optim, it
 will list the code.  You can copy that into a script file and walk
 through it line by line to figure out what it does.  By doing this, you
 should be able to find a place in the iteration where you can test both
 branches of each bifurcation and pick one -- or keep a list of however
 many you want and follow them all more or less simultaneously, pruning
 the ones that seem too implausible.  Then you can alternate between a
 piece of the optim code, bifurcating and pruning, adjusting each and
 printing intermediate progress reports to help you understand what it's
 doing and how you might want to modify it.
 
  With a bit more effort, you can get the official source code with
 comments.  To do that, I think you go to www.r-project.org - CRAN -
 (select a local mirror) - Software:  R sources.  From there, just
 download The latest release:  R-2.2.0.tar.gz.
 
  For more detailed help, I suggest you try to think of the simplest
 possible toy problem that still contains one of the issues you find most
 difficult.  Then send that to this list.  If readers can copy a few
 lines of R code from your email into R and try a couple of things in
 less than a minute, I think you might get more useful replies quicker.

On 11/3/05 8:08 AM, Liaw, Andy [EMAIL PROTECTED] wrote:

 Alternatively, just type debug(optim) before using it, then step through it
 by hitting enter repeatedly...
 
 When you're done, do undebug(optim).

On 11/3/05 11:06 AM, Liaw, Andy [EMAIL PROTECTED] wrote:

 Essentially all that debug() does is like inserting browser() as the first
 line of the function being debug()ed.  You can type just about any command
 at the browser prompt, e.g., for checking data, etc.  ?browser has list of
 special commands for the browser prompt.
 
 Andy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Fitting heteroscedastic linear models/ problems with varIdent of nlme

2005-11-03 Thread Andreas Cordes
Hi,
I would like to fit a model for a factorial design that allows for 
unequal variances in all groups. If I am not mistaken, this can be done 
in lm by specifying weights.
A function intended to specify weights for unequal variance structures 
is provided in the nlme library with the varIdent function. Is it 
apropriate to use these weights with lm? If not, is there another 
possibility to do factorial designs with heteroscedasticity?

When trying to use varIdent I get an errormessage that says that 
varIndent is not a defined class. The function calls are written in the 
same way as in Pinheiro  Bates Book. Their example works. With my data 
it doesnt. I somehow fail to figure out the difference between them.
In the remainder is a subset of my Dataset in which the problem also 
occours. A,B,C are metric variables, D is a factor.

I wold be thankful if any Ideas you might have.

thank you for your attention
Andreas

  dat2
A  B  C D
1   990.5 20 46 1
2   990.5 20 44 1
3   704.5 19 35 1
4   990.5 20 39 1
5  2240.5 25 79 2
6  2240.5 25 43 2
7  2240.5 25 44 2
8  2240.5 25 50 2
9  2240.5 25 56 2
10  470.0 17 51 2

  vi-varIdent(form=~1|D)
  vi-initialize(vi,dat2)
Error in getClass(Class) : c(\varIdent\ is not a defined class, 
\varFunc\ is not a defined class)
In addition: Warning message:
the condition has length  1 and only the first element will be used in: 
if (!is.na(match(Class, .BasicClasses))) return(newBasic(Class, 
 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ML optimization question--unidimensional unfolding scalin g

2005-11-03 Thread Peter Muhlberger
Hi Spencer:  Just realized I may have misunderstood your comments about
branching--you may have been thinking about a restart.  Sorry if I
misrepresented them.

See below:


On 11/3/05 11:03 AM, Spencer Graves [EMAIL PROTECTED] wrote:

 Hi, Andy and Peter:
 
  That's interesting.  I still like the idea of making my own local
 copy, because I can more easily add comments and test ideas while
 working through the code.  I haven't used debug, but I think I should
 try it, because some things occur when running a function that don't
 occur when I walk through it line by line, e.g., parsing the call and
 ... arguments.
 

Debug's handy tho I think it is line by line.

  Two more comments on the original question:
 
  1.  What is the structure of your data?  Have you considered
 techniques for Multidimensional Scaling (MDS)?  It seems that your
 problem is just a univariate analogue of the MDS problem.  For metric
 MDS from a complete distance matrix, the solution is relatively
 straightforward computation of eigenvalues and vectors from a matrix
 computed from the distance matrix, and there is software widely
 available for the nonmetric MDS problem.  For a terse introduction to
 that literature, see Venables and Ripley (2002) Modern Applied
 Statistics with S, 4th ed. (Springer, distance methods in sec. 11.1,
 pp. 306-308).
 

I was looking for something on MDS in R, that'll be handy!

The data structure is a set of variables (say about 6) that I have reason to
believe measure an underlying dimension.  I suspect that several of the
variables are unfolding--that is, they have their highest value for some
point on the scale and fall off w/ distance from that point in either
direction.  The degree of fall-off may vary depending on the variable.  Some
seem to fall off very rapidly, others not.  A couple variables probably
monotonically increase w/ the underlying scale, so they don't unfold.  I can
construct a distance matrix consisting of distances between these variables.

Do you think MDS might be able to handle an arrangement like this, w/ some
values folded about a scale point and with drop-off varying between
variables?  The distances between the variables do not map in any
straightforward way into distances on the underlying scale because of
folding and non-linearity.

  2.  If you don't have a complete distance matrix, might it be
 feasible to approach the problem starting small and building larger,
 i.e., start with 3 nodes, then add a fourth, etc.?
 

Not sure I follow, but I do have a complete distance matrix of distances
between the variables.

  spencer graves

Thanks,

Peter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Specify Z matrix with lmer function

2005-11-03 Thread Mark Lyman
Is there a way to specify a Z matrix using the lmer function, where the 
model is written as y = X*Beta + Z*u + e?

I am trying to reproduce smoothing methods illustrated in the paper 
Smoothing with Mixed Model Software my Long Ngo and M.P. Wand. 
published in the /Journal of Statistical Software/ in 2004 using the 
lme4 and Matrix packages. The code and data sets used can be found at 
http://www.jstatsoft.org/v09/i01/.

There original code did  not work for me without slight modifications 
here is the code that I used with my modifications noted.

x - fossil$age
y - 10*fossil$strontium.ratio
knots - seq(94,121,length=25)
n - length(x)
X - cbind(rep(1,n),x)
Z - outer(x,knots,-)
Z - Z*(Z0)
# I had to create the groupedData object with one group to fit the model 
I wanted
grp - rep(1,n)
grp.dat-groupedData(y~Z|grp)
fit - lme(y~-1+X,random=pdIdent(~-1+Z),data=grp.dat)

I would like to know how I could fit this same model using the lmer 
function. Specifically can I specify a Z matrix in the same way as I do 
above in lme?

Thanks,
Mark

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] multidimensional integration not over a multidimensionalrectangle

2005-11-03 Thread Lynette Sun
Hi,

anyone knows about any functions in R can get multidimensional integration
not over a multidimensional rectangle (not adapt).

For example, I tried the following function f(x,n)=x^n/n!

phi.fun-function(x,n)
{ if (n==1) {
x
}else{
integrate(phi.fun, lower=0, upper=x, n=n-1)$value
}
}

I could get f(4,2)=4^2/2!=8, but failed in f(4,3)=4^3/3! Thanks

Best,
Lynette

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Fitting heteroscedastic linear models/ problems with varIdent of nlme

2005-11-03 Thread Dieter Menne
Andreas Cordes andreas.cordes at stud.uni-goettingen.de writes:

 
 Hi,
 I would like to fit a model for a factorial design that allows for 
 unequal variances in all groups. If I am not mistaken, this can be done 
 in lm by specifying weights.

 A function intended to specify weights for unequal variance structures 
 is provided in the nlme library with the varIdent function. Is it 
 apropriate to use these weights with lm? If not, is there another 
 possibility to do factorial designs with heteroscedasticity?

No, varIdent and friends is made for use with mixed effect models in package 
nlme.

Dieter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] multivariate nonparametric regression with e = 0

2005-11-03 Thread Maciej Kalisiak
Hello all,

I'm a relatively new user of R, having mostly used it only for plotting so
far.  I'm also not very familiar with regression methods, hence forgive my
greenness on the topic.

What I want to do in R is multivariate nonparametric regression, with a slight
hitch.  From my experimental data I have a multitude of samples whose values
approximate a function `f' that is defined over a 5D space (i.e., f: R^5-R).
The values of the collected samples, call these `y', approximate `f', but due
to the process by which they are collected, they always over-estimate (i.e., y
= f + e, e = 0).  The distribution of the error `e' can likely be modelled
using the positive half of the normal distribution.

Naturally I'm trying to obtain a smooth and relatively faithful approximation
of `f' using the collected samples `y'.  What would be the most fruitful
approach in R to doing this?  Even suggestions on which package/function to
use would be tremendously helpful, as I don't yet know what their
strengths/weaknesses are.

Also, I would consider parametric regression as well, but in the general case
I don't think I can assume/guess for my data at what the appropriate
parametric basis functions should be...

-- 
Maciej Kalisiak [EMAIL PROTECTED]
http://www.dgp.toronto.edu/~mac/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie graphics question: Two density plots in same frame ?

2005-11-03 Thread Deepayan Sarkar
On 11/3/05, Alpert, William [EMAIL PROTECTED] wrote:
 I swear I've scoured the help files and several texts before posting
 what feels like a dumb newbie question.

 How can I draw two kernel density plots in the same frame ? I have
 similar variables in two separate data frames, and I would like to show
 their two histograms/densities in a single picture.  Same units, scale,
 range for both, so I'm simply trying to draw one and then add the other
 to the picture.  Nothin' fancy.

Using densityplot from lattice:

library(lattice)
d1 = data.frame(x = rnorm(100))
d2 = data.frame(x = rnorm(100, mean = 0.5))
densityplot(~d1$x + d2$x, plot.points = FALSE)

-Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problems with pf() with certain noncentral values/degrees of freedom combinations

2005-11-03 Thread Thomas Lumley

The problem is in src/nmath/pnbeta.c, which has an iteration limit of 
100, not enough for these problems.  Increasing the iteration limit to 
1000 seems to work.

-thomas


On Mon, 24 Oct 2005, Ken Kelley wrote:

 Hello all.

 It seems that the pf() function when used with noncentral parameters can
 behave badly at times. I've included some examples below, but what is
 happening is that with some combinations of df and ncp parameters,
 regardless of how large the quantile gets, the same probability value is
 returned. Upon first glance noncentral values greater than 200 may seem
 large, but they are in some contexts not large at all. The problems with
 pf() can thus have serious implications (for example, in the context of
 sample size planning).

 I noticed that in in 1999 and 2000 issues with large degrees of freedom
 came about (PR#138), but I couldn't find the present issue reported
 anywhere.

 Might there be a way to make the algorithm more stable? I'm not sure how
 difficult this issue might be to fix, but hopefully it won't be too bad
 and can be easily done. Any thoughts on a workaround until then?

 Thanks,
 Ken Kelley

 # Begin example code
 X - seq(10, 600, 10)

 # Gets stuck at .99135
 
 round(pf(X, 10, 1000, 225), 5)
 round(pf(X, 10, 200, 225), 5)

 round(pf(X, 5, 1000, 225), 5)
 round(pf(X, 5, 200, 225), 5)

 round(pf(X, 1, 1000, 225), 5)
 round(pf(X, 1, 200, 225), 5)

 # Gets stuck at .97035
 
 round(pf(X, 10, 1000, 250), 5)
 round(pf(X, 10, 200, 250), 5)

 round(pf(X, 5, 1000, 250), 5)
 round(pf(X, 5, 200, 250), 5)

 round(pf(X, 1, 1000, 250), 5)
 round(pf(X, 1, 200, 250), 5)

 # Gets stuck at .93539
 
 round(pf(X, 10, 1000, 275), 5)
 round(pf(X, 10, 200, 275), 5)

 round(pf(X, 5, 1000, 275), 5)
 round(pf(X, 5, 200, 275), 5)

 round(pf(X, 1, 1000, 275), 5)
 round(pf(X, 1, 200, 275), 5)
 # end example code

  version
  _
 platform i386-pc-mingw32
 arch i386
 os   mingw32
 system   i386, mingw32
 status
 major2
 minor2.0
 year 2005
 month10
 day  06
 svn rev  35749
 language R

 -- 
 Ken Kelley, Ph.D.
 Inquiry Methodology Program
 Indiana University
 201 North Rose Avenue, Room 4004
 Bloomington, Indiana 47405
 http://www.indiana.edu/~kenkel

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] margins too large

2005-11-03 Thread Paul Murrell
Hi


Sara Mouro wrote:
 Dear all,
 
 How can I explian and solve the error message:
   margins too large


Do you mean figure margins too large?  If so, it means that there is 
not enough room for your plot;  try making the graphics window (or page 
size) bigger.  Depending on how standard plot.fasp() is, you could 
also try reducing the plot margins by something like ...

par(mar=rep(1, 4))

Paul


 which appears when I do something like: 
 KK - alltypes(SpatData, K)
 plot.fasp(KK)
 
 Hope someone can please help me on this.
 
 Regards,
 Sara Mouro
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Help in expand.grid() (Restricted combination)

2005-11-03 Thread Prasanna
Dear Rs:

BY having the following code:

candidates-expand.grid(e=c(nearest-neighbor,exaustive),
d=c(70,75,80,85,90,92,94,96,98,99),
n=c(20,25,30,35,40))

results in :

e d n
1 nearest-neighbor 70 20
2 exaustive 70 20
3 nearest-neighbor 75 20
4 exaustive 75 20

90 exaustive 90 40
91 nearest-neighbor 92 40
92 exaustive 92 40
93 nearest-neighbor 94 40
94 exaustive 94 40
95 nearest-neighbor 96 40
96 exaustive 96 40
97 nearest-neighbor 98 40
98 exaustive 98 40
99 nearest-neighbor 99 40
100 exaustive 99 40


I need to associate nearest-neighbor with
d=c(70,75,80,85,90,92,94,96,98,99), n=c(20,25,30,35,40)
but exaustive only with d=c(70,75,80,85,90,92,94,96,98,99). Therefore I
will have only 50+10 combinations not 100.
I need to have combination as shown below

1 nearest-neighbor 70 20
2 exaustive 70
3 nearest-neighbor 75 20
4 exaustive 75

60

Thanks
Prasanna


--
Prasanna BALAPRAKASH
IRIDIA, Université Libre de Bruxelles
50, Av. F. Roosevelt, CP 194/6
1050 Brussels
Belgium.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] RODBC and Excel: Wrong Data Type Assumed on Import

2005-11-03 Thread Earl F. Glynn
Kevin Wright [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 From my experience (somewhat of a guess):

 Excel uses the first 16 rows of data to determine if a column is numeric
or
 character. The data type which is most common in the first 16 rows will
then
 be used for the whole column.

I ran some experiments trying to force RODBC to read column 1 of my
worksheet as character data (the data are mostly numbers with two
exceptions, 275a and 275b, as mentioned earlier).



Here's the base code:



 library(RODBC)

 channel - odbcConnectExcel(U:/efg/lab/R/Krumlauf-Plasmid/construct
list.xls)

 plasmid - sqlFetch(channel,Sheet1, as.is=TRUE)

 odbcClose(channel)

 names(plasmid)

[1] Plasmid Number PlasmidConcentration  Comments
Lost



When Excel Sheet1 has rows 2:13 as an X to attempt to force treatment of
column 1 as character data:



 class(plasmid$Plasmid Number)

[1] numeric

 typeof(plasmid$Plasmid Number)

[1] double

 plasmid$Plasmid Number[1:20]

 [1] NA NA NA NA NA NA NA NA NA NA NA NA  2  3  4  5  6  7  8  9



Why would any software with 12 consecutive X character strings assume
the data are purely numeric?



Add one more X so rows 2:14 have an X to attempt to force treatment of
column 1 as character data:



 class(plasmid$Plasmid Number)

[1] character

 typeof(plasmid$Plasmid Number)

[1] character

 plasmid$Plasmid Number[1:20]

 [1] X X X X X X X X X X X X X NA  NA  NA  NA  NA
NA  NA



So RODBC now recognizes character Xs in column 1 and then declares all
numbers as invalid?  These are incredibly (bad) assumptions.



I say this is a bug, but it may be an ODBC problem and not one with R.
And if this is not an official bug, then it's a serious design problem.
Minimally, this issue should be described in the R Data Import/Export
document, which everyone is told to read before asking a question.



It's frustrating when packages like this work for toy problems, and the
documentation never mentions the pitfalls of real data.


 The gregmisc bundle has a different read.xls function that uses a Perl
 script (xls2csv) and seems to be safer with mixed-type columns.
 Requires a working version of Perl.

Thanks for this suggestion, but I think I'll just convert the Excel
spreadsheet to a .csv and maintain it in that format.

efg

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Fitting heteroscedastic linear models/ problems with varIdent of nlme

2005-11-03 Thread Peter Dalgaard
Dieter Menne [EMAIL PROTECTED] writes:

 Andreas Cordes andreas.cordes at stud.uni-goettingen.de writes:
 
  
  Hi,
  I would like to fit a model for a factorial design that allows for 
  unequal variances in all groups. If I am not mistaken, this can be done 
  in lm by specifying weights.
 
  A function intended to specify weights for unequal variance structures 
  is provided in the nlme library with the varIdent function. Is it 
  apropriate to use these weights with lm? If not, is there another 
  possibility to do factorial designs with heteroscedasticity?
 
 No, varIdent and friends is made for use with mixed effect models in package 
 nlme.

- or with generalized least squares, using the gls() function in the same
package. 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Add dots at the mean of a bwplot using panel.points

2005-11-03 Thread Andy Bunn
 How can I modify the example below to put a dot at the mean of each violin
 plot? I assume I use panel.points but that's as far as I can go.

  bwplot(voice.part ~ height, singer,
 panel = function(..., box.ratio) {
 panel.violin(..., col = transparent,
  varwidth = FALSE, box.ratio = box.ratio)
 #panel.points(mean(x.))
 } )

Well, I answered my own question and learned something of the dark art of
lattice plots in the process. If this is an inane way to go about this, then
somebody please say so.

panel.mean - function(x,y,...){
  y - as.numeric(y)
  y.unique - sort(unique(y))
  for(Y in y.unique) {
X - x[y == Y]
if (!length(X)) next
mean.value - list(x = mean(X), y = Y)
do.call(lpoints, c(mean.value, pch = 20))
  }
}

bwplot(voice.part ~ height, singer,
panel = function(...) {
  panel.violin(...,col = transparent)
  panel.mean(...)
  })

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Using R with SGE

2005-11-03 Thread Jon Savian
Hello,

I was wondering if there is a way to use R with the Sun Grid Engine,
as opposed to using Rmpi.  Our sge has a lammpi parrallel environment
as well.  Does anyone have a script to submit a batch R job using the
SGE scheduler?  Or can point me to a good tutorial?

Thanks

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Error message: The following object(s) are masked

2005-11-03 Thread Ettinger, Nicholas
Hello!

First time posting here:

Here is my code:

x - c(1:22)
finaloutput=cidrm=NULL
finaldiversityoutput=diversitym=NULL

diversityinfo=read.table(Diversity_info.txt, header=T, sep=\t,
row.names=NULL)
attach(diversityinfo)
diversitynr=nrow(diversityinfo)
diversitytemp - matrix(0,nrow=diversitynr,ncol=1)

for(j in 1:length(x))
{
diversitym=read.table(paste(paste(Div-Chr,x[j],
sep=),_corr.txt,sep=), header=T, sep=\t, row.names=NULL)
attach(diversitym)
diversitytemp - cbind(diversitytemp,diversitym)
print(paste(paste(Diversity-Chrom,x[j], sep=),.txt,sep=))
print(dim(diversitytemp))
}

I am essentially trying to combine several tab-delimited text data files
together into one big file.

I recently upgraded to R 2.2.0.  I now get multiple error messages of
the form:
The following object(s) are masked from diversitym ( position 4 ) :

 X X.1 X.2

I searched on masked and read the manual about conflicts but I don't
really understand what the issue is.  I didn't get this error message,
using the same code with R 2.1.0.  Can somebody help (I'm not a
programmer).

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help in expand.grid() (Restricted combination)

2005-11-03 Thread Berton Gunter
If I understand you correctly, you cannot do this with a data.frame, which
must be rectangular with equal numbers of entries (columns) in each row. See
?.data.frame and read An Introduction to R for these basics.  You could
make the 3rd column for exhaustive = NA (or maybe an empty string, ) I
suppose, and rbind the results you want:

part1-expand.grid(e='nearest-neighbor',d=c(70,75,80,85,90,92,94,96,98,99),
n=c(20,25,30,35,40))

part2-data.frame(e='exhaustive',d=c(70,75,80,85,90,92,94,96,98,99),n=NA)

result-rbind(part1,part2)


-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific learning
process.  - George E. P. Box
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Prasanna
 Sent: Thursday, November 03, 2005 12:07 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Help in expand.grid() (Restricted combination)
 
 Dear Rs:
 
 BY having the following code:
 
 candidates-expand.grid(e=c(nearest-neighbor,exaustive),
 d=c(70,75,80,85,90,92,94,96,98,99),
 n=c(20,25,30,35,40))
 
 results in :
 
 e d n
 1 nearest-neighbor 70 20
 2 exaustive 70 20
 3 nearest-neighbor 75 20
 4 exaustive 75 20
 
 90 exaustive 90 40
 91 nearest-neighbor 92 40
 92 exaustive 92 40
 93 nearest-neighbor 94 40
 94 exaustive 94 40
 95 nearest-neighbor 96 40
 96 exaustive 96 40
 97 nearest-neighbor 98 40
 98 exaustive 98 40
 99 nearest-neighbor 99 40
 100 exaustive 99 40
 
 
 I need to associate nearest-neighbor with
 d=c(70,75,80,85,90,92,94,96,98,99), n=c(20,25,30,35,40)
 but exaustive only with d=c(70,75,80,85,90,92,94,96,98,99). 
 Therefore I
 will have only 50+10 combinations not 100.
 I need to have combination as shown below
 
 1 nearest-neighbor 70 20
 2 exaustive 70
 3 nearest-neighbor 75 20
 4 exaustive 75
 
 60
 
 Thanks
 Prasanna
 
 
 --
 Prasanna BALAPRAKASH
 IRIDIA, Universiti Libre de Bruxelles
 50, Av. F. Roosevelt, CP 194/6
 1050 Brussels
 Belgium.
 
   [[alternative HTML version deleted]]
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Potential for R to conflict with other softwares

2005-11-03 Thread Soukup, Mat
I just wanted to make one clarification about my statement: After some
time, my collegues at the Food and Drug Adminstration have finally
acknowledged R as a powerful statistical computing environment. I did not
intend for this to read that R has been acknowledged as being  21 CFR Part
11 compliant. This is a whole other ball game. What I meant to say is that,
within the Center of Drug Evaluation and Research (CDER) the Office of
Biostatistics is willing to look into whether or not reviewers can download
R onto their government issued PC's. And this must all be approved by the
Office of Information and Technology. So admittedly, this is only a small
step, but nonetheless it is a step in the right direction.

I apologize for any confusion. Cheers,

Mat

Disclaimer which I also forgot in the original post: The following views are
those of the writer and do not necessarily reflect those of the FDA.



 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Duncan Murdoch
 Sent: Thursday, November 03, 2005 6:30 AM
 To: Soukup, Mat
 Cc: 'r-help@stat.math.ethz.ch'
 Subject: Re: [R] Potential for R to conflict with other softwares
 
 On 11/3/2005 9:11 AM, Soukup, Mat wrote:
  Hi.
  
  After some time, my collegues at the Food and Drug 
 Adminstration have
  finally acknowledged R as a powerful statistical computing 
 environment.
  However, in order to comply with the Office of Information 
 and Technology
  standards there are a couple of questions about whether R 
 could interfere
  with other software. As I'm more of a driver of the R 
 software and not a
  mechanic, I was hoping for the insight of the many great 
 useRs. Below is a
  list of 5 proposed questions to which I value any comment.
  
  Thank you for your time,
  
  -Mat
  
  
 
 These answers are about the Windows version only, but from the 
 questions, I think that's what you were looking for.  They 
 apply to all 
 versions since 1.6.x at least (though the earlier ones would have put 
 fewer entries into the registry, they put them in the same places).
 
  1. Does R have high resolution graphics?
 
 Yes, but I don't think I get the point of this question.  How 
 would that 
 interfere with other software?
  
  2. Does R have .dll files, or other executables which are 
 not located in the
  R software directory tree?
 
 No, it installs everything below R_HOME.
  
  3. Does R modify the Windows registry in a non-obvious way, 
 i.e. other than
  defining itself and what extensions to associate with R, 
 and what are those
  extensions?
 
 I think all of its modifications would count as obvious.  They are 
 mainly below HKLM/Software/R-core or HKCU/Software/R-core (where the 
 file locations are recorded); additionally file associations 
 are set up 
 for .Rdata files (which are called RWorkspace files there), and an 
 uninstall entry is made.
  
  4. Does R add macros to any part of MS Office?
 
 No.
  
  5. Can you anticipate any other way in which installing and 
 using R could
  disrupt the operation of another software?
 
 No, not really.  Maybe users will become addicted to it?  ;-)
 
 Duncan Murdoch
   
  
  
 **
 *
  Mat Soukup, Ph.D.
  Food and Drug Administration
  10903 New Hampshire Ave. 
  BLDG 22 RM 5329
  Silver Spring, MD 20993-0002
  Phone: 301.796.1005
  
 **
 *
  
  
  [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Add dots at the mean of a bwplot using panel.points

2005-11-03 Thread Deepayan Sarkar
On 11/3/05, Andy Bunn [EMAIL PROTECTED] wrote:
  How can I modify the example below to put a dot at the mean of each
 violin
  plot? I assume I use panel.points but that's as far as I can go.
 
   bwplot(voice.part ~ height, singer,
  panel = function(..., box.ratio) {
  panel.violin(..., col = transparent,
   varwidth = FALSE, box.ratio = box.ratio)
  #panel.points(mean(x.))
  } )

 Well, I answered my own question and learned something of the dark art of
 lattice plots in the process. If this is an inane way to go about this,
 then
 somebody please say so.

panel.mean - function(x, y, ...) {
tmp - tapply(x, y, FUN = mean)
panel.points(tmp, seq(tmp), pch = 20, ...)
}

is more direct, but otherwise your solution is fine.

-Deepayan

 panel.mean - function(x,y,...){
   y - as.numeric(y)
   y.unique - sort(unique(y))
   for(Y in y.unique) {
 X - x[y == Y]
 if (!length(X)) next
 mean.value - list(x = mean(X), y = Y)
 do.call(lpoints, c(mean.value, pch = 20))
   }
 }

 bwplot(voice.part ~ height, singer,
 panel = function(...) {
   panel.violin(...,col = transparent)
   panel.mean(...)
   })

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RODBC and Excel: Wrong Data Type Assumed on Import

2005-11-03 Thread Gabor Grothendieck
You could try using the COM interface rather than the ODBC
interface.  Try code such as this:

library(RDCOMClient)
xls - COMCreate(Excel.Application)
xls[[Workbooks]]$Open(MySpreadsheet.xls)
sheet - xls[[ActiveSheet]]
mydata - sheet[[UsedRange]][[value]]
xls$Quit()

# convert mydata to a character matrix
mydata.char - matrix(unlist(mydata), nc = length(xx))



On 11/3/05, Kevin Wright [EMAIL PROTECTED] wrote:
 From my experience (somewhat of a guess):

 1.

 Excel uses the first 16 rows of data to determine if a column is numeric or
 character. The data type which is most common in the first 16 rows will then
 be used for the whole column. If you sort the data so that at least the
 first 9 rows have character data, you may find this allows the data to be
 interpreted as character. There is supposedly a registy setting that can
 control how many lines to use (instead of 16), but I have not had success
 with the setting. I suspect that ODBC uses JET4, which may be the real
 source of the problem. See more here:
 http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-data-types/

 2.

 The gregmisc bundle has a different read.xls function that uses a Perl
 script (xls2csv) and seems to be safer with mixed-type columns.
 Requires a working version of Perl.

 Best,

 Kevin Wright



 The first column in my Excel sheet has mostly numbers but I need to treat it
 as character data:

  library(RODBC)
 http://tolstoy.newcastle.edu.au/R/help/05/09/11324.html#14938qlink1
 * channel - odbcConnectExcel(U:/efg/lab/R/Plasmid/construct list.xls) *
 * plasmid - sqlFetch(channel,Sheet1, as.is=TRUE) *
 * odbcClose(channel) *

  names(plasmid)

 [1] Plasmid Number Plasmid Concentration Comments Lost

 # How is the type decided? I need a character type.
  class(plasmid$Plasmid Number)

 [1] numeric
  typeof(plasmid$Plasmid Number)

 [1] double

  plasmid$Plasmid Number[273:276]

 [1] 274 NA NA 276

 The two NAs are supposed to be 275a and 275b. I tried the as.is=TRUE but
 that didn't help.

 I consulted Section 4, Relational databases, in the R Data Import/Export
 document (for Version 2.2.0).

 Section 4.2.2, Data types, was not helpful. In particular, this did not seem
 helpful: The more comprehensive of the R interface packages hide the type
 conversion issues from the user.

 Section 4.3.2, Package RODBC, provided a simple example of using ODBC ..
 with a(sic) Excel spreadsheet but is silent on how to control the data type
 on import. Could the documentation be expanded to address this issue?

 I really need to show Plasmid 275a and Plasmid 275b instead of Plasmid
 NA.

 Thanks for any help with this.

 efg

 --
 Earl F. Glynn
 Scientific Programmer
 Bioinformatics Department

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] FIGARCH

2005-11-03 Thread Spencer Graves
  RSiteSearch(FIGARCH) revealed that Diethelm Wuertz prepared an R 
interface to Ox Garch.  However, Ox and all its components are 
copyright of Jurgen A. Doornik. The Console (command line) versions may 
be used freely for academic research and teaching purposes only. 
Commercial users and others who do not qualify for the free version must 
purchase the Windows version of Ox ... .  See 
http://finzi.psych.upenn.edu/R/library/fSeries/html/A3-GarchOxModelling.html;.

  I don't know if this will help you.
  Best Wishes,
  Spencer Graves

Sumanta Basak wrote:
 Hi All,
 
  
 
 Currently I'm working in FIGARCH process [Fractionally Integrated
 Generalized Autoregressive Conditional Heteroscedasticity]. I've already
 got the codes to do the process in S-Plus. Can anyone help me to do it
 in R?
 
  
 
  
 
 Thanks,
 
 SUMANTA BASAK.
 
 
 ---
 This e-mail may contain confidential and/or privileged infor...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Plotting Factorial GLMs

2005-11-03 Thread Jarrett Byrnes
Hello all,
I'm attempting to plot the functions from a generalized linear model 
while iterating over multiple levels of a factor in the model.  In 
other words, I have a data set

Block, Treatment.Level, Response.Level

So, the glm and code to plot should be

logit.reg-glm(formula = Response.Level ~ Treatment.Level + Block,
family=quasibinomial(link=logit)))

plot( Response.Level ~ Treatment.Level)

logit.reg.function - function (trt, blk) predict(logit.reg, 
data.frame(Treatment.Level=trt, Block=blk)

curve(logit.reg.function(x, A), add=TRUE)


But I get the error:
Error in xy.coords(x, y) : 'x' and 'y' lengths differ

Now, if I set Block=A in the function, and take blk out, as well as 
taking the A out of the curve statement, it plots just fine.  What am 
I doing wrong, as this would be a nice, quick, and easy way to whip up 
multiple curves from a factorial dataset!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Search within a file

2005-11-03 Thread Seth Falcon
On  3 Nov 2005, [EMAIL PROTECTED] wrote:
 I am looking for a way to search a file for position of some
 expression, from within R. My current code:

 sha1Pos = gregexpr(sha1, readChar(filename,
 file.info(filename)$size))[[1]]

 Works fine for small files, but text files I will be working with
 might get up to Gb range, so I was trying to accomplish the same
 without loading the whole file into R.

I would think you could use readLines to read in a batch of lines, run
(g)regexpr, and keep track of matches and position.

Create a connection to the file using file() first, and then
subsequent calls to readLines will start where you left off.

But you will need to adjust the position indices returned by gregexpr
by how far into the file you are.  Seems very doable.

+ seth

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problems with abline adding regression line to a graph

2005-11-03 Thread Petr Pikal
Hi

On 3 Nov 2005 at 16:03, CG Pettersson wrote:

Date sent:  Thu, 03 Nov 2005 16:03:05 +0100
From:   CG Pettersson [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject:[R] Problems with abline adding regression line to a 
graph

 Hello all,
 
 R2.1.1, W2k
 
 I try to make a plot of a simple regression model in this way:
 
   with(njfA_bcd, {
 + plot(TC_OS.G31,Prot,cex = 2, col = red, xlab= TC/OS at GS32,
 + ylab=Grain crude protein (CP))
 + })
 
 This part works well and produces the datapoints as red circles.
 When I try to add a line, using a fitted linear model in a way
 that works perfect with other variables in the same dateset the
 following happens:
 
   with(njfA_bcd, {
 +abline(lm(predict(m1tc) ~ TC_OS.G31), lty = 1, col = red)
 + })
 Error in model.frame(formula, rownames, variables, varnames, extras,
 extranames,  :
 variable lengths differ
 
 And this means?
 
 There exists missing values for TC_OS.G31 in the dataset. From the
 beginning m1tc was a lm() object, which gave the same Error message.
 To try to fix the problem I changed to lme() and used
 na.action=na.omit explicitely, but this didn´t help.

na.action=na.exclude

should help.

It should be probably mentioned in lm help page, at least as a 
hyperlink to na.* parametrs.

HTH
Petr



 
 Here is the summary of m1tc:
 
   summary(m1tc)
 Linear mixed-effects model fit by REML
  Data: njfA_bcd
AIC  BIClogLik
   209.4914 219.0692 -100.7457
 
 Random effects:
  Formula: ~1 | Trial
 (Intercept)  Residual
 StdDev:1.242184 0.6520464
 
 Fixed effects: Prot ~ TC_OS.G31
 Value Std.Error DF   t-value p-value
 (Intercept)  14.86209  0.957630 68 15.519662   0
 TC_OS.G31   -24.22286  4.792801 68 -5.054008   0
  Correlation:
   (Intr)
 TC_OS.G31 -0.935
 
 Standardized Within-Group Residuals:
 Min  Q1 Med  Q3 Max
 -1.68329774 -0.73751040 -0.05600477  0.68301243  2.21693174
 
 Number of Observations: 83
 Number of Groups: 14
  
 
 What is happening and what shall I do about it?
 
 Cheers
 /CG
 
 -- 
 CG Pettersson, MSci, PhD Stud.
 Swedish University of Agricultural Sciences (SLU)
 Dept. of Crop Production Ecology. Box 7043.
 SE-750 07 UPPSALA, Sweden.
 +46 18 671428, +46 70 3306685
 [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html