Re: [R] compile error with C code and standalone R math C library

2003-12-14 Thread Prof Brian Ripley
This is a known bug already fixed (we believe) in R-patched and R-devel.

On Sun, 14 Dec 2003, Dirk Eddelbuettel wrote:

 On Sun, Dec 14, 2003 at 12:51:09AM -0500, Faheem Mitha wrote:
  On Sat, 13 Dec 2003, Dirk Eddelbuettel wrote:
  
   On Sat, Dec 13, 2003 at 07:44:46PM -0500, Faheem Mitha wrote:
I just went back to an old piece of C code. On trying to compile it with
the R math standalone C library I got the following error. Can anyone
enlighten me what I am doing wrong, if anything? C file (rr-sa.c) follows.
   
I'm on Debian sarge. I'm running R version 1.8.1. Gcc is version
3.3.1.
   [...]
faheem ~/co/rr/trunkgcc -o rr rr-sa.c -lRmath -lm
/usr/lib/gcc-lib/i486-linux/3.3.2/../../../libRmath.so: undefined
reference to `Rlog1p'
collect2: ld returned 1 exit status
  
   The linker tells you that it cannot find object code for a function Rlog1p.
   So let's check:
  
   [EMAIL PROTECTED]:~ grep Rlog1p /usr/include/Rmath.h
   [EMAIL PROTECTED]:~ grep log1p /usr/include/Rmath.h
   double  log1p(double); /* = log(1+x) {care for small x} */
   [EMAIL PROTECTED]:~
  
   Turns out that there is none defined in Rmath.h, but log1p exists.  This may
   have gotten renamed since you first wrote your code.
  
  Maybe I am being dense, but how is this my fault? I am not using either
  Rlog1p or log1p in my code (as far as I can see).
 
 Indeed -- it looks like we have a problem. Looking at the NEWS file, some
 logic regarding (R)log1p was changed in the 1.8.* series, and it seems to be
 going wrong here.
 
 As a stop gap-measure, just define an empty Rlog1p() to complete linking:
 
 [EMAIL PROTECTED]:/tmp grep Rlog1p rr-sa.c
 void Rlog1p(void) {;}
 [EMAIL PROTECTED]:/tmp gcc -Wall -o rr rr-sa.c -lRmath -lm
 [EMAIL PROTECTED]:/tmp ls -l rr
 -rwxr-xr-x1 edd  edd 13889 Dec 14 00:12 rr
 [EMAIL PROTECTED]:/tmp
 
 For the record, on my build system, a log1p function is found but deemed not
 good enough:
 
 [EMAIL PROTECTED]:/tmp grep log1p ~/src/debian/build-logs/r-base_1.8.1-1.log |
 head -2checking for log1p... yes
 checking for working log1p... no
 
 
 Dirk
 
 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] density plot for very large dataset

2003-12-14 Thread Liaw, Andy
You might want to try hexbin (hexagonal binning) in the BioConductor suite
(see www.bioconductor.org).

HTH,
Andy

 From: Obi Griffith
 
 I'm new to R and am trying to perform a simple, yet 
 problematic task.  I 
 have two variables for which I would like to measure the 
 correlation and 
 plot versus each other.  However, I have ~30 million data points 
 measurements of each variable.  I can read this into R from file and 
 produce a plot with plot(x0, x1) but as you would expect, its 
 not pretty 
 to look at and produces a postscript file of about 700MB.  A google 
 search found a few mentions of doing density plots but they seemed to 
 assume you already have the density matrix.  Can anyone point 
 me in the 
 right direction, keeping in mind that I am a complete R newbie.
 
 Obi


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] Problem with data conversion

2003-12-14 Thread arinbasu
Hi All: 

I came across the following problem while working with a dataset, and 
wondered if there could be a solution I sought here. 

My dataset consists of information on 402 individuals with the followng five 
variables (age,sex, status = a binary variable with levels case or 
control, mma, dma). 

During data check, I found that in the raw data, the data entry operator had 
mistakenly put a 0 for one participant, so now, the levels show 

levels(status) 
[1] 0 control case 

The variables mma, and dma are actually numerical variables but in the 
dataframe, they are represented as characters. I tried to change the type 
of the variables (from character to numeric) using the edit function (and 
bringing up the data grid where then I made changes), but the changes were 
not saved. I tried 

mma1 - as.numeric(mma) 

but I was not successful in converting mma from a character variable to a 
numeric variable. 

So, to edit and clean the data, I exported the dataset as a text file to 
Epi Info 2002 (version 2, Windows). I used the following code: 

mysubset - subset(workingdat, select = c(age,sex,status, mma, dma))
write.table(mysubset, file=mysubset.txt, sep=\t, col.names=NA) 

After I made changes in the variables using Epi Info (I created a new 
variable called statusrec containing values case and control), I 
exported the file as a .rec file (filename mydata.rec). I used the 
following code to read the file in R: 

require(foreign)
myData - read.epiinfo(mydata.rec, read.deleted=NA) 

Now, the problem is this, when I want to run a logistic regression, R 
returns the following error message: 

glm(statusrec~mma, family=binomial(link=logit))
Error in model.frame(formula, rownames, variables, varnames, extras, 
extranames,  :
  invalid variable type 

I cannot figure out the solution. I want to run a logistic regression now 
with the variable statusrec (which is a binary variable containing values 
case and control), and another
variable (say mma, which is now a numeric variable). What does the above 
error message mean and what could be a possible solution? 

Would greatly appreciate your insights and wisdom. 

-Arin Basu

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Problem with data conversion

2003-12-14 Thread Christian Schulz
 The variables mma, and dma are actually numerical variables but in the
 dataframe, they are represented as characters. I tried to change the
type
 of the variables (from character to numeric) using the edit function (and
 bringing up the data grid where then I made changes), but the changes were
 not saved. I tried

 mma1 - as.numeric(mma)

i'm not sure understanding your problem correct, but is it possible that you
forget the data.frame ,suppose your data.frame is df

df$mma - as.numeric (mma)  should work
df$mma[df$mma == 0 ]  -  1   #or any other value

regards,christian

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Problem with data conversion

2003-12-14 Thread Prof Brian Ripley
The message probably means that the variable is a character variable and 
not numerical (as you intended) nor factor.

Although you said there was a trip to epiinfo, you never said where the 
data came from.  Try dumping out the data, editing the file, and reading 
it with read.table.  There are other ways, but one of your steps has a bug 
and we have no idea what the steps actually were.

When you are finished, try

sapply(mfdf, class)

on your dataframe `mydf'.  You should see only numeric or factor 
variables.

On Sun, 14 Dec 2003 [EMAIL PROTECTED] wrote:

 Hi All: 
 
 I came across the following problem while working with a dataset, and 
 wondered if there could be a solution I sought here. 
 
 
 My dataset consists of information on 402 individuals with the followng five 
 variables (age,sex, status = a binary variable with levels case or 
 control, mma, dma). 
 
 During data check, I found that in the raw data, the data entry operator had 
 mistakenly put a 0 for one participant, so now, the levels show 
 
  levels(status) 
 [1] 0 control case 
 
 The variables mma, and dma are actually numerical variables but in the 
 dataframe, they are represented as characters. I tried to change the type 
 of the variables (from character to numeric) using the edit function (and 
 bringing up the data grid where then I made changes), but the changes were 
 not saved. I tried 
 
 mma1 - as.numeric(mma) 
 
 but I was not successful in converting mma from a character variable to a 
 numeric variable. 
 
 So, to edit and clean the data, I exported the dataset as a text file to 
 Epi Info 2002 (version 2, Windows). I used the following code: 
 
 mysubset - subset(workingdat, select = c(age,sex,status, mma, dma))
 write.table(mysubset, file=mysubset.txt, sep=\t, col.names=NA) 
 
 After I made changes in the variables using Epi Info (I created a new 
 variable called statusrec containing values case and control), I 
 exported the file as a .rec file (filename mydata.rec). I used the 
 following code to read the file in R: 
 
 require(foreign)
 myData - read.epiinfo(mydata.rec, read.deleted=NA) 
 
 Now, the problem is this, when I want to run a logistic regression, R 
 returns the following error message: 
 
  glm(statusrec~mma, family=binomial(link=logit))
 Error in model.frame(formula, rownames, variables, varnames, extras, 
 extranames,  :
invalid variable type 
 
 
 I cannot figure out the solution. I want to run a logistic regression now 
 with the variable statusrec (which is a binary variable containing values 
 case and control), and another
 variable (say mma, which is now a numeric variable). What does the above 
 error message mean and what could be a possible solution? 
 
 Would greatly appreciate your insights and wisdom. 
 
  -Arin Basu
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 
 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] half normal probability plot in R

2003-12-14 Thread kjetil
On 13 Dec 2003 at 4:00, Allison Jones wrote:

 I have generated the effects in a factorial design and now want to 
put
 them in a half normal probability plot. Is there an easy way to do
 this in R??? I can't find the command. Thanks much -
 

qqnorm.aov in package gregmisc (on CRAN)

Kjetil Halvorsen


 Ali Jones


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] correlation and causality examples

2003-12-14 Thread Jean lobry
Dear R-users,

many thanks to all who replied to my request, approx. one month
ago, about examples illustrating that correlation does imply
causality. I have tried to compile your suggestions in a
web-site, which URL is given in the screenshot in png format
there:
http://pbil.univ-lyon1.fr/members/lobry/z.png
and in jpeg format there:
http://pbil.univ-lyon1.fr/members/lobry/z.jpg
It's far for being perfect because of an over-teaching period,
but I hope to improve it in the future, so that your suggestions
and comments are always welcome.
All the best,

Jean
--
Jean R. Lobry
Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I,
43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE
allo  : +33 472 43 12 87 fax: +33 472 43 13 88
http://pbil.univ-lyon1.fr/members/lobry/
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] contour() should handle the asp parameter

2003-12-14 Thread Patrick Giraudoux
Hi all,

To my knowledge, the current version of contour.default() does not handle the 'asp' 
parameter. This can be embarassing when displaying eg geographical maps, etc... 
Submitted to the opinion of more experienced R programmers, contour.defaut() function 
should be changed according to the followings:

line 7: add = FALSE,asp=NA,...)
line 33: plot.window(xlim, ylim, asp=asp,)

The new script would be:

contour.default-
function (x = seq(0, 1, len = nrow(z)), y = seq(0, 1,
len = ncol(z)),z, nlevels = 10, levels = pretty(zlim, nlevels), labels = NULL, xlim = 
range(x, finite = TRUE), ylim = range(y, finite = TRUE),
zlim = range(z, finite = TRUE), labcex = 0.6, drawlabels = TRUE,
method = flattest, vfont = c(sans serif, plain), axes = TRUE,
frame.plot = axes, col = par(fg), lty = par(lty), lwd = par(lwd),
add = FALSE,asp=NA,...)
{
if (missing(z)) {
if (!missing(x)) {
if (is.list(x)) {
z - x$z
y - x$y
x - x$x
}
else {
z - x
x - seq(0, 1, len = nrow(z))
}
}
else stop(no `z' matrix specified)
}
else if (is.list(x)) {
y - x$y
x - x$x
}
if (any(diff(x) = 0) || any(diff(y) = 0))
stop(increasing x and y values expected)
if (!is.matrix(z) || nrow(z) = 1 || ncol(z) = 1)
stop(no proper `z' matrix specified)
if (!add) {
plot.new()
plot.window(xlim, ylim, asp=asp,)
title(...)
}
if (!is.double(z))
storage.mode(z) - double
method - pmatch(method[1], c(simple, edge, flattest))
if (!is.null(vfont))
vfont - c(typeface = pmatch(vfont[1], Hershey$typeface) -
1, fontindex = pmatch(vfont[2], Hershey$fontindex))
if (!is.null(labels))
labels - as.character(labels)
.Internal(contour(as.double(x), as.double(y), z, as.double(levels),
labels, labcex, drawlabels, method, vfont, col = col,
lty = lty, lwd = lwd))
if (!add) {
if (axes) {
axis(1)
axis(2)
}
if (frame.plot)
box()
}
invisible()
}
environment: namespace:base




Best regards,

Patrick
[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] contour() should handle the asp parameter

2003-12-14 Thread Prof Brian Ripley
Given contour() has an add= argument whose use is demonstrated for this 
exact reason in `all good books on S', why complicate the contour 
function?

Contrary to popular belief, the `asp parameter' is not a parameter, but an 
argument of plot's default method.

On Sun, 14 Dec 2003, Patrick Giraudoux wrote:

 To my knowledge, the current version of contour.default() does not handle the 'asp' 
 parameter. This can be embarassing when displaying eg geographical maps, etc... 
 Submitted to the opinion of more experienced R programmers, contour.defaut() 
 function should be changed according to the followings:
 
 line 7: add = FALSE,asp=NA,...)
 line 33: plot.window(xlim, ylim, asp=asp,)

And of course alter the documentation.

I suspect you could get away with passing ... to the plot.window call, 
which for clarity should be plot.window(xlim, ylim, , asp = asp)

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] A faster plotOHLC() for the tseries package

2003-12-14 Thread Dirk Eddelbuettel

The plotOHLC function in the tseries package is useful to plot timeseries of
various financial assets with open/high/low/close data.  I had often
wondered if it could be made to run a little faster. It turns out that the
following patch does 

--- plotOHLC.R.orig 2003-12-14 12:02:20.0 -0600
+++ plotOHLC.R  2003-12-14 12:03:42.0 -0600
@@ -21,14 +21,9 @@
 ylim - range(x[is.finite(x)])
 plot.new()
 plot.window(xlim, ylim, ...)
-for (i in 1:NROW(x)) {
-segments(time.x[i], x[i, High], time.x[i], x[i, Low], 
-col = col[1], bg = bg)
-segments(time.x[i] - dt, x[i, Open], time.x[i], x[i, 
-Open], col = col[1], bg = bg)
-segments(time.x[i], x[i, Close], time.x[i] + dt, x[i, 
-Close], col = col[1], bg = bg)
-}
+segments(time.x, x[, High], time.x, x[, Low], col = col[1], bg = bg)
+segments(time.x - dt, x[, Open], time.x, x[, Open], col = col[1], bg = bg)
+segments(time.x, x[, Close], time.x + dt, x[, Close], col = col[1], bg = bg)
 if (ann) 
 title(main = main, xlab = xlab, ylab = ylab, ...)
 if (axes) {

decrease the time spent on a series of ~500 points by a factor of sixty:

 IBM-get.hist.quote(IBM, 2001-12-14)
trying URL
http://chart.yahoo.com/table.csv?s=IBMa=11b=13c=2001d=11e=12f=2003g=dq=qy=0z=IBMx=.csv'
Content type application/octet-stream' length unknown
opened URL
.. .. ...
downloaded 23Kb

time series starts 2001-12-12
time series ends   2003-12-11
 system.time(plotOHLC(IBM))# original
[1] 1.56 0.26 5.11 0.00 0.00
 system.time(fastplotOHLC(IBM))# patched
[1] 0.02 0.00 0.05 0.00 0.00


Regards,  Dirk

-- 
Those are my principles, and if you don't like them... well, I have others.
-- Groucho Marx

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] A faster plotOHLC() for the tseries package

2003-12-14 Thread Dirk Eddelbuettel

Hi Gabor,

On Sun, Dec 14, 2003 at 02:35:40PM -0500, Gabor Grothendieck wrote:
 
 Dirk,
 
 Could you please explain to me how to interpret the lines
 that you posted (which I gather are intended to be used with
 some program that combines them with the original source)?

That is the usual paradigm of using output of diff(1) 

 $ diff -u old new  diff.txt
 
as input to the patch(1) program as e.g. in

 $ patch  diff.txt# try patch --dry-run  diff.txt   first
   
On win2k, you can get them for sure with Cygwin, probably also with
mingw/msys and likely also with BDR's set of tools to build R from source.
 
Patch, written by Larry Wall of Perl fame, reads an entry such as

   --- plotOHLC.R.orig 2003-12-14 12:02:20.0 -0600
   +++ plotOHLC.R 2003-12-14 12:03:42.0 -0600
   @@ -21,14 +21,9 @@

and knows to replace lines marked with '-' (taken be old the old file) with
those marked '+'.  In this example, it is trivial as there is only one
segment in which the code 

 for (i in 1:NROW(x)) {
 segments(time.x[i], x[i, High], time.x[i], x[i, Low], 
 col = col[1], bg = bg)
 segments(time.x[i] - dt, x[i, Open], time.x[i], x[i, 
 Open], col = col[1], bg = bg)
 segments(time.x[i], x[i, Close], time.x[i] + dt, x[i, 
 Close], col = col[1], bg = bg)
 }

with 

 segments(time.x, x[, High], time.x, x[, Low], col = col[1], bg = bg)
 segments(time.x - dt, x[, Open], time.x, x[, Open], col = col[1], bg = bg)
 segments(time.x, x[, Close], time.x + dt, x[, Close], col = col[1], bg = bg)
 
 Alternately, could you just send the revised OHLC source?

Well, the source is different from what you find in $R_HOME/library/tseries/R
so you may as well edit there by hand. I don't have access to a windows box
right now, but if the above doesn't help email off-line and I start up the
laptop from work.

Hope this helps,  Dirk

-- 
Those are my principles, and if you don't like them... well, I have others.
-- Groucho Marx

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] reverse lexicographic order

2003-12-14 Thread Murray Jorgensen
Hi all,

I have some email addresses that I would like to sort in reverse 
lexicographic order so that addresses from the same domain will be 
grouped together. How might that be done?

Murray

--
Dr Murray Jorgensen  http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: [EMAIL PROTECTED]Fax 7 838 4155
Phone  +64 7 838 4773 wk+64 7 849 6486 homeMobile 021 1395 862
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] pca

2003-12-14 Thread fciclone
Dear listmates, i've done a pca analisys in R (1.8 v.) 
with the command

pca(Matrix, cent=FALSE, scle=FALSE)  

I have obtained a v matrix very different from the  
component matrix resulted by a factor analysis in SPSS, 
unrotated and with a extraction from a correlation 
matrix. Any clues about this diference?
Thanks in advance,
Alex.
 
__
Acabe com aquelas janelinhas que pulam na sua tela.
AntiPop-up UOL - É grátis!
http://antipopup.uol.com.br/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] reverse lexicographic order

2003-12-14 Thread Peter Dalgaard
Thomas W Blackwell [EMAIL PROTECTED] writes:

 Murray  -
 
 If you could guarantee that all of the email addresses have
 exactly one occurrence of the @ character in them, then
 something like
snip

Otherwise, try something like this (I don't think we have a string
reversal function anywhere, do we?):

 mychar - scan(what=)
1: I have some email addresses that I would like to sort in reverse
14: lexicographic order so that addresses from the same domain will be
25: grouped together. How might that be done?
32:
Read 31 items

mychar[order(sapply(lapply(strsplit(mychar,),rev),paste,collapse=))]
 [1] together. done? I I
 [5] lexicographic grouped   would be
 [9] bethe   like  same
[13] some  reverse   have  email
[17] will  from  indomain
[21] sotoorder addresses
[25] addresses that  that  that
[29] might sort  How



  Hi all,
 
  I have some email addresses that I would like to sort in reverse
  lexicographic order so that addresses from the same domain will be
  grouped together. How might that be done?
 
  Murray
 
  --
  Dr Murray Jorgensen  http://www.stats.waikato.ac.nz/Staff/maj.html
  Department of Statistics, University of Waikato, Hamilton, New Zealand
  Email: [EMAIL PROTECTED]Fax 7 838 4155
  Phone  +64 7 838 4773 wk+64 7 849 6486 homeMobile 021 1395 862
 
  __
  [EMAIL PROTECTED] mailing list
  https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] reverse lexicographic order

2003-12-14 Thread Murray Jorgensen
Hi Duncan, Hi Peter,

thanks for those ideas!  I'm sure I will learn a lot be fooling around 
with them.

Cheers,

Murray

Duncan Murdoch wrote:

On Mon, 15 Dec 2003 10:08:31 +1300, you wrote:


Hi all,

I have some email addresses that I would like to sort in reverse 
lexicographic order so that addresses from the same domain will be 
grouped together. How might that be done?


I'm not sure this is what you want, but this function sorts a
character vector by last letters, then 2nd last, 3rd last, and so on:
revsort - function(x,...) {
x[order(unlist(lapply(strsplit(x,''),
function(x) paste(rev(x),collapse=''))),...)]
}

revsort(as.character(1:20))
 [1] 10 20 1  11 2  12 3  13 4  14 5  15 6
16 7 
[16] 17 8  18 9  19

The ... args are given to order(), so na.last=FALSE and
decreasing=TRUE are possibilities.
Duncan Murdoch




--
Dr Murray Jorgensen  http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: [EMAIL PROTECTED]Fax 7 838 4155
Phone  +64 7 838 4773 wk+64 7 849 6486 homeMobile 021 1395 862
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] extensive grid docs?

2003-12-14 Thread Paul Murrell
Hi

Most of the documentation that exists for grid is currently linked off 
my grid web site (http://www.stat.auckland.ac.nz/~paul/grid/grid.html).

Paul

[EMAIL PROTECTED] wrote:


[EMAIL PROTECTED] wrote on 10/12/2003 22:23:43:


I'm looking for extensive docs on using grid (for the somewhat newbie).
I'm

attempting to create a set of graphics that look similar to the attached
image (I hope this doesn't get bounced) and have only come across the R
newsletters and it appears that grid was new as of 1.8.0? I think the
best

way to proceed is to create the plot, clip it using a polygon, then
manually

add the annotation. Is that correct?

I couldn't find much on the FAQ regarding creating really goofy plots
with

grid and any hints would be greatly appreciated.



Here's an R graphics tutorial by Paul Murrell

http://www.ci.tuwien.ac.at/Conferences/DSC-2003/tutorials.html



HTH,
Tobias
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] read.spss question warning compression bias

2003-12-14 Thread Mulholland, Tom
So it would appear that if the above is correct, there is no user adjustment to the 
bias value.
The only scenario that I can envision is if the user SAVE's the .sav file in an 
uncompressed
format, where the bias value **might** be set to 0.

Perhaps a r-help reader with access to current SPSS manuals can confirm the above.


The windows version 11.5.0 appears the same (I assume the negative sign on -99 was 
somehow dropped)

COMPRESSED and UNCOMPRESSED Subcommands

COMPRESSED saves the file in compressed form. UNCOMPRESSED saves the file in 
uncom-pressed form.
In a compressed file, small integers (from 99 to 155) are stored in one byteinstead 
of the
eight bytes used in an uncompressed file. 

The only specification is the keyword COMPRESSED or UNCOMPRESSED. There are 
noadditional specifications. 

Compressed data files occupy less disk space than do uncompressed data files.

Compressed data files take longer to read than do uncompressed data files.

The GET command, which reads SPSS-format data files, does not need to specify 
whetherthe files it reads are compressed or uncompressed.

Only one of the subcommands COMPRESSED or UNCOMPRESSED can be specified perSAVE 
command. COMPRESSED is usually the default, though UNCOMPRESSED may bethe default on 
some systems.

Ciao, Tom

_
 
Tom Mulholland
Senior Policy Officer
WA Country Health Service
Tel: (08) 9222 4062
 
The contents of this e-mail transmission are confidential and may be protected by 
professional privilege. The contents are intended only for the named recipients of 
this e-mail. If you are not the intended recipient, you are hereby notified that any 
use, reproduction, disclosure or distribution of the information contained in this 
e-mail is prohibited. Please notify the sender immediately.


-Original Message-
From: Marc Schwartz [mailto:[EMAIL PROTECTED] 
Sent: Friday, 12 December 2003 3:56 AM
To: Thomas Lumley
Cc: [EMAIL PROTECTED]
Subject: Re: [R] read.spss question warning compression bias


On Thu, 2003-12-11 at 12:32, Thomas Lumley wrote:
 On Thu, 11 Dec 2003, Marc Schwartz wrote:
 
  An additional question might be, if the file is not compressed, what 
  is the default bias value set by SPSS? If it is 0, then the check is 
  meaningless. On the other hand, if the default value is 100, whether 
  or not the file is compressed, then the warning message would serve 
  a purpose in flagging the possibility of other issues. Reasonably, 
  that setting may be SPSS version specific.
 
 
 I think the issue is that the format is not documented, so the author 
 of the code (Ben Pfaff) didn't know what a change in the value would 
 imply. If the file is apparently read correctly it seems that it 
 doesn't imply anything.
 
   -thomas



Thanks for the clarification Thomas.

I did some searching of the PSPP site and found the following:

http://www.gnu.org/software/pspp/manual/pspp_18.html#SEC170

The compression bias is defined as:

flt64 bias;
Compression bias. Always set to 100. The significance of this
value is that only numbers between (1 - bias) and (251 - bias)
can be compressed.


So it would seem to potentially impact aspects of the file compression data structure, 
when compression is used.

I am not sure if the Always set to 100 is unique to PSPP in how Ben elected to do 
things. Presumably if that is always the case, even with SPSS, one might reasonably 
wonder: why have it, if it does not vary?

It leaves things unclear as to under what circumstances this value would change. 

I did some Googling and found the following text snippet from a presumably dated SPSS 
manual for the syntax of the SAVE command:


SAVE OUTFILE=file 

[/VERSION={3**}] {2 } 

[/UNSELECTED=[{RETAIN}] {DELETE} 

[/KEEP={ALL** }] [/DROP=varlist] {varlist} 

[/RENAME=(old varlist=new varlist)...] 

[/MAP] 

[/{COMPRESSED }] {UNCOMPRESSED} 

**Default if the subcommand is omitted.


COMPRESSED and UNCOMPRESSED Subcommands 

COMPRESSED saves the file in compressed form. UNCOMPRESSED saves the file in 
uncompressed form. In a compressed file, small integers (from 
99 to 155) are stored in one byte instead of the eight bytes used in an uncompressed 
file.

The only specification is the keyword COMPRESSED or UNCOMPRESSED. There are no 
additional specifications. 

Compressed data files occupy less disk space than do uncompressed data files. 

Compressed data files take longer to read than do uncompressed data files. 

The GET command, which reads SPSS-format data files, does not need to specify whether 
the files it reads are compressed or uncompressed. 

Only one of the subcommands COMPRESSED or UNCOMPRESSED can be specified per SAVE 
command. COMPRESSED is usually the default, though UNCOMPRESSED may be the default on 
some systems.




So it would appear that if the above is correct, there is no user adjustment to the 
bias value. The only scenario that I can envision is if 

Re: [R] Problem with data conversion

2003-12-14 Thread Paul E. Johnson
I sympathize with your trouble bringing in data, but you need to catch 
your breath and figure out what you really have.  I think when you get a 
bit more R practice, you will be able to manage what you bring in 
without going back to that editor so much.

I feel certain your data is not what you think it is.  Here's an example 
where a factor DOES work on the lhs of a glm:

 y - factor(c(S,N,S,N,S,N,S,N))
 x - rnorm(8)
 glm(y~x,family=binomial(link=logit))
Look here: the system knows y is a factor:
 attributes(y)
$levels
[1] N S
$class
[1] factor
My guess is that your variables are not really factors, but rather 
character vectors.  You have to convert them into factors.
Watch the error I get is the same that you got.

 y - c(S,N,S,N,S,N,S,N)
 glm(y~x,family=binomial(link=logit))
Error in model.frame(formula, rownames, variables, varnames, extras, 
extranames,  :
   invalid variable type

Note the system doesn't know y is supposed to be a factor. It just 
sees characters.

 y
[1] S N S N S N S N
 levels(y)
NULL
 attributes(y)
NULL
but look:
 glm(as.factor(y)~x,family=binomial(link=logit))


[EMAIL PROTECTED] wrote:

Hi All:
I came across the following problem while working with a dataset, and 
wondered if there could be a solution I sought here.

My dataset consists of information on 402 individuals with the 
followng five variables (age,sex, status = a binary variable with 
levels case or control, mma, dma).
During data check, I found that in the raw data, the data entry 
operator had mistakenly put a 0 for one participant, so now, the 
levels show

levels(status) 
[1] 0 control case
The variables mma, and dma are actually numerical variables but in the 
dataframe, they are represented as characters. I tried to change the 
type of the variables (from character to numeric) using the edit 
function (and bringing up the data grid where then I made changes), 
but the changes were not saved. I tried
mma1 - as.numeric(mma)
but I was not successful in converting mma from a character variable 
to a numeric variable.
So, to edit and clean the data, I exported the dataset as a text 
file to Epi Info 2002 (version 2, Windows). I used the following code:
mysubset - subset(workingdat, select = c(age,sex,status, mma, dma))
write.table(mysubset, file=mysubset.txt, sep=\t, col.names=NA)
After I made changes in the variables using Epi Info (I created a new 
variable called statusrec containing values case and control), I 
exported the file as a .rec file (filename mydata.rec). I used the 
following code to read the file in R:
require(foreign)
myData - read.epiinfo(mydata.rec, read.deleted=NA)
Now, the problem is this, when I want to run a logistic regression, R 
returns the following error message:

glm(statusrec~mma, family=binomial(link=logit))
Error in model.frame(formula, rownames, variables, varnames, extras, 
extranames,  :
  invalid variable type

I cannot figure out the solution. I want to run a logistic regression 
now with the variable statusrec (which is a binary variable containing 
values case and control), and another
variable (say mma, which is now a numeric variable). What does the 
above error message mean and what could be a possible solution?
Would greatly appreciate your insights and wisdom.
-Arin Basu

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


--
Paul E. Johnson   email: [EMAIL PROTECTED]
Dept. of Political Sciencehttp://lark.cc.ukans.edu/~pauljohn
University of Kansas  Office: (785) 864-9086
Lawrence, Kansas 66045FAX: (785) 864-5700
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] help in lme

2003-12-14 Thread cbotts1
To anyone who can help,

  I have two stupid questions, and one fairly intelligent question

Stupid question (1):   is there an R function to calculate a factorial of a number?
That is...is there a function g(.) such that g(3) = 6, g(4) = 24, g(6) = 720, etc?

Stupid question (2):  how do you extract the estimated covariance matrix of the random 
effects in an lme object?


Intelligent question (1)  I keep on trying to fit a linear mixed model in R using 
'lme(y~fxd.dsgn, data = data.mtrx, ~rnd.dsgn|group)' where fxd.dsgn and rnd.dsgn are 
the fixed and random design matrices, respectively.   The function won't work, though. 
  It keeps telling me that it can't find the object 'rnd.dsgn'.What's the matter 
here?

Any help would be greatly appreciated.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] help with random numbers and Rmpi

2003-12-14 Thread Faheem Mitha


On Mon, 1 Dec 2003, A.J. Rossini wrote:

 Faheem Mitha [EMAIL PROTECTED] writes:

  So, can this (parallelization at the C level) be done without running a
  bunch of C slaves along the lines I had previously written? Any examples
  would be helpful.

 How much heavy lifting happens before you spawn the slaves, and can
 that not be moved to R?

 Your best bet is to read the SNOW code for handling SPRNG/RSPRNG,
 otherwise.

I've tried to use snow as suggested. I have a R function mg.randvec which
generates a vector of random variates. This function calls a C routine via
the .C call. This works fine if I call it like say...

*
 mg.randvec(3,2,10,5)
$val
 [1] -1.9967464 -1.8634205 -0.7459255 -1.7591047 -1.7811685 -1.9953316
 [7] -1.7932502 -1.9823565 -1.7999789 -1.0501179 -1.9679886  0.1484859
[13]  0.5768898  1.9117889  1.9366872 -1.3847453 -1.5554107 -1.4933195
[19] -1.8508795 -1.6715850 -1.8951212 -1.8900167 -1.1630852 -1.3989748
[25] -1.9400337 -1.6774471 -1.8136065 -1.8685709 -1.9119879 -1.3378416
*

However, with snow I get

**
 clusterCall(cl,mg.randvec,3,2,10,5)
[[1]]
[1] Error in .C(\rocftp\, as.integer(k), as.integer(len),
as.double(theta),  : \n\tC/Fortran function name not in load table\n
attr(,class)
[1] try-error

[[2]]
[1] Error in .C(\rocftp\, as.integer(k), as.integer(len),
as.double(theta),  : \n\tC/Fortran function name not in load table\n
attr(,class)
[1] try-error

[[3]]
[1] Error in .C(\rocftp\, as.integer(k), as.integer(len),
as.double(theta),  : \n\tC/Fortran function name not in load table\n
attr(,class)
[1] try-error


In the cluster case it seems to have difficulty loading up the C routine.

I think snow is working Ok, because basic examples like the following
work.


 clusterCall(cl,runif,3)
[[1]]
[1] 0.1527429 0.1134621 0.8663094

[[2]]
[1] 0.2256776 0.8981241 0.1120226

[[3]]
[1] 0.2371450 0.5090693 0.2776081
**

Can anyone tell me what I am doing wrong? All data files is shared across
all three machines I am using (AFS space).

Thanks in advance.

Faheem.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help