[R] The function predict

2008-02-11 Thread Carla Rebelo
Good Morning!

May you help me? I need to understand the function predict. I need to 
understand the algorithm implemented, the calculations associated. Where 
can I find this information?

Thank You!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tree() producing NA's

2008-02-11 Thread Amnon Melzer
Hi

 

Hoping someone can help me (a newbie).

 

I am trying to construct a tree using tree() in package tree. One of the
fields is a factor field (owner), with many levels. In the resulting tree, I
see many NA's (see below), yet in the actual data there are none.

 

 rr200.tr - tree(backprof ~ ., rr200)

 rr200.tr

1) root 200 1826.00 -0.2332  

...

[snip]

...

5) owner: Cliveden Stud,NA,NA,NA,NA,NA,NA,NA,NA 10   14.25  1.5870 *

  3) owner: B E T Partnership,Flaming Sambuca
Syndicate,NA,NA,NA,NA,NA,NA,NA,NA 11  384.40 10.5900  

6) decodds  12 5   74.80  6.3000 *

7) decodds  12 6  140.80 14.1700 *

 

Can anyone tell me why this happens and what I can do about it?

 

Regards

 

Amnon

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dendrogram for agglomerative hierarchical clustering result

2008-02-11 Thread noorpiilur
Hey group,

I have a problem of drawing dendrogram as the result of my program
written in C. My algorithm is a approximation algorithm for single
linkage method. AS a result I will get the following data:

[Average distance] [cluster A] [cluster B]

For example:
42.593141   1   26
42.593141   4   6
42.593141   123 124
42.593141   4   113
74.244206   1   123
74.244206   4   133
74.244206   1   36

So far I have used C to generate a bitmap output but I would like to
use the computed result as an input for R to just draw the dendrogram.

As I'm new to R any help is appreciated.

Thanks,
Risto

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] image quality

2008-02-11 Thread John Lande
dear all,
I am writing a sweave documentation for my analysis, and I am plotting huge
scatter plot data for microarray.
unlucly this take a lot of resource to my pc because of the quality of the
image which is to high (I see the PC get stuck for each single spot).
how can I overcome this problem? is there a way to make lighter image?


john

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Conditional rows

2008-02-11 Thread Ng Stanley
Hi,

Given a simple example,  test - matrix(c(0.1, 0.2, 0.1, 0.2, 0.1, 0.1, 0.3,
0.1, 0.1), 3, 3)

How to generate row indexes for which their corresponding row values are
less than or equal to 0.2 ? For this example, row 2 and 3 are the correct
ones.

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to generate a column based on other columns in a data frame

2008-02-11 Thread Henrique Dallazuanna
Try this:

x2 - merge(x, cbind(unique(x), Site=sprintf(S%d,
seq_len(nrow(unique(x), by=c(X, Y))
x2[order(x2$site)]

On 11/02/2008, Weidong Gu [EMAIL PROTECTED] wrote:
 HI,



 I am working on a data set with multiple collections of mosquitoes at
 sampling sites. Each row represents a collection of individual samples
 with coordinates for each collection.

 ... X,  Y,...

 1  36.435 30.118

 2  36.435 30.118

 3  36.435 30.118

 4  35.329 29.657

 5  35.329 29.657

 6  36.431 30.111

 7  36.431 30.111

 8  35.421 29.797

 9  35.421 29.797

 10 35.421 29.797



 Unfortunately, there is no 'site' entry. I would like to add a column of
 'site' based on the coordinates of samples so that samples from the same
 sites have the same site ID like S1, S2,



 How to do this in R way? Thanks.





 Weidong Gu,

 Department of Medicine
 University of Alabama, Birmingham
 1900 University Blvd., Birmingham, Alabama 35294
 Email: [EMAIL PROTECTED]
 PH: (205)-975-9053




 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gcc 4.3 any known issues?

2008-02-11 Thread Stefan Grosse
Hi,

Fedora is for Fedora 9 switching to gcc 4.3. Before I test it (rawhide) I want 
to be sure that R is running. So my question is whether there have been 
issues compiling R + packages using 4.3? 

Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tinn-R not working well with latest R

2008-02-11 Thread John C Frain
Pending the solution of the problem I use tinn-R as follows.

1) I make none of the recommended additions to the Rprofile.site file
2) I start tinn-R from the desktop and then load an r-file from my
working directory.
3) I then start R from the  R| start preferred Rgui | menu in Tinn-R.
This has the effect of starting R in the work directory and any saved
data there will be loaded automatically.

I may be missing some functionality but I like the way it works.  As
far as I can see this is not documented.

I do occasionally have a problem which i can not replicate at will but
which occurs occasionally.  If I have the cursor in the middle of a
line containing an R command any attempt to insert some thing i=at the
cursor is inserted one character per line above the line that I am
amending.  For example If I try to change

x - a + b + c+ d + e to
x - a +  b123 + c + d + e

Tinn-R displays

1
2
3
x - a +  b + c + d + e

with the cursor staying after the b

To recover close Tinn-R and restart and the problem vanishes.  Any suggestions

Best Regards

John


On 11/02/2008, Farrel Buchinsky [EMAIL PROTECTED] wrote:
 I recently installed R 2.6.2 and am getting errors on startup that relate to
 svIDE being loaded by Tinn-R.


 Loading required package: tcltk
 Loading Tcl/Tk interface ... done
 Warning messages:
 1: '\A' is an unrecognized escape in a character string
 2: unrecognized escape removed from ;for Options\AutoIndent: 0=Off,
 1=follow language scoping and 2=copy from previous line\n
 3: In grep(paste([{]TclEval , topic, [}], sep = ), tclvalue(.Tcl(dde
 services TclEval {})), :
 argument 'useBytes = TRUE' will be ignored
 Loading required package: svMisc
 Loading required package: R2HTML


 Any idea what is going on.
 I use R 2.6.2 on windows xp

 I also started R without the profile that Tinn-R made.
 If I manualy enter library(svIDE) then I get.
  library(svIDE)
 Warning messages:
 1: '\A' is an unrecognized escape in a character string
 2: unrecognized escape removed from ;for Options\AutoIndent: 0=Off,
 1=follow language scoping and 2=copy from previous line\n

 So the underlying problem may be svIDE
 see: http://tolstoy.newcastle.edu.au/R/e2/help/07/04/15738.html

 Apparently, because of this error, several great features in Tinn-R are not
 working properly.
 Any solutions or workarounds?


 --
 Farrel

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
John C Frain
Trinity College Dublin
Dublin 2
Ireland
www.tcd.ie/Economics/staff/frainj/home.html
mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subsetting a data.frame degenerates at one column?

2008-02-11 Thread Allen S. Rout
jim holtman [EMAIL PROTECTED] writes:

 try:

 input[,targets, drop=FALSE]

 see:

 ?[

 for an explanation.


Thanks, you who responded; this was exactly helpful, and a good
reference to the part of the FM I was missing.  To unpack (and
demonstrate some comprehension gained.. ;) the subsetting operations
on data frames, by default, use the most basic data type capable of
representing the answer.  

Either the drop=FALSE or the inputs[targets] solution give me the
result I had in mind.  I mildly prefer the [targets] statement from a
visual perspective.





- Allen S. Rout

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tinn-R not working well with latest R

2008-02-11 Thread John C Frain
Corinna

Thanks for the suggestion.  I can not duplicate the error myself.  I
generally have a code segment open in Tinn-R and have sent it to R and
wish to rerun it with some changes but  when I try to make the changes
they are transferred to the line above one character per line.  This
has happened about 5/6 times since Christmas.  The code segments were
different. I can think of no common factor that might have caused the
problem.  Closing tinn-r and R and restarting always cured the problem
which then did not occur again for  several days and in different
circumstances.  I have not looked for help because I have been unable
to replicate the problem.

John Frain

On 11/02/2008, Schmitt, Corinna [EMAIL PROTECTED] wrote:



 Hallo,

  I had the same problems before. I think the best solution is that you just
 copy the needed codepart out of Tinn-R with Ctr+C. Then open R directly from
 your desktop NOT from Tinn-R. Than paste in the command. you can still make
 changes in the command when you have not pressed enter by using the arrow
 buttons of the keyboard. put the curse where you want in the command line
 and change it.

  Hope that is what you want. I cannot imitate your example.

  Corinna





-- 
John C Frain
Trinity College Dublin
Dublin 2
Ireland
www.tcd.ie/Economics/staff/frainj/home.html
mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] WG: Tinn-R not working well with latest R

2008-02-11 Thread Schmitt, Corinna

 
I am in R command window and just make Crt+V.

Corinna


-Ursprüngliche Nachricht-
Von: Farrel Buchinsky [mailto:[EMAIL PROTECTED]
Gesendet: Mo 11.02.2008 21:16
An: Schmitt, Corinna
Betreff: Re: Tinn-R not working well with latest R
 
I can easily get R to open without an error. I simply removed the Tinn-R 
related lines from the Rprofile.site file
C:\Program Files\R-2.6.2\etc\Rprofile.site

but then when I try to manually load the svIDE library by entering 
library(svIDE) from the command line, I get a similar error.

So when you say Than paste in the command, what command are you referring 
to?
What do you change it to?


Schmitt, Corinna [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]


 Hallo,

 I had the same problems before. I think the best solution is that you just 
 copy the needed codepart out of Tinn-R with Ctr+C. Then open R directly 
 from your desktop NOT from Tinn-R. Than paste in the command. you can 
 still make changes in the command when you have not pressed enter by using 
 the arrow buttons of the keyboard. put the curse where you want in the 
 command line and change it.

 Hope that is what you want. I cannot imitate your example.

 Corinna




 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


**
This email and any files transmitted with it are confide...{{dropped:16}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Histogram in Lattice with 3 factors

2008-02-11 Thread Deepayan Sarkar
On 2/11/08, willem vervoort [EMAIL PROTECTED] wrote:
 Dear R-help list,

 I am trying to construct a lattice histogram using 3 factors.

 My dataframe looks like this: (simulating a waterbalance over
 groundwater with different salinities)

 s  days   modelECEC_max
 0.4 1A   10  9
 0.42   2A   10  9
 0.44   3A   10   9
   ::  :   : :
 0.4  1B  10  9
   ::  :  : :
 0.4  1A  309
   ::  :  :   :
 0.4  1A  3036

 Anyway you get the gist
 EC_max has two levels 9 and 36, EC has 3 levels 10, 30 and 70, and
 model has two levels (A and B). There are say 365 days and s is
 the variable of interest (soil saturation)

 Can maybe be reproduced with:
 data - data.frame(s = rnorm(2*3*365*2),rep(1:365,12), model =
 sort(rep(c(A,B),6*365)),
 EC = rep(sort(rep(c(10,30,70),365*2)),2), EC_max =
 rep(sort(rep(c(9,36),3*365)),2))

 I would like to plot histograms with the three factors using Lattice
 so I had the following code:

 my.strip - function(which.given, ..., factor.levels) {
 levs - if (which.given == 1)  c(Model A,Model B)
 else {if(which.given == 2) paste(EC =
 ,as.character(EC),dS/m)
   else paste(ECmax = ,as.character(EC_max),dS/m)}
 strip.default(which.given, ..., factor.levels = levs)
  }

 histogram(~s|model*as.factor(EC)*as.factor(EC_max),data=Store,xlab=soil
 saturation,type=density,strip=my.strip)

 But I am doing something wrong, because it plots the histogram for
 factor level EC_max =9 first and than straight over it the histogram
 for factor level 36, so only 6 panels on the graph rather than 12.

 I searched the archives, but no luck so far.

Look up the 'layout' argument in ?xyplot. By default, for 2 or more
conditioning variables, the levels of the first two define columns and
rows, and the rest are spread out over multiple pages. In your
example, you could try layout = c(6, 2) for starters.

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Histogram in Lattice with 3 factors

2008-02-11 Thread willem vervoort
Dear R-help list,

I am trying to construct a lattice histogram using 3 factors.

My dataframe looks like this: (simulating a waterbalance over
groundwater with different salinities)

s  days   modelECEC_max
0.4 1A   10  9
0.42   2A   10  9
0.44   3A   10   9
  ::  :   : :
0.4  1B  10  9
  ::  :  : :
0.4  1A  309
  ::  :  :   :
0.4  1A  3036

Anyway you get the gist
EC_max has two levels 9 and 36, EC has 3 levels 10, 30 and 70, and
model has two levels (A and B). There are say 365 days and s is
the variable of interest (soil saturation)

Can maybe be reproduced with:
data - data.frame(s = rnorm(2*3*365*2),rep(1:365,12), model =
sort(rep(c(A,B),6*365)),
EC = rep(sort(rep(c(10,30,70),365*2)),2), EC_max =
rep(sort(rep(c(9,36),3*365)),2))

I would like to plot histograms with the three factors using Lattice
so I had the following code:

my.strip - function(which.given, ..., factor.levels) {
levs - if (which.given == 1)  c(Model A,Model B)
else {if(which.given == 2) paste(EC =
,as.character(EC),dS/m)
  else paste(ECmax = ,as.character(EC_max),dS/m)}
strip.default(which.given, ..., factor.levels = levs)
 }

histogram(~s|model*as.factor(EC)*as.factor(EC_max),data=Store,xlab=soil
saturation,type=density,strip=my.strip)

But I am doing something wrong, because it plots the histogram for
factor level EC_max =9 first and than straight over it the histogram
for factor level 36, so only 6 panels on the graph rather than 12.

I searched the archives, but no luck so far.

Any help is appreciated

Willem

platform   i386-pc-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  2
minor  6.1
year   2007
month  11
day26
svn rev43537
language   R
version.string R version 2.6.1 (2007-11-26)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] good reference for mixed models and EM algorithm

2008-02-11 Thread Murray Jorgensen

Erin,

as well as P  B can I recommend

McCullogh CE, Searle SR (2000), Generalized, Linear, and Mixed Models, Wiley

I also found

Data analysis using regression and multilevel/hierarchical models by
Andrew Gelman and Jennifer Hill.
  Cambridge ; New York : Cambridge University Press, 2007.

useful although it takes a Bayesian rather than EM approach.

Cheers,  Murray

-- 
Dr Murray Jorgensen  http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: [EMAIL PROTECTED]Fax 7 838 4155
Phone  +64 7 838 4773 wkHome +64 7 825 0441Mobile 021 1395 862

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R on Mac PRO does anyone have experience with R on such a platform ?

2008-02-11 Thread Charilaos Skiadas
JiHO, in case you are not following TextMate's mailing list, you  
might want to check out Hans-Jorg Bibiko's work on Rdaemon:

http://article.gmane.org/gmane.editors.textmate.general/24195/

It provides a lot of the terminal functionality within a TextMate  
window, uses X11 for the plots, and opens help files either in a  
browser or in a TextMate HTML window. It essentially runs an R  
process in the background, and communicates with it, so I'm not sure  
it would allow you to run R on a remote server. But I think it is  
worth checking out otherwise. Currently you have to install the  
bundles from the above link, but I'm hoping soon we'll be able to  
commit these bundles to TextMate's bundle repository.

Anyone interested in trying it out and having problems, you can email  
TextMate's mailing list (http://macromates.com/community), which both  
I and Hans-Jorg follow closely.

Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

PS: Yes, it is the best $40 I've ever spent, by far.

On Feb 11, 2008, at 2:08 PM, jiho wrote:

 On 2008-February-11  , at 19:14 , Roger Day wrote:
 My experience with R.app on a MACbook has been mostly very positive.
 I like the interface much better than that of Windows--
 with two exceptions.

 a)  I use stepping thru code with control-R.  It's not as convenient
 on Mac-
 the code you want to run has to be actually selected; not good
 enough just
 to be on the line you want.
 That slows down code-stepping.
 b)  saveHistory() doesn't save the history of the current session --
 beware,
 I lost some work that way.  you have to actually click a button.
 c) no resizing graphs post-hoc,
 d) saving graphics to a file is inconvenient except for pdf output.

 Some plusses are:
 a) better built-in editor (if you're not using ESS), including
 delimiter
 matching
 b) the history pane is nice,
 c) the package installer and manager are nicer than on Win,
 d) autocompletion with ctrl-period,
 e) you can select text on the current or past command line much
 easier,
 f) attractive interface with lots of cosmetic options.

 I've done some tkrplot work in both (using X11 in OSX)
 -- some inconsistencies with placement of widgets show up.

 This is off the top of my head.
 Check out the mailing list R-sig-mac for more info.

 After using R via R-app (which is indeed very nice to start with) I
 eventually switched to a combination of TextMate + Terminal + CarbonEL
 - TextMate[1] is a very powerful editor, well worth the $40 price tag,
 and has nice goodies for R besides syntax highlighting such as command
 autocompletion, command templates, plenty of snippets, etc.
 - I run R in a regular Terminal window. This way I get command line
 editing and searching through history. In addition it makes it as easy
 to run R on my local machine that on a remote server (useful to run
 demanding tasks on a large CPU). I can send code from TextMate to the
 terminal prompt using AppleScript commands in TextMate[2]. This allows
 to send selected text _or_ current line directly to the Terminal with
 just a keystroke.
 - CarbonEL is a package which allows to plot to a quartz window even
 from a simple Terminal (quartz is Mac OS X graphics engine). The plots
 on quartz look gorgeous and going back to X11 would have been a pain.
 Another similar solution would be to use the Cairo package.

 All in all, I fond it a very convenient and flexible way to use R. It
 has the added bonus that the same combination (TM+Terminal) works for
 anything that can run in a terminal window (MATLAB, Scilab, python
 etc.). So, even if you don't use only R, you can keep the same habits
 with a nice editor.

 I haven't tried Emacs+ESS. I've heard a lot of good things about it
 but learning Emacs is a task in itself.

 [1] http://macromates.com/
 [2] modification of those http://jo.irisson.free.fr/?p=32 for the
 built-in Terminal, since Terminal on Leopard finally has tabs

 JiHO
 ---
 http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] barplot or maybe related?

2008-02-11 Thread Manuel Morales
On Mon, 2008-02-11 at 09:31 -0800, questions? wrote:
 I have two distributions, represented by heights of several intervals.
 e.g. the distribution is partitioned into 10 segments, I have
 numbers(freq or counts) associated
 with each region in the format as:
 
 0.2  0.3
 0.1  0.1
 .
 
 0.01 0.02
 
 
 I want to plot the two distributions side by side in meaning that, for
 each region,the
 two bars(in barplot) from the two distribution are adjacent to each
 other.
 
 If you do barplot(beside=T), the two distribution are plotted side by
 side, not interleaved.
 I was wondering there are ways to do what I want

Compare:
mat1 - matrix(c(1:10), nrow=5, ncol=2)
mat2 - t(mat1) # Transpose mat1

barplot(mat1, beside=TRUE)
barplot(mat2, beside=TRUE)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
http://mutualism.williams.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logistic regression with repeated measures

2008-02-11 Thread Dieter Menne
Steven Vamosi smvamosi at gmail.com writes:

 In a nutshell, the experiment involved presenting females from two
 groups (treatment, control) with an opportunity to mate with a virgin
 male every 6 hours for 48 hours. Every female was presented this
 opportunity at every time step (i.e., whether or not she mated at 6
 hr, she was again presented with a male at 12 hr, and so on).

. 
 femalegroup   masstimemate
 1 control 5.7 0   1
 1 control 5.7 6   1
. 
 How, then, to determine whether treatment females display different
 mating patterns over time than control females? Here's my crack at it:
 
 foo1 - lmer2(mate ~ group * mass * time + (time | female), family=binomial)
 

And what happened post-crack? Error ~...singular? In case, did you try to
replace the * by + as a first try?

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gcc 4.3 any known issues?

2008-02-11 Thread Peter Dalgaard
Stefan Grosse wrote:
 Hi,

 Fedora is for Fedora 9 switching to gcc 4.3. Before I test it (rawhide) I 
 want 
 to be sure that R is running. So my question is whether there have been 
 issues compiling R + packages using 4.3? 
   
I suspect that not many have tried.

There is an R-2.6.2 RPM in Fedora 9 alpha, so _something_ seems to work.


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R on Mac PRO does anyone have experience with R on such a platform ?

2008-02-11 Thread jiho
On 2008-February-11  , at 19:14 , Roger Day wrote:
 My experience with R.app on a MACbook has been mostly very positive.
 I like the interface much better than that of Windows--
 with two exceptions.

 a)  I use stepping thru code with control-R.  It's not as convenient  
 on Mac-
 the code you want to run has to be actually selected; not good  
 enough just
 to be on the line you want.
 That slows down code-stepping.
 b)  saveHistory() doesn't save the history of the current session --  
 beware,
 I lost some work that way.  you have to actually click a button.
 c) no resizing graphs post-hoc,
 d) saving graphics to a file is inconvenient except for pdf output.

 Some plusses are:
 a) better built-in editor (if you're not using ESS), including  
 delimiter
 matching
 b) the history pane is nice,
 c) the package installer and manager are nicer than on Win,
 d) autocompletion with ctrl-period,
 e) you can select text on the current or past command line much  
 easier,
 f) attractive interface with lots of cosmetic options.

 I've done some tkrplot work in both (using X11 in OSX)
 -- some inconsistencies with placement of widgets show up.

 This is off the top of my head.
 Check out the mailing list R-sig-mac for more info.

After using R via R-app (which is indeed very nice to start with) I  
eventually switched to a combination of TextMate + Terminal + CarbonEL
- TextMate[1] is a very powerful editor, well worth the $40 price tag,  
and has nice goodies for R besides syntax highlighting such as command  
autocompletion, command templates, plenty of snippets, etc.
- I run R in a regular Terminal window. This way I get command line  
editing and searching through history. In addition it makes it as easy  
to run R on my local machine that on a remote server (useful to run  
demanding tasks on a large CPU). I can send code from TextMate to the  
terminal prompt using AppleScript commands in TextMate[2]. This allows  
to send selected text _or_ current line directly to the Terminal with  
just a keystroke.
- CarbonEL is a package which allows to plot to a quartz window even  
from a simple Terminal (quartz is Mac OS X graphics engine). The plots  
on quartz look gorgeous and going back to X11 would have been a pain.  
Another similar solution would be to use the Cairo package.

All in all, I fond it a very convenient and flexible way to use R. It  
has the added bonus that the same combination (TM+Terminal) works for  
anything that can run in a terminal window (MATLAB, Scilab, python  
etc.). So, even if you don't use only R, you can keep the same habits  
with a nice editor.

I haven't tried Emacs+ESS. I've heard a lot of good things about it  
but learning Emacs is a task in itself.

[1] http://macromates.com/
[2] modification of those http://jo.irisson.free.fr/?p=32 for the  
built-in Terminal, since Terminal on Leopard finally has tabs

JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Interpretation of log odds

2008-02-11 Thread Schmitt, Corinna
Hallo,

 fit12-lmFit(qrg[,1:2])
 t12-toptable(fit12,adjust=fdr,number=15000,genelist=qrg$genes[,1])
 t12
ID logFC t  P.Value
adj.P.ValB
1560orf6.2714  -5,95911144  -7,5045373620,0616459272630
0,00430961073320568  20,85141454
8689SW232,709344216  3,41198098 0,000644926129763921000
0,03967585550307640  -0,62704052


The data example comes from one experiment, where I want to know if genes are 
differentially expressed. As I saw in the onlinehelp for toptable the value B 
is the log odds that the gene is differentially expressed. When I now look at 
the B value 20,85141454 it says that the gene orf6.2714 is in 20,85% 
differentially expressed. Is it right? But how should I interpret the second 
example SW23 with a negative B value?

Can anyone discribe it to me in easy word? ;-)

Thanks, Corinna



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R in a university course: dealing with proposal comments

2008-02-11 Thread Stas Kolenikov
On 2/11/08, Paul Gilbert [EMAIL PROTECTED] wrote:

 Stas Kolenikov wrote:
  ...
  Training researchers of tomorrow might be great, but ifyour students get
 on
  the market in the end of the semester, they won't have the luxury of
 waiting
  until R becomes THE package of choice.
 
 Not being a teacher, I usually follow these discussions with a bit of
 amusement and some befuddlement. We hire young people hoping they will
 bring in bright new ideas from academia, and academics are training the
 students based on what they think are the old things we use.
 Fortunately, R is already one of the packages of choice many places.

 Another point that needs more emphasis is that R is actually a
 programming language, like Matlab and and APL, so it really has more
 general usefulness than statistics packages that one might use in the
 narrower context of a statistics course.


There are people who would be developing and pricing some novel
financial derivatives -- your young people are probably Ph.D. in finance
or statistics or economics, and yes, programming is a must at research
level, and R is a great choice (although economists might say that
GAUSS or Stata is an even greater choice). The original question was about
the first and most likely the only statistics class the health students will
ever take, and the words graduate level should not be fooling anybody --
that will have to be a non-calculus data analysis class (Arin Basu can
surprise me here now if it is
different!!!). I would predict the students coming out of it will run
the routine analysis that are spelled out by FDA
and the
likes, and I would think the FDA regulations could go as far as
specific SAS syntax, or at least
to specify SAS PROCs to be used. The GPL software does not necessarily
thrive in commercial and even academic environments -- I have plenty
acquantainces of mine in academia who prefer to use some commercial
flavors of LaTeX over the free miktex
distribution for the illusion of technical support they get for their
money; I expect those people to prefer SPSS or SAS over R for similar
reasons (plus the GUI). I don't argue that R is a greal tool
for innovative work, but rather that it is the best tool for the basic stats
class to a not-so-technical audience, and in the perspective work the
students would be doing.

Of course if you are a full professor you can dismiss any of the comments
and teach the way you like. That's what I'll be planning to do when I get
there :))

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: Please do not reply to my Gmail address as I don't check it
regularly.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Viable Approach to Parallel R?

2008-02-11 Thread Lewis, Daniel (IS Consultant)
All,

We are researching approaches to parallel R with the end goal of running
R in a distributed manner on a Linux cluster. We expect of course to do
some work decomposing our problems to be task-parallel or data-parallel,
but wouldn't mind getting an initial boost working with embarrassingly
parallel code sections and one of the approaches below. 

Incidentally our environment includes R 2.6.1, RHEL 5.1, Solaris 10, SGE
(Sun Grid Engine) and OpenMPI 1.2.4 (SunHPC 7.1)).

In researching previous work, the most promising approaches seem to be:

A. Snow (with Rmpi or Rpvm) (as described in
http://www.r-project.org/useR-2006/Slides/Harrington+Salibian-Barrera.pd
f from the 2006 R User Conference)

It is my understanding that this approach is viable, and works with
OpenMPI 1.2.4. Is anyone using this method with good results?

B. taskpR, RScaLAPACK, pMatrix

I read a paper
http://sdm.lbl.gov/sdmcenter/projects/SDM.center.parallel.r.2-pager.4.do
c coming out of the ORNL, describing what they call parallel R, which
included taskpr, RScaLAPACK, pMatrix. I notice that taskpR is no longer
available in contrib, nor is pMatrix.

An old link indicates the packages are available at
http://www.ASPECT-SDM.org/Parallel-R but that site displays a notice
that the server is migrating. Has this work been discontinued? Anyone
using this? I see RScaLAPACK is still available, from reading the above
it seems that was bundled with taskpR. Does it function without the
other components? (Guess I'll try it and find out :)

C. Sleigh  NetworkSpaces

I see that SCAI (Scientific Computing Associates) offers a parallel R
package based on something they call NetworkSpaces and  Sleigh
(inspired by Snow). They sell services around the product but it is open
source. They have an enhanced version that they sell  support.
http://www.lindaspaces.com/hp/BenchmarksWithCharts.pdf. Has anyone
investigated this approach or it's open source components?

TIA for any information, direction, suggestions, and if I've missed any
other approaches please advise.

Dan Lewis




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Logistic regression with repeated measures

2008-02-11 Thread Steven Vamosi
Hello R list,

I am hoping to conduct a logistic regression with repeated measures,
and would love an actual code run through for such an analysis. I
found only one related post on this list, but a full answer was never
provided. I understand that the routine lmer (or lmer2) in the lme4
package is often recommended in such a case, but actually implementing
it is where I've hit a wall.

In a nutshell, the experiment involved presenting females from two
groups (treatment, control) with an opportunity to mate with a virgin
male every 6 hours for 48 hours. Every female was presented this
opportunity at every time step (i.e., whether or not she mated at 6
hr, she was again presented with a male at 12 hr, and so on). In
addition to which group a female belongs to, we have an a priori
reason to want to test the effect of her initial body mass as a
covariate. A subset of the data looks like this:

female  group   masstimemate
1   control 5.7 0   1
1   control 5.7 6   1
1   control 5.7 12  0
1   control 5.7 18  0
1   control 5.7 24  0
1   control 5.7 30  1
1   control 5.7 36  0
1   control 5.7 42  1
1   control 5.7 48  0
2   treatm  5.3 0   1
2   treatm  5.3 6   0
2   treatm  5.3 12  0
2   treatm  5.3 18  0
2   treatm  5.3 24  0
2   treatm  5.3 30  1
2   treatm  5.3 36  0
2   treatm  5.3 42  0
2   treatm  5.3 48  0
3   control 6.1 0   1
3   control 6.1 6   0
3   control 6.1 12  0
3   control 6.1 18  0
3   control 6.1 24  1
3   control 6.1 30  1
3   control 6.1 36  0
3   control 6.1 42  1
3   control 6.1 48  0
...

How, then, to determine whether treatment females display different
mating patterns over time than control females? Here's my crack at it:

foo1 - lmer2(mate ~ group * mass * time + (time | female), family=binomial)

Thanks in advance,
Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] barplot or maybe related?

2008-02-11 Thread questions?
I have two distributions, represented by heights of several intervals.
e.g. the distribution is partitioned into 10 segments, I have
numbers(freq or counts) associated
with each region in the format as:

0.2  0.3
0.1  0.1
.

0.01 0.02


I want to plot the two distributions side by side in meaning that, for
each region,the
two bars(in barplot) from the two distribution are adjacent to each
other.

If you do barplot(beside=T), the two distribution are plotted side by
side, not interleaved.
I was wondering there are ways to do what I want

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loading Data to R

2008-02-11 Thread Gregory Warnes
On Microsoft Windows systems, it may be more convenient to install  
and use the XLSReadWRite packge.  For non-windows systems, the  
gdata package provides this function, but requires perl to be present.

-Greg
(Maintainer of gdata)



On Feb 9, 2008, at 1:09PM , Henrique Dallazuanna wrote:

 You need library(gdata) before

 On 08/02/2008, Wensui Liu [EMAIL PROTECTED] wrote:
 # READ DATA FROM XLS FILE #

 xls - read.xls(file = C:/projects/Rintro/Part01/export.xls,  
 sheet = 3,
 type = data.frame, from = 1, colNames = TRUE)

 On Feb 8, 2008 3:49 PM, Christine Lynn [EMAIL PROTECTED]  
 wrote:
 This is the most basic question ever...I haven't used R in a  
 couple years
 since college so I forget and haven't been able to find what I'm  
 looking for
 in any of the manuals.

 I just need to figure out how to load a dataset into the program  
 from excel!

 Thanks!

 CL

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 ===
 WenSui Liu
 ChoicePoint Precision Marketing
 Phone: 678-893-9457
 Email : [EMAIL PROTECTED]
 Blog   : statcompute.spaces.live.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.



 -- 
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hastie - Tibshirani - Friedman pg 141 nnet question

2008-02-11 Thread G Ilhamto
Dear helper,

I am working with a nnet using large data set (23K) and have some questions.
I have a binary response (occurrence  non-occurrence of event) with 8
predictors.

(1) How can I reproduce plot in Hastie et al. (page 141), i.e. natural
cubic splines of tensor product?
(2) How does nnet treat the response. It seems that by default it
treats Y as numeric (?). Can I change the response as factor? or, does
it matter?

Thank you,

Ilham

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dendrogram for agglomerative hierarchical clustering result

2008-02-11 Thread noorpiilur
Thank you for your reply Wolfgang
I've seen these examples but my problem is that I don't know how to
make the input data out of my given data. According to the example
below hclust is making the clustering and will result in hclust object
hc. In my case the clustering is already done and I need to create the
hclust object out of my clustering result. So I probably have to study
how to create the hclust object first..

dndrgr hc - hclust(dist(USArrests), ave)
dndrgr (dend1 - as.dendrogram(hc))

Risto

On 11 veebr, 19:18, Wolfgang Huber [EMAIL PROTECTED] wrote:
 Hi Risto,
 You could try

example(dendrogram)

 best wishes
 Wolfgang

 noorpiilur scripsit:



  Hey group,

  I have a problem of drawing dendrogram as the result of my program
  written in C. My algorithm is a approximation algorithm for single
  linkage method. AS a result I will get the following data:

  [Average distance] [cluster A] [cluster B]

  For example:
  42.593141  1   26
  42.593141  4   6
  42.593141  123 124
  42.593141  4   113
  74.244206  1   123
  74.244206  4   133
  74.244206  1   36

  So far I have used C to generate a bitmap output but I would like to
  use the computed result as an input for R to just draw the dendrogram.

  As I'm new to R any help is appreciated.

  Thanks,
  Risto

 __
 [EMAIL PROTECTED] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between P.Value and adj.P.Value

2008-02-11 Thread Schmitt, Corinna
Hallo,


 fit12-lmFit(qrg[,1:2])
 t12-toptable(fit12,adjust=fdr,number=25,genelist=qrg$genes[,1])
 t12
ID logFC t  P.Value  adj.P.ValB
522PLAU_OP -6.836144 -8.420414 5.589416e-05 0.01212520 2.054965
1555  CD44_WIZ -6.569622 -8.227938 6.510169e-05 0.01212520 1.944046

Can anyone tell me what the difference is between P.Value and
adj.P.Value? I need to analyse microarrays and should say if there exist
differential expressed genes. Which P.Value should I use?

Thanks, Corinna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Length problem

2008-02-11 Thread Benilton Carvalho
if data was your data.frame, data[4:length(data)] was also a  
data.frame.


but, c(data[4:length(data)] ) coerces it to a list.

therefore coppie is a list.

coppie[1] is also a list of length 1...

compare that to: coppie[[1]]

b

On Feb 11, 2008, at 10:38 AM, milton ruser wrote:


Ciao Paolo,

How about you show some row of your data?
How many columns have your data.frame? One?
By the way data is not a so good name for your data frame.

We will be very happy to help you

Kindly,

Miltinho
Brasile

On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote:



 Hi all
 I have this problem:
 In my database .dta, called data I have five rows
 data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta)
 # From this database  I wuold like to create another
 coppie-c(data[4:length(data)])
 but I find this

 # Length of  original data
 length(data[,4])
 5   RIGHT!!
 # Length of new data
 length(coppie[1])
 1  WHY??
 Thank you all for your help
 Paolo Grillo
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Length problem

2008-02-11 Thread jim holtman
You were asking for the length of the first element of the vector
coppie, which is of course 1.  Did you mean to say lgngth(coppie)?
length(data[,4]) is asking how many elements in that column, which
seems to be 5.  also your statement

coppie - c(data[4:length(data)])

seems strange.  What did you intend to do?

On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote:

   Hi all
   I have this problem:
   In my database .dta, called data I have five rows
   data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta)
   # From this database  I wuold like to create another
   coppie-c(data[4:length(data)])
   but I find this

   # Length of  original data
   length(data[,4])
   5   RIGHT!!
   # Length of new data
   length(coppie[1])
   1  WHY??
   Thank you all for your help
   Paolo Grillo
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Length problem

2008-02-11 Thread Paolo Grillo

   Ciao Milthinho
   Here it is
data
   yy mm dd C.531C.542 C.558C.565
   1 2003  1  1 0.9941125 1.412338 0.8996750 2.258200
   2 2003  1  2 1.7931375 2.786900NA 3.108725
   3 2003  1  3NA 3.657775 1.7269750 2.541938
   4 2003  1  4 1.0840625 1.766925 1.2313375 2.321300
   5 2003  1  5 1.1558000 2.128488 0.9670375   NA
   # New data
   coppie-c(data[4:length(data)])
   # Length of  original data
data[,4]
   [1] 0.9941125 1.7931375NA 1.0840625 1.1558000
length(data[,4])
   [1] 5
5   # Right !!!
   [1] 5
# Length of new data
coppie[1]
   $C.531
   [1] 0.9941125 1.7931375NA 1.0840625 1.1558000
length(coppie[1])
   [1] 1
1   # Why ??
   Thank you for your help
   Paolo
   Italia
   milton ruser wrote:

   Ciao Paolo,



   How about you show some row of your data?

   How many columns have your data.frame? One?

   By the way data is not a so good name for your data frame.



   We will be very happy to help you

   Kindly,



   Miltinho

   Brasile


   On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote:

   Hi all
   I have this problem:
   In my database .dta, called data I have five rows
   data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta)
   # From this database  I wuold like to create another
   coppie-c(data[4:length(data)])
   but I find this
   # Length of  original data
   length(data[,4])
   5   RIGHT!!
   # Length of new data
   length(coppie[1])
   1  WHY??
   Thank you all for your help
   Paolo Grillo
 __
 [EMAIL PROTECTED] mailing list
 [3]https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 [4]http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

References

   1. mailto:[EMAIL PROTECTED]
   2. mailto:R-help@r-project.org
   3. https://stat.ethz.ch/mailman/listinfo/r-help
   4. http://www.R-project.org/posting-guide.html
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R programming style

2008-02-11 Thread Roland Rau
Hi,

I think using Emacs+ESS [1,2] is always a good starting point for a 
clear layout with consistent and meaningful indentation.

I don't know how other people think about it, but in my opinion, 
Elements of Programming Style by Kernighan and Plauger is still an 
interesting read -- although their programs are either Fortran or PL/1 
and the book itself is 30 years or old. Of course, I am not always 
successful but at least I try to incorporate their 'mantras':
- write clearly, don't be too clever [3]
- say what you mean, simply and directly
- use library functions
- write clearly -- don't sacrifice clarity for efficiency
- let the machine do the dirty work
- parenthesize to avoid ambiguity
- 10.0 times 0.1 is hardly ever 1.0
- ...

I hope this helps?

Best,
Roland


[1] http://www.gnu.org/software/emacs/
[2] http://ess.r-project.org/
[3] I guess this is what Kernighan meant in his famous(?) quote: 
Everyone knows that debugging is twice as hard as writing a program in 
the first place. So if you're as clever as you can be when you write it, 
how will you ever debug it? 
(http://en.wikiquote.org/wiki/Brian_W._Kernighan )






David Scott wrote:
 I am aware of one (unofficial) guide to style for R programming:
 http://www1.maths.lth.se/help/R/RCC/
 from Henrik Bengtsson.
 
 Can anyone provide further pointers to good style?
 
 Views on Bengtsson's ideas would interest me as well.
 
 David Scott
 
 
 
 _
 David Scott   Department of Statistics, Tamaki Campus
   The University of Auckland, PB 92019
   Auckland 1142,NEW ZEALAND
 Phone: +64 9 373 7599 ext 86830   Fax: +64 9 373 7000
 Email:[EMAIL PROTECTED]
 
 Graduate Officer, Department of Statistics
 Director of Consulting, Department of Statistics
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: question_encoding

2008-02-11 Thread Petr PIKAL
Hi

Is it only a question of PDF export or are the glyphs distorted in plot 
window too?

If it is in plot window try to look into etc folder to Rdevga file.

If it is during export from plot window to PDF then try to produce PDF 
file with 

pdf()
plot(1,1, type=n)
text(1,1, Ě Š Ť Č Ř Ň Á Í É Ó Ý Ž)
dev.off()

Although I am on WXP the export does not work however direct formation 
through pdf command seems to work.

Regards

Petr
[EMAIL PROTECTED]

[EMAIL PROTECTED] napsal dne 08.02.2008 23:32:48:

 Hallo,
 I would like to ask you, for one question. When I export graph to .pdf
 and I need some czech font, I use a parameter encoding=ISOLatin2.enc
 for these special fonts. But exported text is bad. I try ISOLatin1 and
 MacRoman, but it is some one. I don't know, what Iam doing bad, because
 in quartz is the graph ok. SorryI forgetI have a Mac with
 Leopard and R ver. 2.6.1. Thank you.
 jena
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep etc.

2008-02-11 Thread John Kane
I don't understand exactly what you are asking

you can change v from 'insd-otsd' 'sppr-unsp' to
'insd--otsd', 'sppr--unsp' with

sub(-, --,v) 

However do you want to change the entire assignment
statement?

--- Michael Kubovy [EMAIL PROTECTED] wrote:

 Dear R-helpers,
 
 How do I transform
 v - c('insd-otsd', 'sppr-unsp')
 into
 c('insd--otsd', 'sppr--unsp')
 ?
 _
 Professor Michael Kubovy
 University of Virginia
 Department of Psychology
 USPS: P.O.Box 400400Charlottesville, VA
 22904-4400
 Parcels:Room 102Gilmer Hall
  McCormick RoadCharlottesville, VA 22903
 Office:B011+1-434-982-4729
 Lab:B019+1-434-982-4751
 Fax:+1-434-982-4766
 WWW:http://www.people.virginia.edu/~mk9y/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R in a university course: dealing with proposal comments

2008-02-11 Thread bartjoosen

You can use a GUI to teach R, so the programming-style is gone.
But using the command line aproach, it forces you to think about your
analysis.
In an GUI, it's easy to point and click, without knowing what you are doing.
With the command line, you know where you start, and from there you go to
the next step, and so on.
I think you learn more this way.

And of course, its free, so if they are off school or somewhat further, at
work, they still have the possibility to use what they have learned (in
contrary of SPSS maybe).

Bart


Arin Basu-3 wrote:
 
 Hi All,
 
 I am scheduled to teach a graduate course on research methods in
 health sciences at a university. While drafting the course proposal, I
 decided to include a brief introduction to R, primarily with an
 objective to enable the students to do data analysis using R. It is
 expected that enrolled students of this course have all at least a
 formal first level introduction to quantitative methods in health
 sciences and following completion of the course, they are all expected
 to either evaluate, interpret, or conduct primary research studies in
 health. The course would be delivered over 5 months, and R was
 proposed to be taught as several laboratory based hands-on sessions
 along with required readings within the coursework.
 
 The course proposal went to a few colleagues in the university for
 review. I received review feedbacks from them; two of them commented
 about inclusion of R in the proposal.
 
 In quoting parts these mails, I have masked the names/identities of
 the referees, and have included just part of the relevant text with
 their comments. Here are the comments:
 
 Comment 1:
 
 In my quick glance, I did not see that statistics would be taught,
 but I did see that R would be taught.  Of course, R is a statistics
 programme. I worry that teaching R could overwhelm the class.  Or
 teaching R would be worthless, because the students do not understand
 statistics.  (Prof LR)
 
 Comment 2:
 
 Finally, on a minor point, why is R the statistical software being
 used? SPSS is probably more widely available in the workplace –
 certainly in areas of social policy etc.  (Prof NB)
 
 I am interested to know if any of you have faced similar questions
 from colleagues about inclusion of R in non-statistics based
 university graduate courses. If you did and were required to address
 these concerns, how you would respond?
 
 TIA,
 Arin Basu
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Using-R-in-a-university-course%3A-dealing-with-proposal-comments-tp15405138p15412757.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Power law, lognormal end exponenntial statistical testing in one sample population

2008-02-11 Thread ΠΑΝΤΟΠΟΥΛΟΣ ΓΕΩΡΓΙΟ Σ
Hello,
My name is George Pantopoulos, i am a phd student in the Dept. of Geology,
University of Patras, Greece.
I am studying the statistical behaviour of bed thickness datasets taken
from outcrops. Until now, 4 statistical distributions seems to fit my
datasets: power law, lognormal, lognormal mixture with 2 modes and
exponential.
I already used the MIX package of R to detect a possible 2 mode lognormal
mixture in the datasets.
My questions are :
Can i do ONE POPULATION non-parametric tests (ks or x2) in R for the above
distributions, with the distribution parameters NOT estimated from the
data ? (especially for the ks test). If yes , is it possible for someone
to show me the exact steps that i must follow in R ? Also, must i generate
artificial populations of data to do what i want ?
The datasets are in text files, one variable (bed thicknesses only).
Please if somebody knows how to do something of the above in R , it would
be valuable for my work.
Thank you.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditional rows

2008-02-11 Thread Gabor Csardi
Because you need

test = 0.2 | test 0.3

See ?| 

Gabor

On Mon, Feb 11, 2008 at 09:12:57PM +0800, Stanley Ng wrote:
 That works beautfully. Why using test=0.2 || test 0.3 gives error ?   
 
 -Original Message-
 From: Gabor Csardi [mailto:[EMAIL PROTECTED] 
 Sent: Monday, February 11, 2008 18:27
 To: Ng Stanley
 Cc: r-help
 Subject: Re: [R] Conditional rows
 
 which(apply(test=0.2, 1, all))
 
 See ?which, ?all, and in particular ?apply.
 
 Gabor
 
 On Mon, Feb 11, 2008 at 06:22:09PM +0800, Ng Stanley wrote:
  Hi,
  
  Given a simple example,  test - matrix(c(0.1, 0.2, 0.1, 0.2, 0.1, 
  0.1, 0.3, 0.1, 0.1), 3, 3)
  
  How to generate row indexes for which their corresponding row values 
  are less than or equal to 0.2 ? For this example, row 2 and 3 are the 
  correct ones.
  
  Thanks
[...]

-- 
Csardi Gabor [EMAIL PROTECTED]UNIL DGM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGTK2 and glade on Windows - GUI newbie

2008-02-11 Thread Liviu Andronic
On 2/11/08, Anja Kraft [EMAIL PROTECTED] wrote:
 I'd like to write a GUI (first choice with GTK+).

There is also pmg [1] that uses GTK+. And, albeit more specific,
playwith [2]. Also, creating a GUI under R issues were discussed
previously, specifically this reference [3] may give you useful ideas.

Liviu

[1] http://wiener.math.csi.cuny.edu/pmg
[2] http://cran.r-project.org/src/contrib/Descriptions/playwith.html
[3] https://stat.ethz.ch/pipermail/r-sig-gui/2005-October/000504.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Release 3.2.0 of randomSurvivalForest is now availablle

2008-02-11 Thread Udaya B. Kogalur
Dear useRs:

Release 3.2.0 of the CRAN package randomSurvivalForest is now available.

--

Release 3.2.0 represents a significant upgrade in the functionality of
the product.  Key changes are as follows:

o A second method of perturbing the data set in order to calculate
variable importance (VIMP) has been implemented.  In addition to
permuting the values for a single variable, a random split approach
has been taken in which a data point is randomly assigned to the left
or right daughter node when a split occurs on the specified variable.

o The joint VIMP among multiple variables of a (potentially proper)
subset of the GROW data can now be calculated using the new function
interaction.rsf().  This represents a third mode of operation for the
application, and follows rsf.default (GROW) and predict.rsf (PREDICT).
See the documentation for details.

o An additional option in GROW mode can now be specified.  The option
'varUsed' allows users to quantify which variables have been split
upon within a single tree or over the entire forest.  See the
documentation for more details.

o The ability to multiply impute data has been implemented.  This
involves imputing data while growing a forest and using the results to
grow a new forest in order to better impute the data.

o In GROW mode, the application now outputs both the in-bag and OOB
summary imputed values.

o An additional split rule 'randomsplit' has been implemented.
See the documentation for more details.

o The split rule 'logrankscore' is now calculated correctly.

o The split rule 'logrankapprox' has been removed and replaced by
the new split rule 'logrankrandom'.  See the documentation for more details.


[EMAIL PROTECTED]

Udaya B. Kogalur, Ph.D.
Kogalur Shear Corporation
5425 Nestleway Drive, Suite L1
Clemmons, NC 27012

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED]

2008-02-11 Thread Prof Brian Ripley
On Tue, 12 Feb 2008, [EMAIL PROTECTED] wrote:

 Thanks to all for your kind suggestions.

 After some discussion with our IT staff, I was told the UNIX system we have
 is Solaris and installation of R is very time consuming because Given that
 this software is not standard, and given the amount of time required to
 compile the software (and potentially it's dependencies), it will need to be
 resourced as a project ... From my experience with IT staff, it may take
 quite a long time for them to set up such project, let alone the
 installation.

Prebuilt versions of R are available for Solaris -- and the 'R 
Installation and Administration' manual told them so.

 Given that, I wonder if it is possible to install it myself. As I have
 mentioned before, I have no experience in using UNIX, but I will have an
 access to the UNIX system soon. Any suggestions and help are greatly
 appreciated.

It is easy to install R from the sources if you have the compilers and 
e.g. Tcl/Tk installed.  But a Solaris box quite possibly does not, and 
then a binary install is much easier.


 Regards,
 Jin

 -Original Message-
 From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
 Sent: Monday, 28 January 2008 11:38
 To: Li Jin
 Cc: r-help@r-project.org
 Subject: Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED]

 On the PC there is a builtin GUI but not on UNIX and there are
 some packages that are OS specific in which case you might
 get more or less selection but probably more.  Also depending
 on the specific system you may have greater difficulty installing
 certain packages due to the need to compile them on UNIX
 and the possibility exists that you don't quite have the right
 libraries.  On Windows you get binaries so this is not a problem.
 I have repeatedly found that common packages that I took
 for granted on Windows had some problem with installation
 on UNIX and I had to hunt around and figure out what the problem
 was with my UNIDX libraries or possibly some other problem.
 For all R packages this won't be a problem but for packages
 that use C and FORTRAN this can be.  Although I am lumping
 all UNIX systems together I think this varies quite a bit from
 one particular type/distro of UNIX/Linux to another and I suspect if you
 are careful in picking out the right one (if you have a choice) you
 will actually have zero problems.

 On Jan 23, 2008 6:08 PM,  [EMAIL PROTECTED] wrote:
 Dear All,
 I am currently using R in Windows PC with a 2 GB of RAM. Some pretty large
 datasets are expected soon, perhaps in an order of several GB. I am facing
 a
 similar situation like Ralph, either to get a new PC with a bigger RAM or
 else. I am just wondering if R is getting faster in other systems like UNIX
 or Linux. Any suggestions are appreciated.
 Regards,
 Jin
 
 Jin Li, PhD
 Spatial Modeller/
 Computational Statistician
 Marine  Coastal Environment
 Geoscience Australia
 Ph: 61 (02) 6249 9899
 Fax: 61 (02) 6249 9956
 email: [EMAIL PROTECTED]
 


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
 Behalf Of Prof Brian Ripley
 Sent: Thursday, 24 January 2008 12:05
 To: Ralph79
 Cc: r-help@r-project.org
 Subject: Re: [R] Problems with XP32-3GB-patch?/ Worth upgrading to Vista
 X64?

 On Wed, 23 Jan 2008, Ralph79 wrote:


 Dear R-Users,

 as I will start a huge simulation in a few weeks, I am about to buy a new
 and fast PC. I have noticed, that the RAM has been the limiting factor in
 many of my calculations up to now (I had 2 GB in my old system, but
 Windows still used quite a lot of virtual memory), hence my new computer
 will have 4 GB of fast DDR2-800 RAM.

 However, I know that 1.) Windows 32 bit cannot make use of more than
 about
 3,2 GB RAM and 2.) it is normally not allowed to allocate more than 2 GB
 of
 RAM to one single application (at least under XP, I don't know if that
 has
 changed under Vista?).

 I remember from the R-FAQ that you can manually adjust XP so that it
 allocates up to 3 GB to one application (the 3GB patch), but I read in
 a
 PC-magazine and some message boards that this may cause problems. Does
 anybody of you successfully use this trick without any problems?

 Yes, many people: most 32-bit Exchange servers use it.  Please don't rate
 the advice in the R documentation below tittle-tattle you read on the web.

 Would it be wise to use a 64bit OS, as e.g. Vista X64? I think, under
 Vista
 X64 it should be no problem to allocate 4 GB of RAM to R. Any experiences
 with that?

 That's what the rw-FAQ says, and we do write answers based on experience!

 Thanks in advance,
 Ralph Wirth


 -
 Ralph Wirth
 University Erlangen-Nuremberg, Chair of Statistics
 GfK Group, Department of Methods and Product Development



 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford,  

Re: [R] gcc 4.3 any known issues?

2008-02-11 Thread Prof Brian Ripley
On Mon, 11 Feb 2008, Peter Dalgaard wrote:

 Stefan Grosse wrote:
 Hi,

 Fedora is for Fedora 9 switching to gcc 4.3. Before I test it (rawhide) I 
 want
 to be sure that R is running. So my question is whether there have been
 issues compiling R + packages using 4.3?

 I suspect that not many have tried.

 There is an R-2.6.2 RPM in Fedora 9 alpha, so _something_ seems to work.

But enough have to have sorted out many of the issues.  For example, the 
gcc 4.3 series uses C99-style inlining and that has been implemented 
(conditionally on GCC capabilities), and various bits of R have been 
rewritten to work around mis-compiles by the gcc 4.3 branch.

But note that gcc 4.3 is not released and not even branched with rather a 
lot of remaining regressions.  I'd be wary of using high levels of 
optimization: last time I looked R failed to build correctly at -O3 on 
x86_64, and there were more problems with packages.

Surely this is a topic that the posting guide directs to R-devel -- all 
the issues mentioned are programming ones, not R ones.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] WG: Tinn-R not working well with latest R

2008-02-11 Thread Gabor Grothendieck
Or try
source(clipboard)


On Feb 11, 2008 3:30 PM, Schmitt, Corinna
[EMAIL PROTECTED] wrote:


 I am in R command window and just make Crt+V.

 Corinna


 -Ursprüngliche Nachricht-
 Von: Farrel Buchinsky [mailto:[EMAIL PROTECTED]
 Gesendet: Mo 11.02.2008 21:16
 An: Schmitt, Corinna
 Betreff: Re: Tinn-R not working well with latest R


 I can easily get R to open without an error. I simply removed the Tinn-R
 related lines from the Rprofile.site file
 C:\Program Files\R-2.6.2\etc\Rprofile.site

 but then when I try to manually load the svIDE library by entering
 library(svIDE) from the command line, I get a similar error.

 So when you say Than paste in the command, what command are you referring
 to?
 What do you change it to?


 Schmitt, Corinna [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
 
 
  Hallo,
 
  I had the same problems before. I think the best solution is that you just
  copy the needed codepart out of Tinn-R with Ctr+C. Then open R directly
  from your desktop NOT from Tinn-R. Than paste in the command. you can
  still make changes in the command when you have not pressed enter by using
  the arrow buttons of the keyboard. put the curse where you want in the
  command line and change it.
 
  Hope that is what you want. I cannot imitate your example.
 
  Corinna
 
 
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 **
 This email and any files transmitted with it are confide...{{dropped:16}}


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] image quality

2008-02-11 Thread Henrik Bengtsson
Have a look at the smoothScatter() function in the 'geneplotter'
(Bioconductor) package.  That might be sufficient for you.
Alternatively, generate a bitmap (e.g. PNG) image plot instead (at
least pdflatex can import those as is).

/Henrik

On Feb 11, 2008 2:18 AM, John Lande [EMAIL PROTECTED] wrote:
 dear all,
 I am writing a sweave documentation for my analysis, and I am plotting huge
 scatter plot data for microarray.
 unlucly this take a lot of resource to my pc because of the quality of the
 image which is to high (I see the PC get stuck for each single spot).
 how can I overcome this problem? is there a way to make lighter image?


 john

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED]

2008-02-11 Thread Don MacQueen

At 10:27 AM +1100 2/12/08, [EMAIL PROTECTED] wrote:
Thanks to all for your kind suggestions.

After some discussion with our IT staff, I was told the UNIX system we have
is Solaris and installation of R is very time consuming because Given that
this software is not standard, and given the amount of time required to
compile the software (and potentially it's dependencies), it will need to be
resourced as a project ...

Even if pre-built versions of R were not available, I think your IT 
staff is exaggerating -- or at least being overly cautious when faced 
with something unfamiliar.

Although I have used unix for many years, I am not a trained or 
experienced Solaris system administrator. Yet I have been able to 
install R from source on a modern Solaris using not more than a day 
or so. And most of that time was spent doing things that an 
experienced Solais sysadmin should be able to do relatively quickly.

-Don

  From my experience with IT staff, it may take
quite a long time for them to set up such project, let alone the
installation.

Given that, I wonder if it is possible to install it myself. As I have
mentioned before, I have no experience in using UNIX, but I will have an
access to the UNIX system soon. Any suggestions and help are greatly
appreciated.

Regards,
Jin

-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
Sent: Monday, 28 January 2008 11:38
To: Li Jin
Cc: r-help@r-project.org
Subject: Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED]

On the PC there is a builtin GUI but not on UNIX and there are
some packages that are OS specific in which case you might
get more or less selection but probably more.  Also depending
on the specific system you may have greater difficulty installing
certain packages due to the need to compile them on UNIX
and the possibility exists that you don't quite have the right
libraries.  On Windows you get binaries so this is not a problem.
I have repeatedly found that common packages that I took
for granted on Windows had some problem with installation
on UNIX and I had to hunt around and figure out what the problem
was with my UNIDX libraries or possibly some other problem.
For all R packages this won't be a problem but for packages
that use C and FORTRAN this can be.  Although I am lumping
all UNIX systems together I think this varies quite a bit from
one particular type/distro of UNIX/Linux to another and I suspect if you
are careful in picking out the right one (if you have a choice) you
will actually have zero problems.

On Jan 23, 2008 6:08 PM,  [EMAIL PROTECTED] wrote:
  Dear All,
  I am currently using R in Windows PC with a 2 GB of RAM. Some pretty large
  datasets are expected soon, perhaps in an order of several GB. I am facing
a
  similar situation like Ralph, either to get a new PC with a bigger RAM or
  else. I am just wondering if R is getting faster in other systems like UNIX
  or Linux. Any suggestions are appreciated.
  Regards,
  Jin
  
  Jin Li, PhD
  Spatial Modeller/
  Computational Statistician
  Marine  Coastal Environment
  Geoscience Australia
  Ph: 61 (02) 6249 9899
  Fax: 61 (02) 6249 9956
  email: [EMAIL PROTECTED]
  


  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
  Behalf Of Prof Brian Ripley
  Sent: Thursday, 24 January 2008 12:05
  To: Ralph79
  Cc: r-help@r-project.org
  Subject: Re: [R] Problems with XP32-3GB-patch?/ Worth upgrading to Vista
  X64?

  On Wed, 23 Jan 2008, Ralph79 wrote:

  
   Dear R-Users,
  
   as I will start a huge simulation in a few weeks, I am about to buy a new
   and fast PC. I have noticed, that the RAM has been the limiting factor in
   many of my calculations up to now (I had 2 GB in my old system, but
   Windows still used quite a lot of virtual memory), hence my new computer
will have 4 GB of fast DDR2-800 RAM.
  
   However, I know that 1.) Windows 32 bit cannot make use of more than
about
   3,2 GB RAM and 2.) it is normally not allowed to allocate more than 2 GB
of
   RAM to one single application (at least under XP, I don't know if that
has
   changed under Vista?).
  
   I remember from the R-FAQ that you can manually adjust XP so that it
   allocates up to 3 GB to one application (the 3GB patch), but I read in
a
   PC-magazine and some message boards that this may cause problems. Does
   anybody of you successfully use this trick without any problems?

  Yes, many people: most 32-bit Exchange servers use it.  Please don't rate
  the advice in the R documentation below tittle-tattle you read on the web.

   Would it be wise to use a 64bit OS, as e.g. Vista X64? I think, under
Vista
   X64 it should be no problem to allocate 4 GB of RAM to R. Any experiences
   with that?

  That's what the rw-FAQ says, and we do write answers based on experience!

   Thanks in advance,
   Ralph Wirth
  
  
   -
   Ralph Wirth
   

Re: [R] R programming style

2008-02-11 Thread Roland Rau
Hi,

Earl F. Glynn wrote:
 Instead of using 1 or 2 in an apply, I'll write something like this 
 trying for some sort of mnemonic
 
 apply(x, BY.ROW-1, sum)
 or
 apply(z, BY.COL-2, mean)
 
It think it makes sense to use those magic numbers in the given case.
Please let me give you several arguments:

- In such a setting, I'd probably also use more mnemonic functions:
rowMeans
rowSums
colMeans
colSums

- The numbering of the MARGINs (the name of the second argument) is what 
I remember from maths: 1 is for rows, 2 index is for columns, ... So I 
don't think the numbering is counter-intuitive. For sure, you have to 
check the help page at least once. But this is also the case for using 
mnemonic arguments.

- The first argument in apply() is an array which is not restricted to 
two dimensions. For example, if you are working with three dimension, 
how would you specify it? BY.LAYER? Maybe, but then four dimensions or 
five dimensions?[1]

Please don't consider this as a personal criticism. I am sure that 
users' criticism improves R. But using mnemonics instead of the margins 
in the apply() case is not a convincing example, I think. Maybe you have 
another example?

Best,
Roland

[1] If you are curious whether there practical applications of four- or 
fivedimensional arrays, I can write to you off-list how useful they were 
in real world projects.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overdispersion + GAM

2008-02-11 Thread Ravi Varadhan
No.  Binomial data can indeed be overdispersed.  See McCullagh  Nelder
(1989, section 4.5).  Accounting for over(under)dispersion in binomial and
Poisson distributions is, in fact, one of the original impetus for GEE type
developments.  See also a nice paper by Liang  McCullagh (Biometrics 1993,
p. 623-630), which discusses numerous examples of overdispersion in binary
data.  

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Gavin Simpson
Sent: Monday, February 11, 2008 12:37 PM
To: anna banana
Cc: r-help@r-project.org
Subject: Re: [R] overdispersion + GAM

On Mon, 2008-02-11 at 07:35 -0800, anna banana wrote:
 Hi,
 
 there are a lot of messages dealing with overdispersion, but I couldn't
find
 anything about how to test for overdispersion. I applied a GAM with
binomial
 distribution on my presence/absence data, and would like to check for
 overdispersion. Does anyone know the command?

Bernoulli data (presence/absence of single species say) can't be
overdispersed, so there is no need to test or correct for it.

G

 
 Many thanks,
 
 Anna
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] image quality

2008-02-11 Thread Philippe Glaziou
On 2/11/08, John Lande [EMAIL PROTECTED] wrote:
 I am writing a sweave documentation for my analysis, and I am plotting huge
 scatter plot data for microarray.
 unlucly this take a lot of resource to my pc because of the quality of the
 image which is to high (I see the PC get stuck for each single spot).
 how can I overcome this problem? is there a way to make lighter image?
 john


John,
You may try to plot random samples of your data. E.g.:

df1 - data.frame(x=rnorm(1), y=rnorm(1))
df1.small - df1[sample(nrow(df1),1000), ]
with(df1.small, plot(x,y))


HTH,

Philippe

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R programming style

2008-02-11 Thread Earl F. Glynn
David Scott [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]

 Can anyone provide further pointers to good style?

While not written for R specifically, the book Code Complete:  A Practical 
Handbook of Software Construction (2nd Edition) discusses a number of good 
concepts for writing good code in any language:
http://www.amazon.com/Code-Complete-Practical-Handbook-Construction/dp/0735619670

In particular, Part IV Statements gives a number of useful suggestions by 
type of statement, e.g., straight-line code, conditionals, loops, ...

There are some practices used in R that I think should be improved.  For 
example, many years ago I was taught in a software engineering class that 
the use of  magic numbers was a bad practice, yet we find magic numbers 
used in R in many places.

Instead of using 1 or 2 in an apply, I'll write something like this 
trying for some sort of mnemonic

apply(x, BY.ROW-1, sum)
or
apply(z, BY.COL-2, mean)


I find BY.ROW or BY.COL to be more mnemonic than the magic numbers 1 and 2.

The sides 1, 2, 3, and 4 in an axis statement should have some sort of 
mnemonic definition, too, perhaps:

axis(BOTTOM-1, ...)

But I believe I was ostracized in this E-mail list the last time I suggested 
such mnemonics instead of magic numbers.

efg
Earl F. Glynn
Bioinformatics
Stowers Institute for Medical Research

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to fit discrete distribution to data in R having non standard shape?

2008-02-11 Thread Aswad Gurjar
Hello,

I have 421 readings of time and no of requests coming at perticular
time.Basically I have data with interval of one minute and corresponding no
of requests arriving per minute.It is discrete in nature.I am collecting
data from 9AM to 4PM.But some of readings are coming as 0.When I plotted
histogram of data I could not get shape of any standard distribution.Now,my
aim is to find distribution which is best fit to my data.

How can I do that with R?Because major problem is shape is not standard
one.I am not able to fit any standard model.Do I need to use empirical
distribution?Is there any other way to fit appropriate model by dividing
data and trying to fit model to each part?

Please help me on this issue.
Thank You.

Aswad

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] controlling the edge linewidth in Rgraphviz

2008-02-11 Thread Gabor Grothendieck
Check out:

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/111006.html

On Feb 11, 2008 9:56 PM, Adrian Dragulescu [EMAIL PROTECTED] wrote:

 Hello,

 I would like to have different linewidths for the edges of my graph.  I
 read the documentation but could not find how to control this.  On the
 Graphviz help page I've seen that there is something called penwidth but I
 could not find it in the R edge attributes.

 Thanks a lot for any help.

 Adrian Dragulescu

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] controlling the edge linewidth in Rgraphviz

2008-02-11 Thread Adrian Dragulescu

Hello,

I would like to have different linewidths for the edges of my graph.  I
read the documentation but could not find how to control this.  On the
Graphviz help page I've seen that there is something called penwidth but I
could not find it in the R edge attributes.

Thanks a lot for any help.

Adrian Dragulescu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R on Mac PRO does anyone have experience with R on such a platform ?

2008-02-11 Thread Roger Day

My experience with R.app on a MACbook has been mostly very positive.
I like the interface much better than that of Windows--
with two exceptions.


a)  I use stepping thru code with control-R.  It's not as convenient on Mac-
the code you want to run has to be actually selected; not good enough just
to be on the line you want.
That slows down code-stepping.
b)  saveHistory() doesn't save the history of the current session -- beware,
I lost some work that way.  you have to actually click a button.
c) no resizing graphs post-hoc,
d) saving graphics to a file is inconvenient except for pdf output.
 
Some plusses are:
a) better built-in editor (if you're not using ESS), including delimiter
matching
b) the history pane is nice,
c) the package installer and manager are nicer than on Win,
d) autocompletion with ctrl-period,
e) you can select text on the current or past command line much easier,
f) attractive interface with lots of cosmetic options.

I've done some tkrplot work in both (using X11 in OSX)
 -- some inconsistencies with placement of widgets show up.

This is off the top of my head.
Check out the mailing list R-sig-mac for more info.


Maura E Monville wrote:
 
 I saw there exists an R version for Mac/OS.
 I'd like to hear from someone who is running R on a Mac/OS before
 venturing
 on getting  the following  computer system.
 I am in the process of choosing a powerful laptop 17 MB PRO
 2.6GHZ(dual-core)  4GBRAM 
 
 Thank you so much,
 -- 
 Maura E.M
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/R-on-Mac-PRO-does-anyone-have-experience-with-R-on-such-a-platform---tp15392360p15417362.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] User defined split function in rpart

2008-02-11 Thread R Help
I had a similar problem, trying to use lme within a custom rpart
function.  I got around it by passing the dataframe I needed through
the parms option in rpart, and then using the parms option in
evaluation, init and split as a dataset.  It's not the most elegant
solution, but it will work.

Have you (or anyone else) figured out the details of the summary and
text options in the init function?  I know that they are used to fill
out the summary of the model and the text.rpart plotting, but I can't
seem to use any of the variables being passed to them efficiently (or
at all).

Hope that helps,
Sam Stewart

On Feb 20, 2007 2:47 PM, Tobias Guennel [EMAIL PROTECTED] wrote:
 I have made some progress with the user defined splitting function and I got
 a lot of the things I needed to work. However, I am still stuck on accessing
 the node data. It would probably be enough if somebody could tell me, how I
 can access the original data frame of the call to rpart.
 So if the call is: fit0 - rpart(Sat ~Infl +Cont+ Type,
  housing, control=rpart.control(minsplit=10, xval=0),
  method=alist)
 how can I access the housing data frame within the user defined splitting
 function?

 Any input would be highly appreciated!

 Thank you
 Tobias Guennel


 -Original Message-
 From: Tobias Guennel [mailto:[EMAIL PROTECTED]
 Sent: Monday, February 19, 2007 3:40 PM
 To: '[EMAIL PROTECTED]'
 Subject: [R] User defined split function in rpart

 Maybe I should explain my Problem a little bit more detailed.
 The rpart package allows for user defined split functions. An example is
 given in the source/test directory of the package as usersplits.R.
 The comments say that three functions have to be supplied:
 1. The 'evaluation' function.  Called once per node.
   Produce a label (1 or more elements long) for labeling each node,
   and a deviance.
 2. The split function, where most of the work occurs.
Called once per split variable per node.
 3. The init function:
fix up y to deal with offsets
return a dummy parms list
numresp is the number of values produced by the eval routine's label.

 I have altered the evaluation function and the split function for my needs.
 Within those functions, I need to fit a proportional odds model to the data
 of the current node. I am using the polr() routine from the MASS package to
 fit the model.
 Now my problem is, how can I call the polr() function only with the data of
 the current node. That's what I tried so far:

 evalfunc - function(y,x,parms,data) {

 pomnode-polr(data$y~data$x,data,weights=data$Freq)
 parprobs-predict(pomnode,type=probs)
 dev-0
 K-dim(parprobs)[2]
 N-dim(parprobs)[1]/K
 for(i in 1:N){
 tempsum-0
 Ni-0
 for(l in 1:K){
 Ni-Ni+data$Freq[K*(i-1)+l]
 }
 for(j in 1:K){
 tempsum-tempsum+data$Freq[K*(i-1)+j]/Ni*log(parprobs[i,j]*Ni/data$Freq[K*(i
 -1)+j])
 }
 dev=dev+Ni*tempsum
 }
 dev=-2*dev
 wmean-1
 list(label= wmean, deviance=dev)

 }

 I get the error: Error in eval(expr, envir, enclos) : argument data is
 missing, with no default

 How can I use the data of the current node?

 Thank you
 Tobias Guennel

 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mistake during subscription

2008-02-11 Thread Ted Harding
On 11-Feb-08 16:34:36, Anna Meissner wrote:
 Dear R-helper,
 
 I made a mistake during my subscription, and think that
 I turned off from the mailing list. I confirm that I want
 to join the mailing list and wish to post some emails.
 
 Cheers, Anna Meissner

Anna,
If you [re-]visit the info/subscription web page at

  http://stat.ethz.ch/mailman/listinfo/r-help

and enter your details in the section Subscribing to R-help,
and then click the Subscribe button, then you should get
subscribed to the list. This may not take effect immediately,
since I think the list administrator (Martin Maechler) may
have to approve you first. (And this may be why nothing
has happened for you yet, since Martin has been away for
a week and, I think, has only just returned).

In any case, if you post an email to the list even though
you are not subscribed, it will reach the list once it has
been approved (though you would have to visit the archives
at https://stat.ethz.ch/pipermail/r-help/ to see any replies).

Hoping this helps,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 11-Feb-08   Time: 17:20:03
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to generate a column based on other columns in a data frame

2008-02-11 Thread Gabor Grothendieck
Assuming this data frame:
DF - data.frame(X = c(36.435, 36.435, 36.435, 35.329, 35.329,
36.431, 36.431, 35.421, 35.421, 35.421), Y = c(30.118, 30.118,
30.118, 29.657, 29.657, 30.111, 30.111, 29.797, 29.797, 29.797))

# Try this:
DF$site - as.numeric(factor(interaction(DF$X, DF$Y)))

If X and Y can vary slightly while still referring to the same
site then round them first to k decimal places first.  See ?round


On Feb 11, 2008 11:30 AM, Weidong Gu [EMAIL PROTECTED] wrote:
 HI,



 I am working on a data set with multiple collections of mosquitoes at
 sampling sites. Each row represents a collection of individual samples
 with coordinates for each collection.

 ... X,  Y,...

 1  36.435 30.118

 2  36.435 30.118

 3  36.435 30.118

 4  35.329 29.657

 5  35.329 29.657

 6  36.431 30.111

 7  36.431 30.111

 8  35.421 29.797

 9  35.421 29.797

 10 35.421 29.797



 Unfortunately, there is no 'site' entry. I would like to add a column of
 'site' based on the coordinates of samples so that samples from the same
 sites have the same site ID like S1, S2,



 How to do this in R way? Thanks.





 Weidong Gu,

 Department of Medicine
 University of Alabama, Birmingham
 1900 University Blvd., Birmingham, Alabama 35294
 Email: [EMAIL PROTECTED]
 PH: (205)-975-9053




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Length problem

2008-02-11 Thread milton ruser
Ciao Paulo,
my.data-read.table(stdin(),head=T,sep=,)
yy,mm,dd,C.531,C.542,C.558,C.565
2003,1,1,0.9941125,1.412338,0.8996750,2.258200
2003,1,2,1.7931375,2.786900,NA,3.108725
2003,1,3,NA,3.657775,1.7269750,2.541938
2003,1,4,1.0840625,1.766925,1.2313375,2.321300
2003,1,5,1.1558000,2.128488,0.9670375,NA


coppie-c(my.data[4:length(my.data)])

my.data[,4]

length(my.data[,4])

coppie[1]

length(coppie[1]) #here you get 1 because you have one object ($C.531)

length(coppie[[1]]) #here you get what you want.
 Good luck


Miltinho


On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote:

 Ciao Milthinho
 Here it is

  data
 yy mm dd C.531C.542 C.558C.565
 1 2003  1  1 0.9941125 1.412338 0.8996750 2.258200
 2 2003  1  2 1.7931375 2.786900NA 3.108725
 3 2003  1  3NA 3.657775 1.7269750 2.541938
 4 2003  1  4 1.0840625 1.766925 1.2313375 2.321300
 5 2003  1  5 1.1558000 2.128488 0.9670375   NA

 # New data
 coppie-c(data[4:length(data)])

 # Length of  original data
  data[,4]
 [1] 0.9941125 1.7931375NA 1.0840625 1.1558000
  length(data[,4])
 [1] 5
  5   # Right !!!
 [1] 5
  # Length of new data
  coppie[1]
 $C.531
 [1] 0.9941125 1.7931375NA 1.0840625 1.1558000

  length(coppie[1])
 [1] 1
  1   # Why ??

 Thank you for your help

 Paolo
 Italia


 milton ruser wrote:

 Ciao Paolo,

 How about you show some row of your data?
 How many columns have your data.frame? One?
 By the way data is not a so good name for your data frame.

 We will be very happy to help you

 Kindly,

 Miltinho
 Brasile

 On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote:
 
 
Hi all
I have this problem:
In my database .dta, called data I have five rows
data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta)
# From this database  I wuold like to create another
coppie-c(data[4:length(data)])
but I find this
 
# Length of  original data
length(data[,4])
5   RIGHT!!
# Length of new data
length(coppie[1])
1  WHY??
Thank you all for your help
Paolo Grillo
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mistake during subscription

2008-02-11 Thread Anna Meissner
Dear R-helper,

I made a mistake during my subscription, and think that I turned off from
the mailing list. I confirm that I want to join the mailing list and wish to
post some emails.

Cheers, Anna Meissner

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R in a university course: dealing with proposal comments

2008-02-11 Thread David Whiting

On Mon, Feb 11, 2008 at 07:37:04AM -0800, Neil Shephard wrote:
 
 
 
 Arin Basu-3 wrote:
  
  Comment 2:
  
  Finally, on a minor point, why is R the statistical software being
  used? SPSS is probably more widely available in the workplace –
  certainly in areas of social policy etc.  (Prof NB)
  
  
 
 What struck me in the above is the probably.  How probable is it, anything
 to substantiate the claim?
 
 Anyway, whether one package is more widely available in the workplace than
 another is somewhat of a moot point.  If a student learns how to use one
 software package then they start to get pigeon-holed into using that
 particular software package.
 
 Many jobs are advertised with SPSS/SAS/Stata/S-Plus (add/subtract at will)
 skills/knowledge required (or at least desirable).  The prospective job
 applicant may think Well I don't know how to use that so I shan't bother
 applying or they may be unwilling to re-learn how to use a new stats
 package after months/years of investment in learning how to use another
 package, alternatively they may well just loose out to someone who already
 has the experience/skills.
 
 (Most) of this problem isn't negated when using R.  Start a new job and use
 the (excellent, extensible, and free) software that you've been using for
 years.

And you could even argue that learning R means you'll be able to do
more with SPSS: http://www.spss.com/spss/data_management_book.htm

[I have not read this book so I don't know anything about the details
of how they implement this, I just came across this by accident, but I
was intrigued by the idea of extending SPSS using R.]


David


 
 I'd stick with using R to teach your statistics, in the long-run any of them
 who continue to perform statistical analysis will be grateful.
 
 Neil
 

-- 
David Whiting, Ph.D.
Advancing Research in Chronic Disease Epidemiology (ARCHEPI) programme
Institute of Health and Society, The Medical School, 
Newcastle University, Framlington Place, Newcastle upon Tyne, NE2 4HH. 
Tel: +44 191 222 7045;  Extn: 7375; Fax: +44 191 222 8211.
http://research.ncl.ac.uk/archepi
www.ncl.ac.uk/ihs

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to generate a column based on other columns in a data frame

2008-02-11 Thread Weidong Gu
HI,

 

I am working on a data set with multiple collections of mosquitoes at
sampling sites. Each row represents a collection of individual samples
with coordinates for each collection. 

... X,  Y,...

1  36.435 30.118

2  36.435 30.118

3  36.435 30.118

4  35.329 29.657

5  35.329 29.657

6  36.431 30.111

7  36.431 30.111

8  35.421 29.797

9  35.421 29.797

10 35.421 29.797

 

Unfortunately, there is no 'site' entry. I would like to add a column of
'site' based on the coordinates of samples so that samples from the same
sites have the same site ID like S1, S2, 

 

How to do this in R way? Thanks.

 

 

Weidong Gu, 

Department of Medicine
University of Alabama, Birmingham
1900 University Blvd., Birmingham, Alabama 35294
Email: [EMAIL PROTECTED]
PH: (205)-975-9053

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between P.Value and adj.P.Value

2008-02-11 Thread john seers (IFR)
 
Hi Corinna

The p.adjusted value is the the p-value adjusted for Multiple
Comparisons.

Enter ?p.adjust to get more of an explanation.

Regards


JS

 
---
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Schmitt, Corinna
Sent: 11 February 2008 16:02
To: r-help@r-project.org
Subject: [R] Difference between P.Value and adj.P.Value

Hallo,


 fit12-lmFit(qrg[,1:2])
 t12-toptable(fit12,adjust=fdr,number=25,genelist=qrg$genes[,1])
 t12
ID logFC t  P.Value  adj.P.ValB
522PLAU_OP -6.836144 -8.420414 5.589416e-05 0.01212520 2.054965
1555  CD44_WIZ -6.569622 -8.227938 6.510169e-05 0.01212520 1.944046

Can anyone tell me what the difference is between P.Value and
adj.P.Value? I need to analyse microarrays and should say if there exist
differential expressed genes. Which P.Value should I use?

Thanks, Corinna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ROracle for windows

2008-02-11 Thread Daniel Ito
..

Somebody can help me to connect Oracle data base with R ?

i`m just a user of this software and i don`t know about this especial
thinks!

sorry about my english...

att


-- 
Atenciosamente
Daniel Ito
Estatística UNICAMP
EPR - CPFL

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Length problem

2008-02-11 Thread milton ruser
Ciao Paolo,

How about you show some row of your data?
How many columns have your data.frame? One?
By the way data is not a so good name for your data frame.

We will be very happy to help you

Kindly,

Miltinho
Brasile

On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote:


   Hi all
   I have this problem:
   In my database .dta, called data I have five rows
   data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta)
   # From this database  I wuold like to create another
   coppie-c(data[4:length(data)])
   but I find this

   # Length of  original data
   length(data[,4])
   5   RIGHT!!
   # Length of new data
   length(coppie[1])
   1  WHY??
   Thank you all for your help
   Paolo Grillo
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] overdispersion + GAM

2008-02-11 Thread anna banana

Hi,

there are a lot of messages dealing with overdispersion, but I couldn't find
anything about how to test for overdispersion. I applied a GAM with binomial
distribution on my presence/absence data, and would like to check for
overdispersion. Does anyone know the command?

Many thanks,

Anna
-- 
View this message in context: 
http://www.nabble.com/overdispersion-%2B-GAM-tp15413120p15413120.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Length problem

2008-02-11 Thread Paolo Grillo

   Hi all
   I have this problem:
   In my database .dta, called data I have five rows
   data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta)
   # From this database  I wuold like to create another
   coppie-c(data[4:length(data)])
   but I find this

   # Length of  original data
   length(data[,4])
   5   RIGHT!!
   # Length of new data
   length(coppie[1])
   1  WHY??
   Thank you all for your help
   Paolo Grillo
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R in a university course: dealing with proposal comments

2008-02-11 Thread Bernard Leemon
Hi Arin,
Others have commented wisely an your first issue.  As for your 2nd issue, I
had my own concerns about using R in undergraduate teaching because I had
always used a point-and-click program for that level.  I should not have
worried.  The current generation has been typing on their keyboards and
their phones for a long time; they are very skilled.  They LIKE a
command-line interface, so long as someone gives them an initial cheat sheet
to get them going.  They like the price, they like having it on their own
computers, and they like that they can use it other courses.  Some students
are sometimes upset that no one has ever told them about R before.  Two
hours after the first lab in which I had students download R to their
laptops, I received an email from a student telling me about how she had
used R to do her physics homework.  I like the (almost)
platform-independence of R.  I've resisted using Rcmdr and JGR because I
want students to be able to use base R well.  If they want to customize
later, then fine.  But what I teach them will apply wherever they next
encounter R, whereas if were to use a lot of packages--especially one I
would be tempted to create to match my teaching more closely--then they
wouldn't be sure what to expect later.

gary mcclelland
Colorado

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R in a university course: dealing with proposal comments

2008-02-11 Thread Stas Kolenikov
I've been teaching an intro stats class to engineering students (who are
better in calculus and math than med students, I would imagine), and use of
R has never been received very warmly. I might not be teaching it right, but
their (quite valid, from their standpoint) concerns were that they would
have to learn a tool that they will never use (so they might have been
better off with statistics toolbox from Matlab say, as they use the latter
in their DiffEq, Circuits and other classes), and that did not get enough
credit points for doing those (and indeed I was suggesting using R as an
extra credit, essentially as a bypass so as not to use the tables in the end
of the book). With health sciences people, I would expect they would want to
learn the tool that they would use for life
-- at least that's my impression with the applied researchers that
I've interacted with: their computer literacy is often limited to a
small number of software titles, but they know each of them quite
well. R might be just too dynamic for them.
Again, it's not terribly clear whether they will use it at all if that's the
only statistics class they take for breadth requirement. If anything, I
would expect SAS and Stata to be more widely used in biostatistics, so
teaching any of those might be of greater service and use to your students.
Training researchers of tomorrow might be great, but ifyour students get on
the market in the end of the semester, they won't have the luxury of waiting
until R becomes THE package of choice.

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: Please do not reply to my Gmail address as I don't check it
regularly.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R in a university course: dealing with proposal comments

2008-02-11 Thread Longinus
I will also evaluate what did the students used before in the
introductory statistics class and how proficient they have become in
using it. If they only barely touched it, I will use my class as a
chance to further refine their familiarity with the software they saw
before. Tool is tool, I consider it's more important to use at least
one tool, regardless of if it's trendy or free or user-friendly,
really well rather than being able to juggle many softwares
superficially.

SPSS, despite of its price, is still widely recognized. I wouldn't
feel too bad teaching it. However, I will definitely shift the focus
to syntax writing from GUI so that the students will be better
prepared for other command based softwares.

Ken

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R programming style

2008-02-11 Thread Bernard Leemon
I just got a copy of
A First Course in Statistical Programming with R by W. John Braun and Duncan
J. Murdoch.  Cambridge.  at amazon:
 http://www.amazon.com/First-Course-Statistical-Programming-R/dp/0521694248/

first couple of chapters are base R that most everyone would know before
wanting to program but then the other chapters on programming itself seem
pretty good so far.

gary mcclelland
colorado

On Mon, Feb 11, 2008 at 3:47 AM, David Scott [EMAIL PROTECTED] wrote:


 I am aware of one (unofficial) guide to style for R programming:
 http://www1.maths.lth.se/help/R/RCC/
 from Henrik Bengtsson.

 Can anyone provide further pointers to good style?

 Views on Bengtsson's ideas would interest me as well.

 David Scott



 _
 David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
 Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
 Email:  [EMAIL PROTECTED]

 Graduate Officer, Department of Statistics
 Director of Consulting, Department of Statistics

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lower Confidence Bound

2008-02-11 Thread Feifei
Hi,

I'm doing analysis of microarray data with affylmGUI package, and to
make a comparison by using affylmGUI and dChip
( http://biosun1.harvard.edu/complab/dchip/ another tool to analyse
microarrays).

I'm trying to use the same criteria as many as I can, but there's a 90%
lower confidence bound to filter genes. Does anyone know how to
calculate this lower confidence bound in R or by using a specific
package.

Thanks!

Feifei Ding

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED]

2008-02-11 Thread Jin.Li
Thanks to all for your kind suggestions.

After some discussion with our IT staff, I was told the UNIX system we have
is Solaris and installation of R is very time consuming because Given that
this software is not standard, and given the amount of time required to
compile the software (and potentially it's dependencies), it will need to be
resourced as a project ... From my experience with IT staff, it may take
quite a long time for them to set up such project, let alone the
installation.

Given that, I wonder if it is possible to install it myself. As I have
mentioned before, I have no experience in using UNIX, but I will have an
access to the UNIX system soon. Any suggestions and help are greatly
appreciated.

Regards,
Jin

-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: Monday, 28 January 2008 11:38
To: Li Jin
Cc: r-help@r-project.org
Subject: Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED]

On the PC there is a builtin GUI but not on UNIX and there are
some packages that are OS specific in which case you might
get more or less selection but probably more.  Also depending
on the specific system you may have greater difficulty installing
certain packages due to the need to compile them on UNIX
and the possibility exists that you don't quite have the right
libraries.  On Windows you get binaries so this is not a problem.
I have repeatedly found that common packages that I took
for granted on Windows had some problem with installation
on UNIX and I had to hunt around and figure out what the problem
was with my UNIDX libraries or possibly some other problem.
For all R packages this won't be a problem but for packages
that use C and FORTRAN this can be.  Although I am lumping
all UNIX systems together I think this varies quite a bit from
one particular type/distro of UNIX/Linux to another and I suspect if you
are careful in picking out the right one (if you have a choice) you
will actually have zero problems.

On Jan 23, 2008 6:08 PM,  [EMAIL PROTECTED] wrote:
 Dear All,
 I am currently using R in Windows PC with a 2 GB of RAM. Some pretty large
 datasets are expected soon, perhaps in an order of several GB. I am facing
a
 similar situation like Ralph, either to get a new PC with a bigger RAM or
 else. I am just wondering if R is getting faster in other systems like UNIX
 or Linux. Any suggestions are appreciated.
 Regards,
 Jin
 
 Jin Li, PhD
 Spatial Modeller/
 Computational Statistician
 Marine  Coastal Environment
 Geoscience Australia
 Ph: 61 (02) 6249 9899
 Fax: 61 (02) 6249 9956
 email: [EMAIL PROTECTED]
 


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
 Behalf Of Prof Brian Ripley
 Sent: Thursday, 24 January 2008 12:05
 To: Ralph79
 Cc: r-help@r-project.org
 Subject: Re: [R] Problems with XP32-3GB-patch?/ Worth upgrading to Vista
 X64?

 On Wed, 23 Jan 2008, Ralph79 wrote:

 
  Dear R-Users,
 
  as I will start a huge simulation in a few weeks, I am about to buy a new
  and fast PC. I have noticed, that the RAM has been the limiting factor in
  many of my calculations up to now (I had 2 GB in my old system, but
  Windows still used quite a lot of virtual memory), hence my new computer
  will have 4 GB of fast DDR2-800 RAM.
 
  However, I know that 1.) Windows 32 bit cannot make use of more than
about
  3,2 GB RAM and 2.) it is normally not allowed to allocate more than 2 GB
of
  RAM to one single application (at least under XP, I don't know if that
has
  changed under Vista?).
 
  I remember from the R-FAQ that you can manually adjust XP so that it
  allocates up to 3 GB to one application (the 3GB patch), but I read in
a
  PC-magazine and some message boards that this may cause problems. Does
  anybody of you successfully use this trick without any problems?

 Yes, many people: most 32-bit Exchange servers use it.  Please don't rate
 the advice in the R documentation below tittle-tattle you read on the web.

  Would it be wise to use a 64bit OS, as e.g. Vista X64? I think, under
Vista
  X64 it should be no problem to allocate 4 GB of RAM to R. Any experiences
  with that?

 That's what the rw-FAQ says, and we do write answers based on experience!

  Thanks in advance,
  Ralph Wirth
 
 
  -
  Ralph Wirth
  University Erlangen-Nuremberg, Chair of Statistics
  GfK Group, Department of Methods and Product Development
 
 

 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide

Re: [R] Conditional rows

2008-02-11 Thread Stanley Ng
That works beautfully. Why using test=0.2 || test 0.3 gives error ?   

-Original Message-
From: Gabor Csardi [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 11, 2008 18:27
To: Ng Stanley
Cc: r-help
Subject: Re: [R] Conditional rows

which(apply(test=0.2, 1, all))

See ?which, ?all, and in particular ?apply.

Gabor

On Mon, Feb 11, 2008 at 06:22:09PM +0800, Ng Stanley wrote:
 Hi,
 
 Given a simple example,  test - matrix(c(0.1, 0.2, 0.1, 0.2, 0.1, 
 0.1, 0.3, 0.1, 0.1), 3, 3)
 
 How to generate row indexes for which their corresponding row values 
 are less than or equal to 0.2 ? For this example, row 2 and 3 are the 
 correct ones.
 
 Thanks
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor [EMAIL PROTECTED]UNIL DGM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R in a university course: dealing with proposal comments

2008-02-11 Thread John Fox
Dear Arin,

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Arin Basu
 Sent: February-10-08 10:41 PM
 To: r-help@r-project.org
 Subject: [R] Using R in a university course: dealing with proposal
 comments
 
 Hi All,
 
 I am scheduled to teach a graduate course on research methods in
 health sciences at a university. While drafting the course proposal, I
 decided to include a brief introduction to R, primarily with an
 objective to enable the students to do data analysis using R. It is
 expected that enrolled students of this course have all at least a
 formal first level introduction to quantitative methods in health
 sciences and following completion of the course, they are all expected
 to either evaluate, interpret, or conduct primary research studies in
 health. The course would be delivered over 5 months, and R was
 proposed to be taught as several laboratory based hands-on sessions
 along with required readings within the coursework.
 
 The course proposal went to a few colleagues in the university for
 review. I received review feedbacks from them; two of them commented
 about inclusion of R in the proposal.
 
 In quoting parts these mails, I have masked the names/identities of
 the referees, and have included just part of the relevant text with
 their comments. Here are the comments:
 
 Comment 1:
 
 In my quick glance, I did not see that statistics would be taught,
 but I did see that R would be taught.  Of course, R is a statistics
 programme. I worry that teaching R could overwhelm the class.  Or
 teaching R would be worthless, because the students do not understand
 statistics.  (Prof LR)

As others have pointed out, this is potentially a valid point, but it is
applicable to all statistical software. I use R in several different courses
for social-science undergraduates and grad students, but the focus is on the
statistical methods, with R as a tool. In introductory courses, I use the
Rcmdr package to simplify students' interaction with R. Beyond that level, I
want students to learn to use R as a practical tool for data analysis, so I
teach them to write commands. In all courses, students have much more
difficulty with the substantive course content than with R, which they pick
up readily.

 Comment 2:
 
 Finally, on a minor point, why is R the statistical software being
 used? SPSS is probably more widely available in the workplace -
 certainly in areas of social policy etc.  (Prof NB)

I don't have concrete data on this, and I'm sure that usage varies by field,
but I'd bet that R is now more widely used overall (and internationally)
than SPSS. Moreover, it wouldn't take students long to learn to
point-and-click their way through SPSS if they have to use it in future.

I hope this helps,
 John

 
 I am interested to know if any of you have faced similar questions
 from colleagues about inclusion of R in non-statistics based
 university graduate courses. If you did and were required to address
 these concerns, how you would respond?
 
 TIA,
 Arin Basu
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGTK2 and glade on Windows - GUI newbie

2008-02-11 Thread Felix Andrews
Yes, a GUI based on GTK+ (with or without Glade) will work on Windows XP.

If what you want to do is relatively straightforward (say, without any
fancy formatting, or advanced event handling) then you should consider
gWidgets. Look at the vignette in the gWidgets package.

If you do decide to go with RGtk2 directly rather than gWidgets, look
at demo(package=RGtk2). If you are using Glade, you might want to
look at the source code for Rattle, which is a GUI built with Glade:
see http://rattle.googlecode.com/
and http://datamining.togaware.com/survivor/Installation_Details.html
(Another example is hydrosanity: http://hydrosanity.googlecode.com/)

By the way, there is a special mailing list for GUI issues:
https://stat.ethz.ch/mailman/listinfo/r-sig-gui

Felix

On Mon, Feb 11, 2008 at 9:52 PM, Anja Kraft
[EMAIL PROTECTED] wrote:
 Hallo,

  I'd like to write a GUI (first choice with GTK+).
  I've surfed through the R- an Omegahat-Pages, because I'd like to use
  RGTK2, GTK 2.10.11 in combination with glade on Windows XP (perhaps later
  Unix, Mac).
  I've found a lot of different information. Because of the information I'm
  not sure, if this combination is running on Windows XP and I'm unsure how
  it works.

  Is there anyone, who has experience with this combination (if it works)
  and could tell me, where I could find something like a tutorial, how this
  combination is used together and how it works?

  Thank you very much,

  Anja Kraft

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
Felix Andrews / 安福立
PhD candidate
Integrated Catchment Assessment and Management Centre
The Fenner School of Environment and Society
The Australian National University (Building 48A), ACT 0200
Beijing Bag, Locked Bag 40, Kingston ACT 2604
http://www.neurofractal.org/felix/
3358 543D AAC6 22C2 D336  80D9 360B 72DD 3E4C F5D8

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PDF with computationally expensive normalizing constant

2008-02-11 Thread Robin Hankin
Hi

I am writing some functionality for a multivariate  PDF.

One problem is that evaluating the normalizing constant (NC)   is
massively computationally intensive [one recent example
took 4 hours and bigger examples would take much much longer]
and it would be good allow for this in the
design of the package somehow.

For example, the likelihood function doesn't need the NC
but (eg) the moment generating function does.

So a user wanting a maximum-likelihood estimate shouldn't have
to evaluate the NC but a user wanting a
mean has to.  Some simple forms of the PDF have an
easily-evaluated analytical expression for the NC.

And once the NC is evaluated, it would be
good to store it somehow.

I thought perhaps I could define an S4 class  with a slot for
the parameters and a slot for the NC; and
if the NC is unknown this would have an NA entry.

Then a user could execute something like

a - CalculateNormalizingConstant(a)

and after this, object a  would then have the numerically
computed NC  in place.



Is this a Good Idea?

Are there any PDFs implemented in R  in which this is an issue?






--
Robin Hankin
Uncertainty Analyst and Neutral Theorist,
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dendrogram for agglomerative hierarchical clustering result

2008-02-11 Thread Wolfgang Huber
Hi Risto,
You could try

   example(dendrogram)

best wishes
Wolfgang

noorpiilur scripsit:
 Hey group,
 
 I have a problem of drawing dendrogram as the result of my program
 written in C. My algorithm is a approximation algorithm for single
 linkage method. AS a result I will get the following data:
 
 [Average distance] [cluster A] [cluster B]
 
 For example:
 42.593141 1   26
 42.593141 4   6
 42.593141 123 124
 42.593141 4   113
 74.244206 1   123
 74.244206 4   133
 74.244206 1   36
 
 So far I have used C to generate a bitmap output but I would like to
 use the computed result as an input for R to just draw the dendrogram.
 
 As I'm new to R any help is appreciated.
 
 Thanks,
 Risto


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] local variance estimation using gam or locfit

2008-02-11 Thread Takatsugu Kobayashi
Hi,

I appreciate if any one could give me clues about the following problem.

I have a map data, x, y, z, and d, where (x,y) is the coordinate of a 
point and d is a distance from the urban center (0,0), and z is 
population density.  Then I would like to calculate local standard 
deviations of these points.  Let me say hypothetically,

x - rnorm(100)
y - rnorm(100)
z - runif(100)
d - sqrt(x^2+y^2)*runif(100,1,1.5)

mod - gam(z~s(x,y,by=d))

std.res.loc - residuals/loc.std

So, I would like to calculate loc.std.  Is there any function available 
for this? Or should I manually compute it?

I am reading Generalized Additive Model: Introduction to R by Dr. Wood.

Thank you very much.

Tk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with write.csv

2008-02-11 Thread Richard . Cotton
 I am new to R. I am using the impute package with data contained in csv
 file.
 I have followed the example in the impute package as follows:
 
  mydata = read.csv(sample_impute.csv, header = TRUE)
  mydata.expr - mydata[-1,-(1:2)]
  mydata.imputed - impute.knn(as.matrix(mydata.expr))
 
 The impute is succesful.
 
 Then I try to write the imputation results (mydata.imputed) to a csv 
file
 such as follows..
 
  write.csv(mydata.imputed, file = sample_imputed.csv)
 Error in data.frame(data = c(-0.07, -1.22, -0.09, -0.6, 0.65, -0.36, 
0.25,
 :
   arguments imply differing number of rows: 18, 1, 0

When you use write.csv, the object that you are writing to a file must 
look something like a data frame or a matrix, i.e. a rectangle of data. 
The error message suggests that different columns of the thing you are 
trying to write have different numbers of rows. 

This means that mydata.imputed isn't the matrix it is supposed to be. 
You'll have to do some detective work to figure out what mydata.imputed 
really is. Try this:

mydata.imputed
class(mydata.imputed)
dim(mydata.imputed)

Then you need to see why mydata.imputed isn't a matrix.  Here there are 
two possibilities
1. There are some lines of code that you didn't tell us about, where you 
overwrote mydata.imputed with another value.
2. The impute wasn't as successful as you thought.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] svm: is this right?

2008-02-11 Thread Weiwei Shi
Hi,
I have a question on using svm{e1071} for a classification task:

No matter how I split the data into training and test, I always end with a
perfect accuracy in training but sensitivity = 0 for test. One example is
like this
  1   2
  1 209   0
  2   0  67

   pred1
 1  2
  1 47  0
  2 17  0

My question is, is there anything wrong with the following call:
m2 - best.svm(class~., data=x1, gamma=2^(-3:3), cost=2^(0:5)) # x1 is
training data
pred1 - predict(m2, x3)  # x3 is test data

Thanks!

-- 

Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

Did you always know?
No, I did not. But I believed...
---Matrix III

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] learning S4

2008-02-11 Thread Robin Hankin
Christophe

you might find the Brobdingnag package on CRAN helpful here.

I wrote the package partly to teach myself S4; it includes a
  vignette that builds the various S4 components from scratch,
in a step-by-step annotated cookbook.


HTH


rksh




On 8 Feb 2008, at 15:30, [EMAIL PROTECTED] wrote:

 Hi the list.

 I try to learn the S4 programming. I find the wiki and several doc.  
 But
 I still have few questions...

 1. To define 'representation', we can use two syntax :
- representation=list(temps = 'numeric',traj = 'matrix')
- representation(temps = 'numeric',traj = 'matrix')
   Is there any difference ?
 2. 'validityMethod' check the intialisation of a new object, but not
 the latter
   modifications. Is it possible to set up a validation that check  
 every
   modifications ?
 3. When we use setMethod('initialize',...) does the validityMethod
 become un-used ?
 4. Is it possible to set up several initialization processes ?   One
 that build an objet from a data.frame, one from a matrix...

 Thanks

 Christophe

 
 Ce message a ete envoye par IMP, grace a l'Universite Paris 10  
 Nanterre

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Robin Hankin
Uncertainty Analyst and Neutral Theorist,
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tinn-R not working well with latest R

2008-02-11 Thread Farrel Buchinsky
I can easily get R to open without an error. I simply removed the Tinn-R 
related lines from the Rprofile.site file
C:\Program Files\R-2.6.2\etc\Rprofile.site

but then when I try to manually load the svIDE library by entering 
library(svIDE) from the command line, I get a similar error.

So when you say Than paste in the command, what command are you referring 
to?
What do you change it to?


Schmitt, Corinna [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]


 Hallo,

 I had the same problems before. I think the best solution is that you just 
 copy the needed codepart out of Tinn-R with Ctr+C. Then open R directly 
 from your desktop NOT from Tinn-R. Than paste in the command. you can 
 still make changes in the command when you have not pressed enter by using 
 the arrow buttons of the keyboard. put the curse where you want in the 
 command line and change it.

 Hope that is what you want. I cannot imitate your example.

 Corinna




 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R programming style

2008-02-11 Thread Scillieri, John
I second that, Code Complete is a great book! For anyone interested in
improving their code no matter what language, (it has a C++/Java-type
focus but is definitely applicable to R), it would definitely be a good
place to start.

I've read some negative reviews claiming that everything he writes is
'obvious' (use good variable names, short concise functions, limit
nested conditionals, etc) but on more than one occasion I've gone back
over the book and thought of new places to improve my code. 

HTH,
John 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Earl F. Glynn
Sent: Monday, February 11, 2008 2:30 PM
To: [EMAIL PROTECTED]
Subject: Re: [R] R programming style

David Scott [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]

 Can anyone provide further pointers to good style?

While not written for R specifically, the book Code Complete:  A
Practical Handbook of Software Construction (2nd Edition) discusses a
number of good concepts for writing good code in any language:
http://www.amazon.com/Code-Complete-Practical-Handbook-Construction/dp/0
735619670

In particular, Part IV Statements gives a number of useful suggestions
by type of statement, e.g., straight-line code, conditionals, loops, ...

There are some practices used in R that I think should be improved.  For
example, many years ago I was taught in a software engineering class
that the use of  magic numbers was a bad practice, yet we find magic
numbers used in R in many places.

Instead of using 1 or 2 in an apply, I'll write something like
this trying for some sort of mnemonic

apply(x, BY.ROW-1, sum)
or
apply(z, BY.COL-2, mean)


I find BY.ROW or BY.COL to be more mnemonic than the magic numbers 1 and
2.

The sides 1, 2, 3, and 4 in an axis statement should have some sort of
mnemonic definition, too, perhaps:

axis(BOTTOM-1, ...)

But I believe I was ostracized in this E-mail list the last time I
suggested such mnemonics instead of magic numbers.

efg
Earl F. Glynn
Bioinformatics
Stowers Institute for Medical Research

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
 This e-mail and any attachments are confidential, may contain legal, 
 professional or other privileged information, and are intended solely for 
 the addressee.  If you are not the intended recipient, do not use the 
 information in this e-mail in any way, delete this e-mail and notify the 
 sender. CEG-IP1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Viable Approach to Parallel R?

2008-02-11 Thread Scillieri, John
We've also had substantial success with the Condor project
[http://www.cs.wisc.edu/condor/], not just with R, but as a generic
computation grid.

John 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Lewis, Daniel (IS Consultant)
Sent: Monday, February 11, 2008 1:09 PM
To: r-help@r-project.org
Subject: [R] Viable Approach to Parallel R?

All,

We are researching approaches to parallel R with the end goal of running
R in a distributed manner on a Linux cluster. We expect of course to do
some work decomposing our problems to be task-parallel or data-parallel,
but wouldn't mind getting an initial boost working with embarrassingly
parallel code sections and one of the approaches below. 

Incidentally our environment includes R 2.6.1, RHEL 5.1, Solaris 10, SGE
(Sun Grid Engine) and OpenMPI 1.2.4 (SunHPC 7.1)).

In researching previous work, the most promising approaches seem to be:

A. Snow (with Rmpi or Rpvm) (as described in
http://www.r-project.org/useR-2006/Slides/Harrington+Salibian-Barrera.pd
f from the 2006 R User Conference)

It is my understanding that this approach is viable, and works with
OpenMPI 1.2.4. Is anyone using this method with good results?

B. taskpR, RScaLAPACK, pMatrix

I read a paper
http://sdm.lbl.gov/sdmcenter/projects/SDM.center.parallel.r.2-pager.4.do
c coming out of the ORNL, describing what they call parallel R, which
included taskpr, RScaLAPACK, pMatrix. I notice that taskpR is no longer
available in contrib, nor is pMatrix.

An old link indicates the packages are available at
http://www.ASPECT-SDM.org/Parallel-R but that site displays a notice
that the server is migrating. Has this work been discontinued? Anyone
using this? I see RScaLAPACK is still available, from reading the above
it seems that was bundled with taskpR. Does it function without the
other components? (Guess I'll try it and find out :)

C. Sleigh  NetworkSpaces

I see that SCAI (Scientific Computing Associates) offers a parallel R
package based on something they call NetworkSpaces and  Sleigh
(inspired by Snow). They sell services around the product but it is open
source. They have an enhanced version that they sell  support.
http://www.lindaspaces.com/hp/BenchmarksWithCharts.pdf. Has anyone
investigated this approach or it's open source components?

TIA for any information, direction, suggestions, and if I've missed any
other approaches please advise.

Dan Lewis




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
 This e-mail and any attachments are confidential, may contain legal,
professional or other privileged information, and are intended solely for the
addressee.  If you are not the intended recipient, do not use the information
in this e-mail in any way, delete this e-mail and notify the sender. CEG-IP2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R in a university course: dealing with proposal comments

2008-02-11 Thread Neil Shephard



Neil Shephard wrote:
 
 (Most) of this problem isn't negated when using R.  Start a new job and
 use the (excellent, extensible, and free) software that you've been using
 for years.
 

Apologies for the double negative, that should have read

(Most) of this problem _is_ negated when using R.

Neil
-- 
View this message in context: 
http://www.nabble.com/Using-R-in-a-university-course%3A-dealing-with-proposal-comments-tp15405138p15416301.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to generate a column based on other columns in a data frame

2008-02-11 Thread Peter Dalgaard
Henrique Dallazuanna wrote:
 Try this:

 x2 - merge(x, cbind(unique(x), Site=sprintf(S%d,
 seq_len(nrow(unique(x), by=c(X, Y))
 x2[order(x2$site)]
   
That was (close to) my first thought as well. But what about

site - with(x, interaction(X,Y, drop=TRUE))
levels(site) - paste(S, seq_len(length(levels(site))), sep=)

-p

 On 11/02/2008, Weidong Gu [EMAIL PROTECTED] wrote:
   
 HI,



 I am working on a data set with multiple collections of mosquitoes at
 sampling sites. Each row represents a collection of individual samples
 with coordinates for each collection.

 ... X,  Y,...

 1  36.435 30.118

 2  36.435 30.118

 3  36.435 30.118

 4  35.329 29.657

 5  35.329 29.657

 6  36.431 30.111

 7  36.431 30.111

 8  35.421 29.797

 9  35.421 29.797

 10 35.421 29.797



 Unfortunately, there is no 'site' entry. I would like to add a column of
 'site' based on the coordinates of samples so that samples from the same
 sites have the same site ID like S1, S2,



 How to do this in R way? Thanks.





 Weidong Gu,

 Department of Medicine
 University of Alabama, Birmingham
 1900 University Blvd., Birmingham, Alabama 35294
 Email: [EMAIL PROTECTED]
 PH: (205)-975-9053




 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 


   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] good reference for mixed models and EM algorithm

2008-02-11 Thread Ravi Varadhan
Hi Doug  Ted,

The multivariate Aitken accelerator suggested by Ted is numerically
ill-conditioned.  I have written a globally-convergent, general-purpose EM
accelerator that works well. It is quite simple to implement for any EM-type
algorithm (e.g. ECM, ECME which are all monotone in likelihood).  My paper
on that should be coming out soon in Scandinavian J of Stats.  I would be
interested in helping with its implementation for EM acceleration in large
data sets with non-nested random effects.  

Best,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Ted Harding
Sent: Monday, February 11, 2008 11:19 AM
To: r-help@r-project.org
Subject: Re: [R] [OT] good reference for mixed models and EM algorithm

On 11-Feb-08 15:07:37, Douglas Bates wrote:
 [...]
 Except that Doug Bates doesn't use the EM algorithm for fitting mixed
 models any more.  The lme4 package previously had an option for
 starting with EM (actually ECME, which is a variant of EM) iterations
 but I have since removed it.  For large data sets and especially for
 models with non-nested random effects, the EM iterations just slowed
 things down relative to direct optimisation of the log-likelihood.
 [...]

The raw EM Algorithm can be slow. I have had good success
using Aitken Acceleration for it.

The basic principle is that, once an interative algorithm
gets to a stage where (approximately)

[A]  (X[n+1] - X) = k*(X[n] -X)

where X[n] is the result at the n-th iteration, -1  k  1, and
X is the limit, then you can use recent results to predict the
limit. Taking the above equation literally, along with its
analogue for the next step:

[B]  (X[n+2] - X) = k*(X[n+1] -X)

from which k = (X[n+2] - X[[n+1])/(X[n+1] - X[n])

and then

[C] X = (X[n+1] - X[n])/(1 - k).

If X is multidimensional (say dimension = p), then k is a
pxp matrix, and you want all its eigenvalues to be less than 1
in modulus. Then you use the matrix analogues of the above
equations, based it on (p+1) successive iterations
X[n], X[n+1], ... , X[n+p+1]), i.e. on the p-vector

  c(X[n+1]-X[n], X[n+1]-X[n+1], ... , X[n+p+1]-X[n+p])

I have had good experience with this too!

The best method of proceeding is:

Stage 1: Monitor the sequence {X[n]} until it seems that
  equation [A] is beginning to be approximately true;

Stage 2: Apply equations [A], [B], [C] to estimate X.

Stage 3: Starting at this X, run a few more iterations
  so that you get a better (later) estimate of k, and
  then apply [A], [B], [C] aqain to re-estimate X.

Repeat stage 3 until happy (or bored).

The EM Algorithm, in most cases, falls into the class
of procedures to which Aitken Acceleration is applicable.

Best wishes to all,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 11-Feb-08   Time: 16:18:45
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R in a university course: dealing with proposal comments

2008-02-11 Thread Paul Gilbert
Stas Kolenikov wrote:
 ...
 Training researchers of tomorrow might be great, but ifyour students get on
 the market in the end of the semester, they won't have the luxury of waiting
 until R becomes THE package of choice.
   
Not being a teacher, I usually follow these discussions with a bit of 
amusement and some befuddlement. We hire young people hoping they will 
bring in bright new ideas from academia, and academics are training the 
students based on what they think are the old things we use.   
Fortunately, R is already one of the packages of choice many places.

Another point that needs more emphasis is that R is actually a 
programming language, like Matlab and and APL, so it really has more 
general usefulness than statistics packages that one might use in the 
narrower context of a statistics course.

Paul Gilbert


La version française suit le texte anglais.



This email may contain privileged and/or confidential in...{{dropped:26}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] good reference for mixed models and EM algorithm

2008-02-11 Thread Ted Harding
On 11-Feb-08 15:07:37, Douglas Bates wrote:
 [...]
 Except that Doug Bates doesn't use the EM algorithm for fitting mixed
 models any more.  The lme4 package previously had an option for
 starting with EM (actually ECME, which is a variant of EM) iterations
 but I have since removed it.  For large data sets and especially for
 models with non-nested random effects, the EM iterations just slowed
 things down relative to direct optimisation of the log-likelihood.
 [...]

The raw EM Algorithm can be slow. I have had good success
using Aitken Acceleration for it.

The basic principle is that, once an interative algorithm
gets to a stage where (approximately)

[A]  (X[n+1] - X) = k*(X[n] -X)

where X[n] is the result at the n-th iteration, -1  k  1, and
X is the limit, then you can use recent results to predict the
limit. Taking the above equation literally, along with its
analogue for the next step:

[B]  (X[n+2] - X) = k*(X[n+1] -X)

from which k = (X[n+2] - X[[n+1])/(X[n+1] - X[n])

and then

[C] X = (X[n+1] - X[n])/(1 - k).

If X is multidimensional (say dimension = p), then k is a
pxp matrix, and you want all its eigenvalues to be less than 1
in modulus. Then you use the matrix analogues of the above
equations, based it on (p+1) successive iterations
X[n], X[n+1], ... , X[n+p+1]), i.e. on the p-vector

  c(X[n+1]-X[n], X[n+1]-X[n+1], ... , X[n+p+1]-X[n+p])

I have had good experience with this too!

The best method of proceeding is:

Stage 1: Monitor the sequence {X[n]} until it seems that
  equation [A] is beginning to be approximately true;

Stage 2: Apply equations [A], [B], [C] to estimate X.

Stage 3: Starting at this X, run a few more iterations
  so that you get a better (later) estimate of k, and
  then apply [A], [B], [C] aqain to re-estimate X.

Repeat stage 3 until happy (or bored).

The EM Algorithm, in most cases, falls into the class
of procedures to which Aitken Acceleration is applicable.

Best wishes to all,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 11-Feb-08   Time: 16:18:45
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Length problem

2008-02-11 Thread Henrique Dallazuanna
I think that coppie is a list, so

length(coppie[[1]])

On 11/02/2008, Paolo Grillo [EMAIL PROTECTED] wrote:

Hi all
I have this problem:
In my database .dta, called data I have five rows
data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta)
# From this database  I wuold like to create another
coppie-c(data[4:length(data)])
but I find this

# Length of  original data
length(data[,4])
5   RIGHT!!
# Length of new data
length(coppie[1])
1  WHY??
Thank you all for your help
Paolo Grillo
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] genetics package not working

2008-02-11 Thread Farrel Buchinsky
Finally I found something that provides lower level examples.
I was looking around the genetics package. I came across 
write.pop.file(genetics) and there I found the format of 'pedigree' files is 
documented at http://www.sph.umich.edu/csg/abecasis/GOLD/docs/pedigree.html

That reference lays out exactly the format that is being used.


Farrel Buchinsky [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 From crawling around the internet it appears to me as if genetics has 
 given
 way to GeneticsBase and is part of bioconductor. The basic data structure 
 has changed to something called geneSet class. There is a pdf document 
 that promises to help me. 
 http://www.bioconductor.org/packages/2.1/bioc/vignettes/GeneticsBase/inst/doc/SummaryTables.pdf.
  
 Unfortunately it does not. My dataset which was created using genetics 
 package does not seem to fit (or should I say does not seem to easily 
 fit) the read in formats demonstrated in the document: standard pedigree 
 format, hapmap format, Pfizer format, Perlegen format.

 Can anyone point me to a resource with lower level instructions and 
 examples?

 My format is as follows (rs numbers are not correct but do not worry about 
 that detail)
 str(ped.seq[,2:15])
 'data.frame':   608 obs. of  14 variables:
 $ pedigree  : int  1 1 2 3 3 4 4 5 6 6 ...
 $ id: Factor w/ 30 levels 1,2,3,4,..: 3 2 3 3 2 3 2 3 3 2 
 ...
 $ id.father : int  1 0 1 1 0 1 0 1 1 0 ...
 $ id.mother : int  2 0 2 2 0 2 0 2 2 0 ...
 $ PtCode: Factor w/ 608 levels AJM16001FA,AJM16001MO,..: 74 73 77 
 117 116 80 79 83 86 85 ...
 $ HS.nr : int  32940 32941 32960 32963 32964 32967 32968 32970 32972 
 32973 ...
 $ affected  : int  2 1 2 2 1 2 1 2 2 1 ...
 $ sex   : int  2 2 1 1 2 1 2 2 2 2 ...
 $ rs11684: Factor w/ 1 level C/C: 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, allele.names)= chr C
  ..- attr(*, allele.map)= chr [1, 1:2] C C
 $ rs1144: Factor w/ 3 levels A/A,G/A,G/G: 3 3 3 3 3 2 3 3 3 3 ...
  ..- attr(*, allele.names)= chr  G A
  ..- attr(*, allele.map)= chr [1:3, 1:2] A G G A ...
 $ rs120: Factor w/ 2 levels A/A,A/G: 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, allele.names)= chr  A G
  ..- attr(*, allele.map)= chr [1:2, 1:2] A A A G



 Farrel Buchinsky [EMAIL PROTECTED] wrote in message 
 news:[EMAIL PROTECTED]
 Has something changed in R that requires an update in the genetics 
 package
 by Gregory Warnes? I am using R version 2.5.0
 This used to work
 summary(founders[,59])

 to prove that it is  a genotype class
 class(founders[,59])
 [1] genotype factor

 Now when I issue the command:
 summary(founders[,59])

 I get:

 Error in attr(retval, which) - which : attempt to set an attribute on
 NULL
 In addition: Warning message:
 $ operator is deprecated for atomic vectors, returning NULL in:
 x$allele.names

 Clearly, I am missing something. What am I missing?

 -- 
 Farrel Buchinsky

 [[alternative HTML version deleted]]

 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] good reference for mixed models and EM algorithm

2008-02-11 Thread Douglas Bates
On Feb 10, 2008 2:32 PM, Spencer Graves [EMAIL PROTECTED] wrote:
 Hi, Erin:

   Have you looked at Pinheiro and Bates (2000) Mixed-Effects Models
 in S and S-Plus (Springer)?

   As far as I know, Doug Bates has been the leading innovator in
 this area for the past 20 years.  Pinheiro was one of his graduate
 students.  The 'nlme' package was developed  by him or under his
 supervision, and 'lme4' is his current development platform.  The
 ~R\library\scripts subdirectory contains ch01.R, ch02.R, etc. =
 script files to work the examples in the book (where ~R = your R
 installation directory).  There are other good books, but I recommend
 you start with Pinheiro and Bates.

Except that Doug Bates doesn't use the EM algorithm for fitting mixed
models any more.  The lme4 package previously had an option for
starting with EM (actually ECME, which is a variant of EM) iterations
but I have since removed it.  For large data sets and especially for
models with non-nested random effects, the EM iterations just slowed
things down relative to direct optimization of the log-likelihood.


   Spencer Graves

 Erin Hodgess wrote:
  Dear R People:
 
  Sorry for the off-topic.  Could someone recommend a good reference for
  using the EM algorithm on mixed models, please?
 
  I've been looking and there are so many of them.  Perhaps someone here
  can narrow things down a bit.
 
  Thanks in advance,
  Sincerely,
  Erin
 


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] building packages for Linux vs. Windows

2008-02-11 Thread Paul Gilbert
Erin Hodgess wrote:
 Hi R People:

 I sure that this is a really easy question, but here goes:

 I'm trying to build a package that will run on both Linux and Windows.

 However, there are several commands in a section that will be
 different in Linux than they are in Windows.
   
Erin

Several people have indicated how to do this, but I encourage you to be 
sure you really need to do it. Many things can be made to work the same 
way on all OSs, and packages are much easier to maintain if you do not 
have several variants.  You might consider posting a few example of 
where you find this necessary, and ask if there is an OS independent way 
to do it.

Paul Gilbert
 Would I be better off just to build two separate packages, please?
 If just one is needed, how could I determine which system is running
 in order to use the correct command, please?

 Thanks in advance,
 Erin


   


La version française suit le texte anglais.



This email may contain privileged and/or confidential in...{{dropped:26}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Tinn-R not working well with latest R

2008-02-11 Thread Schmitt, Corinna

 
Hallo,

I had the same problems before. I think the best solution is that you just copy 
the needed codepart out of Tinn-R with Ctr+C. Then open R directly from your 
desktop NOT from Tinn-R. Than paste in the command. you can still make changes 
in the command when you have not pressed enter by using the arrow buttons of 
the keyboard. put the curse where you want in the command line and change it.

Hope that is what you want. I cannot imitate your example.

Corinna




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nlme special case of corARMA?

2008-02-11 Thread Reid Landes
Dear All:

I am trying to fit a special case of a 2-banded Toeplitz correlation
structure.  A 2-banded Toeplitz has ones on the diagonal, a
correlation, RHO1, on the first off-diagonal, and a correlation, RHO2,
on the second off-diagonal, with zeros on all subsequent
off-diagonals.  After reading relevant sections in Mixed-Effects
Models in S and S-PLUS (Pinheiro  Bates, 2000) and searching on the
R-help archives, I've figured out how to get the 2-banded Toeplitz,
but not the desired special case.

In the example below, the initial value RHO1 = 0 and RHO2= -0.3.  The
output matrix below is an example of the special case I'd like to fit
-- a 2-banded Toeplitz constraining RHO1=0.

Fitting the 2-banded Toeplitz structure to the ``Orthodont'' example
dataset provided in R-Help, we estimate RHO1 also (since the example
matrix below contains INITIAL values).

-Start R-code  output ---

 #This intilizes a 2-banded Toeplitz structure
 cs1ARMA - corARMA(value = c(0,-.3), form = ~ 1 | Subject, p = 2, q = 0)
 cs1ARMA - Initialize(cs1ARMA, data = Orthodont)
 corMatrix(cs1ARMA)$M01

[,1] [,2] [,3] [,4]
[1,]  1.0  0.0 -0.3  0.0
[2,]  0.0  1.0  0.0 -0.3
[3,] -0.3  0.0  1.0  0.0
[4,]  0.0 -0.3  0.0  1.0

 TOEP2 - gls(distance ~ Sex * I(age - 11), Orthodont,
+correlation = corARMA(value = c(0,-.3),
+form = ~ 1 | Subject, p = 2, q = 0),
+   weights = varIdent(form = ~ 1 | age))

#-- Selected output follows-
Correlation Structure: ARMA(2,0)
 Formula: ~1 | Subject
 Parameter estimate(s):
Phi1  Phi2
0.3269544 0.4897645

- End R-code  output 

I cannot figure out how to restrict RHO1 = 0, while allowing
estimation of RHO2.

Maybe an answer lies in specifying a different ``position vector''
other than the default: corARMA(..., form = ~ 1 | Subject ...).  (See
p226 of Pinheiro  Bates, 2000 for explanation of a position vector.)
But I'm not totally sure if I understand the position vector and I
know I don't know how it works in R.

Then again, there is likely a completely different way to solve this
problem.

Any help will be appreciated!

Reid

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R in a university course: dealing with proposal comments

2008-02-11 Thread Neil Shephard



Arin Basu-3 wrote:
 
 Comment 2:
 
 Finally, on a minor point, why is R the statistical software being
 used? SPSS is probably more widely available in the workplace –
 certainly in areas of social policy etc.  (Prof NB)
 
 

What struck me in the above is the probably.  How probable is it, anything
to substantiate the claim?

Anyway, whether one package is more widely available in the workplace than
another is somewhat of a moot point.  If a student learns how to use one
software package then they start to get pigeon-holed into using that
particular software package.

Many jobs are advertised with SPSS/SAS/Stata/S-Plus (add/subtract at will)
skills/knowledge required (or at least desirable).  The prospective job
applicant may think Well I don't know how to use that so I shan't bother
applying or they may be unwilling to re-learn how to use a new stats
package after months/years of investment in learning how to use another
package, alternatively they may well just loose out to someone who already
has the experience/skills.

(Most) of this problem isn't negated when using R.  Start a new job and use
the (excellent, extensible, and free) software that you've been using for
years.

I'd stick with using R to teach your statistics, in the long-run any of them
who continue to perform statistical analysis will be grateful.

Neil

-- 
View this message in context: 
http://www.nabble.com/Using-R-in-a-university-course%3A-dealing-with-proposal-comments-tp15405138p15413122.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Gini index of frequencies in a data frame

2008-02-11 Thread m. teodosiu

Dear All,
I wish to calculate the Gini index (ineq from same package) and some other 
indices for the diameter distribution of each plot (df dgtot).

dgtot:
IDPlotDiameter(cm) 
14 34.0
24 23.0
34 38.0
...
51   5 16.0
52   5  8.0
53   5  9.0
...
5301  140 25.0
5302  140 12.0
5303  140  7.0

I use:
 aggregate(dgtot,by=list(dgtot$IDSupr),FUN=ineq(dsp))

where
dsp - function(x) # compute frequency distribution for each plot
{
 cd-seq(5,max(x),by=2)
 Fi - table(cut(x, br = seq(5, max(x)+1, 2), right = FALSE))
 K - length(names(Fi))
}

but, the result was:
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 'x' 
must be atomic

I'm at the beginning in R and I kindly request your experienced help.

Thank you,
Marius
 Teodosiu










  

Looking for last minute shopping deals?  

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tree() producing NA's

2008-02-11 Thread Prof Brian Ripley
Take a look at the levels of 'owner'.

On Mon, 11 Feb 2008, Amnon Melzer wrote:

 Hi



 Hoping someone can help me (a newbie).



 I am trying to construct a tree using tree() in package tree. One of the
 fields is a factor field (owner), with many levels. In the resulting tree, I
 see many NA's (see below), yet in the actual data there are none.

You are misinterpreting this: those are level names.

Using a tree with a factor with many levels is a very bad idea: it takes a 
long time to compute (unless the response is binary) and almost surely 
overfits.



 rr200.tr - tree(backprof ~ ., rr200)

 rr200.tr

 1) root 200 1826.00 -0.2332

 ...

 [snip]

 ...

5) owner: Cliveden Stud,NA,NA,NA,NA,NA,NA,NA,NA 10   14.25  1.5870 *

  3) owner: B E T Partnership,Flaming Sambuca
 Syndicate,NA,NA,NA,NA,NA,NA,NA,NA 11  384.40 10.5900

6) decodds  12 5   74.80  6.3000 *

7) decodds  12 6  140.80 14.1700 *



 Can anyone tell me why this happens and what I can do about it?

Well, you could follow the request at the footer of this and every R-help 
message.



 Regards



 Amnon




   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Any one is porting or has ported prtools (http://www.prtools.org/) to R ?

2008-02-11 Thread Kwang Loong Stanley Ng
Hi, 

Any one is porting or has ported prtools (http://www.prtools.org/) to R ? 

Thanks
Stanley

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scatterplot in CAR

2008-02-11 Thread John Fox
Dear Aimin,

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Aimin Yan
 Sent: February-11-08 7:22 AM
 To: r-help@r-project.org
 Subject: [R] scatterplot in CAR
 
 I am trying to use scatterplot function in CAR like the following:
 
 scatterplot(X~Y)
 
 I want to label X points and Y ponits using the different color.
 Any idea for this?
 
 Aimin

I'm afraid that I don't understand the question: scatterplot(X~Y) will make
a scatterplot with the variable X on the vertical axis and Y on the
horizontal axis. Did you really want to do that? Moreover, as in any
scatterplot, the variables Y and X will define the coordinates of the points
-- there are not distinct X points and Y points.

Regards,
 John


John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] scatterplot in CAR

2008-02-11 Thread Aimin Yan
I am trying to use scatterplot function in CAR like the following:

scatterplot(X~Y)

I want to label X points and Y ponits using the different color.
Any idea for this?

Aimin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The function predict

2008-02-11 Thread Dieter Menne
Carla Rebelo crebelo at liaad.up.pt writes:

 May you help me? I need to understand the function predict. I need to 
 understand the algorithm implemented, the calculations associated. Where 
 can I find this information?

In the documentation: 

predict is a generic function for predictions from the results of various model
fitting functions. The function invokes particular methods which depend on the
class of the first argument.

There is no information available for the default predict function, but there is
information for the predict.XXX implementations mentioned further below:

See Also

predict.glm, predict.lm, predict.loess, predict.nls, predict.poly,
predict.princomp, predict.smooth.spline.

For time-series prediction, predict.ar, predict.Arima, predict.arima0,
predict.HoltWinters, predict.StructTS.

For details, you should look into the examples provided with predict.lm (as the
simplest starter), and the code.

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >