date:20090426

1. ?merge

2. sqldf package whose home page is at:
http://sqldf.googlecode.com

On Sat, Apr 25, 2009 at 9:15 PM, Nigel Birney na...@cam.ac.uk wrote:

 Hello all,

 Apologize for the newbie question. What's the easiest way to do a SQL inner
 table join in R?

 Say I have a table containing column names A, B, C and another which has
 columns named C, D, E. I would like to do an inner table join on C and
 produce a table A, B, C, D, E.

 thanks a lot,

 N.
 --
 View this message in context: 
 http://www.nabble.com/THE-EQUIVALENT-OF-SQL-INNER-TABLE-JOIN-IN-R-tp23238179p23238179.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Stochastic Gradient Ascent for logistic regression

2009-04-26 Thread Tim LIU




Hi. guys,

I am trying to write my own Stochastic Gradient Ascent for logistic 
regression in R. But it seems that I am having convergence problem.

Am I doing anything wrong, or just the data is off?

Here is my code in R -



lbw -
read.table(http://www.biostat.jhsph.edu/~ririzarr/Teaching/754/lbw.dat;
, header=TRUE)

attach(lbw)



lbw[1:2,]
low age lwt race smoke ptl ht ui ftv bwt
1 0 19 182 2 0 0 0 1 0 2523
2 0 33 155 3 0 0 0 0 3 2551




#-R implementation of logistic regression : gradient descent --
sigmoid-function(z)
{
1/(1 + exp(-1*z))

}




X-cbind(age,lwt, smoke, ht, ui)

#y-low


my_logistic-function(X,y)
{

alpha - 0.005
n-5 
m-189
max_iters - 189 #number of obs

ll-0

X-cbind(1,X)

theta -rep(0,6) # intercept and 5 regerssors
#theta - c(1.39, -0.034, -0.01, 0.64, 1.89, 0.88) #glm estimates as 
starting values
theta_all-theta
for (i in 1:max_iters) 
{ 
dim(X)
length(theta)
hx - sigmoid(X %*% theta) # matrix 
product

ix-i

for (j in 1:6)
{
theta[j] - theta[j] + alpha * ((y-hx)[ix]) * X[ix,j] 
#stochastic gradient !

}





logl - sum( y * log(hx) + (1 - y) * log(1 - hx) ) #direct 
multiplication

ll-rbind(ll, logl)


theta_all = cbind(theta_all,theta)
}

par(mfrow=c(4,2))


plot(na.omit(ll[,1]))
lines(ll[,1])

for (j in 1:6)
{

plot(theta_all[j,])
lines(theta_all[j,])
} 


#theta_all
#ll
cbind(ll,t(theta_all))
}


my_logistic(X,low)
==


parameter estimates values jumped after 130+ iterations...

not converging even when I use parameter estimates as starting values 
from glm (family=binomial)


help!




-- 
View this message in context: 
http://www.nabble.com/Stochastic-Gradient-Ascent-for-logistic-regression-tp23239378p23239378.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] THE EQUIVALENT OF SQL INNER TABLE JOIN IN R

2009-04-26 Thread Peter Dalgaard


Nigel Birney wrote:

Hello all,

Apologize for the newbie question. What's the easiest way to do a SQL inner
table join in R? 


Say I have a table containing column names A, B, C and another which has
columns named C, D, E. I would like to do an inner table join on C and
produce a table A, B, C, D, E.


merge(), perhaps? Otherwise describe what an inner table join does.

-pd

--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nomogram with stratified cph in Design package

2009-04-26 Thread reneepark


I'm sorry - I meant a median survival estimate, not a median risk.

I see - I didn't realize that by stratifying it would pool the levels of the
stratified variable. Hm, that is unfortunate, considering the stratified
variable is one that I would like to keep in the nomogram.

Thank you for your help!
~Renee
-- 
View this message in context: 
http://www.nabble.com/Nomogram-with-stratified-cph-in-Design-package-tp23237422p23239686.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question of Quantile Regression for Longitudinal Data

2009-04-26 Thread Tirthankar Chakravarty

This is a nontrivial problem. This comes up often on the Statalist
(-qreg- is for cross-section quantile regression):

You want to fit a plane through the origin using the L-1 norm.
This is not as easy as with L-2 norm (LS), as it is more
than a matter of dropping a constant predictor yet otherwise using the
same criterion of fit. You are placing another constraint on a
problem that already does not have a closed-form solution,
and it does not surprise me that -qreg- does not support this.
(N.J. Cox)
http://www.stata.com/statalist/archive/2007-10/msg00809.html

You will probably have to program this by hand. Note also the
degeneracy conditions in Koenker (2003, pg. 36--). I am not sure how
this extends to panel data though.

References:
@book{koenker2005qre,
title={{Quantile Regression; Econometric Society Monographs}},
author={Koenker, R.},
year={2005},
publisher={Cambridge University Press}
}

On Sun, Apr 26, 2009 at 8:24 AM, Helen Chen 96258...@nccu.edu.tw wrote:

Hi,

I am trying to estimate a quantile regression using panel data. I am trying
to use the model that is described in Dr. Koenker's article. So I use the
code the that is posted in the following link:

http://www.econ.uiuc.edu/~roger/research/panel/rq.fit.panel.R

How to estimate the panel data quantile regression if the regression
contains no constant term? I tried to change the code of rq.fit.panel by
delect X=cbind(1,x) and would like to know is that correct ?

Thanks
I really would appreciate some suggestions.
Best
Helen Chen
--
View this message in context:
http://www.nabble.com/Question-of-%22Quantile-Regression-for-Longitudinal-Data%22-tp23239896p23239896.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
To every ω-consistent recursive class κ of formulae there correspond
recursive class signs r, such that neither v Gen r nor Neg(v Gen r)
belongs to Flg(κ) (where v is the free variable of r).

Re: [R] Conditional plot labels

2009-04-26 Thread baptiste auguie


Hi,

Have you considered using high-level plotting functions provided by  
the ggplot2 or lattice package? Here's a dummy example,




x - seq(0, 10, length=100)
y1 - sin(x)
y2 - cos(x)
y3 - x^2/100
y4 - 1/x

d - data.frame(x, y1, y2, y3, y4)

library(reshape)
dm - melt(d, id=x)

dm$type1 - rep(LETTERS[1:2], each=2*length(x)) # dummy factors
dm$type2 - rep(letters[1:2], each=length(x))

library(ggplot2)
p1 -
qplot(x, value, data=dm, geom=line, facets=type1~type2)

p1 # you can customise the appearance if the default doesn't please  
you


library(lattice)

p2 -
xyplot(value~x|type1*type2, data=dm, t=l) # here the strips are on  
top of each other by default


library(latticeExtra)

useOuterStrips(p2) # this makes the layout more like you want



Alternatively, you can also use raw Grid commands and define your own  
layout where to place the different graphical objects, but it's more  
work.


Hope this helps,

baptiste


On 26 Apr 2009, at 01:31, Christian Bustamante wrote:


Hi all,
I'm trying to do multiple graphs in a window like this:

   ___  ___   ___
ylab  |__|  |__|   |__|
   ___  ___   ___
ylab  |__|  |__|   |__|
   ___  ___   ___
ylab  |__|  |__|   |__|
xl xl xl

If I try to put the labels manually, some graphs become smaller than
other and the output is really ugly.
In the thread title I put the word conditional because I'm trying to
do a function, and in that function I want to print ylabels if the
plot positions is at first column of the graph matrix, and xlab if the
position is at last row of matrix.

How can i achive this two things?

Thanks for your help

--
CdeB

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scatterplot of two groups side-by-side?

2009-04-26 Thread baptiste auguie


Hi,

You could do this very easily using ggplot2,


#install.packages(ggplot2, dep=TRUE)



library(ggplot2)
 c - ggplot(mtcars, aes(y=wt, x=mpg)) + facet_grid(. ~ cyl)
 c + stat_smooth(method=lm) + geom_point()


See more examples on Hadley's website: http://had.co.nz/ggplot2/

Hope this helps,

baptiste


On 26 Apr 2009, at 10:29, nonu...@yahoo.de wrote:


Dear all

I'm realy new to R, so I hope you can help me, as I didn't find any  
solution in the common books.


Since some days I'm trying to create the following plot: A  
scatterplott showning two different groups side-by-side with  
according regression lines. Both datasets only have the same five  
factors, so the scatters will form a kind of column at each factor.  
When I use scatterplot (package car), then I can plot two groups  
in the same graph by using the command groups, but the scatters of  
both groups are then plotted on top of eachother using different  
symbols and they can hardly be distingushed. How can I plot them  
side by side, so that the groups do not overlap? And how can I give  
different colours to the groups and the according regression line? 
(This is, what I got so far: http://img7.imageshack.us/img7/227/almostgood.jpg)


I tried to use the commands used in boxplot, to solve this  
problem. In this commant, it's possible to plot different datasets  
side-by-side by defining the position of the bars (example: at = 1:5  
- 0.4). A second boxplot-chart can then be added by adding the  
command add=TRUE to the line and defining another position. Both  
commands don't function within the scatterplot-command.


By the way: It's realy necessary to plott the data as scatters and  
not as boxplots. With the command plot, I can not plot the data by  
groups (I tried it with the commands subset and groups, but  
obviously, there is no way to do so).


I'm greatful for every (simple) solution
Thanks in advance

Karin Schneeberger
MSc-student
University of Berne
Switzerland



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3 questions regarding matrix copy/shuffle/compares


Hello David,

Let me try again, I don't think this was the best post ever I've made :-)
Hopefully this is clearer, or otherwise I may break this up into
three separate simple queries as this may be too long.


 == is not an assignment operator in R, so the answer is that it
 would do neither.  - and = can do assignment. In neither case
 would it be a deep copy.

It was late when I posted the code, I made a mistake with regard to
the assignment operator and used the boolean compare instead -- thanks
for catching that.

It should have been:

keep_pop[1:POP_SIZE] = pop[1:POP_SIZE]


 Here's an edited and clearer version I hope:


The basic idea is that I am trying to keep track of a number of bitrings.

Therefore I am creating a matrix (named 'pop') whose rows are made up
of bit vectors (ie my bitstrings).  I only initialize half of the rows
with my bitstrings of random 1s and 0s, the rest of the rows are set
to all zeros).

So I use following function call to create a matrix and fill it with
bit strings:

   pop=create_pop_2(POP_SIZE, LEN)

where

   POP_SIZE refers to the number of rows
   LEN to the columns (length of my bitstrings)



This is the code I call:


# create a random binary vector of size len
#
create_bin_Chromosome - function(len)
{
  sample(0:1, len, replace=T)
}



## create_population ###
# create population of chromosomes of length len
# the matrix contains twice as much space as popsize
#
create_pop_2 - function(popsize, len)
{
  datasize=len*popsize
  print(datasize)
  npop - matrix(0, popsize*2, len, byrow=T)

  for(i in 1:popsize)
npop[i,] = create_bin_Chromosome(len)

  npop
}


My 3 questions:

(1) If I did

keep_pop[1:POP_SIZE] = pop[1:POP_SIZE]

to keep a copy of the original data structure before manipulating
'pop' potentially, would this make a deep copy or just shallow? Ie
if I change something in pop would keep_pop change too? I would
like two independent copies so that 'keep_pop' stays intact while
'pop' may change.

 - and = can do assignment. In neither case would it be a
 deep copy.

Is there a deepcopy operator, or would I have to have two nested
loops and iterate through them? Or is there a nice R-idiomatic way
to do this?


(2) If I wanted to change the order of rows in my matrix 'pop', is
there an easy way to shuffle these?  I.e., I don't want to change
any of the bitstrings vectors/rows, just the order of the rows in the
matrix 'pop'. (E.g., in Python I could just say something like
suffle(pop)) - is there an equivalent for R?

So if pop [ [0, 0, 0]
[1, 1, 1]
[1, 1, 0] ]

after the shuffle it may look like

  [ [1, 1, 0](originally at index 2)
[1, 1, 1](originally at index 1)
[0, 0, 0] ]  (originally at index 0)

the rows themselves remained intact, just their order changes.
This is a tiny example, in my case I may have 100 rows (POPS_SIZE)
and rows of LEN 200.


(3) I would like to compare the contents of 'keep_pop' (a copy of the
original 'pop') with the current 'pop'. Though the order of rows
may be different between the two, it should not matter as long as
the same rows are present.  So for the example given above, the
comparison should return True.

For instance, in Python this would be simply

if sorted(keep_pop) == sorted(pop):
   print 'they are equal'
else
   print 'they are not equal'

Is there an equivalent R code segment?


I hope this post is clearer than my original one. Thank you David for
pointing out some of the shortcomings of my earlier post.

Thanks,

Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help to select the raw in a data.frame with the max value

2009-04-26 Thread Alessandro

Dear User,

 

thank for the attention. I have a data.frame with 5 columns (ex:ID,
a1,a2,a3,a4) and 1000 rows. I wish to find the absolute max value for all
data.frame and save a new data.frame with the row where is that value. Ex:

 

ID: 1,2,3,4,5,6,7,8,9,10

a1:1,2,3,4,5,6,7,8,9,10

a2:11,12,13,14,15,16,17,18,19,20

a3:21,22,23,24,25,26,27,28,29,30

a4:31,32,33,34,35,36,37,38,39,40

 

 

The max value in the four columns (a1,a2,a3,a4) is 40. The new data.frame is

 

ID:10

A1:10

A2:20

A3:30

A4 :40

 

Thanks

 

Ale


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] re moving entries from one vector that are in another

2009-04-26 Thread onyourmark


I have various objects defined but I am trying to remove a set of elements in
one vector from another and using the loops at the end of this post but I am
getting the error at the very end of this post.

 str(x)
 num [1:923, 1:923] 1 -0.00371 -0.00102 -0.00204 -0.00102 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:923] a4.1 abdomen.2 abdomimal.3 abdominal.4 ...
  ..$ : chr [1:923] a4.1 abdomen.2 abdomimal.3 abdominal.4 ...
 str(answer2)
 int [1:2129, 1:2] 1 399 653 2 3 600 4 5 271 870 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:2129] a4.1 hirschsprung.399 peritoneal.653 abdomen.2
...
  ..$ : chr [1:2] row col
 str(answer3)
 int [1:1206, 1:2] 399 653 600 271 870 185 298 620 119 162 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:1206] hirschsprung.399 peritoneal.653 occult.600
enteroclysis.271 ...
  ..$ : chr [1:2] row col


#Trying to delete all variables in my correlation matrix x that have
correlation greater than .6 
answer2-which(((x .6) | (x(-.6))), arr.ind = TRUE)

#also need to delete the diagonal entries of x  (where a var is correlated
with itself) because my goal is to variables from mydataN but a variable
being correlated with itself is not a reason to drop it:
answer3-answer2
answer3 - answer2[answer2[,1]!=answer2[,2],]

#so now the second row of answer3 is a list of highly correlated variables.

#now take the entire 2nd column answer3 :in other words we want a list of 

correlated variables that we are going to eliminate.


uniqueFromColumn2ofAnswer2=unique(answer3[,2])
 str(uniqueFromColumn2ofAnswer2)
 int [1:561] 1 3 5 10 12 13 15 17 18 19 ...
 

#now create a holder for mydataN minus those columns which have bad or
highly correlated variables. 

mydataNMinusHighCorForAll - mydataN

#and now go ahead and take them out:

for (i in uniqueFromColumn2ofAnswer2) {for (j in
mydataN[0,]){if(i==j){mydataNMinusHighCorForAll[,-j]}}}

Error in if (i == j) { : argument is of length zero
-- 
View this message in context: 
http://www.nabble.com/removing-entries-from-one-vector-that-are-in-another-tp23241912p23241912.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Is their any function can generate orthogonal tables(e.g. L_8(2^7)

2009-04-26 Thread dzuswxbylw

I want to generate some orthogonal tables in my experiment design.

I searched http://search.r-project.org/ (RSiteSearch) whith key words such as 
orthogonal table, orthogonal design, latin square etc and get no useful 
result. the same result get by searching via google's insite search in r-cran 
main web.

Some packages like crossdes,AlgDesign can Construction of Designs Based on 
Mutually Orthogonal Latin Squares, I tried des.MOLS(package=crossdes) 

 des.MOLS(4,3) # this may be 4 treatment, 3 period
  [,1] [,2] [,3]
 [1,]123
 [2,]214
 [3,]341
 [4,]432
 [5,]134
 [6,]243
 [7,]312
 [8,]421
 [9,]142
[10,]231
[11,]324
[12,]413

But it is not orthogonal tables which I want. If I misunderstand  this function?

So any one knows how to construct orthogonal tables  for example 
L_8(2^7)---means 8 runs, 2 levels and 7 factors, 
and many other orthogonal tables such as L_4(2^3), L_18(2*3^7) and so on,
with R.

Thanks for any suggestions.

Best wishes,
xjx


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dotplot: labeling coordinates for each point

2009-04-26 Thread Qifei Zhu

Hi David,

Thanks! It looks much better now. but is there any way to add (x,y)
coordinates as labels to all the points in the graph? Best case if I can
enforce some conditions saying if (y10,000) label, else no label. Any
advice is appreciated.

Best,
Tony

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Friday, April 24, 2009 10:48 PM
To: Qifei Zhu
Cc: r-help@r-project.org
Subject: Re: [R] dotplot: labeling coordinates for each point


On Apr 24, 2009, at 9:23 PM, Qifei Zhu wrote:
 I used dotplot to draw a graph for a dataset with size of 100. Since  
 the
 x-axis are all texts, so they are mixed up and not readable. Is  
 there any
 way to make it readable or how can I add labels to all the points  
 with its
 (x,y) coordinates? Thanks for your help.

Look up information on the scales parameter and rotate your label text:

dotplot(decrease ~ treatment, OrchardSprays, groups = rowpos,  
scales=list(x=list(rot=60,  
labels=c(,,,,,,,) )))

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help to select the raw in a data.frame with the max value

2009-04-26 Thread Jorge Ivan Velez

Dear Alessandro,
Here is one way:

DF - data.frame(ID,a1,a2,a3,a4)
Row - which( DF == max(DF[,-1]),  arr.ind = TRUE)[1]
DF[Row,]
#   ID a1 a2 a3 a4
#  10 10 10 20 30 40

See ?which and ?max for more details.

HTH,

Jorge


On Sun, Apr 26, 2009 at 8:02 AM, Alessandro alessandro.monta...@unifi.itwrote:

 Dear User,



 thank for the attention. I have a data.frame with 5 columns (ex:ID,
 a1,a2,a3,a4) and 1000 rows. I wish to find the absolute max value for all
 data.frame and save a new data.frame with the row where is that value. Ex:



 ID: 1,2,3,4,5,6,7,8,9,10

 a1:1,2,3,4,5,6,7,8,9,10

 a2:11,12,13,14,15,16,17,18,19,20

 a3:21,22,23,24,25,26,27,28,29,30

 a4:31,32,33,34,35,36,37,38,39,40





 The max value in the four columns (a1,a2,a3,a4) is 40. The new data.frame
 is



 ID:10

 A1:10

 A2:20

 A3:30

 A4 :40



 Thanks



 Ale


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nomogram with stratified cph in Design package

2009-04-26 Thread Frank E Harrell Jr


David Winsemius wrote:


On Apr 25, 2009, at 6:57 PM, reneepark wrote:



Hello,
I am using Dr. Harrell's design package to make a nomogram. I was able to
make a beautiful one without stratifying, however, I will need to 
stratify

to meet PH assumptions. This is where I go wrong, but I'm not sure where.



Non-Stratified Nomogram:

f-cph(S~A+B+C+D+E+F+H,x=T,y=T,surv=T,time.inc=10*12,method=breslow)
srv=Survival(f)
srv120=function(lp) srv(10*12,lp)
quant=Quantile(f)
med=function(lp) quant(.5,lp)
at.surv=c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9)
at.med=c(120,80,60,40,30,20,15,10,8,6,4,2,0)
nomogram(f,lp=F, fun=list(srv120, med),funlabel=c(120-mo 
Survival,Median

Survival),fun.at=list(at.surv, at.med))

I get a the following warning:
Warning message:
In approx(fu[s], xseq[s], fat) : collapsing to unique 'x' values

However, a great nomogram is constructed.


But then I try to stratify...




Stratified Nomogram:

f-cph(S~A+B+C+D+E+F+strat(H),x=T,y=T,surv=T,time.inc=10*12,method=breslow) 


srv=Survival(f)
surv.p - function(lp) srv(10*12, lp, stratum=Hist=P)
surv.f - function(lp) srv(10*12, lp, stratum=Hist=F)
surv.o - function(lp) srv(10*12, lp, stratum=Hist=O)
quant=Quantile(f)
med.p - function(lp) quant(.5, lp, stratum=Hist=P)
med.f - function(lp) quant(.5, lp, stratum=Hist=F)
med.o - function(lp) quant(.5, lp, stratum=Hist=O)
nomogram(f, fun=list(surv.p, surv.f, surv.o, med.p, med.f, med.o),
+ funlabel=c(S(120|P),S(120|F),S(120|O),
+ med(P),med(F),med(O)),
+ fun.at=list(c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9),
+ c(120,80,60,40,30,20,15,10,8,6,4,2,0)))


the final nomogram only gives me a survival probability line for one 
of the
3 Hist categories S(120|P). It does show the letters S(120|F) but 
there
is no survival probability line; there is nothing for the last 
category O,

and no median risk at all.


Those outputs seem consistent with the fact that stratification is not 
computing

separate models, but rather a pooled model. See Section 19.1.7 of RMS.


But you can think of stratification as using a different transformation 
for each stratum, and as long as you create a separate function for each 
level of the stratification variable, as Rene did, all should be well.





I considered the idea that I was exceeding some
sort of space limitation, and tried to set total.sep.page=T, but it 
didn't

change the output.


Does a median risk'  exist when you stratify? You are allowing 3 
separate survival

functions to be created so that you estimate the remaining parameters.
It's possible that you can extract information about them, but you may 
be on your own

about how to recombine them.


Yes it exists, using the separate function approach.

Rene if you can duplicate the problem with a simple simulated or real 
dataset and send that to me I can try to go through this step by step. 
It's probably a scaling, units of measurement, or extrapolation problem 
where the median is not defined.  You can evaluate the created functions 
yourself a several settings to see if the results are reasonable and to 
learn where extrapolation is not possible because of truncated follow-up.


Frank





I get the following error message:

Error in axis(sides[jj], at = scaled[jj], label = fat[jj], pos = y, 
cex.axis

= cex.axis,  :
 no locations are finite


I would very much appreciate any assistance in this matter. Thank you 
very

much.

~Renee Park


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3 questions regarding matrix copy/shuffle/compares



On Apr 26, 2009, at 7:48 AM, Esmail wrote:


Hello David,

Let me try again, I don't think this was the best post ever I've  
made :-)

Hopefully this is clearer, or otherwise I may break this up into
three separate simple queries as this may be too long.


 == is not an assignment operator in R, so the answer is that it
 would do neither.  - and = can do assignment. In neither case
 would it be a deep copy.

It was late when I posted the code, I made a mistake with regard to
the assignment operator and used the boolean compare instead -- thanks
for catching that.

It should have been:

   keep_pop[1:POP_SIZE] = pop[1:POP_SIZE]


 Here's an edited and clearer version I hope:


The basic idea is that I am trying to keep track of a number of  
bitrings.


Therefore I am creating a matrix (named 'pop') whose rows are made up
of bit vectors (ie my bitstrings).  I only initialize half of the rows
with my bitstrings of random 1s and 0s, the rest of the rows are set
to all zeros).

So I use following function call to create a matrix and fill it with
bit strings:

  pop=create_pop_2(POP_SIZE, LEN)

where

  POP_SIZE refers to the number of rows
  LEN to the columns (length of my bitstrings)



This is the code I call:


# create a random binary vector of size len
#
create_bin_Chromosome - function(len)
{
 sample(0:1, len, replace=T)
}



## create_population ###
# create population of chromosomes of length len
# the matrix contains twice as much space as popsize
#
create_pop_2 - function(popsize, len)
{
 datasize=len*popsize
 print(datasize)
 npop - matrix(0, popsize*2, len, byrow=T)

 for(i in 1:popsize)
   npop[i,] = create_bin_Chromosome(len)

 npop
}


My 3 questions:

(1) If I did

   keep_pop[1:POP_SIZE] = pop[1:POP_SIZE]

   to keep a copy of the original data structure before manipulating
   'pop' potentially, would this make a deep copy or just shallow? Ie
   if I change something in pop would keep_pop change too? I would
   like two independent copies so that 'keep_pop' stays intact while
   'pop' may change.

- and = can do assignment. In neither case would it be a
deep copy.

   Is there a deepcopy operator, or would I have to have two nested
   loops and iterate through them? Or is there a nice R-idiomatic way
   to do this?


Not that I know of, although my knowledge of R depth is not  
encyclopedic. You might get the desired sort of effect by creating a  
copy  inside a function, working on it inside the function in the  
manner desired, and then comparing the output to the original. There  
might be other strategies to get certain effects by creating specific  
environments.



(2) If I wanted to change the order of rows in my matrix 'pop', is
   there an easy way to shuffle these?  I.e., I don't want to change
   any of the bitstrings vectors/rows, just the order of the rows in  
the

   matrix 'pop'. (E.g., in Python I could just say something like
   suffle(pop)) - is there an equivalent for R?

   So if pop [ [0, 0, 0]
   [1, 1, 1]
   [1, 1, 0] ]

   after the shuffle it may look like

 [ [1, 1, 0](originally at index 2)
[1, 1, 1](originally at index 1)
   [0, 0, 0] ]  (originally at index 0)

   the rows themselves remained intact, just their order changes.
   This is a tiny example, in my case I may have 100 rows (POPS_SIZE)
   and rows of LEN 200.


Yes. As I said before I am going to refrain from posting speculation  
until you provide valid R code

that will create an object that can be the subject of operations.


(3) I would like to compare the contents of 'keep_pop' (a copy of the
   original 'pop') with the current 'pop'. Though the order of rows
   may be different between the two, it should not matter as long as
   the same rows are present.  So for the example given above, the
   comparison should return True.

   For instance, in Python this would be simply

   if sorted(keep_pop) == sorted(pop):
  print 'they are equal'
   else
  print 'they are not equal'

   Is there an equivalent R code segment?


If you created a random index vector that was used to sort the rows  
for display or computational purposes only, you could maintain the  
original ordering so that row wise comparisons could be done.


I hope this post is clearer than my original one. Thank you David for
pointing out some of the shortcomings of my earlier post.

Thanks,

Esmail


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem installing packages


since 2.9.0 version I have a problem with installing packages:

install.packages(sp)
--- Please select a CRAN mirror for use in this session ---
Loading Tcl/Tk interface ... done
Warning: unable to access index for repository 
http://piotrkosoft.net/pub/mirrors/CRAN/src/contrib

Warning messages:
1: In open.connection(con, r) : unable to resolve ''
2: In list.files(lib) : list.files: 'sp' is not a readable directory

the reposityry is working, is accesible and sp package is in the repo

sdo I something wrong?

Jarek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] issue building my own package... moving from Apple OS to Windows




Daryl Morris wrote:

Thanks for the various responses.

It was easier than I thought to get all the tools together and setup 
Windows paths for the build process.  I've been successful now on Windows.


BUT... how do I make a package which DOES NOT require R CMD INSTALL to 
install?  Obviously, it should not be required for everyone to have Perl 
on their Windows box to install a package.



Well, you can build a binary bundle for Windows (as it happens on CRAN), 
then just you (but not your users) need the tools installed.


Best,
Uwe




When I tried to install the .tar.gz file from the GUI, I got the same 
errors as I did when I used the version I had built on my Apple.

Error in gzfile(file,r) : cannot open the connection
In addition: Warning messages:
1: In unzip(zipname, exdir=dest) :error 1 in extracting from zip file
2: In gzfile(file,r):
cannot open compressed file 
'multgeneriskpredperf_1.0.tar.gz/DESCRIPTION', probable reason 'No such 
file or directory'


Thanks, Daryl


Uwe Ligges wrote:


You need to INSTALL a source package (that has been build) as on any 
other platform. How to colect the tools in order to make R CMD INSTALL 
work under Windows (and other OSs) is described in the R Installation 
and Administration manual.


Best,
Uwe Ligges



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scatterplot of two groups side-by-side?

2009-04-26 Thread Stefan Grosse

On Sun, 26 Apr 2009 09:29:39 + (GMT) nonu...@yahoo.de
nonu...@yahoo.de wrote:

ND I'm realy new to R, so I hope you can help me, as I didn't find any
ND solution in the common books.  

Have a look at the online resources at:
http://cran.r-project.org/other-docs.html
There is also stuff on graphics.

Furthermore the lattice package and book are highly recommended as well.

ND By the way: It's realy necessary to plott the data as scatters and
ND not as boxplots. With the command plot, I can not plot the data
ND by groups (I tried it with the commands subset and groups, but
ND obviously, there is no way to do so).  

There is always a way. I just don't understand why it is necessary to
plot this as a scatterplot. 

Look your problem is that your data have integer values. So it is
very clear that they will be overplotted and that the reader has no
idea at which point are many observations even when you split the data
on the x axis into groups. Or even if you make a per group plot as
Baptiste suggested and as would be possible with lattice as well.

I could offer an easy solution. You can split into groups manually by
changing your x values slightly groupwise. But still you dont see how
many data are on each point. You could add some noise with the jitter
function (see ?jitter ), so that one sees that there are many
observation at one point. However it introduces the appearence that you
dont deal with categorical data, which might not be intended...

daten-data.frame(y=sample(c(1,2,3),24,replace=T),
x=rep(c(1,2),each=12),group=rep(c(1,2)))

daten


# plot with overplotting, no information gain
plot(daten$x,daten$y)

# plot with jitter

# prepare data
daten$x2-ifelse(daten$group==1,daten$x-0.02,daten$x+0.02)

plot(c(0,2),c(0,4),type=n) # empty plot you could use real data

# plot points, see ?jitter for options
points(jitter(y)~x2,data=subset(daten,group==1),col=blue,pch=1)
points(jitter(y)~x2,data=subset(daten,group==2),col=red,pch=2)

# regression lines added:
abline(lm(y~x,data=subset(daten,group==1)),col=blue)
abline(lm(y~x,data=subset(daten,group==2)),col=red)

legend(topleft,c(group 1,group 2,
regression group 1,regression group 2) ,lty=c(0,0,1,1),
pch=c(1,2,NA,NA), col=rep(c(blue,red),2),bty=n)

But I believe there are better solutions. You should think about a
different plot like a ballon plot or so. 

Then I doubt whether a linear regression is really good here since we
deal with categorical data...

ND I'm greatful for every (simple) solution  

Sorry if it is not simple. You see R has the advantage that it is
highly configurable. But you still need to know the message...

hth
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3 questions regarding matrix copy/shuffle/compares


David Winsemius wrote:


Yes. As I said before I am going to refrain from posting speculation 
until you provide valid R code

that will create an object that can be the subject of operations.



The code I have provided works, here is a run that may prove helpful:

POP_SIZE = 6
LEN = 8

pop=create_pop_2(POP_SIZE, LEN)

print(pop)
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
 [1,]01011001
 [2,]00000000
 [3,]11001000
 [4,]00000001
 [5,]00110010
 [6,]10000010
 [7,]00000000
 [8,]00000000
 [9,]00000000
[10,]00000000
[11,]00000000
[12,]00000000

I want to (1) create a deep copy of pop, (2) be able to shuffle
the rows only, and (3) be able to compare two copies of these objects
for equality and have it return True if only the rows have been shuffled.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help to select the raw in a data.frame with the max value



On Apr 26, 2009, at 8:02 AM, Alessandro wrote:


Dear User,
thank for the attention. I have a data.frame with 5 columns (ex:ID,
a1,a2,a3,a4) and 1000 rows. I wish to find the absolute max value  
for all
data.frame and save a new data.frame with the row where is that  
value. Ex:


ID: 1,2,3,4,5,6,7,8,9,10

a1:1,2,3,4,5,6,7,8,9,10

a2:11,12,13,14,15,16,17,18,19,20

a3:21,22,23,24,25,26,27,28,29,30

a4:31,32,33,34,35,36,37,38,39,40

The max value in the four columns (a1,a2,a3,a4) is 40. The new  
data.frame is



ID:10

A1:10

A2:20

A3:30

A4 :40



df - data.frame(ID= c( 1,2,3,4,5,6,7,8,9,10),
 a1 =c(1,2,3,4,5,6,7,8,9,10),
 a2 =c(11,12,13,14,15,16,17,18,19,20),
 a3 = c(21,22,23,24,25,26,27,28,29,30),
 a4 = c(31,32,33,34,35,36,37,38,39,40) )
 df
---output---
   ID a1 a2 a3 a4
1   1  1 11 21 31
2   2  2 12 22 32
3   3  3 13 23 33
4   4  4 14 24 34
5   5  5 15 25 35
6   6  6 16 26 36
7   7  7 17 27 37
8   8  8 18 28 38
9   9  9 19 29 39
10 10 10 20 30 40
-

apply(df, 2, max)

# If you want the names to be as specified, then look at the colnames  
function, but at this point I am concerned that I may have already  
done too much of you homework.

--output---
ID a1 a2 a3 a4
10 10 20 30 40


 max(df)

[1] 40

--


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R: constrained optimization

2009-04-26 Thread mauede

Thank you. 
Unluckily what makes the problem only apparenttly simple (for me) is that we 
have not differentiable functions and the parameter space is not continuous ... 
which reduces dramatically the number of choices.
I would be grateful to chat with anyone who has tackled a similar problem.

Maura

-Messaggio originale-
Da: David Winsemius [mailto:dwinsem...@comcast.net]
Inviato: dom 26/04/2009 6.55
A: mau...@alice.it
Cc: r-h...@stat.math.ethz.ch
Oggetto: Re: [R] constrained optimization
 
http://search.r-project.org/cgi-bin/namazu.cgi?query=%22constrained+optimization%22max=100result=normalsort=scoreidxname=functionsidxname=Rhelp08

And that is only the help messages from the last two years.'



On Apr 26, 2009, at 12:00 AM, mau...@alice.it wrote:

 Is there any R package addressing problems of constrained  
 optimization ?
 I have the following apparently simple problem:

 Given a set V with fixed cardinality:nv
 Given a set S whose cardinality is a parameter:nHat
 Let the cardinality of the intersection S.and.V be:   nHatv

 The problem consists of maximizing   nHatv/nv  subject to a penalty  
 if  nHat  nHatv

 It is allowed and even desirable to make set S contain set V

 Thank you so much


 tutti i telefonini TIM!


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT





tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scatterplot of two groups side-by-side?

2009-04-26 Thread John Fox

Dear Karin,

If I understand correctly what you want, the scatterplot function in the car
package isn't designed to produce it, but there are many ways to draw
side-by-side scatterplots. Here is one, using basic R graphics:

par(mfrow=c(1,2))
by(Data, Data$group, 
function(x) {
plot(Pulls ~ Resistance, data=x, main=paste(group =, group[1]))
abline(lm(Pulls ~ Resistance, data=x))
}
)

This assumes that your data are in a data frame named Data, with variables
group, Pulls, and Resistance.

I hope this helps,
 John

--
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
 Behalf Of nonu...@yahoo.de
 Sent: April-26-09 5:30 AM
 To: r-help@r-project.org
 Subject: [R] Scatterplot of two groups side-by-side?
 
 Dear all
 
 I'm realy new to R, so I hope you can help me, as I didn't find any
solution
 in the common books.
 
 Since some days I'm trying to create the following plot: A scatterplott
 showning two different groups side-by-side with according regression
lines.
 Both datasets only have the same five factors, so the scatters will form a
 kind of column at each factor. When I use scatterplot (package car),
then
 I can plot two groups in the same graph by using the command groups, but
 the scatters of both groups are then plotted on top of eachother using
 different symbols and they can hardly be distingushed. How can I plot them
 side by side, so that the groups do not overlap? And how can I give
different
 colours to the groups and the according regression line?(This is, what I
got
 so far: http://img7.imageshack.us/img7/227/almostgood.jpg)
 
 I tried to use the commands used in boxplot, to solve this problem. In
this
 commant, it's possible to plot different datasets side-by-side by defining
 the position of the bars (example: at = 1:5 - 0.4). A second boxplot-chart
 can then be added by adding the command add=TRUE to the line and
defining
 another position. Both commands don't function within the scatterplot-
 command.
 
 By the way: It's realy necessary to plott the data as scatters and not as
 boxplots. With the command plot, I can not plot the data by groups (I
tried
 it with the commands subset and groups, but obviously, there is no way
to
 do so).
 
 I'm greatful for every (simple) solution
 Thanks in advance
 
 Karin Schneeberger
 MSc-student
 University of Berne
 Switzerland
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem installing packages




Jarek Jasiewicz wrote:

since 2.9.0 version I have a problem with installing packages:

install.packages(sp)
--- Please select a CRAN mirror for use in this session ---
Loading Tcl/Tk interface ... done
Warning: unable to access index for repository 
http://piotrkosoft.net/pub/mirrors/CRAN/src/contrib

Warning messages:
1: In open.connection(con, r) : unable to resolve ''
2: In list.files(lib) : list.files: 'sp' is not a readable directory

the reposityry is working, is accesible and sp package is in the repo

sdo I something wrong?



Have you tried another mirror (that one seems to be quite slow currently)?

Uwe Ligges



Jarek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fclustindex, e1071 package




Caroline Wallis wrote:

Hi,
 
I'm using e1071 package to do fuzzy cluster analysis. My dataset (ra) has

5237 observations and 2 variables - depth and velocity. I used fuzzy cmeans
to create 6 fuzzy classes.
 


ra.flcust6-cmeans(ra,6,iter.max=100,verbose=F,dist=euclidean,method=cmea

ns,m=1.7,rate.par=NULL,weights=1)

 


I would like to calculate the value of all the fuzzy validity measures using
the flcustIndex function. However it returned the following error:

 


fclustIndex(ra.flcust6,ra,index=all)
Error in solve.default(scatter[, , i]) : 
  Lapack routine dgesv: system is exactly singular



 


Please could anyone explain what this means and what I have done wrong?



Well, it tells you that the matrix is exatcly singular. At first I'd 
check your data if this is plausible.


Uwe Ligges





 


thanks

Caroline Wallis

 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overlapping parameters k in different functions in ipred




WilDsc0p wrote:

Dear List,

I have a question regarding ipred package. Under 10-fold cv, for different 
knn ( = 1,3,...25), I am getting same misclassification errors:
#
library(ipred) 
data(iris) 
cv.k = 10 ## 10-fold cross-validation 
bwpredict.knn - function(object, newdata) predict.ipredknn(object, newdata, type=class) 
for (i in seq(1,25,2)){

set.seed(19)
a-errorest(Species ~ ., data=iris, model=ipredknn, estimator=cv, est.para=control.errorest(k=cv.k), predict=bwpredict.knn, nk = i)$err 
print(a)

}
[1] 0.0267
[1] 0.0267
[1] 0.0267
[1] 0.0267
[1] 0.0267
[1] 0.0267
[1] 0.0267
[1] 0.0267
[1] 0.0267
[1] 0.0267
[1] 0.0267
[1] 0.0267
[1] 0.0267
#
I think its is because in ipredknn and control.errorest has the same variable 
k (I guess k=5 default in ipredknn is returning). If I use



No, but because you are resetting the seed of the random number 
generator to the same value in each iteration of your loop.







errorest(Species ~ ., data=iris, model=ipredknn, estimator=cv, 
est.para=control.errorest(k=cv.k), predict=bwpredict.knn, k = 1)

the following message is generated:
Error in cv.factor(y, formula, data, model = model, predict = predict,  : 
  formal argument k matched by multiple actual arguments


I can't seem to change k to nk in ipredknn. If I try
ipred::ipredknn - function (formula, data, subset, na.action, nk = 5, 
...){...}
I get the message-
Error in ipred::ipredknn - function(formula, data, subset, na.action,  : 
  object ipred not found


How can I fix that? Any suggestion would be greatly appreciated.



You don't need (see above), but you can with assignInNamespace() and 
friends, see ?assignInNamespace.



Uwe Ligges





Thanks in advance,


- mek

R.Version() $platform i386-pc-mingw32 $arch i386 $os mingw32 $system i386, mingw32 $status  $major 2 $minor 8.1 $year 
2008 $month 12 $day 22 $`svn rev` 47281 $language R $version.string R version 2.8.1 (2008-12-22)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with plotting results of lda


Works for me with

library(MASS)
plot(lda(Species~., data=iris))

hence you may want to profide the data to enable us to reproduce your 
problem...


Uwe Ligges



pgseye wrote:

Hi,

I've performed an lda and obtained a classification table for some of my
data:


efa.dfa-lda(groups~.,efa.scores.8,CV=T)
str(efa.dfa)

List of 5
 $ class: Factor w/ 2 levels 1,2: 1 2 1 2 1 1 2 2 1 2 ...
 $ posterior: num [1:160, 1:2] 0.99083 0.00852 0.93983 0.23186 0.85931 ...
  ..- attr(*, dimnames)=List of 2
  .. ..$ : chr [1:160] 1 2 3 4 ...
  .. ..$ : chr [1:2] 1 2
 $ terms:Classes 'terms', 'formula' length 3 groups ~ Comp.1 + Comp.2 +
Comp.3 + Comp.4 + Comp.5 + Comp.6 +  Comp.7 + Comp.8 + Comp.9 + Comp.10
+ Comp.11 + Comp.12 +  ...
  .. ..- attr(*, variables)= language list(groups, Comp.1, Comp.2, Comp.3,
Comp.4, Comp.5, Comp.6,  Comp.7, Comp.8, Comp.9, Comp.10, Comp.11,
Comp.12, Comp.13,  ...
  .. ..- attr(*, factors)= int [1:35, 1:34] 0 1 0 0 0 0 0 0 0 0 ...
  .. .. ..- attr(*, dimnames)=List of 2
  .. .. .. ..$ : chr [1:35] groups Comp.1 Comp.2 Comp.3 ...
  .. .. .. ..$ : chr [1:34] Comp.1 Comp.2 Comp.3 Comp.4 ...
  .. ..- attr(*, term.labels)= chr [1:34] Comp.1 Comp.2 Comp.3
Comp.4 ...
  .. ..- attr(*, order)= int [1:34] 1 1 1 1 1 1 1 1 1 1 ...
  .. ..- attr(*, intercept)= int 1
  .. ..- attr(*, response)= int 1
  .. ..- attr(*, .Environment)=environment: R_GlobalEnv 
  .. ..- attr(*, predvars)= language list(groups, Comp.1, Comp.2, Comp.3,

Comp.4, Comp.5, Comp.6,  Comp.7, Comp.8, Comp.9, Comp.10, Comp.11,
Comp.12, Comp.13,  ...
  .. ..- attr(*, dataClasses)= Named chr [1:35] numeric numeric
numeric numeric ...
  .. .. ..- attr(*, names)= chr [1:35] groups Comp.1 Comp.2 Comp.3
...
 $ call : language lda(formula = groups ~ ., data = efa.scores.8, CV =
T)
 $ xlevels  : list()

table(groups, Classified=efa.dfa$class)

  Classified
groups  1  2
 1 59 21
 2 10 70

but when I try to plot the results I get:


plot(efa.dfa)

Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf

anyone have any ideas?

Thanks a lot,

Paul


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get rid of loop?




Ken-JP wrote:

set.seed(1)
x - runif(100)

# I want to calculate y such that:
#
# 1. if x0.75, y - 1
# 2. else if x0.25, y - -1
# 3. else if y_prev==1  x0.5, y - 0
# 4. else if y_prev==-1  x0.5, y - 0
# 5. else y - y_prev
#
# 1. and 2. are directly doable without looping.
#
# How do I do 3.-5. without looping?  The problem is, I need to run this
algorithm over gigs of data, so I
# need to avoid looping, if at all possible...
#
# - Ken






If y_prev is meant to be from a former iteration of a loop, you probably 
can't get rid of it. Original working code might have helped to 
udnertsand your problem better.
Anyway, perhaps you can imnprove your loop in other ways, but again, 
we'd need to see at least some code 


Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem installing packages


I tried several mirrors

Uwe Ligges pisze:



Jarek Jasiewicz wrote:

since 2.9.0 version I have a problem with installing packages:

install.packages(sp)
--- Please select a CRAN mirror for use in this session ---
Loading Tcl/Tk interface ... done
Warning: unable to access index for repository 
http://piotrkosoft.net/pub/mirrors/CRAN/src/contrib

Warning messages:
1: In open.connection(con, r) : unable to resolve ''
2: In list.files(lib) : list.files: 'sp' is not a readable directory

the reposityry is working, is accesible and sp package is in the repo

sdo I something wrong?



Have you tried another mirror (that one seems to be quite slow 
currently)?


Uwe Ligges



Jarek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3 questions regarding matrix copy/shuffle/compares



On Apr 26, 2009, at 9:43 AM, Esmail wrote:


David Winsemius wrote:
Yes. As I said before I am going to refrain from posting  
speculation until you provide valid R code

that will create an object that can be the subject of operations.



The code I have provided works, here is a run that may prove helpful:

POP_SIZE = 6
LEN = 8

pop=create_pop_2(POP_SIZE, LEN)

print(pop)
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]01011001
[2,]00000000
[3,]11001000
[4,]00000001
[5,]00110010
[6,]10000010
[7,]00000000
[8,]00000000
[9,]00000000
[10,]00000000
[11,]00000000
[12,]00000000

I want to (1) create a deep copy of pop,


I have already said *I* do not know how to create a deep copy in R.


(2) be able to shuffle the rows only, and


I have suggested that shuffling by way of a random selection of an   
external index:


 pop=create_pop_2(POP_SIZE, LEN)
[1] 48
 pop
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
 [1,]11001011
 [2,]10100010
 [3,]11010100
 [4,]00001000
 [5,]10011111
 [6,]11000000
 [7,]00000000
 [8,]00000000
 [9,]00000000
[10,]00000000
[11,]00000000
[12,]00000000

 dx - sample(1:nrow(pop), nrow(pop) )
 dx
 [1] 12 10  8  9  3  1  6 11  5  7  4  2
 pop[dx,]
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
 [1,]00000000
 [2,]00000000
 [3,]00000000
 [4,]00000000
 [5,]11010100
 [6,]11001011
 [7,]11000000
 [8,]00000000
 [9,]10011111
[10,]00000000
[11,]00001000
[12,]10100010


(3) be able to compare two copies of these objects
for equality and have it return True if only the rows have been  
shuffled.


I see two possible questions, the first easier (for me) than the  
second. Do you want to work on a copy with a known permutation of  
rows... or on a copy with an unknown ordering? In the first case I am  
unclear why you would not create an original and a copy, work on the  
copy, and compare with the original that is also sorted by the  
external index.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem installing packages


I tried several mirrors

But, what may be more important.
This error was mentioned earlier on VISTA and WindowsXP, I use Ubuntu 8.04

Uwe Ligges pisze:



Jarek Jasiewicz wrote:

since 2.9.0 version I have a problem with installing packages:

install.packages(sp)
--- Please select a CRAN mirror for use in this session ---
Loading Tcl/Tk interface ... done
Warning: unable to access index for repository 
http://piotrkosoft.net/pub/mirrors/CRAN/src/contrib

Warning messages:
1: In open.connection(con, r) : unable to resolve ''
2: In list.files(lib) : list.files: 'sp' is not a readable directory

the reposityry is working, is accesible and sp package is in the repo

sdo I something wrong?



Have you tried another mirror (that one seems to be quite slow 
currently)?


Uwe Ligges



Jarek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Memory issues in R

2009-04-26 Thread Neotropical bat risk assessments


   How do people deal with R and memory issues?
   I have tried using gc() to see how much memory is used at each step.
   Scanned Crawley R-Book and all other R books I have available and the FAQ
   on-line but no help really found.
   Running WinXP Pro (32 bit) with 4 GB RAM.
   One SATA drive pair is in RAID 0 configuration with 1 MB allocated as
   virtual memory.
   I do have another machine set up with Ubuntu but it only has 2 GB RAM and
   have not been able to get R installed on that system.
   I can run smaller sample data sets w/o problems and everything plots as
   needed.
   However I need to review large data sets.
   Using latest R version 2.9.0 (2009-04-17)
   My  data is in CSV format with a header row and is a big data set with
   1,200,240 rows!
   E.g. below:
   Dur,TBC,Fmax,Fmin,Fmean,Fc,S1,Sc,
   9.81,0,28.78,24.54,26.49,25.81,48.84,14.78,
   4.79,1838.47,37.21,29.41,31.76,29.52,241.77,62.83,
   4.21,5.42,28.99,26.23,27.53,27.4,76.03,11.44,
   10.69,193.48,30.53,25.4,27.69,25.4,-208.19,26.05,
   15.5,248.18,30.77,24.32,26.57,24.92,-202.76,18.64,
   14.85,217.47,31.25,24.62,26.93,25.56,-88.4,10.32,
   11.86,158.01,33.61,25.24,27.66,25.32,83.32,17.62,
   14.05,229.74,30.65,24.24,26.76,25.24,61.87,14.06,
   8.71,264.02,31.01,25.72,27.56,25.72,253.18,19.2,
   3.91,10.3,25.32,24.02,24.55,24.02,-71.67,16.83,
   16.11,242.21,29.85,24.02,26.07,24.62,79.45,19.11,
   16.81,246.48,28.57,23.05,25.46,23.81,-179.82,15.95,
   16.93,255.09,28.78,23.19,25.75,24.1,-112.21,16.38,
   5.12,107.16,32,29.41,30.46,29.41,134.45,20.88,
   16.7,150.49,27.97,22.92,24.91,23.95,42.96,16.81
    etc
   I am getting the following warning/error message:
   Error: cannot allocate vector of size 228.9 Mb
   Complete listing from R console below:
library(batcalls)
   Loading required package: ggplot2
   Loading required package: proto
   Loading required package: grid
   Loading required package: reshape
   Loading required package: plyr
   Attaching package: 'ggplot2'
   The following object(s) are masked from package:grid :
nullGrob
gc()
used (Mb) gc trigger (Mb) max used (Mb)
   Ncells 186251  5.0 407500 10.9   35  9.4
   Vcells  98245  0.8 786432  6.0   358194  2.8
BR - read.csv (C:/R-Stats/Bat calls/Reduced bats.csv)
gc()
 used (Mb) gc trigger  (Mb) max used  (Mb)
   Ncells  188034  5.1 667722  17.9   378266  10.2
   Vcells 9733249 74.3   20547202 156.8 20535538 156.7
attach(BR)
library(ggplot2)
library(MASS)
library(batcalls)
BRC-kde2d(Sc,Fc)
   Error: cannot allocate vector of size 228.9 Mb
gc()
  used  (Mb) gc trigger  (Mb)  max used  (Mb)
   Ncells   198547   5.4 667722  17.9378266  10.2
   Vcells 19339695 147.6  106768803 814.6 124960863 953.4
   
   Tnx for any insight,
   Bruce
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3 questions regarding matrix copy/shuffle/compares

 I want to (1) create a deep copy of pop,

 I have already said *I* do not know how to create a deep copy in R.

Creating a deep copy is easy, because all copies are deep copies.
You need to try very hard to create a reference in R.

Hadley


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3 questions regarding matrix copy/shuffle/compares

My understanding of the OP's request was for some sort of copy which  
did change when entries in the original were changed; the sort of  
behavior that might be seen  in a spreadsheet that had a copy by  
reference.


On Apr 26, 2009, at 11:28 AM, hadley wickham wrote:


I want to (1) create a deep copy of pop,


I have already said *I* do not know how to create a deep copy in R.


Creating a deep copy is easy, because all copies are deep copies.
You need to try very hard to create a reference in R.

Hadley

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3 questions regarding matrix copy/shuffle/compares


David,

Good news! It seems that R has deep copy by default. I ran this simplified
test and it seems I can change 'pop' without changing the saved version.

POP_SIZE = 4
LEN = 8
pop=create_pop_2(POP_SIZE, LEN)
cat('printing original pop\n')
print(pop)

keep_pop = pop
pop[1,1] = 99

cat('printing changed pop\n')
print(pop)
cat('printing keep_pop\n')
print(keep_pop)



---

 source('mat.R')
[1] 32
printing original pop
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]01101001
[2,]10100011
[3,]01011101
[4,]00010100
[5,]00000000
[6,]00000000
[7,]00000000
[8,]00000000


printing changed pop
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]   991101001
[2,]10100011
[3,]01011101
[4,]00010100
[5,]00000000
[6,]00000000
[7,]00000000
[8,]00000000


printing keep_pop
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]01101001
[2,]10100011
[3,]01011101
[4,]00010100
[5,]00000000
[6,]00000000
[7,]00000000
[8,]00000000


Re Shuffle

I tried using sample based on your earlier post, but your example
really helped, thanks!  That solves the shuffling issue.

dx - sample(1:POP_SIZE, POP_SIZE)
cat('shuffled index:')
print(dx)
print(pop[dx,])

cat('shuffled pop')
pop[1:POP_SIZE,] = pop[dx,]
print(pop)


re compare:

 I am unclear why you would not create an original and a copy,

Well .. that I wanted to do from the start (hence my question about
deep copy :-)

 work on the copy, and compare with the original that is also sorted
 by the external index.

That's a great idea, hadn't thought of keeping the index around for
this, I'll give this a try.

Final question, how do I compare these two structures so that I get
one result, true or false? Right now

keep == pop yields all these individual comparisons:

 pop==keep

  [,1] [,2]  [,3] [,4]  [,5]
[1,] FALSE TRUE FALSE TRUE FALSE
[2,] FALSE TRUE FALSE TRUE FALSE
[3,]  TRUE TRUE  TRUE TRUE  TRUE
[4,]  TRUE TRUE  TRUE TRUE  TRUE
[5,]  TRUE TRUE  TRUE TRUE  TRUE
[6,]  TRUE TRUE  TRUE TRUE  TRUE

Thanks for the help, much appreciated.

Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3 questions regarding matrix copy/shuffle/compares


hadley wickham wrote:

I want to (1) create a deep copy of pop,

I have already said *I* do not know how to create a deep copy in R.


Creating a deep copy is easy, because all copies are deep copies.
You need to try very hard to create a reference in R.


Hi Hadley

Right you are .. I discovered this now too. It's really confusing to
go back and forth between different languages. I have been programming
in Python for the last 2 months and everything there is a reference .. so
I have to worry about deep copy etc.

Thanks!
Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3 questions regarding matrix copy/shuffle/compares

In that case, you would want a shallow copy, and you'd need to jump
through a lot of hoops to do that in R.

Hadley

On Sun, Apr 26, 2009 at 10:35 AM, David Winsemius
dwinsem...@comcast.net wrote:
 My understanding of the OP's request was for some sort of copy which did
 change when entries in the original were changed; the sort of behavior that
 might be seen  in a spreadsheet that had a copy by reference.

 On Apr 26, 2009, at 11:28 AM, hadley wickham wrote:

 I want to (1) create a deep copy of pop,

 I have already said *I* do not know how to create a deep copy in R.

 Creating a deep copy is easy, because all copies are deep copies.
 You need to try very hard to create a reference in R.

 Hadley

 --

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT





-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3 questions regarding matrix copy/shuffle/compares


 David Winsemius

dwinsem...@comcast.net wrote:

My understanding of the OP's request was for some sort of copy which did
change when entries in the original were changed; the sort of behavior that
might be seen  in a spreadsheet that had a copy by reference.


You misunderstood (my phrasing wasn't probably the best), but I was
clear about wanting two independent copies.

From my earlier post:

(1) If I did

keep_pop[1:POP_SIZE] = pop[1:POP_SIZE]

to keep a copy of the original data structure before manipulating
'pop' potentially, would this make a deep copy or just shallow? Ie
if I change something in 'pop' would it be reflected in 'keep_pop'
too? (I don't think so, but just wanted to check). I would like
two independent copies.

Regardless, the net outcome was new knowledge, so this is a good outcome.

Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Stochastic Gradient Ascent for logistic regression

2009-04-26 Thread Ravi Varadhan

Hi Tim,

There are two main problems with your implementation of Stochastic gradient 
algorithm:

1.  You are only implementing one cycle of the algorithm, i.e. it cycles over 
each data point only once.  You need to do this several time, until convergence 
of parameters is obtained.
2.  Stochastic gradient algorithm has very slow convergence.  It can be really 
slow if the predictors are not scaled properly.  

I am attaching a code that takes care of (1) and (2).  It gives results that 
are in good agreement with glm() results.  Beware that it is still very slow.  

This seems like your homework assignment.  If so, you should acknowledge that 
you got help from the R group.  

Ravi.


Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: Tim LIU timothy.sli...@gmail.com
Date: Sunday, April 26, 2009 7:41 am
Subject: [R]  Stochastic Gradient Ascent for logistic regression
To: r-help@r-project.org


  
  
  Hi. guys,
  
  I am trying to write my own Stochastic Gradient Ascent for logistic 
  regression in R. But it seems that I am having convergence problem.
  
  Am I doing anything wrong, or just the data is off?
  
  Here is my code in R -
  
  
  
  lbw -
  read.table(
  , header=TRUE)
  
  attach(lbw)
  
  
  
  lbw[1:2,]
  low age lwt race smoke ptl ht ui ftv bwt
  1 0 19 182 2 0 0 0 1 0 2523
  2 0 33 155 3 0 0 0 0 3 2551
  
  
  
  
  #-R implementation of logistic regression : gradient descent --
  sigmoid-function(z)
  {
  1/(1 + exp(-1*z))
  
  }
  
  
  
  
  X-cbind(age,lwt, smoke, ht, ui)
  
  #y-low
  
  
  my_logistic-function(X,y)
  {
  
  alpha - 0.005
  n-5 
  m-189
  max_iters - 189 #number of obs
  
  ll-0
  
  X-cbind(1,X)
  
  theta -rep(0,6) # intercept and 5 regerssors
  #theta - c(1.39, -0.034, -0.01, 0.64, 1.89, 0.88) #glm estimates as 
 
  starting values
  theta_all-theta
  for (i in 1:max_iters) 
  { 
  dim(X)
  length(theta)
  hx - sigmoid(X %*% theta) # matrix 
  product
  
  ix-i
  
  for (j in 1:6)
  {
  theta[j] - theta[j] + alpha * ((y-hx)[ix]) * X[ix,j] 
  #stochastic gradient !
  
  }
  
  
  
  
  
  logl - sum( y * log(hx) + (1 - y) * log(1 - hx) ) #direct 
  multiplication
  
  ll-rbind(ll, logl)
  
  
  theta_all = cbind(theta_all,theta)
  }
  
  par(mfrow=c(4,2))
  
  
  plot(na.omit(ll[,1]))
  lines(ll[,1])
  
  for (j in 1:6)
  {
  
  plot(theta_all[j,])
  lines(theta_all[j,])
  } 
  
  
  #theta_all
  #ll
  cbind(ll,t(theta_all))
  }
  
  
  my_logistic(X,low)
  ==
  
  
  parameter estimates values jumped after 130+ iterations...
  
  not converging even when I use parameter estimates as starting values 
 
  from glm (family=binomial)
  
  
  help!
  
  
  
  
  -- 
  View this message in context: 
  Sent from the R help mailing list archive at Nabble.com.
  
  __
  R-help@r-project.org mailing list
  
  PLEASE do read the posting guide 
  and provide commented, minimal, self-contained, reproducible code. 
lbw - 
read.table(http://www.biostat.jhsph.edu/~ririzarr/Teaching/754/lbw.dat;, 
header=TRUE)
attach(lbw)
lbw[1:2,]
low age lwt race smoke ptl ht ui ftv bwt
1 0 19 182 2 0 0 0 1 0 2523
2 0 33 155 3 0 0 0 0 3 2551


#-R implementation of logistic regression : gradient descent --
sigmoid - function(z) 1/(1 + exp(-1*z))

X - cbind(age,lwt, smoke, ht, ui)
X.orig - X
X - scale(X.orig) # scaling improves convergence

my.logistic-function(par, X,y, alpha, plot=FALSE)
{

n - ncol(X)
m - nrow(X)
ll- rep(NA, m)
theta_all - matrix(NA, n, m)

X-cbind(1,X)
#theta - c(1.39, -0.034, -0.01, 0.64, 1.89, 0.88) #glm estimates as starting 
values
theta_all-theta
for (i in 1:m) 
{ 
dim(X)
length(theta)
hx - sigmoid(X %*% theta) # matrix product
theta - theta + alpha * (y - hx)[i] * X[i, ]
logl - sum( y * log(hx) + (1 - y) * log(1 - hx) ) #direct multiplication

ll[i] - logl

theta_all = cbind(theta_all, theta)
}

if(plot) {
par(mfrow=c(4,2))
plot(na.omit(ll))
lines(ll[1:i])

for (j in 1:6)
{
plot(theta_all[j, 1:i])
lines(theta_all[j, 1:i])
} 
}

return(list(par=theta, loglik=logl))
}

theta -rep(0,6) # intercept and 5 regerssors
delta - 1
while (delta  0.0001) {
ans - my.logistic(theta, X,low, alpha=0.0005,plot=TRUE)
theta.new - ans$par
delta - max(abs(theta - theta.new))
theta - theta.new
}
ans  # you can multiply coefficients by std. dev to get back original coeffs  

==
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem installing packages




Jarek Jasiewicz wrote:

I tried several mirrors

But, what may be more important.
This error was mentioned earlier on VISTA and WindowsXP, I use Ubuntu 8.04



Unfortunately, I cannot reproduce this for sp on Windows XP.

Uwe



Uwe Ligges pisze:



Jarek Jasiewicz wrote:

since 2.9.0 version I have a problem with installing packages:

install.packages(sp)
--- Please select a CRAN mirror for use in this session ---
Loading Tcl/Tk interface ... done
Warning: unable to access index for repository 
http://piotrkosoft.net/pub/mirrors/CRAN/src/contrib

Warning messages:
1: In open.connection(con, r) : unable to resolve ''
2: In list.files(lib) : list.files: 'sp' is not a readable directory

the reposityry is working, is accesible and sp package is in the repo

sdo I something wrong?



Have you tried another mirror (that one seems to be quite slow 
currently)?


Uwe Ligges



Jarek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory issues in R



On Apr 26, 2009, at 11:20 AM, Neotropical bat risk assessments wrote:



  How do people deal with R and memory issues?


They should read the R-FAQ and the Windows FAQ as you say you have.

http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021


  I have tried using gc() to see how much memory is used at each step.
  Scanned Crawley R-Book and all other R books I have available and  
the FAQ

  on-line but no help really found.
  Running WinXP Pro (32 bit) with 4 GB RAM.
  One SATA drive pair is in RAID 0 configuration with 1 MB  
allocated as

  virtual memory.


On the basis of my Windows experience this may not be enough  
information. (The drive information is fairly irrelevant.)

The R-Win-FAQ suggests:

?Memory
?memory.size# for information about memory usage. The limit can  
be raised by calling memory.limit 


Although you read the FAQs,  have you zeroed in on the relevant  
sections? What does memory.size report? And what happens when you run  
R alone in WinXP and alter the default settings with memory.limit?





  I do have another machine set up with Ubuntu but it only has 2 GB  
RAM and

  have not been able to get R installed on that system.
  I can run smaller sample data sets w/o problems and everything  
plots as

  needed.
  However I need to review large data sets.
  Using latest R version 2.9.0 (2009-04-17)
  My  data is in CSV format with a header row and is a big data set  
with

  1,200,240 rows!


It's long, but not particularly wide. Last year I was getting  
satisfactory work done on a 990K by 50-60 column dataset in a memory  
constraint of 4GB on a different OS. Your constraint is in the 2.5-  
3.0 GB area but your dataframe is only a third of the size.


  E.g. below:
  Dur,TBC,Fmax,Fmin,Fmean,Fc,S1,Sc,
  9.81,0,28.78,24.54,26.49,25.81,48.84,14.78,
  4.79,1838.47,37.21,29.41,31.76,29.52,241.77,62.83,
  4.21,5.42,28.99,26.23,27.53,27.4,76.03,11.44,
  10.69,193.48,30.53,25.4,27.69,25.4,-208.19,26.05,
  15.5,248.18,30.77,24.32,26.57,24.92,-202.76,18.64,
  14.85,217.47,31.25,24.62,26.93,25.56,-88.4,10.32,
  11.86,158.01,33.61,25.24,27.66,25.32,83.32,17.62,
  14.05,229.74,30.65,24.24,26.76,25.24,61.87,14.06,
  8.71,264.02,31.01,25.72,27.56,25.72,253.18,19.2,
  3.91,10.3,25.32,24.02,24.55,24.02,-71.67,16.83,
  16.11,242.21,29.85,24.02,26.07,24.62,79.45,19.11,
  16.81,246.48,28.57,23.05,25.46,23.81,-179.82,15.95,
  16.93,255.09,28.78,23.19,25.75,24.1,-112.21,16.38,
  5.12,107.16,32,29.41,30.46,29.41,134.45,20.88,
  16.7,150.49,27.97,22.92,24.91,23.95,42.96,16.81
   etc
  I am getting the following warning/error message:
  Error: cannot allocate vector of size 228.9 Mb


So you got the data into memory. That does not appear to exceed the  
capacity of your hardware setup, if you address the options offered  
above.





  Complete listing from R console below:

library(batcalls)

  Loading required package: ggplot2
  Loading required package: proto
  Loading required package: grid
  Loading required package: reshape
  Loading required package: plyr
  Attaching package: 'ggplot2'
  The following object(s) are masked from package:grid :
   nullGrob

gc()

   used (Mb) gc trigger (Mb) max used (Mb)
  Ncells 186251  5.0 407500 10.9   35  9.4
  Vcells  98245  0.8 786432  6.0   358194  2.8

BR - read.csv (C:/R-Stats/Bat calls/Reduced bats.csv)
gc()

used (Mb) gc trigger  (Mb) max used  (Mb)
  Ncells  188034  5.1 667722  17.9   378266  10.2
  Vcells 9733249 74.3   20547202 156.8 20535538 156.7


Looks like you need to use memory.limit(some bigger number)



attach(BR)
library(ggplot2)
library(MASS)
library(batcalls)
BRC-kde2d(Sc,Fc)

  Error: cannot allocate vector of size 228.9 Mb

gc()

 used  (Mb) gc trigger  (Mb)  max used  (Mb)
  Ncells   198547   5.4 667722  17.9378266  10.2
  Vcells 19339695 147.6  106768803 814.6 124960863 953.4



  Tnx for any insight,
  Bruce

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory issues in R

2009-04-26 Thread Stefan Grosse

On Sun, 26 Apr 2009 09:20:12 -0600 Neotropical bat risk assessments
neotropical.b...@gmail.com wrote:

NBRA 
NBRAHow do people deal with R and memory issues?
NBRAI have tried using gc() to see how much memory is used at each
NBRA step. Scanned Crawley R-Book and all other R books I have
NBRA available and the FAQ on-line but no help really found.
NBRARunning WinXP Pro (32 bit) with 4 GB RAM.

There is a limit on windows, read the FAQ:
http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021

So either you use a (64bit) Linux with enough memory or you use
packages or a SQL solution that is able to deal with huge datasets.
(biglm for example)

Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Matching in R

2009-04-26 Thread dirk567

Dear R users,

I am trying to do exact matching on a large dataset (500.000 obs), about equal 
size of treatment and controll group, with replacement: As for the moment I use 
the Match function of the Matching library. I match on 2 covariates and all 
observations in the treatment group have at least one exact counterpart in the 
controllgroup. Now I want to introduce observation weights. I set ties=FALSE, 
as I want exactly one by one matching: Is there a way which makes that I draw 
randomly from the individuals in the controllgroup which have the same values 
of covariates as the individual in the treatmentgroup, setting the 
probabilities to be drawn proportional to the weights of the individual in the 
CT? E.g. I have three individuals which all have the same value for the 
covariates as the one observation I want to find a partner for, and the first 
of the three individuals has a very large weight: Now when drawing randomly 
among those three I want the probability that the first one is dr!
 awn to be very large.

I'd really appreciate any suggestions: the weights option does not do the 
job, this seems to work only if setting ties=TRUE

Thanks
Dirk 
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with plotting results of lda

2009-04-26 Thread Prof Brian Ripley


On Sun, 26 Apr 2009, Uwe Ligges wrote:


Works for me with

library(MASS)
plot(lda(Species~., data=iris))

hence you may want to profide the data to enable us to reproduce your 
problem...


He is trying to plot the results from a cross-validation.  As the help 
page clearly states, that is a list (and has no assigned class). 
Plotting an arbitrary list makes little sense (and not does plotting 
the results of an LDA cross-validation).




Uwe Ligges



pgseye wrote:

Hi,

I've performed an lda and obtained a classification table for some of my
data:


efa.dfa-lda(groups~.,efa.scores.8,CV=T)
str(efa.dfa)

List of 5
 $ class: Factor w/ 2 levels 1,2: 1 2 1 2 1 1 2 2 1 2 ...
 $ posterior: num [1:160, 1:2] 0.99083 0.00852 0.93983 0.23186 0.85931 ...
  ..- attr(*, dimnames)=List of 2
  .. ..$ : chr [1:160] 1 2 3 4 ...
  .. ..$ : chr [1:2] 1 2
 $ terms:Classes 'terms', 'formula' length 3 groups ~ Comp.1 + Comp.2 +
Comp.3 + Comp.4 + Comp.5 + Comp.6 +  Comp.7 + Comp.8 + Comp.9 + Comp.10
+ Comp.11 + Comp.12 +  ...
  .. ..- attr(*, variables)= language list(groups, Comp.1, Comp.2, 
Comp.3,

Comp.4, Comp.5, Comp.6,  Comp.7, Comp.8, Comp.9, Comp.10, Comp.11,
Comp.12, Comp.13,  ...
  .. ..- attr(*, factors)= int [1:35, 1:34] 0 1 0 0 0 0 0 0 0 0 ...
  .. .. ..- attr(*, dimnames)=List of 2
  .. .. .. ..$ : chr [1:35] groups Comp.1 Comp.2 Comp.3 ...
  .. .. .. ..$ : chr [1:34] Comp.1 Comp.2 Comp.3 Comp.4 ...
  .. ..- attr(*, term.labels)= chr [1:34] Comp.1 Comp.2 Comp.3
Comp.4 ...
  .. ..- attr(*, order)= int [1:34] 1 1 1 1 1 1 1 1 1 1 ...
  .. ..- attr(*, intercept)= int 1
  .. ..- attr(*, response)= int 1
  .. ..- attr(*, .Environment)=environment: R_GlobalEnv   .. ..- 
attr(*, predvars)= language list(groups, Comp.1, Comp.2, Comp.3,

Comp.4, Comp.5, Comp.6,  Comp.7, Comp.8, Comp.9, Comp.10, Comp.11,
Comp.12, Comp.13,  ...
  .. ..- attr(*, dataClasses)= Named chr [1:35] numeric numeric
numeric numeric ...
  .. .. ..- attr(*, names)= chr [1:35] groups Comp.1 Comp.2 
Comp.3

...
 $ call : language lda(formula = groups ~ ., data = efa.scores.8, CV =
T)
 $ xlevels  : list()

table(groups, Classified=efa.dfa$class)

  Classified
groups  1  2
 1 59 21
 2 10 70

but when I try to plot the results I get:


plot(efa.dfa)

Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf

anyone have any ideas?

Thanks a lot,

Paul


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Install packages not working in latest version?

2009-04-26 Thread Neotropical bat risk assessments


Seems that the latest version R version 2.9.0 (2009-04-17)
 has a glitch and will not install packages. Issue with unzipping?

Works fine with R version 2.8.1 (2008-12-22)


 install.packages(ff)
--- Please select a CRAN mirror for use in this session ---
trying URL 'http://cran.fhcrc.org/bin/windows/contrib/2.9/ff_2.0.1.zip'
Content type 'application/zip' length 779664 bytes (761 Kb)
opened URL
downloaded 761 Kb

Error in .Internal(int.unzip(zipname, NULL, dest)) :
  no internal function int.unzip

Bruce

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Install packages not working in latest version?




Neotropical bat risk assessments wrote:

Seems that the latest version R version 2.9.0 (2009-04-17)
 has a glitch and will not install packages. Issue with unzipping?

Works fine with R version 2.8.1 (2008-12-22)


  install.packages(ff)
--- Please select a CRAN mirror for use in this session ---
trying URL 'http://cran.fhcrc.org/bin/windows/contrib/2.9/ff_2.0.1.zip'
Content type 'application/zip' length 779664 bytes (761 Kb)
opened URL
downloaded 761 Kb

Error in .Internal(int.unzip(zipname, NULL, dest)) :
  no internal function int.unzip



Please check if you have old base packages in a library that is used 
before the standard library in R_HOME/library when you start R-2.9.0.


Uwe Ligges



 
Bruce

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] comparing matrices


I'm trying to compare two matrices made up of bits.

doing a simple comparison of

matA == matB

yields this sort of output.

  [,1] [,2]  [,3]  [,4]  [,5]  [,6]
[1,] FALSE TRUE FALSE  TRUE  TRUE FALSE
[2,]  TRUE TRUE  TRUE  TRUE  TRUE  TRUE
[3,] FALSE TRUE FALSE FALSE FALSE  TRUE
[4,] FALSE TRUE  TRUE FALSE FALSE FALSE
[5,]  TRUE TRUE  TRUE  TRUE FALSE FALSE
[6,]  TRUE TRUE  TRUE  TRUE FALSE FALSE

I really would like just one comprehensive value to say TRUE or FALSE.

This is the hack (rather ugly I think) I put together that works,
but there has to be a nicer way, no?

res=pop[1:ROWS,] == keep[1:ROWS,]

if ((ROWS*COL) == sum(res))
 {
   cat('they are equal\n')
 }else
   cat('they are NOT equal\n')

Thanks!

Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] figure layout

2009-04-26 Thread hesicaia


Hello, 

I have a specific question regarding figure layout. I am tryng to make a 2
by 1 figure but I would like to make the bottom figure slightly larger than
the top figure. I have read through the help posts and have tried to use
fig=c(),new=T and have also tried to use split.screen and although
they both work for most types of plotting(ie. hist, plot, etc...), for some
strange reason, they do not function when I try to use them with the
metaplot function in package rmeta. The plot come out ok, but they are
on two separate pages instead of the same one. I realize this is a very
specific question, but was hoping someone might be able to suggest how I
could achieve this. Below is my code for both fig and split.screen,
minus the axis labels which take up a lot of space in the code.
Thanks very much, Daniel.

***
library(rmeta)
meta.n-meta.summaries(ttable.n$lin.yeff,ttable.n$lin.se,method=random)
conf.n-1/(meta.n$se.summary^2)
meta.c-meta.summaries(ttable.c$lin.yeff,ttable.c$lin.se,method=random)
conf.c-1/(meta.c$se.summary^2)

bitmap(/scratch/dboyce/nmfs/figs/yeareffs.all.metaplot.dev.pdf,type=pdfwrite,res=800,height=6,width=6,pointsize=12)
par(mfrow=c(2,1),mar=c(1,2,1,1),oma=c(4,6,.5,.5),cex.axis=.8,fig=c(0,1,.6,1))
metaplot(mn=ttable.n$lin.yeff,se=ttable.n$lin.se,nn=(ttable.n$dev)-.05,labels=NULL,conf.level=0.95,summn=meta.n$summary,sumse=meta.n$se.summary,sumnn=conf.n/700,logeffect=F,colors=meta.colors(box=firebrick3,lines=gray38,zero=black,summary=blue,text=black),xlim=c(-.06,.16),cex=1.2,ylab=,summlabel=,xaxt=n)
axis(side=1,at=seq(-.06,.16,by=.03),labels=T)
text(-.06,0.5,A,cex=1.4)
box()
 
par(fig=c(0,1,0,.6),new=T)
metaplot(ttable.c$lin.yeff,ttable.c$lin.se,nn=ttable.c$dev,labels=NULL,conf.level=0.95,summn=meta.c$summary,sumse=meta.c$se.summary,sumnn=conf.c/7000,logeffect=F,colors=meta.colors(box=firebrick3,lines=gray38,zero=black,summary=blue,text=black),xlim=c(-.02,.06),cex=1.2,summlabel=,ylab=,xaxt=n)
text(-.02,0.5,B,cex=1.4)
axis(side=1,at=seq(-.02,.06,by=.01),labels=T)
mtext(Instantaneous rate of change,side=1,line=3,cex=1.5)
box()
dev.off()


split.screen(figs=c(2,1),erase=F)
screen(1)
metaplot(mn=ttable.n$lin.yeff,se=ttable.n$lin.se,nn=(ttable.n$dev)-.05,labels=NULL,conf.level=0.95,summn=meta.n$summary,sumse=meta.n$se.summary,sumnn=conf.n/700,logeffect=F,colors=meta.colors(box=firebrick3,lines=gray38,zero=black,summary=blue,text=black),xlim=c(-.06,.16),cex=1.2,ylab=,summlabel=,xaxt=n)
screen(2)
par(new=T)
metaplot(ttable.c$lin.yeff,ttable.c$lin.se,nn=ttable.c$dev,labels=NULL,conf.level=0.95,summn=meta.c$summary,sumse=meta.c$se.summary,sumnn=conf.c/7000,logeffect=F,colors=meta.colors(box=firebrick3,lines=gray38,zero=black,summary=blue,text=black),xlim=c(-.02,.06),cex=1.2,summlabel=,ylab=,xaxt=n)
close.screen(all=T)

-- 
View this message in context: 
http://www.nabble.com/figure-layout-tp23242699p23242699.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] doubt in vglm output

2009-04-26 Thread priyabrata panigrahi

i am using vglm for multiple logistic regression.
i have 1 response variable (total 4 category)
and 5 predictor.

Call:
vglm(formula = class ~ PC1 + PC2 + PC3 + PC4 + PC5, family = multinomial(),
na.action = na.pass)

Coefficients:
(Intercept):1 (Intercept):2 PC1:1 PC1:2 PC2:1
   -0.5480417-1.0716498 0.5146799 0.1578941-0.3111874
PC2:2 PC3:1 PC3:2 PC4:1 PC4:2
0.5213314-0.9584294-0.9889684 0.8510812 1.2110904
PC5:1 PC5:2
0.5832257 0.5126038

Degrees of Freedom: 330 Total; 318 Residual
Residual Deviance: 216.9244
Log-likelihood: -108.4622


i am not understanding whether this model is good or not.
what log likelihood value says ? whether it should be low or high ?

because i used this model to predict the 4 category of response variable by
choosing those datapoint which were used to fit the model.

i get 72% of training data ( those which were used to fit model) correctly
predicted.

please help

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with create a tree

2009-04-26 Thread Grześ


I have a small database (file csv):

'District';'HouseType';'Income';'PreviousCustomer';'Outcome'
'Suburban';'Detached';'High';'No';'Nothing'
'Suburban';'Detached';'High';'Yes';'Nothing'
'Rural';'Detached';'High';'No';'Responded' ... itd.

After instruction str() as a result I get:

 str(zielone)
'data.frame': 14 obs. of 5 variables:
$ X.District. : Factor w/ 3 levels 'Rural','Suburban',..: 2 2 1 3 3 3 1
2 2 3 ...
$ X.HouseType. : Factor w/ 3 levels 'Detached','Semi-detached',..: 1 1 1
2 2 2 2 3 2 3 ...
$ X.Income. : Factor w/ 2 levels 'High','Low': 1 1 1 1 2 2 2 1 2 2 ...
$ X.PreviousCustomer.: Factor w/ 2 levels 'No','Yes': 1 2 1 1 1 2 2 1 1
1 ...
$ X.Outcome. : Factor w/ 2 levels 'Nothing','Responded': 1 1 2 2 2 1 2 1
2 2 ...

But when I try to built a tree I get an error. What's wrong in my file?

 t.zielone=rpart(zielone$X.District~.,zielone)
 plot(t.zielone)
Error in plot.rpart(t.zielone) : fit is not a tree, just a root

-- 
View this message in context: 
http://www.nabble.com/Problem-with-create-a-tree-tp23243589p23243589.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ANOVA/statistics question

2009-04-26 Thread drmh


Hello again,
In my situation, I have three variables: pretest, posttest, and cohesion. 

I want to work out the correlation between postest and cohesion. 

I looked at multiple sets of data and created ANOVA tables of them. However,
as pretest and postest are sometimes correlated (with a statistical
significance  0.05), it is necessary to discount the effect of pretest to
work out the real correlation of posttest and coherence.. I need a system
for working out the strength of the correlation between posttest and
coherence, when does actually occur. 

According to my understanding level refers the amount or magnitude of
experimental units. Pretest, posttest are scores - range from any value from
0 to 1. Cohesion can be any value.

What exactly would
cor(y[pre == 1], x[pre == 1])
cor(y[pre == 2], x[pre == 2])
give me?

I understand that my lack of understanding may be exasperating, but it is
not for want of effort - I have put in hours trying to understand this
stuff.

Thanks,
Douglas Holmes



Tal Galili wrote:
 
 Hi Douglas.
 So you want to check for correlation or regression ?
 
 how many levels does pre have ?
 you could subset the variables you want to check correlation on, by the
 pre
 levels.
 
 for example:
 Let's say pre has two levels: 1 and 2. then you can do:
 cor(y[pre == 1], x[pre == 1])
 cor(y[pre == 2], x[pre == 2])
 
 
 Also, If you want to go for regression, you can go to something like:
 summary(lm(y ~ x + pre))
 But I suggest getting much more understanding of what linear regression is
 before using it's methods...
 
 Good luck,
 Tal
 
 
 
 On Sat, Apr 25, 2009 at 1:26 PM, drmh
 douglasrmhol...@googlemail.comwrote:
 

 Hi, thanks for your prompt reply

 In my situation, the dependent variable is post-test and the
 independent
 variables are pre and coh.
 Howw would I find the correlation between coh and post with the effect of
 pre regressed using your commands?



 Tal Galili wrote:
 
  Hi Douglas
  I would go for a different command then aov.
  something like:
  ?cor
  or
  ?cor.test
  To also get the p value of the correlation.
 
  Cheers,
  Tal
 
 
 
  On Sat, Apr 25, 2009 at 8:27 AM, drmh
  douglasrmhol...@googlemail.comwrote:
 
 
  (Have searched for this already)
 
  Hi,
 
  How do you find the strength of correlation between two variables
 using
  an
  ANOVA table?  Pr(F) gives the statistical significance of the
  association, but not the strength of the correlation.
 
  See data (from R) below
 
  Readable:
   Df Sum SqMean Sq   F
  value
  Pr(F)
  pre  1   0.00593  0.00593936
  0.7450563   0.401636958677004
  coh 1   0.04311  0.04311302
   5.4082639
  0.0344751749542619
  Residuals15 0.11957  0.00797169  NA
  NA
 
  Original:
  Df Sum Sq Mean Sq F value Pr(F)
  pre 1 0.0059393604629317 0.0059393604629317 0.745056336657567
  0.401636958677004
  coh 1 0.0431130207164516 0.0431130207164516 5.40826398359156
  0.0344751749542619
  Residuals 15 0.119575396598395 0.00797169310655964 NA NA
 
  Any help would be greatly appreciated,
  Douglas Holmes
  --
  View this message in context:
 
 http://www.nabble.com/ANOVA-statistics-question-tp23231563p23231563.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
  --
  --
 
 
  My contact information:
  Tal Galili
  Phone number: 972-50-3373767
  FaceBook: Tal Galili
  My Blogs:
  http://www.r-statistics.com/
  http://www.talgalili.com
  http://www.biostatistics.co.il
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

 --
 View this message in context:
 http://www.nabble.com/ANOVA-statistics-question-tp23231563p23234421.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 
 -- 
 --
 
 
 My contact information:
 Tal Galili
 Phone number: 972-50-3373767
 FaceBook: Tal Galili
 My Blogs:
 http://www.r-statistics.com/
 http://www.talgalili.com
 http://www.biostatistics.co.il
 
   [[alternative HTML version deleted]]
 
 __

Re: [R] Memory issues in R


Then post the material that would make sense for Windows.

What _does_ memory.limits() return? This _was_ asked and you did not  
answer.
How many other objects do you have in your workspace?
How big are they?

Jim Holtman offered this function that displays memory occupation by  
object and total:

my.ls -  function (pos = 1, sorted = F)
  {
  .result - sapply(ls(pos = pos, all.names = TRUE), function(..x)  
object.size(eval(as.symbol(..x
  if (sorted) {
  .result - rev(sort(.result))
  }
  .ls - as.data.frame(rbind(as.matrix(.result), `**Total` =  
sum(.result)))
  names(.ls) - Size
  .ls$Size - formatC(.ls$Size, big.mark = ,, digits = 0,
  format = f)
  .ls$Mode - c(unlist(lapply(rownames(.ls)[-nrow(.ls)],
function(x) mode(eval(as.symbol(x),
  ---)
  .ls
  }


On Apr 26, 2009, at 12:19 PM, Neotropical bat risk assessments wrote:


 Thanks for the comments,

 I did read the FAQ and that link you sent the first time. No help  
 and very general.

 I did set  memory.size(max = TRUE)  but still get same warning-error  
 message.

 Bruce

 At 09:58 AM 4/26/2009, you wrote:

 On Apr 26, 2009, at 11:20 AM, Neotropical bat risk assessments wrote:


   How do people deal with R and memory issues?

 They should read the R-FAQ and the Windows FAQ as you say you have.

 http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021

David Winsemius, MD
Heritage Laboratories
West Hartford, CT


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] comparing matrices

2009-04-26 Thread ONKELINX, Thierry

Have a look at all.equal

matA - matrix(1:4, ncol = 2)
matB - matA
all.equal(matA, matB)
matB[1,1] - -10
all.equal(matA, matB)

HTH,

Thierry 




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
thierry.onkel...@inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Esmail
Verzonden: zondag 26 april 2009 19:03
Aan: R mailing list
Onderwerp: [R] comparing matrices

I'm trying to compare two matrices made up of bits.

doing a simple comparison of

 matA == matB

yields this sort of output.

   [,1] [,2]  [,3]  [,4]  [,5]  [,6]
[1,] FALSE TRUE FALSE  TRUE  TRUE FALSE
[2,]  TRUE TRUE  TRUE  TRUE  TRUE  TRUE
[3,] FALSE TRUE FALSE FALSE FALSE  TRUE
[4,] FALSE TRUE  TRUE FALSE FALSE FALSE
[5,]  TRUE TRUE  TRUE  TRUE FALSE FALSE
[6,]  TRUE TRUE  TRUE  TRUE FALSE FALSE

I really would like just one comprehensive value to say TRUE or FALSE.

This is the hack (rather ugly I think) I put together that works,
but there has to be a nicer way, no?

 res=pop[1:ROWS,] == keep[1:ROWS,]

 if ((ROWS*COL) == sum(res))
  {
cat('they are equal\n')
  }else
cat('they are NOT equal\n')

Thanks!

Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] THE EQUIVALENT OF SQL INNER TABLE JOIN IN R

2009-04-26 Thread Wacek Kusnierczyk

Peter Dalgaard wrote:
 Nigel Birney wrote:
 Hello all,

 Apologize for the newbie question. What's the easiest way to do a SQL
 inner
 table join in R?
 Say I have a table containing column names A, B, C and another which has
 columns named C, D, E. I would like to do an inner table join on C and
 produce a table A, B, C, D, E.

 merge(), perhaps? Otherwise describe what an inner table join does.

btw., i think ?merge has it wrong when it comes to the sql join terminology:

 In SQL database terminology, the default value of 'all = FALSE'
 gives a _natural join_, a special case of an _inner join_.

following [1, sec. 6.5] (and in concordance with the typical use of the
terms in the db lingo, as of my rather limited knowledge), a natural
join is a join where values are compared pairwise for columns with the
same names across the joined tables.  the result from merge with
all=FALSE does not have to be a natural join, while it will be an inner
join, as in:

d1 = data.frame(a=1:5, b=rnorm(5))
d2 = data.frame(c=3:7, d=rnorm(5))

merge(d1, d2, all=FALSE)
# 25 rows, a cross join (an outer join)
 # *not* an inner join, even less so a natural join

merge(d1, d2, by.x='a', by.y='c', all=FALSE)
# 3 rows, an inner join
# *not* a natural join

the point is, all=FALSE gives a natural join iff by is equivalent to
intersect(names(x), names(y)), and these two conditions together are
necessary (and sufficient) for a join to be a natural join.

the snippet from ?merge quoted above is wrong and misleading, and should
be corrected to sth like:

 In SQL database terminology, the default value of 'all = FALSE'
 gives an _inner join_.  If, in addition, 'by' is equivalent to
'intersect(names(x), names(y))', the the join is a _natural join_, a
special case of an _inner join_.

or, if the authors insist ?merge is correct, would they provide a reference?

(in fact, the terminology is not that coherent;  e.g., in mysql natural
merely refers to column names, and not to how to choose rows, and one
can have natural outer joins -- which are not, in general, inner joins.)


vQ

[1] c.j. date's, sql and relational theory, o'reilly 2009

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] comparing matrices

2009-04-26 Thread baptiste auguie


I'm not sure I'm following you but have you tried,

identical(matrix(c(1,1,1,1),ncol=2), matrix(c(1,1,1,1),ncol=2))

?all.equal
?isTRUE
?identical

and possibly the compare package,

compare(matrix(c(1,1,1,1),ncol=2),matrix(c(1,1,1,1),ncol=2))


HTH,

baptiste

On 26 Apr 2009, at 18:02, Esmail wrote:


I'm trying to compare two matrices made up of bits.

doing a simple comparison of

matA == matB

yields this sort of output.

  [,1] [,2]  [,3]  [,4]  [,5]  [,6]
[1,] FALSE TRUE FALSE  TRUE  TRUE FALSE
[2,]  TRUE TRUE  TRUE  TRUE  TRUE  TRUE
[3,] FALSE TRUE FALSE FALSE FALSE  TRUE
[4,] FALSE TRUE  TRUE FALSE FALSE FALSE
[5,]  TRUE TRUE  TRUE  TRUE FALSE FALSE
[6,]  TRUE TRUE  TRUE  TRUE FALSE FALSE

I really would like just one comprehensive value to say TRUE or FALSE.

This is the hack (rather ugly I think) I put together that works,
but there has to be a nicer way, no?

res=pop[1:ROWS,] == keep[1:ROWS,]

if ((ROWS*COL) == sum(res))
 {
   cat('they are equal\n')
 }else
   cat('they are NOT equal\n')

Thanks!

Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem installing packages

unfortunately this problem is difficult do reproduce. If no, it would be 
mentioned and removed earlier.


It probably heppens in some specific cirumstances. I I look for that

Jarek

Uwe Ligges pisze:



Jarek Jasiewicz wrote:

I tried several mirrors

But, what may be more important.
This error was mentioned earlier on VISTA and WindowsXP, I use Ubuntu 
8.04



Unfortunately, I cannot reproduce this for sp on Windows XP.

Uwe



Uwe Ligges pisze:



Jarek Jasiewicz wrote:

since 2.9.0 version I have a problem with installing packages:

install.packages(sp)
--- Please select a CRAN mirror for use in this session ---
Loading Tcl/Tk interface ... done
Warning: unable to access index for repository 
http://piotrkosoft.net/pub/mirrors/CRAN/src/contrib

Warning messages:
1: In open.connection(con, r) : unable to resolve ''
2: In list.files(lib) : list.files: 'sp' is not a readable directory

the reposityry is working, is accesible and sp package is in the repo

sdo I something wrong?



Have you tried another mirror (that one seems to be quite slow 
currently)?


Uwe Ligges



Jarek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] comparing matrices



On Apr 26, 2009, at 1:02 PM, Esmail wrote:


I'm trying to compare two matrices made up of bits.

doing a simple comparison of

   matA == matB


 identical( matrix((1:4), ncol=2), matrix((1:4), nrow=2))
[1] TRUE
 identical( matrix((1:4), ncol=2), matrix((2:5), nrow=2))
[1] FALSE




yields this sort of output.

 [,1] [,2]  [,3]  [,4]  [,5]  [,6]
[1,] FALSE TRUE FALSE  TRUE  TRUE FALSE
[2,]  TRUE TRUE  TRUE  TRUE  TRUE  TRUE
[3,] FALSE TRUE FALSE FALSE FALSE  TRUE
[4,] FALSE TRUE  TRUE FALSE FALSE FALSE
[5,]  TRUE TRUE  TRUE  TRUE FALSE FALSE
[6,]  TRUE TRUE  TRUE  TRUE FALSE FALSE

I really would like just one comprehensive value to say TRUE or FALSE.

This is the hack (rather ugly I think) I put together that works,
but there has to be a nicer way, no?

   res=pop[1:ROWS,] == keep[1:ROWS,]


   if ((ROWS*COL) == sum(res))
{
  cat('they are equal\n')
}else
  cat('they are NOT equal\n')


That code is meaningless to us without a definition (in R) of ROWS,   
COL, keep, and pop



Thanks!

Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] comparing matrices


baptiste auguie wrote:

I'm not sure I'm following you but have you tried,

identical(matrix(c(1,1,1,1),ncol=2), matrix(c(1,1,1,1),ncol=2))

?all.equal
?isTRUE
?identical

and possibly the compare package,

compare(matrix(c(1,1,1,1),ncol=2),matrix(c(1,1,1,1),ncol=2))


HTH,

baptiste


Hi Babtiste,

Thanks for pointing out the various options that exist. R is
a very rich language indeed and it's good to know how to accomplish
tasks in various ways.

Cheers,
Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] comparing matrices


ONKELINX, Thierry wrote:

Have a look at all.equal

matA - matrix(1:4, ncol = 2)
matB - matA
all.equal(matA, matB)
matB[1,1] - -10
all.equal(matA, matB)


Hi Thierry,

Thanks, all.equal does indicate if it's all equal so that
works great!

Much nicer than my hack - thanks,

Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] figure layout

2009-04-26 Thread Cuvelier Etienne



hesicaia a écrit :
Hello, 


I have a specific question regarding figure layout. I am tryng to make a 2
by 1 figure but I would like to make the bottom figure slightly larger than
the top figure. I have read through the help posts and have tried to use
fig=c(),new=T and have also tried to use split.screen and although
they both work for most types of plotting(ie. hist, plot, etc...), for some
strange reason, they do not function when I try to use them with the
metaplot function in package rmeta. The plot come out ok, but they are
on two separate pages instead of the same one. I realize this is a very
specific question, but was hoping someone might be able to suggest how I
could achieve this. Below is my code for both fig and split.screen,
minus the axis labels which take up a lot of space in the code.
Thanks very much, Daniel.


You can try to use the layout function:

?layout
#Example 1
# First plot take 3/5 of the screen
# Second plot 5/5of the screen
layout(matrix(c(0,1,1,1,0,2,2,2,2,2),nrow = 2, byrow=TRUE))
plot(density(rnorm(1000)))
plot(density(rnorm(1000)))

#Example 2
# First plot take 4/6 of the screen
# Second plot 6/6of the screen
layout(matrix(c(0,1,1,1,0,2,2,2,2,2),nrow = 2, byrow=TRUE))
plot(density(rnorm(1000)))
plot(density(rnorm(1000)))
...

I hope it helps.

Etienne



***
library(rmeta)
meta.n-meta.summaries(ttable.n$lin.yeff,ttable.n$lin.se,method=random)
conf.n-1/(meta.n$se.summary^2)
meta.c-meta.summaries(ttable.c$lin.yeff,ttable.c$lin.se,method=random)
conf.c-1/(meta.c$se.summary^2)

bitmap(/scratch/dboyce/nmfs/figs/yeareffs.all.metaplot.dev.pdf,type=pdfwrite,res=800,height=6,width=6,pointsize=12)
par(mfrow=c(2,1),mar=c(1,2,1,1),oma=c(4,6,.5,.5),cex.axis=.8,fig=c(0,1,.6,1))
metaplot(mn=ttable.n$lin.yeff,se=ttable.n$lin.se,nn=(ttable.n$dev)-.05,labels=NULL,conf.level=0.95,summn=meta.n$summary,sumse=meta.n$se.summary,sumnn=conf.n/700,logeffect=F,colors=meta.colors(box=firebrick3,lines=gray38,zero=black,summary=blue,text=black),xlim=c(-.06,.16),cex=1.2,ylab=,summlabel=,xaxt=n)
axis(side=1,at=seq(-.06,.16,by=.03),labels=T)
text(-.06,0.5,A,cex=1.4)
box()
 
par(fig=c(0,1,0,.6),new=T)

metaplot(ttable.c$lin.yeff,ttable.c$lin.se,nn=ttable.c$dev,labels=NULL,conf.level=0.95,summn=meta.c$summary,sumse=meta.c$se.summary,sumnn=conf.c/7000,logeffect=F,colors=meta.colors(box=firebrick3,lines=gray38,zero=black,summary=blue,text=black),xlim=c(-.02,.06),cex=1.2,summlabel=,ylab=,xaxt=n)
text(-.02,0.5,B,cex=1.4)
axis(side=1,at=seq(-.02,.06,by=.01),labels=T)
mtext(Instantaneous rate of change,side=1,line=3,cex=1.5)
box()
dev.off()


split.screen(figs=c(2,1),erase=F)
screen(1)
metaplot(mn=ttable.n$lin.yeff,se=ttable.n$lin.se,nn=(ttable.n$dev)-.05,labels=NULL,conf.level=0.95,summn=meta.n$summary,sumse=meta.n$se.summary,sumnn=conf.n/700,logeffect=F,colors=meta.colors(box=firebrick3,lines=gray38,zero=black,summary=blue,text=black),xlim=c(-.06,.16),cex=1.2,ylab=,summlabel=,xaxt=n)
screen(2)
par(new=T)
metaplot(ttable.c$lin.yeff,ttable.c$lin.se,nn=ttable.c$dev,labels=NULL,conf.level=0.95,summn=meta.c$summary,sumse=meta.c$se.summary,sumnn=conf.c/7000,logeffect=F,colors=meta.colors(box=firebrick3,lines=gray38,zero=black,summary=blue,text=black),xlim=c(-.02,.06),cex=1.2,summlabel=,ylab=,xaxt=n)
close.screen(all=T)



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] eager to learn how to use sapply, lapply, ...

2009-04-26 Thread mauede

After a year my R programming style is still very C like.
I am still writing a lot of for loops and finding it difficult to recognize 
where, in place of loops, I could just do the
same with one line of code, using sapply, lapply, or the like.
On-line examples for such high level function do not help me.
Even if, sooner or later, I am getting my R scripts to do what I expect, I 
would really like to shake my C programming style off.
I am staring at my R script and thinking how can I improve it ?
For instance, I have a lot of loops similar to the following one and wonder 
whether I can replace them with a proper call to a high level R function that 
does the same:

Nstart - Nfour/(2^Lev) + 1
 Nfinish - Nstart -1 + Nfour/(2^Lev)
 LengLev - Nfinish - Nstart + 1
 NW - floor(LengLev*N/Nfour)
 if(NW  0){
   for(j in Nstart:(Nstart + NW -1)){ 
  Dw - abs(Y[j])
  Rnorm - Rnorm + Dw^2
   }
 }


Thank you very much for helping me get better.
Maura





tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] eager to learn how to use sapply, lapply, ...

2009-04-26 Thread jim holtman

I think you can replace your 'for' loop with vectorized operations:

  if(NW  0){
  Rnorm - Rnorm + sum(abs(Y[Nstart:(Nstart + NW - 1)])  ^ 2)
}

On Sun, Apr 26, 2009 at 1:42 PM,  mau...@alice.it wrote:
 After a year my R programming style is still very C like.
 I am still writing a lot of for loops and finding it difficult to recognize 
 where, in place of loops, I could just do the
 same with one line of code, using sapply, lapply, or the like.
 On-line examples for such high level function do not help me.
 Even if, sooner or later, I am getting my R scripts to do what I expect, I 
 would really like to shake my C programming style off.
 I am staring at my R script and thinking how can I improve it ?
 For instance, I have a lot of loops similar to the following one and wonder 
 whether I can replace them with a proper call to a high level R function that 
 does the same:

    Nstart - Nfour/(2^Lev) + 1
     Nfinish - Nstart -1 + Nfour/(2^Lev)
     LengLev - Nfinish - Nstart + 1
     NW - floor(LengLev*N/Nfour)
     if(NW  0){
       for(j in Nstart:(Nstart + NW -1)){
          Dw - abs(Y[j])
          Rnorm - Rnorm + Dw^2
       }
     }


 Thank you very much for helping me get better.
 Maura





 tutti i telefonini TIM!


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ANOVA/statistics question

2009-04-26 Thread Peter Flom

drmh douglasrmhol...@googlemail.com wrote

Hello again,
In my situation, I have three variables: pretest, posttest, and cohesion. 

I want to work out the correlation between postest and cohesion. 


cor(cohesion, posttest) gives you this.

I looked at multiple sets of data and created ANOVA tables of them. However,
as pretest and postest are sometimes correlated (with a statistical
significance  0.05), it is necessary to discount the effect of pretest to
work out the real correlation of posttest and coherence.. I need a system
for working out the strength of the correlation between posttest and
coherence, when does actually occur. 

Whether pretest and posttest are correlated, and whether that correlation is 
statistically significant, is irrelevant to your question as posed.  Correlation
is defined between two variables, not among three.  

You might want some sort of regression such as 

lm(cohesion~pretest+posttest)

but you might not


According to my understanding level refers the amount or magnitude of
experimental units.


What is level?  You mention pretest, posttest and cohesion - now you mention 
level.
What are these experimental units?

 Pretest, posttest are scores - range from any value from
0 to 1. Cohesion can be any value.

What exactly would
cor(y[pre == 1], x[pre == 1])
cor(y[pre == 2], x[pre == 2])
give me?


well, you said above that pretest and posttest can range from 0 to 1; if this 
is the case, pre would rarely be 1 and never be 2, so the first line above 
wouldn't give you much, and the second wouldn't give you anything.  Also, you 
are now using y and x instead of (presumably) cohesion and posttest, and pre 
instead of, presumably, pretest.


Peter

Peter L. Flom, PhD
Statistical Consultant
www DOT peterflomconsulting DOT com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] eager to learn how to use sapply, lapply, ...

Also you don't need the abs since you are
squaring it anyways and seq with length
argument is a bit cleaner:

ix - seq(Nstart, length = NW)
sum(Y[ix]^ 2)

Also please read the last line on every
message to r-help.  The code in your
post lacks all 4 of the asked for
ingredients:  1. there are no comments,
2. it has lines not subsequently used
(not minimal), 3. it not self-contained
and 4. there is no reproducible
calculation.

On Sun, Apr 26, 2009 at 2:04 PM, jim holtman jholt...@gmail.com wrote:
 I think you can replace your 'for' loop with vectorized operations:

  if(NW  0){
      Rnorm - Rnorm + sum(abs(Y[Nstart:(Nstart + NW - 1)])  ^ 2)
    }

 On Sun, Apr 26, 2009 at 1:42 PM,  mau...@alice.it wrote:
 After a year my R programming style is still very C like.
 I am still writing a lot of for loops and finding it difficult to 
 recognize where, in place of loops, I could just do the
 same with one line of code, using sapply, lapply, or the like.
 On-line examples for such high level function do not help me.
 Even if, sooner or later, I am getting my R scripts to do what I expect, I 
 would really like to shake my C programming style off.
 I am staring at my R script and thinking how can I improve it ?
 For instance, I have a lot of loops similar to the following one and wonder 
 whether I can replace them with a proper call to a high level R function 
 that does the same:

    Nstart - Nfour/(2^Lev) + 1
     Nfinish - Nstart -1 + Nfour/(2^Lev)
     LengLev - Nfinish - Nstart + 1
     NW - floor(LengLev*N/Nfour)
     if(NW  0){
       for(j in Nstart:(Nstart + NW -1)){
          Dw - abs(Y[j])
          Rnorm - Rnorm + Dw^2
       }
     }


 Thank you very much for helping me get better.
 Maura





 tutti i telefonini TIM!


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dotplot: labeling coordinates for each point

2009-04-26 Thread Deepayan Sarkar

On 4/26/09, Qifei Zhu zhu_qi...@yahoo.com.sg wrote:
 Hi David,

  Thanks! It looks much better now. but is there any way to add (x,y)
  coordinates as labels to all the points in the graph? Best case if I can
  enforce some conditions saying if (y10,000) label, else no label. Any
  advice is appreciated.

Sure, write a panel function. See the examples in ?xyplot.

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Generate ramified structures

2009-04-26 Thread Talita Perciano

Hello,

I would like to generate ramified structures like plant root systems, river
networks or trees and save the generated structure as an image. Does anyone
knows if there is a way to do that with R?

Thank you in advance,

Talita

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rmpi failing to install with all latest MPI packages and config arguments

2009-04-26 Thread Vince Fulco

On FC10 with openmpi and mpich2 installed, the command R CMD INSTALL
Rmpi_0.5-7.tar.gz --configure-args=--with-mpi=/usr/lib64/openmpi (or
/usr/lib64/mpich2) fails with the error ''cannot find mpi.h.  Doing a
(s)locate indicates no header file labeled as such. Would appreciate
any trailheads.

TIA,

V.

-- 
Vince Fulco, CFA, CAIA
612.424.5477 (universal)
vful...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RWeka: How to access AttributeEvaluators

2009-04-26 Thread Michael Olschimke

Hi,

I'm trying to use Information Gain for feature selection.

There is a InfoGain implementation in Weka:

*weka.attributeSelection.InfoGainAttributeEval*

Is it possible to use this function with RWeka? If yes how?
list_Weka_interfaces doesn't show it and there is no make function for
AttributeEvaluators.

Is there any other implementation of InformationGain in R?

Thank you

Michael Olschimke
MSIS Graduate Student
Santa Clara University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simulate arima model

2009-04-26 Thread Rolf Turner



On 26/04/2009, at 3:56 PM, Rebecca1117 wrote:



I am new in R.

I can simulate Arma, using Arima.sim

However, I want to simulate an Arima Model. Say (1-B)Zt=5+(1-B)at.  
I do not

know how to deal with 5 in this model.

Can any one could help me?
Thank you very much!


If this is a homework problem your instructor needs to learn some  
time series!


The specific model that you have stated is ill-defined.  First of all  
note that
Z_t cannot be stationary in mean, otherwise you'd have mu - mu = 5,  
or 0 = 5,

which is not true!

If you assume that E(Z_t) = mu_t you get mu_t = 5 + mu_{t-1} so mu_t  
= 5t + mu_0.


So you ``could'' (but wait a bit, you can't!) generate say W_t  
according to
(1-B)W_t = (1-B)a_t and then set Z_t = W_t + 5t + mu_0 (for any mu_0  
that you like).


But the problem is that the (1-B) ``cancels'' in the W_t model so the  
W_t are

not well-defined.  You need to get it clearer what you want to do.

Note that in general having (1-B) terms in the coefficient of a_t is  
to be
avoided.  This makes the model non-invertible which implies problems  
with

forecasting.

For a ***stationary*** ARMA model phi(B)Z_t = phi_0 + theta(B)a_t you  
could

generate W_t according to phi(B)W_t = theta(B)a_t and then set

Z_t = W_t + mu

where mu = phi_0/phi(1).

HTH

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lme - nlminb problem, convergence error code = 1

2009-04-26 Thread Katrin Wolf

Hi there,

I have one problem calculating an linear mixed model. 
I have a repeated measurement project for several gases. These were measured in 
3 different altitudes (sites) and three different positions within these sites 
(plot). each  I now wanted to find out if the gasfluxes rather depend on the 
site variable or the plot variable. That for I created a colum named zu that 
contains a name for every plot at each site. 
I selected month as the ramdom effect and zu as fixed effects. 
By calculating the following model:

 flux-read.table(C:/Für die Arbeit/DatenEcua/CSVDateien/PM.csv, sep=,, 
header=T)

 
#Mixed Effects models mit temporal Pseudorepetition#

library(nlme)
library(lattice)

  fixed= CO2~zu
  random= ~month|nr

results-groupedData(CO2~month |nr ,outer=~zu, flux)
results


plot(results)
plot(results,outer=T)


  model1-lme(CO2~zu-1,random=~month|nr, data=results, na.action=na.exclude)
  summary(model1)

I get a result.
I am trying to compute the same model with another gas of the same original 
data table, I get the ollowing error:

   model1-lme(N2O~zu-1,random=~month|nr, data=results, na.action=na.exclude)
Error in lme.formula(N2O ~ zu - 1, random = ~month | nr, data = results,  : 
  nlminb problem, convergence error code = 1
  message = iteration limit reached without convergence (9)


Could you help me with that problem? I already tried to transform the data into 
log functions and I tried to calculate different subsets od the data. Sometimes 
I works and sometimes I get the same error reply. 
I would be graet if you could help me solve this problem!
Thank a lot!
Katrin



Here the Database:


sitezu  nr  plotmonth   CO2 N2O CH4 NO
B   BQ  1   Q   5   344602.6924 5.130347965 
-59.8957518 1.55065598
B   BQ  2   Q   5   398901.7053 1.101199283 
-271.00594741.488116402
B   BQ  3   Q   5   454894.8994 8.696055188 
-55.476268261.205838717
B   BQ  4   Q   5   327387.374  7.096757818 
-60.021691340.871854232
B   BQ  5   Q   5   384996.0642 1.801588959 
-80.045036251.935790284
B   BQ  6   Q   5   422636.9806 14.07652557 
-82.792098142.109318575
B   BL  7   L   5   505346.8111 10.54944828 
-32.922473662.401900921
B   BL  8   L   5   327256.1889 4.54452141  
-108.9732999NA
B   BL  9   L   5   297821.3081 5.790905912 
-69.129990690.97480589
B   BL  10  L   5   312800.8307 1.799366694 
-37.871406561.867861474
B   BL  11  L   5   276755.2239 16.39601357 
-116.85989733.983788919
B   BL  12  L   5   325992.6072 -1.662834798
-59.079287040.754559239
B   BF  13  F   5   407045.5569 5.520809388 
-69.107270050.217938814
B   BF  14  F   5   252150.4253 0.808068555 
-58.6850321 0.121090905
B   BF  15  F   5   313959.6195 3.835605888 
-54.551791740.037920753
B   BF  16  F   5   434645.7327 20.46019005 
-99.857246672.434052894
B   BF  17  F   5   416709.795  8.075076552 
-131.455198 3.872888994
B   BF  18  F   5   516290.4874 14.85285242 
-127.28140771.683358728
B   BQ  1   Q   7   298659.4224 3.156305761 
-113.88991372.26663945
B   BQ  2   Q   7   373406.6274 2.192962923 
-112.31764781.823378652
B   BQ  3   Q   7   570586.4459 8.363801259 
-28.996126081.699736881
B   BQ  4   Q   7   282344.4747 4.603852092 
-39.118680472.105567533
B   BQ  5   Q   7   365304.795  38.2299814  
-39.500954831.257537129
B   BQ  6   Q   7   374278.1826 -3.108669512
-72.554797772.356678228
B   BL  7   L   7   591279.0009 8.331596502 
-58.676097117.167617014
B   BL  8   L   7   281588.779  2.587431098 
-137.63227064.522019435
B   BL  9   L   7   308005.3214 2.252557185 
-86.738601881.686565651
B   BL  10  L   7   160381.235  9.466091319 
-41.258319333.912095613
B   BL  11  L   7   234260.8773 4.12114588  
-120.199607 2.507914542
B   BL  12  L   7   236250.0334 2.089088085 
-52.132233631.020600881
B   BF  13  F   7   434360.9711 3.847437137 
-68.6534629 0.131454075
B   BF  14  F   7   298024.1078 1.664335803 
-54.332999350.121889317
B   BF  15  F   7   373623.5483

Re: [R] Question of Quantile Regression for Longitudinal Data

2009-04-26 Thread roger koenker

I was trying to resist responding to this question since the original
questioner

had already been admonished twice last october about asking questions
on R-help about posted code that was not only not a part of R-base,
but not even a part of an R package. But the quoted comment about
Stata is too enticing a provocation to resist.

First, it should be said that omitting intercepts in any regression
setting

should be undertaken at one's peril it is generally a very dangerous
activity, somewhat akin to fitting interactions without main effects,
but if
there is a good rational for it, it is no different in principle for
median
regression than for mean regression. It may well be that Stata
prohibits
this sort of thing out of some sort of paternalistic motive, but in R
the
usual formula convention y ~ x1 + x2 -1 suffices. Of course it
situations
in which such a formula is used for several quantiles it should be
understood

that it is forcing each conditional quantile function through the origin
effectively implies that the conditional distribution degenerates to a
point

mass at the origin.

Second, I would like to remark that closed-form solutions are in
the eye

of the beholder, and many people who can recall the infamous formula:

betahat = (X'X)^{-1} X'y

would be hard pressed to dredge up enough linear algebra to use the
formula for anything more than the bivariate case on the proverbial
desert island without the aid of their trusty laptop Friday.

Finally, cbind(1,x) does introduce an intercept in the code
originally asked
about, so if you don't want an intercept don't do that, but be sure
that that is

really want you want to do.

url:www.econ.uiuc.edu/~rogerRoger Koenker
email rkoen...@uiuc.edu Department of Economics
vox:217-333-4558University of Illinois
fax:217-244-6678Champaign, IL 61820

On Apr 26, 2009, at 6:35 AM, Tirthankar Chakravarty wrote:

This is a nontrivial problem. This comes up often on the Statalist
(-qreg- is for cross-section quantile regression):

You will probably have to program this by hand. Note also the
degeneracy conditions in Koenker (2003, pg. 36--). I am not sure how
this extends to panel data though.

References:
@book{koenker2005qre,
title={{Quantile Regression; Econometric Society Monographs}},
author={Koenker, R.},
year={2005},
publisher={Cambridge University Press}
}

On Sun, Apr 26, 2009 at 8:24 AM, Helen Chen 96258...@nccu.edu.tw
wrote:

Hi,

I am trying to estimate a quantile regression using panel data. I
am trying
to use the model that is described in Dr. Koenker's article. So I
use the

code the that is posted in the following link:

http://www.econ.uiuc.edu/~roger/research/panel/rq.fit.panel.R

How to estimate the panel data quantile regression if the regression
contains no constant term? I tried to change the code of
rq.fit.panel by

delect X=cbind(1,x) and would like to know is that correct ?

--
To every ω-consistent recursive class κ of formulae there correspond
recursive class signs r, such that neither v Gen r nor Neg(v Gen r)
belongs to Flg(κ) (where v is the free variable of r).

[R] R 64-bit for Ubuntu 9.04 64-bit

2009-04-26 Thread Tom La Bone


I just installed Ubuntu 9.04 but there does not seem to be repository for
binaries for this version. Are there going to be such repositories set up in
the near future?

Tom   
-- 
View this message in context: 
http://www.nabble.com/R-64-bit-for-Ubuntu-9.04-64-bit-tp23246229p23246229.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to install R really locally?

2009-04-26 Thread Oliver Kullmann

Hello,

my first attempt at installing version 2.9.0 failed
because I got an error
Error in library(pspline) : there is no package called 'pspline'

Later I realised that this comes from HOME/.RProfil, and removing
that files solves that problem.

However, I'm actually glad that this error happened, since it shows
a deeper problem (which is actually not solved yet):
My context is that I re-distribute R as a part of an open-source
library I develop, and this library (actually I call it a research
environment) installs many things (like R and Maxima, gcc, git, ...),
and this all purely locally --- it shouldn't interfere with anything
the user has installed.

So my question is how can I tell R as installation time that it should
not look at any configuration files or other files whatsoever (so it should
for example ignore HOME/.RProfil)?

The installation instructions mention the variable rhome, but I don't
understand what type of home is meant here. What I could need here
would be a redefinition of the user home-directory (to a local directory
in my installation), but I guess that is not meant with rhome.

Hope somebody can help here.

Oliver

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get rid of loop?

2009-04-26 Thread Peter Alspach

Tena koe Ken

Would something along the following lines do what you require:

set.seed(1)
x - runif(100)
y - rep(NA, length(x))
y[x0.25] - -1
y[x0.75] - 1
y[-1][y[-length(y)]%in%1  (x[-1]=0.25  x[-1]0.5)] - 0
# etc

HTH 

Peter Alspach

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Ken-JP
 Sent: Saturday, 25 April 2009 12:16 p.m.
 To: r-help@r-project.org
 Subject: [R] How to get rid of loop?
 
 
 set.seed(1)
 x - runif(100)
 
 # I want to calculate y such that:
 #
 # 1. if x0.75, y - 1
 # 2. else if x0.25, y - -1
 # 3. else if y_prev==1  x0.5, y - 0
 # 4. else if y_prev==-1  x0.5, y - 0 # 5. else y - 
 y_prev # # 1. and 2. are directly doable without looping.
 #
 # How do I do 3.-5. without looping?  The problem is, I need 
 to run this algorithm over gigs of data, so I # need to avoid 
 looping, if at all possible...
 #
 # - Ken
 
 
 
 --
 View this message in context: 
 http://www.nabble.com/How-to-get-rid-of-loop--tp23226779p23226779.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

The contents of this e-mail are confidential and may be subject to legal 
privilege.
 If you are not the intended recipient you must not use, disseminate, 
distribute or
 reproduce all or any part of this e-mail or attachments.  If you have received 
this
 e-mail in error, please notify the sender and delete all material pertaining 
to this
 e-mail.  Any opinion or views expressed in this e-mail are those of the 
individual
 sender and may not represent those of The New Zealand Institute for Plant and
 Food Research Limited.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R 64-bit for Ubuntu 9.04 64-bit

2009-04-26 Thread Dirk Eddelbuettel


On 26 April 2009 at 13:25, Tom La Bone wrote:
| I just installed Ubuntu 9.04 but there does not seem to be repository for
| binaries for this version. Are there going to be such repositories set up in
| the near future?

As I understand it Michael and Vincent are working on it right now.

This would have been a good question for r-sig-debian list for Debian / Ubuntu.

Dirk

-- 
Three out of two people have difficulties with fractions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rmpi failing to install with all latest MPI packages and config arguments

2009-04-26 Thread Dirk Eddelbuettel


On 26 April 2009 at 14:50, Vince Fulco wrote:
| On FC10 with openmpi and mpich2 installed, the command R CMD INSTALL
| Rmpi_0.5-7.tar.gz --configure-args=--with-mpi=/usr/lib64/openmpi (or
| /usr/lib64/mpich2) fails with the error ''cannot find mpi.h.  Doing a
| (s)locate indicates no header file labeled as such. Would appreciate
| any trailheads.

Do you actually have the releveant -dev packages installed?

This could have been a good question for r-sig-hpc or the fc/rh sig list.

Dirk

-- 
Three out of two people have difficulties with fractions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rmpi failing to install with all latest MPI packages and config arguments

2009-04-26 Thread Vince Fulco

Agreed on the redirect to SIG.  mpich2-devel and -libs are installed.

V.

On Sun, Apr 26, 2009 at 4:51 PM, Dirk Eddelbuettel e...@debian.org wrote:

 On 26 April 2009 at 14:50, Vince Fulco wrote:
 | On FC10 with openmpi and mpich2 installed, the command R CMD INSTALL
 | Rmpi_0.5-7.tar.gz --configure-args=--with-mpi=/usr/lib64/openmpi (or
 | /usr/lib64/mpich2) fails with the error ''cannot find mpi.h.  Doing a
 | (s)locate indicates no header file labeled as such. Would appreciate
 | any trailheads.

 Do you actually have the releveant -dev packages installed?

 This could have been a good question for r-sig-hpc or the fc/rh sig list.

 Dirk

 --
 Three out of two people have difficulties with fractions.




-- 
Vince Fulco, CFA, CAIA
612.424.5477 (universal)
vful...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function returns R object with name based on input

2009-04-26 Thread Jennifer Brea


Thanks for your replies.  I ended up using the following:



df = data.frame(year 
=c(1991,1991,1992,1992,1993,1993,1992,1991),x=rnorm(8),y=rnorm(8))

df

 year  x   y
1 1991  0.5565083 -1.31364232
2 1991  0.1686598 -0.20344656
3 1992 -0.1010090 -0.65681852
4 1992  0.6130324 -0.10788605
5 1993 -0.9061458 -0.64872139
6 1993 -0.4460332  0.07253762
7 1992 -0.3865464 -1.87445996
8 1991  0.9252679  0.14891506

dfs = split(df,df$year)
dfs[['1991']]

 year x  y
1 1991 0.5565083 -1.3136423
2 1991 0.1686598 -0.2034466
8 1991 0.9252679  0.1489151

dfs[['1992']]

 year  x  y
3 1992 -0.1010090 -0.6568185
4 1992  0.6130324 -0.1078861
7 1992 -0.3865464 -1.8744600

Notice that split automatically uses a character version of the values 
of the split variable to name its output.


Once you've created the list, you can use sapply or lapply
to process each piece.  Let's say we wanted the regression coefficients 
for the regression of y on x for each year:



regs = sapply(dfs,function(d)coef(lm(y~x,data=d)))
regs

 1991   1992  1993
(Intercept) -0.6964841 -0.9456066 0.7717261
x0.4370229  1.5752294 1.5675705







David Winsemius wrote:


On Apr 24, 2009, at 11:56 AM, Jennifer Brea wrote:

I wanted to ask how I can make a for loop or a function return an R 
object with a unique name based on either some XX of the for loop or 
some input for the function.


For example

if I have a function:

fn-function(data,year){

which does does some stuff
}

How do I return an object from the function called X.year, such that 
if I run fn(data,1989), the output is an object called X.1989?


Read:
?assign
?paste
#and FAQ 7.21
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f 



In a separate but related process, I'm also trying to subset data by 
year, where there are multiple observations by years, using the 
subset() function.  For example:


data.1946-subset(data, year==1946)
data.1947-subset(data, year==1947)
data.1948-subset(data, year==1948)
data.1949-subset(data, year==1949)
...


list.of.subsets - sapply(1946:200, function(x) subset(data, year==x) 
) # with no example ... untested


Using data as a dataframe names is poor R programming practice, since 
many functions use data a a parameter name and it is also a function 
name.





How should I set this up?  I was thinking of writing a for loop, but 
I have never written a for loop that creates objects based on the 
loop's index, for example a loop for(i in 1946:2000) that returns 55 
objects with the object names based on the index.




David Winsemius, MD
Heritage Laboratories
West Hartford, CT


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bumps chart in R

2009-04-26 Thread Andreas Christoffersen

Hi there,

I would like to make a 'bumps chart' like the ones described e.g.
here: http://junkcharts.typepad.com/junk_charts/bumps_chart/

Purpose: I'd like to plot the proportion of people in select countries
living for less then one USD pr day in 1994 and 2004 respectively. I
have already constructed a barplot - but I think a bumps chart would
be better

# The barplot and data
countries - c(U-lande, Afrika syd for sahara, Europa og
Centralasien, Lantinamerika og Caribien,Mellemøstenog Nordafrika,
Sydasien,ØStasien og stillehaveet, Kina, Brasilien)
poor_1990 - c(28.7,46.7,0.5,10.2,2.3,43,29.8,33,14)
poor_2004 - c(18.1,41.1,0.9,8.6,1.5,30.8,9.1,9.9,7.5)
poor - cbind(poor_1990,poor_2004)
rownames(poor) - countries
oldpar - par(no.readonly=T)
par - par(mar=c(15,5,5,1))
png(poor.png)
par - par(mar=c(15,5,5,1))
barplot(t(poor[order(poor[,2]),]),beside=T,col=c(1,2),las=3,ylab=%
poor,main=Percent living for  1 USD per day (1993
prices),ylim=c(0,50))
legend(topleft,c(1990,2004),fill=c(1,2),bty=n)
par(oldpar)
dev.off()

I Guess I need to start with an normal plot? Something like the below
- but there is a loong way to go...

# A meager start - how to finish my bumps chart
plot(c(rep(1,9),rep(2,9)),c(fattig_1990,fattig_2004),type=b,ann=F)

Thankfull for any help.

Cheers.

Andreas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] eager to learn how to use sapply, lapply, ...

Have a look at the plyr package and associated documentation -
http://had.co.nz/plyr

Hadley

On Sun, Apr 26, 2009 at 12:42 PM,  mau...@alice.it wrote:
 After a year my R programming style is still very C like.
 I am still writing a lot of for loops and finding it difficult to recognize 
 where, in place of loops, I could just do the
 same with one line of code, using sapply, lapply, or the like.
 On-line examples for such high level function do not help me.
 Even if, sooner or later, I am getting my R scripts to do what I expect, I 
 would really like to shake my C programming style off.
 I am staring at my R script and thinking how can I improve it ?
 For instance, I have a lot of loops similar to the following one and wonder 
 whether I can replace them with a proper call to a high level R function that 
 does the same:

    Nstart - Nfour/(2^Lev) + 1
     Nfinish - Nstart -1 + Nfour/(2^Lev)
     LengLev - Nfinish - Nstart + 1
     NW - floor(LengLev*N/Nfour)
     if(NW  0){
       for(j in Nstart:(Nstart + NW -1)){
          Dw - abs(Y[j])
          Rnorm - Rnorm + Dw^2
       }
     }


 Thank you very much for helping me get better.
 Maura





 tutti i telefonini TIM!


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bumps chart in R

In statistics, a bumps chart is more commonly called a parallel
coordinates plot.

Hadley

On Sun, Apr 26, 2009 at 5:45 PM, Andreas Christoffersen
achristoffer...@gmail.com wrote:
 Hi there,

 I would like to make a 'bumps chart' like the ones described e.g.
 here: http://junkcharts.typepad.com/junk_charts/bumps_chart/

 Purpose: I'd like to plot the proportion of people in select countries
 living for less then one USD pr day in 1994 and 2004 respectively. I
 have already constructed a barplot - but I think a bumps chart would
 be better

 # The barplot and data
 countries - c(U-lande, Afrika syd for sahara, Europa og
 Centralasien, Lantinamerika og Caribien,Mellemøstenog Nordafrika,
 Sydasien,ØStasien og stillehaveet, Kina, Brasilien)
 poor_1990 - c(28.7,46.7,0.5,10.2,2.3,43,29.8,33,14)
 poor_2004 - c(18.1,41.1,0.9,8.6,1.5,30.8,9.1,9.9,7.5)
 poor - cbind(poor_1990,poor_2004)
 rownames(poor) - countries
 oldpar - par(no.readonly=T)
 par - par(mar=c(15,5,5,1))
 png(poor.png)
 par - par(mar=c(15,5,5,1))
 barplot(t(poor[order(poor[,2]),]),beside=T,col=c(1,2),las=3,ylab=%
 poor,main=Percent living for  1 USD per day (1993
 prices),ylim=c(0,50))
 legend(topleft,c(1990,2004),fill=c(1,2),bty=n)
 par(oldpar)
 dev.off()

 I Guess I need to start with an normal plot? Something like the below
 - but there is a loong way to go...

 # A meager start - how to finish my bumps chart
 plot(c(rep(1,9),rep(2,9)),c(fattig_1990,fattig_2004),type=b,ann=F)

 Thankfull for any help.

 Cheers.

 Andreas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bumps chart in R

Have a look at plotweb in the bipartite package.

On Sun, Apr 26, 2009 at 6:45 PM, Andreas Christoffersen
achristoffer...@gmail.com wrote:
 Hi there,

 I would like to make a 'bumps chart' like the ones described e.g.
 here: http://junkcharts.typepad.com/junk_charts/bumps_chart/

 Purpose: I'd like to plot the proportion of people in select countries
 living for less then one USD pr day in 1994 and 2004 respectively. I
 have already constructed a barplot - but I think a bumps chart would
 be better

 # The barplot and data
 countries - c(U-lande, Afrika syd for sahara, Europa og
 Centralasien, Lantinamerika og Caribien,Mellemøstenog Nordafrika,
 Sydasien,ØStasien og stillehaveet, Kina, Brasilien)
 poor_1990 - c(28.7,46.7,0.5,10.2,2.3,43,29.8,33,14)
 poor_2004 - c(18.1,41.1,0.9,8.6,1.5,30.8,9.1,9.9,7.5)
 poor - cbind(poor_1990,poor_2004)
 rownames(poor) - countries
 oldpar - par(no.readonly=T)
 par - par(mar=c(15,5,5,1))
 png(poor.png)
 par - par(mar=c(15,5,5,1))
 barplot(t(poor[order(poor[,2]),]),beside=T,col=c(1,2),las=3,ylab=%
 poor,main=Percent living for  1 USD per day (1993
 prices),ylim=c(0,50))
 legend(topleft,c(1990,2004),fill=c(1,2),bty=n)
 par(oldpar)
 dev.off()

 I Guess I need to start with an normal plot? Something like the below
 - but there is a loong way to go...

 # A meager start - how to finish my bumps chart
 plot(c(rep(1,9),rep(2,9)),c(fattig_1990,fattig_2004),type=b,ann=F)

 Thankfull for any help.

 Cheers.

 Andreas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RODBC - XLSX files - dropping/clearing sheets

2009-04-26 Thread Daniel Bradley

Hi!

I'm manipulating XLSX data using RODBC, however a limitation which appears
to be driver based is that you can't clear or drop sheets from the XLSX
files, as per the following example:

 library(RODBC)
 xlsx-odbcDriverConnect(DRIVER=Microsoft Excel Driver (*.xls, *.xlsx,
*.xlsm, *.xlsb);DBQ=c:\\documents and settings\\desktop\\testxlsx.xlsx;
ReadOnly=False)
 sqlClear(xlsx,newsheet2,errors=TRUE)
[1] [RODBC] ERROR: Could not
SQLExecDirect

[2] HY000?Þêÿÿ\003 -5410 [Microsoft][ODBC Excel Driver] Deleting data in a
linked table is not supported by this ISAM.
 sqlClear(xlsx,newsheet2,errors=TRUE)
[1] [RODBC] ERROR: Could not
SQLExecDirect

[2] HY000?Þêÿÿ\003 -5410 [Microsoft][ODBC Excel Driver] Deleting data in a
linked table is not supported by this ISAM.

I'm wondering if anyone has or knows of a work around for this beyond
converting the sheets to CSV files.  For context, I'm trying to update data
on about 20 spreadsheets as a daily event, pulling data from MySql,
formatting it, then overwriting the existing data on the spreadsheets.  This
is the last piece of the puzzle.  Until the next puzzle.


Thanks!
Dan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get rid of loop?

2009-04-26 Thread Ken-JP



The code below shows what I'm trying to get rid of.

If there is no way to get rid of the loop, I will try to use package( inline
).
I'm just curious as to whether there is a vector way of doing this
algorithm.

#
-

set.seed(1)
x - runif(100)
n - length( x )
y - rep(NA, n)
yprev - 0;
for ( i in (1:n)) {
if ( x[i]0.75 ) {
y[i] - 1;
} else if ( x[i]0.25 ) {
y[i] - -1;
} else if ( yprev==1  x[i]0.5) {
y[i] - 0;
} else if ( yprev==-1  x[i]0.5) {
y[i] - 0;
} else {
y[i] - yprev
}
yprev - y[i];
}

 y
  [1]  0  0  0  1 -1  1  1  1  1 -1 -1 -1  0  0  1  0  0  1  0  1  1 -1  0
-1 -1
 [26] -1 -1 -1  1  0  0  0  0 -1  1  1  1 -1  0  0  1  1  1  1  1  1 -1 -1 
0  0
 [51]  0  1  0 -1 -1 -1 -1  0  0  0  1  0  0  0  0  0  0  1 -1  1  0  1  0 
0  0
 [76]  1  1  0  1  1  0  0  0  0  1 -1  0 -1 -1 -1 -1 -1  0  1  1  1  0  0 
1  1


-- 
View this message in context: 
http://www.nabble.com/How-to-get-rid-of-loop--tp23226779p23248206.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bumps chart in R

2009-04-26 Thread Mike Lawrence

Here's a ggplot2 based solution:

#load the ggplot2 library
library(ggplot2)

#here's the data provided by Andreas
countries - c(U-lande, Afrika syd for sahara, Europa og
Centralasien, Lantinamerika og Caribien,Mellemøstenog
Nordafrika,Sydasien,ØStasien og stillehaveet, Kina,
Brasilien)
poor_1990 - c(28.7,46.7,0.5,10.2,2.3,43,29.8,33,14)
poor_2004 - c(18.1,41.1,0.9,8.6,1.5,30.8,9.1,9.9,7.5)

#reformat the data
data = data.frame(countries,poor_1990,poor_2004)
data = melt(data,id=c('countries'),variable_name='year')
levels(data$year) = c('1990','2004')

#make a new column to make the text justification easier
data$hjust = 1-(as.numeric(data$year)-1)

#start the percentage plot
p = ggplot(
data
,aes(
x=year
,y=value
,groups=countries
)
)

#add the axis labels
p = p + labs(
x = '\nYear'
, y = '%\n'
)

#add lines
p = p + geom_line()

#add the text
p = p + geom_text(
aes(
label=countries
, hjust = hjust
)
)

#expand the axis to fit the text
p = p + scale_x_discrete(
expand=c(2,2)
)

#show the plot
print(p)


#rank the countries
data$rank = NA
data$rank[data$year=='1990'] = rank(data$value[data$year=='1990'])
data$rank[data$year=='2004'] = rank(data$value[data$year=='2004'])

#start the rank plot
r = ggplot(
data
,aes(
x=year
,y=rank
,groups=countries
)
)

#add the axis labels
r = r + labs(
x = '\nYear'
, y = 'Rank\n'
)

#add the lines
r = r + geom_line()

#add the text
r = r + geom_text(
aes(
label=countries
, hjust = hjust
)
)

#expand the axis to fit the text
r = r + scale_x_discrete(
expand=c(2,2)
)

#show the plot
print(r)


-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to create a graph layout?

2009-04-26 Thread Christian Bustamante

I all,
I want to create a graph layout in a 3x3 matrix like this:

ylab  |__|  |__|   |__|
   ___  ___   ___
ylab  |__|  |__|   |__|
   ___  ___   ___
ylab  |__|  |__|   |__|
xl xl xl

With this layout, then I'll insert the 9 plots. How ca I create it?


-- 
CdeB

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plotting polynomial fit

2009-04-26 Thread Ronnen Levinson

Hi.

Is there an analog to abline() that can be used to plot a polynomial fit?

For example, I can draw the straight-line fit

fit - lm(y ~ x)

via

abline(coef=fit$coef)

but I'm not sure how to draw the polynomial fit

fit - lm(y ~ poly(x,2))

I do see the function curve(), but not how to prepare an expr for 
curve() based on the coefficients returned by the polynomial fit.

Thanks for your help,

/Ronnen.

/P.S. E-mailed CCs of posted replies appreciated.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RODBC - XLSX files - dropping/clearing sheets