Re: [R] mlogit and model-based recursive partitioning

2012-10-02 Thread Achim Zeileis

Tudor:

Has anyone tried to model-based recursive partition (using mob from 
package party; thanks Achim and colleagues) a data set based on a 
multinomial logit model (using mlogit from package mlogit; thanks Yves)?


Interesting question: in principle, this is possible but I wouldn't know 
of anyone who has tried this.


I attempted to do so, but there are at least two reasons why I could 
not. First, in mob I am not quite sure that a model of class StatModel 
exists for mlogit models.  Second, as mlogit uses the pipe character | 
to specify the model, I wonder how this would interact with mob which 
uses pipe to differentiate between explanatory and segmentation 
variables.


This is one but not the only complication when trying to actually combine 
mlogit and mob. I think the building blocks would have to be:


- Set up the data plus formula handling. As you point out, that would need 
a three-part formula separating alternative-specific and subject-specific 
regressors and partitioning variables. Furthermore you would probably need 
to translate between the long format used by mlogit (subjects x 
alternatives) to the wide format because mob would want to partition the 
subjects.


- A StatModel object would be required. Personally, if I wanted to do it, 
would try to set up the StatModel object on the fly (rather than predefine 
it in a package) so that the StatModel creator can depend on the 
formula/data. The formula/data processing described above can be done 
inside the StatModel object.


- Finally, the required methods for the fitted model object would have to 
be defined. In particular, the subject-specific gradients would be 
required. I think currently, mlogit just provides the overall gradient.


So, in summary: It can be done but it would likely need more than just an 
hour of coding...


hth,
Z


An example (not working) of what I would like to accomplish follows below.

Thanks a lot.

Tudor

library(party)
library(mlogit)
data(Fishing, package = mlogit)
Fish - mlogit.data(Fishing, varying = c(2:9), shape = wide, choice =
mode)
# FIT AN mlogit MODEL
m1 - mlogit(mode ~ price + catch, data=Fish)
# THE DESIRED END RESULT:  SEGMENT m1 BASED ON INCOME AND/OR OTHER POSSIBLE
COVARIATES





--
View this message in context: 
http://r.789695.n4.nabble.com/mlogit-and-model-based-recursive-partitioning-tp4644743.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Basic question about: - and method start with dot.

2012-10-02 Thread Rolf Turner


On 02/10/12 17:00, Fabrice Tourre wrote:

Dear list,

When I read some source code, I find lot of place used  symbol - , e.g.

lastTime - newTime;

What is the meaning here?

?-

See also:

require(fortunes)
fortune(-)


Also, I find some method with the name start with dot, e.g.

.RowStandardizeCentered = function(x) {
div = sqrt( rowSums(x^2) );
div[ div == 0 ] = 1;
return( x/div );
}

What is the special meaning for the method name start with a dot?

?ls
Note the argument all.names.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transform pairwise observations into a table

2012-10-02 Thread arun
HI David,

Thanks for your input.
Third and fourth solutions looks very interesting.  

with(dat1,tapply(coef,list(ind1,ind2),function(x) x))
# also gave the same result.
# I tried with aggregate() and ddply(), but may be not as elegant solution as 
yours.
 
matrix(with(dat1,aggregate(coef,list(ind1,ind2),FUN=function(x) x)[,3]),nc=4)
library(plyr)
matrix(ddply(dat1,.(ind1,ind2),FUN=function(x) x$coef)[,3],nc=4)
#  [,1]  [,2]  [,3] [,4]
#[1,] 1.000 0.250 0.125  0.5
#[2,] 0.250 1.000 0.125  0.5
#[3,] 0.125 0.125 1.000  0.5
#[4,] 0.500 0.500 0.500  1.0
A.K.





- Original Message -
From: David Winsemius dwinsem...@comcast.net
To: arun smartpink...@yahoo.com
Cc: Hadjixenofontos, Athena ahadjixenofon...@med.miami.edu; R help 
r-help@r-project.org; Bert Gunter gunter.ber...@gene.com
Sent: Monday, October 1, 2012 10:33 PM
Subject: Re: [R] Transform pairwise observations into a table


On Oct 1, 2012, at 2:30 PM, arun wrote:

 HI AHJ,
 No problem.
 
 One more way in addition to reshape() (Rui's suggestion) to get the same 
 result.
 library(reshape)
 
 as.matrix(cast(melt(dat1,id=c(ind1,ind2)),ind1~ind2,value=value))
 #      1     2     3   4
 #1 1.000 0.250 0.125 0.5
 #2 0.250 1.000 0.125 0.5
 #3 0.125 0.125 1.000 0.5
 #4 0.500 0.500 0.500 1.0
 A.K.
 
That looks a tad ... well, ... complicated. So perhaps these base-only 
solutions with tapply might be more accessible: Some of them do border on the 
whimsical, I will admit:

with (dat1, tapply(coef, list(ind1,ind2), I))

with (dat1, tapply(coef, list(ind1,ind2), c))

with (dat1, tapply(coef, list(ind1,ind2), ^, 1))

with (dat1, tapply(coef, list(ind1,ind2), +, 0))

It is a specific response to the request for a `table`-like function tha 
twouldallow the application of other functions. Cnage the `1` to `2` in the 
third instance and you get the tabulated squares. And do not forget the 
availability of `ftable` to flatten the output of `tapply` retunred values.


-- 
David.
 
 
 - Original Message -
 From: Hadjixenofontos, Athena ahadjixenofon...@med.miami.edu
 To: arun smartpink...@yahoo.com
 Cc: R help r-help@r-project.org
 Sent: Monday, October 1, 2012 12:59 PM
 Subject: Re: [R] Transform pairwise observations into a table
 
 Thank you. I had looked at xtabs but misunderstood the syntax. This is great. 
 :)
 
 AHJ
 
 
 
 On Oct 1, 2012, at 12:53 PM, arun smartpink...@yahoo.com wrote:
 
 Hi,
 Try this:
 
 dat1-read.table(text= 
 ind1 ind2 coef 
 1 1 1 
 1 2 0.25 
 1 3 0.125 
 1 4 0.5 
 2 2 1 
 2 1 0.25 
 2 3 0.125 
 2 4 0.5 
 3 3 1 
 3 1 0.125 
 3 2 0.125 
 3 4 0.5 
 4 4 1 
 4 1 0.5 
 4 2 0.5 
 4 3 0.5 
 ,sep=,header=TRUE) 
 mat1-as.matrix(xtabs(coef~ind1+ind2,data=dat1)) 
 
 #    ind2 
 #ind1     1     2     3     4 
    # 1 1.000 0.250 0.125 0.500 
     #2 0.250 1.000 0.125 0.500 
     #3 0.125 0.125 1.000 0.500 
     #4 0.500 0.500 0.500 1.000 
 
 A.K.
 
 
 
 - Original Message -
 From: AHJ ahadjixenofon...@med.miami.edu
 To: r-help@r-project.org
 Cc: 
 Sent: Monday, October 1, 2012 12:17 PM
 Subject: [R] Transform pairwise observations into a table
 
 Hi,
 
 I have a table of pairs of individuals and a coefficient that belongs to the
 pair:
 
 ind1    ind2    coef
 1    1    1
 1    2    0.25
 1    3    0.125
 1    4    0.5
 2    2    1
 2    1    0.25
 2    3    0.125
 2    4    0.5
 3    3    1
 3    1    0.125
 3    2    0.125
 3    4    0.5
 4    4    1
 4    1    0.5
 4    2    0.5
 4    3    0.5
 
 And I want to convert it to a matrix where each individual is both a row and
 a column and at the intersection of each pair is the coefficient that
 belongs to that pair:
 
      1           2           3             4
 1    1           0.25           0.125    0.5
 2    0.25      1           0.125    0.5
 3    0.125    0.125    1            0.5
 4    0.5           0.5           0.5            1
 
 If table() would allow me to specify something other than frequencies to
 fill the table with, it would be what I need. I tried a few different
 combinations of t() and unique() but none of it made enough sense to post as
 my starting code... I am just lost. Any help would be greatly appreciated.
 
 Thank you,
 AHJ
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Transform-pairwise-observations-into-a-table-tp4644706.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA

__

Re: [R] Basic question about: - and method start with dot.

2012-10-02 Thread peter dalgaard

On Oct 2, 2012, at 06:00 , Fabrice Tourre wrote:

 Dear list,
 
 When I read some source code, I find lot of place used  symbol - , e.g.
 
 lastTime - newTime;
 
 What is the meaning here?

Did you check help(-) ? The explanation there seems at least as clear as 
anything I could cook up in a quick mail...

 
 Also, I find some method with the name start with dot, e.g.
 
 .RowStandardizeCentered = function(x) {
   div = sqrt( rowSums(x^2) );
   div[ div == 0 ] = 1;
   return( x/div );
 }
 
 What is the special meaning for the method name start with a dot?

It means nothing in particular, except that such objects don't show up in ls() 
by default. The _intention_ is usually that the function is only to be used 
internally and not for end-user use.


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] smoothScatter plot

2012-10-02 Thread JiangZhengyu




Hi, I want to make a plot similar to sm1 (attached). The code I tried is: dcols 
- densCols(x,y)
smoothScatter(x,y, col = dcols, pch=20,xlab=A,ylab=B)
abline(h=0, col=red)
 But it turned out to be s1 (attached) with big dots. I was wondering if 
anything wrong with my code. Thanks,Zhengyu 
 attachment: sm1.pngattachment: s1.png__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem about R

2012-10-02 Thread Leung Chen
How to run pie chart for each categorical variable in a dataset?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is there any R function for data normalization?

2012-10-02 Thread Rui Esteves
Hello,

I have a matrix with values, with columns c1..cn.
I need the values to be normalized between 0 and 1 by column.
Therefor, the 0 should correspond to the minimum value in the column c1 and
1 should correspond to the maximum value in the column c1.
The remaining columns should be organized in the same way.

Does a function in R exists for this purpose?

Thanks,
Rui

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any R function for data normalization?

2012-10-02 Thread Duncan Murdoch

On 12-10-02 5:51 AM, Rui Esteves wrote:

Hello,

I have a matrix with values, with columns c1..cn.
I need the values to be normalized between 0 and 1 by column.
Therefor, the 0 should correspond to the minimum value in the column c1 and
1 should correspond to the maximum value in the column c1.
The remaining columns should be organized in the same way.

Does a function in R exists for this purpose?


No, but it is easy to construct one using sweep.  First subtract column 
mins, then divide by column maxes:


normalize - function(x) {
  x - sweep(x, 2, apply(x, 2, min))
  sweep(x, 2, apply(x, 2, max), /)
}

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any R function for data normalization?

2012-10-02 Thread Joshua Wiley
Hi Rui,

It doesn't really need one...

doit - function(x) {(x - min(x, na.rm=TRUE))/(max(x,na.rm=TRUE) -
min(x, na.rm=TRUE))}

# use lapply to apply doit() to every column in a data frame
# mtcars is built into R
normed - as.data.frame(lapply(mtcars, doit))
# very that the range of all is [0, 1]
lapply(normed, range)

Cheers,

Josh

On Tue, Oct 2, 2012 at 2:51 AM, Rui Esteves ruimax...@gmail.com wrote:
 Hello,

 I have a matrix with values, with columns c1..cn.
 I need the values to be normalized between 0 and 1 by column.
 Therefor, the 0 should correspond to the minimum value in the column c1 and
 1 should correspond to the maximum value in the column c1.
 The remaining columns should be organized in the same way.

 Does a function in R exists for this purpose?

 Thanks,
 Rui

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any R function for data normalization?

2012-10-02 Thread Pascal Oettli

Hello,

?scale in base package.

Best Regards,
Pascal


Le 12/10/02 18:51, Rui Esteves a écrit :

Hello,

I have a matrix with values, with columns c1..cn.
I need the values to be normalized between 0 and 1 by column.
Therefor, the 0 should correspond to the minimum value in the column c1 and
1 should correspond to the maximum value in the column c1.
The remaining columns should be organized in the same way.

Does a function in R exists for this purpose?

Thanks,
Rui

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Release plans: R-2.15.2 on October 26

2012-10-02 Thread Peter Dalgaard
Just a note that we intend to have a second patch release version on October 
26. The nickname will be Trick or Treat.

Details of current changes can be found at 
http://stat.ethz.ch/R-manual/R-devel/doc/html/NEWS.html 
(scroll past the R-devel stuff and look at 2.15.1 patched.) 

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

___
r-annou...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any R function for data normalization?

2012-10-02 Thread Rui Barradas

Hello,

Try the following.

fun - function(x){
a - min(x)
b - max(x)
(x - a)/(b - a)
}

mat - matrix(rnorm(12), ncol=3)
apply(mat, 2, fun)

Hope this helps,

Rui Barradas
Em 02-10-2012 10:51, Rui Esteves escreveu:

Hello,

I have a matrix with values, with columns c1..cn.
I need the values to be normalized between 0 and 1 by column.
Therefor, the 0 should correspond to the minimum value in the column c1 and
1 should correspond to the maximum value in the column c1.
The remaining columns should be organized in the same way.

Does a function in R exists for this purpose?

Thanks,
Rui

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem about R

2012-10-02 Thread Rui Barradas

Hello,

What is a dataset? In R there are lots of types, including 'data.frame' 
and 'list'. And R's type 'factor' corresponds to categorical variables. 
You must be more specific, show us an example dataset using dput().


dput( head(x, 30) )  # paste the output of this in a post


Hope this helps,

Rui Barradas

Em 02-10-2012 05:50, Leung Chen escreveu:

How to run pie chart for each categorical variable in a dataset?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lattice xyplot, get current level

2012-10-02 Thread Christof Kluß
Hi

xyplot(y ~ x | subject) plots a separate graph of y against x for each
level of subject. But I would like to have an own function for each
level. Something like

xyplot(y ~ x | subject,
   panel = function(x,y) {
 panel.xyplot(x,y)

 panel.curve(x,y) {
   # something that dependents on the current subject
   ...
 }
   })


How I get the current subject? (The current subject is the title of the
graph, too)

thx
Christof

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Basic question about: - and method start with dot.

2012-10-02 Thread Hadley Wickham
 What is the special meaning for the method name start with a dot?

 It means nothing in particular, except that such objects don't show up in 
 ls() by default. The _intention_ is usually that the function is only to be 
 used internally and not for end-user use.

But these days, if you're writing a package, you're better off using namespaces.

Hadley

-- 
RStudio / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading labels for very large heatmaps

2012-10-02 Thread Jean V Adams
Jacob,

Try increasing the size of the pdf.  For example, I can read all 919 
labels in this plot ...

pdf(width=200, height=200)
plot(1:919, 1:919, axes=FALSE)
axis(1)
axis(2, at=1:919, las=1, cex=0.01)
box()
graphics.off()

Jean



JIMonroe jim...@virginia.edu wrote on 10/01/2012 03:42:24 PM:
 
 Hello,
 I have a large (919 X 919), hierarchically clustered heatmap that I 
would
 like to read the labels off of.  I have tried saving the figure in pdf
 format and enlarging it until I can see the labels, but if I make the 
labels
 small enough to read (so that they don't overlap) using cexRow and 
cexCol,
 they do not appear in the pdf.  The limit seems to be anything below 
cexRow
 or Col = 0.06.  Is there a way to have smaller labels visible in the 
pdf? 
 Is this an issue with resolution?  I could get by with just a quarter of 
the
 heatmap (i.e. half of a row and half of a column) so that the labels 
don't
 have to be so small, but the heatmap must be clustered before this is 
done. 
 I tried to cut the dendrogram at just the halfway point of the heatmap, 
but
 was not successful.  Any suggestions?
 Thanks,
 Jacob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any R function for data normalization?

2012-10-02 Thread Loukia Spineli
In this case you could use the apply function.

Let your k*l matrix is named as y. Then, in order to standardize the values
within each column use the following function

aver-apply(y,2,mean) # calculate the mean within each column
std-apply(y,2,sd) # calculate the stabdard deviation within each column

z-matrix(0,nrow=k,ncol=l)
for(i in 1:k){
   for(j in 1:l){
z[i,j]-(y[i,j]-aver[j])/std[j]
}
}
z

On Tue, Oct 2, 2012 at 12:51 PM, Rui Esteves ruimax...@gmail.com wrote:

 Hello,

 I have a matrix with values, with columns c1..cn.
 I need the values to be normalized between 0 and 1 by column.
 Therefor, the 0 should correspond to the minimum value in the column c1 and
 1 should correspond to the maximum value in the column c1.
 The remaining columns should be organized in the same way.

 Does a function in R exists for this purpose?

 Thanks,
 Rui

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mlogit and model-based recursive partitioning

2012-10-02 Thread tudor
Hi Achim:

Excellent points.  Thank you so much for your prompt reply.

Tudor



--
View this message in context: 
http://r.789695.n4.nabble.com/mlogit-and-model-based-recursive-partitioning-tp4644743p4644767.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem about R

2012-10-02 Thread Jim Lemon

On 10/02/2012 02:50 PM, Leung Chen wrote:

How to run pie chart for each categorical variable in a dataset?


Hi Leung,
I depends upon your definition of categorical. If a character variable 
has 100 possible values, you won't get a very informative pie chart out 
of it. Therefore I suggest the following function as a start:


pie_shopping-function(x,maxval=5) {
 xname-deparse(substitute(x))
 for(var in 1:length(x)) {
  if(length(unique(x[,var])) = maxval) {
   png(paste(xname,var,.png,sep=))
pie(table(x[,var]))
   dev.off()
  }
 }
}

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating an autocorrelated binary variable

2012-10-02 Thread Simon Zehnder
Thank you very much for your answer Rolf. It helped. 

I try to simulate a trade indicator model from market microstructure, where the 
1 or -1 indicate a buyer or seller initiated trade respectively. 
I use a Gaussian copula for simulation, so I can put in some correlation if I 
want to. So I generate my multivariate Gaussian sample and transform it to a 
uniform sample so I can use whatever marginal distribution I want to. 
Then I generate the autocorrelated binary by beginning with a transformation on 
the first observation and then by using the uniform sample to compare it to 
0.6. Afterwards I have a 0.6 autocorrelated binary variable:

sampleCop - function(n = 1000, rho = 0.6) {

require(splus2R)
mvrs - rmvnorm(n + 1, mean = rep(0, 3), cov = diag(3))
pmvrs - pnorm(mvrs, 0, 1)
var1 - matrix(0, nrow = n + 1, ncol = 1)
var1[1] - qbinom(pmvrs[1, 1], 1, 0.5)
if(var1[1] == 0) var1[1] - -1
for(i in  1:(nrow(pmvrs) - 1)) {
if(pmvrs[i + 1, 1] = rho) var1[i + 1] - var1[i]
else var1[i + 1] - var1[i] * (-1)
}
sample - matrix(0, nrow = n, ncol = 4)
sample[, 2] - var1[1:nrow(var1) - 1]
sample[, 3] - var1[2:nrow(var1)]
sample[, 4] - qnorm(pmvrs[1:nrow(var1) - 1, 2], 0, 1, 1, 0) + 
qnorm(pmvrs[1:nrow(var1) - 1, 3], 0, 1, 1, 0) - qnorm(pmvrs[2:nrow(var1), 3], 
0, 1, 1, 0)
sample[, 1] - sample[,2] * (0.015 + 0.03) - sample[,3] * (0.015 + 
0.2*0.03) + sample[,4]

sample

}

It is interesting though, that if I run my GMM-model on it it needs very high 
numbers of observations to converge to the real parameters. Using the 
gmm-function

gmmfun - function(theta, data) {

res = data[,1] - (theta[1] + theta[2]) * data[,2] + (theta[1] + 
theta[3] * theta[2]) * data[,3]

moments = matrix(0, nrow=nrow(data), ncol=3)
moments[,1] = data[,2] * data[,3] - data[,2]^2 * theta[3]
moments[,2] = res * data[,2]
moments[,3] = res * data[,3]

return(moments)
}

And the objective function 

obj_fun - function(theta, data) {
moments - gmmfun(theta, data)
moments_sum - colSums(moments)
val - moments_sum %*% diag(3) %*% moments_sum * 1.0 / nrow(data)
val
}

Running then the command 

optim(fn = obj_fun, par = c(0.05, 0.05, .1), method = BFGS)

gives a result with the first parameter fairly high away from its true value 
(true values for the three parameters were (0.015, 0.03, 0.2)). I wonder if 
this is due to the autocorrelation in the binary variable (btw. I also tried 
here lm, as the third parameter can be estimated via the correlation, but it 
remains highly dependent on the number of observations).

Any suggestions here?

Best regards

Simon 

On Sep 28, 2012, at 5:02 AM, Rolf Turner rolf.tur...@xtra.co.nz wrote:

 
 I have no idea what your code is doing, nor why you want correlated binary
 variables.  Correlation makes little or no sense in the context of binary 
 random
 variables --- or more generally in the context of discrete random variables.
 
 Be that as it may, it is an easy calculation to show that if X and Y are 
 binary
 random variables both with success probability of 0.5 then cor(X,Y) = 0.2 if
 and only if Pr(X=1 | Y = 1) = 0.6.  So just generate X and Y using that fact:
 
 set.seed(42)
 X - numeric(1000)
 Y - numeric(1000)
 for(i in 1:1000) {
   Y[i] - rbinom(1,1,0.5)
   X[i] - if(Y[i]==1) rbinom(1,1,0.6) else rbinom(1,1,0.4)
 }
 
 # Check:
 cor(X,Y) # Get 0.2012336
 
 Looks about right.  Note that the sample proportions are 0.484 and
 0.485 for X and Y respectively.  These values do not differ significantly
 from 0.5.
 
cheers,
 
Rolf Turner
 
 On 28/09/12 08:26, Simon Zehnder wrote:
 Hi R-fellows,
 
 I am trying to simulate a multivariate correlated sample via the Gaussian 
 copula method. One variable is a binary variable, that should be 
 autocorrelated. The autocorrelation should be rho = 0.2. Furthermore, the 
 overall probability to get either outcome of the binary variable should be 
 0.5.
 Below you can see the R code (I use for simplicity a diagonal matrix in 
 rmvnorm even if it produces no correlated sample):
 
 sampleCop - function(n = 1000, rho = 0.2) {
  
  require(splus2R)
  mvrs - rmvnorm(n + 1, mean = rep(0, 3), cov = diag(3))
  pmvrs - pnorm(mvrs, 0, 1)
  var1 - matrix(0, nrow = n + 1, ncol = 1)
  var1[1] - qbinom(pmvrs[1, 1], 1, 0.5)
  if(var1[1] == 0) var1[nrow(mvrs)] - -1
  for(i in  1:(nrow(pmvrs) - 1)) {
  if(pmvrs[i + 1, 1] = rho) var1[i + 1] - var1[i]
  else var1[i + 1] - var1[i] * (-1)
  }
  sample - matrix(0, nrow = n, ncol = 4)
  sample[, 1] - var1[1:nrow(var1) - 1]
  sample[, 2] - var1[2:nrow(var1)]
  sample[, 3] - qnorm(pmvrs[1:nrow(var1) - 1, 2], 0, 1, 1, 0)
  sample[, 4] - qnorm(pmvrs[1:nrow(var1) - 1, 3], 0, 1, 1, 0)
  
  sample

[R] glpk package missing?

2012-10-02 Thread Steven Backues
I have a piece of code (from Xie et al. 2009 Autophagy 5:217) that runs 
in R and requires the glpk package. A year or so ago, I was able to 
download and install the glpk package directly from insider the R 
program (for Windows), and everything worked fine. Now I have installed 
R for Windows on a new computer, and I cannot find the glpk package on 
the list of available packages on my local CRAN mirror.


I do see two packages labeled Rglpk and glpkAPI, but as far as I can 
understand these just interface with an outside version of glpk, which I 
don't have. I also found an old link to R glpk in a Wikibooks page, but 
it now says Package ‘glpk’ was removed from the CRAN repository.


Is there a reason why the glpk package is no longer available? Can 
anyone suggest the easiest workaround? I tried but failed to install an 
outside version of glpk for windows. I am working on putting together a 
protocol paper that includes the use of this code, so I need some option 
that is simple enough that anyone can do it, without first needing to be 
a computer expert.


Thanks in advance for your help.

Steven Backues

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] smoothScatter plot

2012-10-02 Thread John Kane
It's hard to know what's wrong with your code since you did not supply it.  

Please supply a small working example and some data.  To supply data use the 
dput() function, see ?dput() for details.   

John Kane
Kingston ON Canada


 -Original Message-
 From: zhyjiang2...@hotmail.com
 Sent: Tue, 2 Oct 2012 11:38:31 +0800
 To: r-help@r-project.org
 Subject: [R] smoothScatter plot
 
 
 
 
 
 Hi, I want to make a plot similar to sm1 (attached). The code I tried is:
 dcols - densCols(x,y)
 smoothScatter(x,y, col = dcols, pch=20,xlab=A,ylab=B)
 abline(h=0, col=red)
  But it turned out to be s1 (attached) with big dots. I was wondering if
 anything wrong with my code. Thanks,Zhengyu
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ffbase, help with %in%

2012-10-02 Thread Jan van der Laan


It doesn't seem possible to index an ff-vector using a logical  
ff-vector. You can use subset (also in ffbase) or first convert 'a' to  
a normal logical vector:



library(ff)
library(ffbase)

data1  - as.ffdf(data.frame(a = letters[1:10], b=1:10))
data2  - as.ffdf(data.frame(a = letters[5:26], b=5:26))

a - data1[[1]] %in% data2$a

subset(data1, a)
data1[a[], ]


HTH,
Jan




Lucas Chaparro lpchaparro...@gmail.com schreef:


Hello to everyone.
I'm trying to use the %in% to match to vectors in ff format.
a-as.ff(data[,1]) %in% fire$fecha


aff (open) logical length=3653 (3653)

   [1][2][3][4][5][6][7][8][3646]
 FALSE  FALSE  FALSE  FALSE  FALSE  FALSE  FALSE  FALSE  :  FALSE
[3647] [3648] [3649] [3650] [3651] [3652] [3653]
 FALSE  FALSE  FALSE  FALSE  FALSE  FALSE  FALSE


Here you see a part of the data:

data[1:20,]  (just a sample, data has 3653 obs)

fecha juliano altura  UTM.E   UTM.N
1  1990-07-01 182 15 248500 6239500
2  1990-07-02 183 15 248500 6239500
3  1990-07-03 184 15 248500 6239500
4  1990-07-04 185 15 248500 6239500
5  1990-07-05 186 15 248500 6239500
6  1990-07-06 187 15 248500 6239500
7  1990-07-07 188 15 248500 6239500
8  1990-07-08 189 15 248500 6239500
9  1990-07-09 190 15 248500 6239500
10 1990-07-10 191 15 248500 6239500
11 1990-07-11 192 15 248500 6239500
12 1990-07-12 193 15 248500 6239500
13 1990-07-13 194 15 248500 6239500
14 1990-07-14 195 15 248500 6239500
15 1990-07-15 196 15 248500 6239500
16 1990-07-16 197 15 248500 6239500
17 1990-07-17 198 15 248500 6239500
18 1990-07-18 199 15 248500 6239500
19 1990-07-19 200 15 248500 6239500
20 1990-07-20 201 15 248500 6239500


fire$fecha[1:20,] [1] 1984-11-08 1984-11-08 1984-11-09  
1984-11-09 1984-11-09

 [6] 1984-11-10 1984-11-10 1984-11-11 1984-11-11 1984-11-11
[11] 1984-11-11 1984-11-11 1984-11-11 1984-11-12 1984-11-12
[16] 1984-11-13 1984-11-13 1984-11-13 1984-11-14 1984-11-14


to see if a got any match:


table.ff(a)


FALSE  TRUE
 1687  1966 Mensajes de aviso perdidosIn if (useNA == no) c(NA, NaN) :
  la condición tiene longitud  1 y sólo el primer elemento será usado


in a regular data.frame I use data[a,] to extract the rows that a ==
TRUE, but when i do this in a ffdf i get this error:



data[a,]Error: vmode(index) == integer is not TRUE



I'm just learning how to use the ff package so, obviously I'm  
missing something



If any of you knows how to solve this, please teach me.

Thank you so much.


Lucas.

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2012-10-02 Thread John Kane
It's hard to know what's wrong  since you did not supply your code.

Please supply a small working example and some data.  To supply data use the 
dput()
function, see ?dput() for details.

Welcome to R.


John Kane
Kingston ON Canada


 -Original Message-
 From: mbhpat...@gmail.com
 Sent: Mon, 1 Oct 2012 14:25:20 -0700
 To: r-help@r-project.org
 Subject: [R] (no subject)
 
 Hello,
 I am a new R -user and request your help for the following problem.
 I need to merge two dataset of longitudinal study which has two column
 (id and respose) common. when I used merge option to join the datas
 side be side,  because of the repeated subject id, I got larger data
 set which is not accurate.
 
  I would like to connect twi data sets by id and response in such a
 way that data are connected by same id and response type  and if the
 same subject has less data point in one data set, then produce NA.
 Sample data sets is attached.
 
 Bibek
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice xyplot, get current level

2012-10-02 Thread Christof Kluß
subj - levels(subject)
subj[panel.number()]

seems to be a good solution

is there something like panel.legend (instead of panel.text)?

Am 02-10-2012 12:59, schrieb Christof Kluß:
 Hi
 
 xyplot(y ~ x | subject) plots a separate graph of y against x for each
 level of subject. But I would like to have an own function for each
 level. Something like
 
 xyplot(y ~ x | subject,
panel = function(x,y) {
  panel.xyplot(x,y)
 
  panel.curve(x,y) {
# something that dependents on the current subject
...
  }
})
 
 
 How I get the current subject? (The current subject is the title of the
 graph, too)
 
 thx
 Christof


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with mutli-dimensional array

2012-10-02 Thread Loukia Spineli
I want to make a multi-dimensional array. To be specific I want to make the
following array

results-array(0,dim=c(2,2,64,7))

This is the code I have created but it gives no result due to the error
subscript out of bound.

x-rep(7,7)  # Missingness in intervention
y-rep(7,7) # Missingness in control

arraynames-list(Group=c(Success,Failure),Outcome=c(Intervention,Control),Trial=c(1:7))
mat.stat-array(c(9,16,10,15,66,12,44,23,102,88,66,104,277,60,247,119,23,43,20,41,201,162,122,
263,14,41,4,41),dim=c(2,2,7),dimnames=arraynames);mat.stat

Nx-rep(0,length(x))
Ny-rep(0,length(y))
n-(x+1)*(y+1)
results-array(0,dim=c(2,2,64,7))
l-1
for(i in 1:length(x)){
Nx[i] - length(1:(x[i]+1))
Ny[i] - length(1:(y[i]+1))
for(j in 1:(Nx[i])){
for(k in 1:(Ny[i])){
results[,,l,i]-mat.stat[,,i]+matrix(c(c(0:x[i])[j],c(0:y[i])[k],-c(0:x[i])[j],
-c(0:y[i])[k]),nrow=2,ncol=2,byrow=T)
l-l+1
}
}
}
results

Any suggestion would be really welcome!
Thank you very much!

Loukia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice xyplot, get current level

2012-10-02 Thread Bert Gunter
Christof:

You are aware, I assume, that the subject level name can be
incorporated into the strip label via the strip function argument;
e.g.

xyplot(...,
strip = strip.custom(style = 1, strip.levels=c(TRUE,TRUE)),
...)

Cheers,
Bert

On Tue, Oct 2, 2012 at 6:45 AM, Christof Kluß ckl...@email.uni-kiel.de wrote:
 subj - levels(subject)
 subj[panel.number()]

 seems to be a good solution

 is there something like panel.legend (instead of panel.text)?

 Am 02-10-2012 12:59, schrieb Christof Kluß:
 Hi

 xyplot(y ~ x | subject) plots a separate graph of y against x for each
 level of subject. But I would like to have an own function for each
 level. Something like

 xyplot(y ~ x | subject,
panel = function(x,y) {
  panel.xyplot(x,y)

  panel.curve(x,y) {
# something that dependents on the current subject
...
  }
})


 How I get the current subject? (The current subject is the title of the
 graph, too)

 thx
 Christof


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glpk package missing?

2012-10-02 Thread Spencer Graves

On 10/2/2012 6:08 AM, Steven Backues wrote:
I have a piece of code (from Xie et al. 2009 Autophagy 5:217) that 
runs in R and requires the glpk package. A year or so ago, I was able 
to download and install the glpk package directly from insider the R 
program (for Windows), and everything worked fine. Now I have 
installed R for Windows on a new computer, and I cannot find the glpk 
package on the list of available packages on my local CRAN mirror.


I do see two packages labeled Rglpk and glpkAPI, but as far as I 
can understand these just interface with an outside version of glpk, 
which I don't have. I also found an old link to R glpk in a Wikibooks 
page, but it now says Package ‘glpk’ was removed from the CRAN 
repository.


Is there a reason why the glpk package is no longer available? Can 
anyone suggest the easiest workaround? I tried but failed to install 
an outside version of glpk for windows. I am working on putting 
together a protocol paper that includes the use of this code, so I 
need some option that is simple enough that anyone can do it, without 
first needing to be a computer expert.



  Have you tried to contact the author(s) and maintainer(s) of 
Rglpk and glpkAPI to see what they recommend?  If it were my project, I 
think I might try that first.



  Parallel with that, I might download glpk_4.8-0.5.tar.gz from 
CRAN - Packages - Archived.  Then I'd do R CMD check and INSTALL from 
that.  Then I'd look for the author(s) and maintainer(s) listed there, 
and try to contact them (or him or her).



  I don't know, but I'd guess that glpk was removed from CRAN after 
it failed to pass some new CRAN test and the maintainer failed to reply 
to requests from the CRAN maintainers for updates.



  What other package(s) will be referenced in your protocol paper?  
Is there one you control?  If yes, and you only wanted a few functions 
from glpk, you might just copy that into your package and run with it.  
If no, I think you could name yourself the maintainer of glpk, give it a 
newer number, get it to pass R CMD check, and submit your new version to 
CRAN.  Along the way, you could add other things to make the protocol 
even easier.



  Hope this helps.
  Spencer



Thanks in advance for your help.

Steven Backues

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code. 



--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transform pairwise observations into a table

2012-10-02 Thread William Dunlap
Another base-R-only solution uses a 2-column matrix of subscripts
to fill a matrix.  E.g.,

 f - function(data) {
+ mat - matrix(NA_real_, nrow=max(data[[1]]), ncol=max(data[[2]]))
+ mat[cbind(data[[1]], data[[2]])] - data[[3]]
+ mat
+ }
 
 f(dat1)
  [,1]  [,2]  [,3] [,4]
[1,] 1.000 0.250 0.125  0.5
[2,] 0.250 1.000 0.125  0.5
[3,] 0.125 0.125 1.000  0.5
[4,] 0.500 0.500 0.500  1.0

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of David Winsemius
 Sent: Monday, October 01, 2012 7:33 PM
 To: arun
 Cc: R help; Bert Gunter; Hadjixenofontos, Athena
 Subject: Re: [R] Transform pairwise observations into a table
 
 
 On Oct 1, 2012, at 2:30 PM, arun wrote:
 
  HI AHJ,
  No problem.
 
  One more way in addition to reshape() (Rui's suggestion) to get the same 
  result.
  library(reshape)
 
  as.matrix(cast(melt(dat1,id=c(ind1,ind2)),ind1~ind2,value=value))
  #  1 2 3   4
  #1 1.000 0.250 0.125 0.5
  #2 0.250 1.000 0.125 0.5
  #3 0.125 0.125 1.000 0.5
  #4 0.500 0.500 0.500 1.0
  A.K.
 
 That looks a tad ... well, ... complicated. So perhaps these base-only 
 solutions with tapply
 might be more accessible: Some of them do border on the whimsical, I will 
 admit:
 
 with (dat1, tapply(coef, list(ind1,ind2), I))
 
 with (dat1, tapply(coef, list(ind1,ind2), c))
 
 with (dat1, tapply(coef, list(ind1,ind2), ^, 1))
 
 with (dat1, tapply(coef, list(ind1,ind2), +, 0))
 
 It is a specific response to the request for a `table`-like function tha 
 twouldallow the
 application of other functions. Cnage the `1` to `2` in the third instance 
 and you get the
 tabulated squares. And do not forget the availability of `ftable` to flatten 
 the output of
 `tapply` retunred values.
 
 
 --
 David.
 
 
  - Original Message -
  From: Hadjixenofontos, Athena ahadjixenofon...@med.miami.edu
  To: arun smartpink...@yahoo.com
  Cc: R help r-help@r-project.org
  Sent: Monday, October 1, 2012 12:59 PM
  Subject: Re: [R] Transform pairwise observations into a table
 
  Thank you. I had looked at xtabs but misunderstood the syntax. This is 
  great. :)
 
  AHJ
 
 
 
  On Oct 1, 2012, at 12:53 PM, arun smartpink...@yahoo.com wrote:
 
  Hi,
  Try this:
 
  dat1-read.table(text=
  ind1 ind2 coef
  1 1 1
  1 2 0.25
  1 3 0.125
  1 4 0.5
  2 2 1
  2 1 0.25
  2 3 0.125
  2 4 0.5
  3 3 1
  3 1 0.125
  3 2 0.125
  3 4 0.5
  4 4 1
  4 1 0.5
  4 2 0.5
  4 3 0.5
  ,sep=,header=TRUE)
  mat1-as.matrix(xtabs(coef~ind1+ind2,data=dat1))
 
  #ind2
  #ind1 1 2 3 4
 # 1 1.000 0.250 0.125 0.500
  #2 0.250 1.000 0.125 0.500
  #3 0.125 0.125 1.000 0.500
  #4 0.500 0.500 0.500 1.000
 
  A.K.
 
 
 
  - Original Message -
  From: AHJ ahadjixenofon...@med.miami.edu
  To: r-help@r-project.org
  Cc:
  Sent: Monday, October 1, 2012 12:17 PM
  Subject: [R] Transform pairwise observations into a table
 
  Hi,
 
  I have a table of pairs of individuals and a coefficient that belongs to 
  the
  pair:
 
  ind1ind2coef
  111
  120.25
  130.125
  140.5
  221
  210.25
  230.125
  240.5
  331
  310.125
  320.125
  340.5
  441
  410.5
  420.5
  430.5
 
  And I want to convert it to a matrix where each individual is both a row 
  and
  a column and at the intersection of each pair is the coefficient that
  belongs to that pair:
 
   1   2   3 4
  11   0.25   0.1250.5
  20.25  1   0.1250.5
  30.1250.12510.5
  40.5   0.5   0.51
 
  If table() would allow me to specify something other than frequencies to
  fill the table with, it would be what I need. I tried a few different
  combinations of t() and unique() but none of it made enough sense to post 
  as
  my starting code... I am just lost. Any help would be greatly appreciated.
 
  Thank you,
  AHJ
 
 
 
  --
  View this message in context: 
  http://r.789695.n4.nabble.com/Transform-pairwise-
 observations-into-a-table-tp4644706.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 Alameda, CA, USA
 
 __
 R-help@r-project.org 

Re: [R] Basic question about: - and method start with dot.

2012-10-02 Thread peter dalgaard

On Oct 2, 2012, at 13:35 , Hadley Wickham wrote:

 What is the special meaning for the method name start with a dot?
 
 It means nothing in particular, except that such objects don't show up in 
 ls() by default. The _intention_ is usually that the function is only to be 
 used internally and not for end-user use.
 
 But these days, if you're writing a package, you're better off using 
 namespaces.

Sure, but that doesn't keep package writers from using that kind of naming 
convention for non-exported objects. (Notice that this started as a question 
about reading 3rd party source codes.)

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with mutli-dimensional array

2012-10-02 Thread Berend Hasselman

On 02-10-2012, at 16:20, Loukia Spineli spinelilouki...@gmail.com wrote:

 I want to make a multi-dimensional array. To be specific I want to make the
 following array
 
 results-array(0,dim=c(2,2,64,7))
 
 This is the code I have created but it gives no result due to the error
 subscript out of bound.
 
 x-rep(7,7)  # Missingness in intervention
 y-rep(7,7) # Missingness in control
 
 arraynames-list(Group=c(Success,Failure),Outcome=c(Intervention,Control),Trial=c(1:7))
 mat.stat-array(c(9,16,10,15,66,12,44,23,102,88,66,104,277,60,247,119,23,43,20,41,201,162,122,
 263,14,41,4,41),dim=c(2,2,7),dimnames=arraynames);mat.stat
 
 Nx-rep(0,length(x))
 Ny-rep(0,length(y))
 n-(x+1)*(y+1)
 results-array(0,dim=c(2,2,64,7))
 l-1
 for(i in 1:length(x)){
 Nx[i] - length(1:(x[i]+1))
 Ny[i] - length(1:(y[i]+1))
 for(j in 1:(Nx[i])){
 for(k in 1:(Ny[i])){
 results[,,l,i]-mat.stat[,,i]+matrix(c(c(0:x[i])[j],c(0:y[i])[k],-c(0:x[i])[j],
 -c(0:y[i])[k]),nrow=2,ncol=2,byrow=T)
 l-l+1
 }
 }
 }
 results

If you had inserted this line

cat(i=,i,l=,l,j=,j,k=,k,\n)

before the assignment to results[,,l,i] in the inner loop you could have quite 
easily seen what the problem is.
The counter l (letter el) keeps on increasing in your code and when it reaches 
65 R will complain.

So at an appropriate place in the code you need to reset l to 1.
I suggest you move the assignment l-1 to just before the  line with for( j in 
1:Nx[i]) ...
Whether the results are what you desire is up to you to decide.

Please use some more whitespace  and do some indenting in your code. As 
presented it is unreadable.

Berend
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with mutli-dimensional array

2012-10-02 Thread Rui Barradas

Hello,
See if this is it.


Nx - rep(0,length(x))
Ny - rep(0,length(y))
n - (x+1)*(y+1)
results - array(0, dim=c(2,2,64,7))
# l - 1 # --- This changed place
for(i in 1:length(x)){
Nx[i] - length(1:(x[i]+1))
Ny[i] - length(1:(y[i]+1))
l - 1 # --- To here
for(j in 1:(Nx[i])){
for(k in 1:(Ny[i])){


Hope this helps,

Rui Barradas
Em 02-10-2012 15:20, Loukia Spineli escreveu:

I want to make a multi-dimensional array. To be specific I want to make the
following array

results-array(0,dim=c(2,2,64,7))

This is the code I have created but it gives no result due to the error
subscript out of bound.

x-rep(7,7)  # Missingness in intervention
y-rep(7,7) # Missingness in control

arraynames-list(Group=c(Success,Failure),Outcome=c(Intervention,Control),Trial=c(1:7))
mat.stat-array(c(9,16,10,15,66,12,44,23,102,88,66,104,277,60,247,119,23,43,20,41,201,162,122,
263,14,41,4,41),dim=c(2,2,7),dimnames=arraynames);mat.stat

Nx-rep(0,length(x))
Ny-rep(0,length(y))
n-(x+1)*(y+1)
results-array(0,dim=c(2,2,64,7))
l-1
for(i in 1:length(x)){
Nx[i] - length(1:(x[i]+1))
Ny[i] - length(1:(y[i]+1))
for(j in 1:(Nx[i])){
for(k in 1:(Ny[i])){
results[,,l,i]-mat.stat[,,i]+matrix(c(c(0:x[i])[j],c(0:y[i])[k],-c(0:x[i])[j],
-c(0:y[i])[k]),nrow=2,ncol=2,byrow=T)
l-l+1
}
}
}
results

Any suggestion would be really welcome!
Thank you very much!

Loukia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] kmeans cluster analysis. How do I (1) determine probability of cluster membership (2) determine cluster membership for a new subject

2012-10-02 Thread John Sorkin
Window XP
R 2.15
 
I am running a cluster analysis in which I ask for three clusters (see code 
below). The analysis nicely tells me what cluster each of the subjects in my 
input dataset belongs to. I would like two pieces of information
(1) for every subject in my input data set, what is the probability of the 
subject belonging to each of the three cluster
(2) given a new subject, someone who was not in my original dataset, how can I 
determine their cluster assignment?
Thanks,
John
 
# K-Means Cluster Analysis
jclusters - 3
fit   - kmeans(datascaled, jclusters) # 3 cluster solution
 
and fit$cluster tells me what cluster each observation in my input dataset 
belongs to (output truncated for brevity):
 
 fit$cluster   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  
 17 . . . .
  1   1   1   1   3   1   1   1   1   2   1   2   1   1   1   1   1 . . . . How 
do I get probability of being in cluster 1, cluster 2, and cluster 3 for a 
given subject, e.g datascaled[1,]?How do I get the cluster assigment for a new 
subject?Thanks,John 
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.  
Any unauthorized use, disclosure or distribution is prohibited.  If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Integration in R

2012-10-02 Thread Naser Jamil
Dear R-users,
I am facing problem with integrating in R a likelihood function which is a
function of four parameters. It's giving me the result at the end but
taking more than half an hour to run. I'm wondering is there any other
efficient way deal with. The following is my code. I am ready to provide
any other description of my function if you need to move forward.



library(cubature)
dose-c(2,3,5)
y0-c(2,1,0)
y1-c(1,1,1)
y2-c(0,1,2)

lf-function (x) {
v-1
for (i in 1:length(dose)) {
psi0-1/((1+exp(x[1]+x[2]*dose[i]))*(1+exp(x[3]+x[4]*dose[i])))

psi1-exp(x[1]+x[2]*dose[i])/((1+exp(x[1]+x[2]*dose[i]))*(1+exp(x[3]+x[4]*dose[i])))
   v-v*(psi0^y0[i])*(psi1^y1[i])*((1-psi0-psi1)^y2[i])
}
   return(v)
}

adaptIntegrate(lf, lowerLimit = c(-20, 0,-20,0), upperLimit = c(0,10,0,10))

-
Thanks for your attention.


Kind regards,

Jamil.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Basic question about: - and method start with dot.

2012-10-02 Thread Fabrice Tourre
require(fortunes)
Loading required package: fortunes
fortune(-)

I wish - had never been invented, as it makes an esoteric and dangerous
feature of the language *seem* normal and reasonable. If you want to dumb down
R/S into a macro language, this is the operator for you.
   -- Bill Venables
  R-help (July 2001)

On Tue, Oct 2, 2012 at 11:14 AM, peter dalgaard pda...@gmail.com wrote:

 On Oct 2, 2012, at 13:35 , Hadley Wickham wrote:

 What is the special meaning for the method name start with a dot?

 It means nothing in particular, except that such objects don't show up in 
 ls() by default. The _intention_ is usually that the function is only to be 
 used internally and not for end-user use.

 But these days, if you're writing a package, you're better off using 
 namespaces.

 Sure, but that doesn't keep package writers from using that kind of naming 
 convention for non-exported objects. (Notice that this started as a question 
 about reading 3rd party source codes.)

 --
 Peter Dalgaard, Professor
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] count data as independent variable in logistinc regression

2012-10-02 Thread vlagani


Dear R users,

I would like to employ count data as covariates while fitting a  
logistic regression model. My question is:


do I violate any assumption of the logistic (and, more in general, of  
the generalized linear) models by employing count, non-negative  
integer variables as independent variables?


I found a lot of references in the literature regarding hot to use  
count data as outcome, but not as covariates; see for example the very  
clear paper: N E Breslow (1996) Generalized Linear Models: Checking  
Assumptions and Strengthening Conclusions, Congresso Nazionale Societa  
Italiana di Biometria, Cortona June 1995, available at

http://biostat.georgiahealth.edu/~dryu/course/stat9110spring12/land16_ref.pdf.

Loosely speaking, it seems that glm assumptions may be expressed as follows:

iid residuals;
the link function must correctly represent the relationship among  
dependent and independent variables;

absence of outliers

Does everybody knows whether there exists any other  
assumption/technical problem that may suggest to use some other type  
of models for dealing with count covariates?


Finally, please notice that my data contain relatively few samples  
(100) and that count variables' ranges can vary within 3-4 order of  
magnitude (i.e. some variables has value in the range 0-10, while  
other variables may have values within 0-1).


A simple example code follows:

###

#genrating simulated data
var1 = sample(0:10, 100, replace = TRUE);
var2 = sample(0:1000, 100, replace = TRUE);
var3 = sample(0:10, 100, replace = TRUE);
outcome = sample(0:1, 100, replace = TRUE);
dataset = data.frame(outcome, var1, var2, var3);

#fitting the model
model = glm(outcome ~ ., family=binomial, data = dataset)

#inspecting the model
print(model)

###

Regards,

--
Vincenzo Lagani
Research Fellow
BioInformatics Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] count data as independent variable in logistinc regression

2012-10-02 Thread Bert Gunter
This is not primarily an R question, although I grant you that it
might intersect packages in R that do what you want. Nevertheless, I
think you would do better posting on a statistical list, like
stats.stackexchange.com . Maybe once you've figured out there what you
want, you can come back to R to find an implementation.

Cheers,
Bert

On Tue, Oct 2, 2012 at 9:10 AM,  vlag...@ics.forth.gr wrote:

 Dear R users,

 I would like to employ count data as covariates while fitting a logistic
 regression model. My question is:

 do I violate any assumption of the logistic (and, more in general, of the
 generalized linear) models by employing count, non-negative integer
 variables as independent variables?

 I found a lot of references in the literature regarding hot to use count
 data as outcome, but not as covariates; see for example the very clear
 paper: N E Breslow (1996) Generalized Linear Models: Checking Assumptions
 and Strengthening Conclusions, Congresso Nazionale Societa Italiana di
 Biometria, Cortona June 1995, available at
 http://biostat.georgiahealth.edu/~dryu/course/stat9110spring12/land16_ref.pdf.

 Loosely speaking, it seems that glm assumptions may be expressed as follows:

 iid residuals;
 the link function must correctly represent the relationship among dependent
 and independent variables;
 absence of outliers

 Does everybody knows whether there exists any other assumption/technical
 problem that may suggest to use some other type of models for dealing with
 count covariates?

 Finally, please notice that my data contain relatively few samples (100)
 and that count variables' ranges can vary within 3-4 order of magnitude
 (i.e. some variables has value in the range 0-10, while other variables may
 have values within 0-1).

 A simple example code follows:

 ###

 #genrating simulated data
 var1 = sample(0:10, 100, replace = TRUE);
 var2 = sample(0:1000, 100, replace = TRUE);
 var3 = sample(0:10, 100, replace = TRUE);
 outcome = sample(0:1, 100, replace = TRUE);
 dataset = data.frame(outcome, var1, var2, var3);

 #fitting the model
 model = glm(outcome ~ ., family=binomial, data = dataset)

 #inspecting the model
 print(model)

 ###

 Regards,

 --
 Vincenzo Lagani
 Research Fellow
 BioInformatics Laboratory
 Institute of Computer Science
 Foundation for Research and Technology - Hellas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error messages when attempting to calculate polychoric correlation matrices using the psych package

2012-10-02 Thread Kevin Cheung
Dear Professor Fox,

Apologies for my oversight relating to the polychor command and thank you for 
your advice. I turned to the polychor command when trying to find an equivalent 
for the polychoric command found in the psych package (I am following a 
procedure outlined in Gadermann, Guhn  Zumbo, 2012 that uses this command 
rather than the polycor package) . The polychoric command is returning an error:

 polychoric(data.matrix18)
Error in max(x, na.rm = TRUE) - min(x, na.rm = TRUE) :
  non-numeric argument to binary operator

I have now computed a polychoric correlation matrix using your hetcor command 
and the two-step estimation method (ML returned an error that I have read about 
in other posts); I didn't have to modify the numeric data. I have framed the 
output into a matrix using crude copy and paste methods with Excel as I was not 
able to frame the output values using R.

pol.matrix -
structure(list(V1 = c(1, 0.1810841, 0.380326, 0.2882147, 0.3106325,
0.2296951, 0.2055215, 0.262325, 0.6560471, 0.3579354, 0.4106915,
0.4857118, 0.3846493, 0.3773653, 0.3806744, 0.3021153, 0.56,
0.2136178), V2 = c(0.1810841, 1, 0.4008101, 0.4456874, 0.5201655,
0.2299313, 0.1537878, 0.6107596, 0.1961909, 0.245016, 0.261282,
0.232519, 0.3282176, 0.3853363, 0.3923762, 0.2678565, 0.594987,
0.5488453), V3 = c(0.380326, 0.4008101, 1, 0.5972196, 0.5136961,
0.2760067, 0.2929657, 0.3148125, 0.4218976, 0.4259719, 0.4035297,
0.4345366, 0.5482426, 0.6126576, 0.5080505, 0.3579256, 0.4045489,
0.2919147), V4 = c(0.2882147, 0.4456874, 0.5972196, 1, 0.4262034,
0.2351873, 0.278928, 0.434891, 0.3305545, 0.3283345, 0.2996847,
0.3660209, 0.4479252, 0.4864451, 0.4788135, 0.2796844, 0.3950568,
0.3319467), V5 = c(0.3106325, 0.5201655, 0.5136961, 0.4262034,
1, 0.2102538, 0.3045312, 0.5690656, 0.2422597, 0.3044538, 0.2995296,
0.217496, 0.3671066, 0.421044, 0.4349045, 0.2683132, 0.6243316,
0.551007), V6 = c(0.2296951, 0.2299313, 0.2760067, 0.2351873,
0.2102538, 1, 0.428175, 0.2127635, 0.2863833, 0.3223868, 0.5212511,
0.3909867, 0.3676703, 0.3744179, 0.3025505, 0.7096538, 0.308744,
0.2504025), V7 = c(0.2055215, 0.1537878, 0.2929657, 0.278928,
0.3045312, 0.428175, 1, 0.2071432, 0.2756563, 0.1874141, 0.4163661,
0.3852869, 0.336576, 0.3489307, 0.3197317, 0.5118747, 0.3065872,
0.2786799), V8 = c(0.262325, 0.6107596, 0.3148125, 0.434891,
0.5690656, 0.2127635, 0.2071432, 1, 0.3062833, 0.3587556, 0.2742062,
0.2397737, 0.3471734, 0.3757533, 0.4882055, 0.2490953, 0.7115149,
0.635405), V9 = c(0.6560471, 0.1961909, 0.4218976, 0.3305545,
0.2422597, 0.2863833, 0.2756563, 0.3062833, 1, 0.4575778, 0.433267,
0.6138503, 0.3967928, 0.4067687, 0.4040989, 0.3540081, 0.2188774,
0.2753278), V10 = c(0.3579354, 0.245016, 0.4259719, 0.3283345,
0.3044538, 0.3223868, 0.1874141, 0.3587556, 0.4575778, 1, 0.3523538,
0.4246141, 0.3921508, 0.4832027, 0.4696606, 0.3012235, 0.3051279,
0.2754337), V11 = c(0.4106915, 0.261282, 0.4035297, 0.2996847,
0.2995296, 0.5212511, 0.4163661, 0.2742062, 0.433267, 0.3523538,
1, 0.5791717, 0.4740649, 0.4200025, 0.4236943, 0.6216296, 0.3404673,
0.2766948), V12 = c(0.4857118, 0.232519, 0.4345366, 0.3660209,
0.217496, 0.3909867, 0.3852869, 0.2397737, 0.6138503, 0.4246141,
0.5791717, 1, 0.6038785, 0.5410787, 0.5420467, 0.4370163, 0.2656908,
0.1940703), V13 = c(0.3846493, 0.3282176, 0.5482426, 0.4479252,
0.3671066, 0.3676703, 0.336576, 0.3471734, 0.3967928, 0.3921508,
0.4740649, 0.6038785, 1, 0.6883049, 0.553011, 0.3793382, 0.3710298,
0.3068081), V14 = c(0.3773653, 0.3853363, 0.6126576, 0.4864451,
0.421044, 0.3744179, 0.3489307, 0.3757533, 0.4067687, 0.4832027,
0.4200025, 0.5410787, 0.6883049, 1, 0.6113808, 0.3660114, 0.3863373,
0.3410903), V15 = c(0.3806744, 0.3923762, 0.5080505, 0.4788135,
0.4349045, 0.3025505, 0.3197317, 0.4882055, 0.4040989, 0.4696606,
0.4236943, 0.5420467, 0.553011, 0.6113808, 1, 0.407302, 0.5054361,
0.4577031), V16 = c(0.3021153, 0.2678565, 0.3579256, 0.2796844,
0.2683132, 0.7096538, 0.5118747, 0.2490953, 0.3540081, 0.3012235,
0.6216296, 0.4370163, 0.3793382, 0.3660114, 0.407302, 1, 0.3743952,
0.3585206), V17 = c(0.56, 0.594987, 0.4045489, 0.3950568,
0.6243316, 0.308744, 0.3065872, 0.7115149, 0.2188774, 0.3051279,
0.3404673, 0.2656908, 0.3710298, 0.3863373, 0.5054361, 0.3743952,
1, 0.7651242), V18 = c(0.2136178, 0.5488453, 0.2919147, 0.3319467,
0.551007, 0.2504025, 0.2786799, 0.635405, 0.2753278, 0.2754337,
0.2766948, 0.1940703, 0.3068081, 0.3410903, 0.4577031, 0.3585206,
0.7651242, 1)), .Names = c(V1, V2, V3, V4, V5, V6,
V7, V8, V9, V10, V11, V12, V13, V14, V15, V16,
V17, V18), class = data.frame, row.names = c(NA, -18L))


I have then used this matrix with the alpha command in psych:

 alpha(pol.matrix)

Reliability analysis
Call: alpha(x = pol.matrix)

  raw_alpha std.alpha G6(smc) average_r
  0.92  0.920.94  0.39

 Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r
V1   0.92  0.920.94  0.39
V2   0.92  0.920.94  0.39
V3   0.91  

Re: [R] R for commercial use

2012-10-02 Thread S Ellison
 
 cool, im none the wiser now, but thanks anyway
That would be because your IT questionnaire requires telepathy to understand 
what the writer wanted to know.

My take on these, in case it helps:

1. Is R a Clientsoftware / Serversoftware / Systemsoftware? 
R can run both on a client or on a server.  
However, it is normally installed on a client machine in the same way as Word, 
Excel, Gimp etc. If you intend to install it on a desktop I suggest the answer 
'client'

2. Does R need a chellenge-response treatment for activation? 
No

3. Is R proxy-able? 
R can access the internet through a proxy. Usually installing usint the 
Internet2 option lets R work through Windows' existing proxy setup.

4. Is the personalised WINDOWS-NTLM-authorization at ISA-proxy possible? 
I have no idea why any sensible IT staff would expect an end user to be able to 
answer this question. I deduce your IT staff are not sensible. Further, this 
refers to a LAN security protocol that authenticates client-server interactions 
(badly, I assume, as even microsoft no longer recommend it). 
I suggest you answer 'N/A' because that authentication should be handled by 
client-server networking protocols and not by R


5. Are any exceptions for virus scanner needed? If yes, which ones? 
Not usually; R does not trigger antivirus alerts unless the antivirus software 
is making mistakes. That happened once to my knowledge.

6. Does R changes system files or folders? (e.g. system32, registry). If yes, 
which ones? 
R's installation process adds the R version to the registry and associates 
.Rdata files with R via the usual registry mechanism. You can tell it not to do 
that. Further, if it fails to do so for permission reasons R will still run 
normally. 
I pefer to get our admin staff to install R as administrators to get the 
registry properly updated, and I also ask them to grant write permission to the 
R subdirectory in the program files directory so that R can install contributed 
packages there. But that is not essential; R can install packages in user space 
instead if permission is not granted for program files space.

Of course if a user uses R to read and write from the system areas, then of 
course it can change files there; byt that is equally true of word, excel etc.



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R process must die - can I save history?

2012-10-02 Thread Mike Miller
I connected from my desktop Linux box to a Linux server using ssh in an 
xterm, but that xterm was running in Xvnc.  I'm running R on the server in 
that xterm (over ssh).  Something went wrong with Xvnc that has caused it 
to hang, probably this bug:


https://bugs.launchpad.net/ubuntu/+source/vnc4/+bug/819473

So I can't get back to that ssh session or to R.  I had done a bunch of 
work in R but the command history hasn't been written out.  If I kill R, I 
assume the command history is gone.  I wish I could somehow cause R to 
dump the command history.  Is there any way to tell the running R process 
to write the history somewhere?


Thanks in advance.

Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integration in R

2012-10-02 Thread Berend Hasselman

On 02-10-2012, at 17:23, Naser Jamil jamilnase...@gmail.com wrote:

 Dear R-users,
 I am facing problem with integrating in R a likelihood function which is a
 function of four parameters. It's giving me the result at the end but
 taking more than half an hour to run. I'm wondering is there any other
 efficient way deal with. The following is my code. I am ready to provide
 any other description of my function if you need to move forward.
 
 
 
 library(cubature)
 dose-c(2,3,5)
 y0-c(2,1,0)
 y1-c(1,1,1)
 y2-c(0,1,2)
 
 lf-function (x) {
v-1
for (i in 1:length(dose)) {
psi0-1/((1+exp(x[1]+x[2]*dose[i]))*(1+exp(x[3]+x[4]*dose[i])))
 
 psi1-exp(x[1]+x[2]*dose[i])/((1+exp(x[1]+x[2]*dose[i]))*(1+exp(x[3]+x[4]*dose[i])))
   v-v*(psi0^y0[i])*(psi1^y1[i])*((1-psi0-psi1)^y2[i])
}
   return(v)
}
 
 adaptIntegrate(lf, lowerLimit = c(-20, 0,-20,0), upperLimit = c(0,10,0,10))
 

There are several things you could do.

1. Use the compiler package to make a compiled version of your function.
2. rewrite your function by putting x[1] etc. in separate variables x1, x2,... 
so avoiding the [..] indexing. You can do the same for dose[i].
And also compiling this version of the function.
3. do not recompute expressions such as exp(x1+x2*dose.i) several times. Store 
the result in a temporary variable and use that variable.

With these functions

library(compiler)

lf.c - cmpfun(lf)

lf1 -function (x) {
   v-1 
   x1 - x[1]
   x2 - x[2]
   x3 - x[3]
   x4 - x[4]
   for (i in 1:length(dose)) { 
   dose.i - dose[i]
   z1 - exp(x1+x2*dose.i)
   z2 - exp(x3+x4*dose.i)
   psi0-1/((1+z1)*(1+z2))
   psi1-z1*psi0
   v-v*(psi0^y0[i])*(psi1^y1[i])*((1-psi0-psi1)^y2[i])
   }
   return(v)
}

lf1.c - cmpfun(lf1)

I tested relative speeds with this code (small tolerance and max. function 
evaluations)

library(rbenchmark)

f1 - function() adaptIntegrate(lf   , lowerLimit = c(-20, 0,-20,0), upperLimit 
= c(0,10,0,10), tol=1e-3,maxEval=5)
f2 - function() adaptIntegrate(lf.c , lowerLimit = c(-20, 0,-20,0), upperLimit 
= c(0,10,0,10), tol=1e-3,maxEval=5)
f3 - function() adaptIntegrate(lf1  , lowerLimit = c(-20, 0,-20,0), upperLimit 
= c(0,10,0,10), tol=1e-3,maxEval=5)
f4 - function() adaptIntegrate(lf1.c, lowerLimit = c(-20, 0,-20,0), upperLimit 
= c(0,10,0,10), tol=1e-3,maxEval=5)
benchmark(z1 - f1(),z2 - f2(), z3 - f3(),z4 - f4(),replications=1)

Result:

 benchmark(z1 - f1(),z2 - f2(), z3 - f3(),z4 - f4(),replications=1)
test replications elapsed relative user.self sys.self user.child
1 z1 - f1()1   3.1974.535 3.1770.008  0
2 z2 - f2()1   1.2401.759 1.2350.003  0
3 z3 - f3()1   2.1713.079 2.1670.002  0
4 z4 - f4()1   0.7051.000 0.7000.004  0

As you can see you should be able to get at least a fourfold speedup by using 
the compiled version of the optimized function.
I would certainly  set tol and maxEval to something reasonable initially.
Checking that z1, z2, z3, and z4 are equal is left to you.

Finally, it may even be possible to eliminate the for loop in your function but 
I'll leave that for someone else.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R process must die - can I save history?

2012-10-02 Thread Bert Gunter
?history

in a fresh R session, to see what might be possible. I'll bet the
answer is, No, you're screwed, though. Nevertheless, maybe Linux
experts can save you.

May the Force be with you.

-- Bert

On Tue, Oct 2, 2012 at 10:17 AM, Mike Miller mbmille...@gmail.com wrote:
 I connected from my desktop Linux box to a Linux server using ssh in an
 xterm, but that xterm was running in Xvnc.  I'm running R on the server in
 that xterm (over ssh).  Something went wrong with Xvnc that has caused it to
 hang, probably this bug:

 https://bugs.launchpad.net/ubuntu/+source/vnc4/+bug/819473

 So I can't get back to that ssh session or to R.  I had done a bunch of work
 in R but the command history hasn't been written out.  If I kill R, I assume
 the command history is gone.  I wish I could somehow cause R to dump the
 command history.  Is there any way to tell the running R process to write
 the history somewhere?

 Thanks in advance.

 Mike

 --
 Michael B. Miller, Ph.D.
 Minnesota Center for Twin and Family Research
 Department of Psychology
 University of Minnesota

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kmeans cluster analysis. How do I (1) determine probability of cluster membership (2) determine cluster membership for a new subject

2012-10-02 Thread Ranjan Maitra
John,

On Tue, 2 Oct 2012 11:35:12 -0400 John Sorkin
jsor...@grecc.umaryland.edu wrote:

 Window XP
 R 2.15
  
 I am running a cluster analysis in which I ask for three clusters (see code 
 below). The analysis nicely tells me what cluster each of the subjects in my 
 input dataset belongs to. I would like two pieces of information
 (1) for every subject in my input data set, what is the probability of the 
 subject belonging to each of the three cluster

K-means provides hard clustering, whatever cluster has closest mean
gets the assignment.

 (2) given a new subject, someone who was not in my original dataset, how can 
 I determine their cluster assignment?

Look at the distance between the subject the cluster means: the one
that is closest gets assigned the cluster.

If you are looking for probabilistic clustering (under Gaussian
mixture model assumptions), you could use model-based clustering: one R
package is mclust.

Btw, note that kmeans is very sensitive to initialization (as is
mclust): you may want to try several random starts (for kmeans),
at the very least. Use the argument nstart with a huge number.

HTH,
Ranjan


 Thanks,
 John
  
 # K-Means Cluster Analysis
 jclusters - 3
 fit   - kmeans(datascaled, jclusters) # 3 cluster solution
  
 and fit$cluster tells me what cluster each observation in my input dataset 
 belongs to (output truncated for brevity):
  
  fit$cluster   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16 
   17 . . . .
   1   1   1   1   3   1   1   1   1   2   1   2   1   1   1   1   1 . . . . 
 How do I get probability of being in cluster 1, cluster 2, and cluster 3 for 
 a given subject, e.g datascaled[1,]?How do I get the cluster assigment for a 
 new subject?Thanks,John 
 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)
 Confidentiality Statement:
 This email message, including any attachments, is for ...{{dropped:16}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integration in R

2012-10-02 Thread Rui Barradas

Hello,

Yes, it's possible to remove the loop. Since the loop is used to compute 
a running product and all we want is the final result, use the 
vectorized behavior of R and a final ?prod().

Seedup: another 2x. And 4x2 == 8 == 1 [decimal] order of magnitude.


lf2 -function (x) {
   v-1
   x1 - x[1]
   x2 - x[2]
   x3 - x[3]
   x4 - x[4]
   z1 - exp(x1+x2*dose)
   z2 - exp(x3+x4*dose)
   psi0-1/((1+z1)*(1+z2))
   psi1-z1*psi0
   v - (psi0^y0)*(psi1^y1)*((1-psi0-psi1)^y2)
   return( prod(v) )
}

lf2.c - cmpfun(lf2)

Hope this helps,

Rui Barradas
Em 02-10-2012 18:21, Berend Hasselman escreveu:

On 02-10-2012, at 17:23, Naser Jamil jamilnase...@gmail.com wrote:


Dear R-users,
I am facing problem with integrating in R a likelihood function which is a
function of four parameters. It's giving me the result at the end but
taking more than half an hour to run. I'm wondering is there any other
efficient way deal with. The following is my code. I am ready to provide
any other description of my function if you need to move forward.



library(cubature)
dose-c(2,3,5)
y0-c(2,1,0)
y1-c(1,1,1)
y2-c(0,1,2)

lf-function (x) {
v-1
for (i in 1:length(dose)) {
psi0-1/((1+exp(x[1]+x[2]*dose[i]))*(1+exp(x[3]+x[4]*dose[i])))

psi1-exp(x[1]+x[2]*dose[i])/((1+exp(x[1]+x[2]*dose[i]))*(1+exp(x[3]+x[4]*dose[i])))
   v-v*(psi0^y0[i])*(psi1^y1[i])*((1-psi0-psi1)^y2[i])
}
   return(v)
}

adaptIntegrate(lf, lowerLimit = c(-20, 0,-20,0), upperLimit = c(0,10,0,10))


There are several things you could do.

1. Use the compiler package to make a compiled version of your function.
2. rewrite your function by putting x[1] etc. in separate variables x1, x2,... 
so avoiding the [..] indexing. You can do the same for dose[i].
And also compiling this version of the function.
3. do not recompute expressions such as exp(x1+x2*dose.i) several times. Store 
the result in a temporary variable and use that variable.

With these functions

library(compiler)

lf.c - cmpfun(lf)

lf1 -function (x) {
v-1
x1 - x[1]
x2 - x[2]
x3 - x[3]
x4 - x[4]
for (i in 1:length(dose)) {
dose.i - dose[i]
z1 - exp(x1+x2*dose.i)
z2 - exp(x3+x4*dose.i)
psi0-1/((1+z1)*(1+z2))
psi1-z1*psi0
v-v*(psi0^y0[i])*(psi1^y1[i])*((1-psi0-psi1)^y2[i])
}
return(v)
}

lf1.c - cmpfun(lf1)

I tested relative speeds with this code (small tolerance and max. function 
evaluations)

library(rbenchmark)

f1 - function() adaptIntegrate(lf   , lowerLimit = c(-20, 0,-20,0), upperLimit 
= c(0,10,0,10), tol=1e-3,maxEval=5)
f2 - function() adaptIntegrate(lf.c , lowerLimit = c(-20, 0,-20,0), upperLimit 
= c(0,10,0,10), tol=1e-3,maxEval=5)
f3 - function() adaptIntegrate(lf1  , lowerLimit = c(-20, 0,-20,0), upperLimit 
= c(0,10,0,10), tol=1e-3,maxEval=5)
f4 - function() adaptIntegrate(lf1.c, lowerLimit = c(-20, 0,-20,0), upperLimit 
= c(0,10,0,10), tol=1e-3,maxEval=5)
benchmark(z1 - f1(),z2 - f2(), z3 - f3(),z4 - f4(),replications=1)

Result:


benchmark(z1 - f1(),z2 - f2(), z3 - f3(),z4 - f4(),replications=1)

 test replications elapsed relative user.self sys.self user.child
1 z1 - f1()1   3.1974.535 3.1770.008  0
2 z2 - f2()1   1.2401.759 1.2350.003  0
3 z3 - f3()1   2.1713.079 2.1670.002  0
4 z4 - f4()1   0.7051.000 0.7000.004  0

As you can see you should be able to get at least a fourfold speedup by using 
the compiled version of the optimized function.
I would certainly  set tol and maxEval to something reasonable initially.
Checking that z1, z2, z3, and z4 are equal is left to you.

Finally, it may even be possible to eliminate the for loop in your function but 
I'll leave that for someone else.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice xyplot, get current level

2012-10-02 Thread David Winsemius

On Oct 2, 2012, at 3:59 AM, Christof Kluß wrote:

 Hi
 
 xyplot(y ~ x | subject) plots a separate graph of y against x for each
 level of subject. But I would like to have an own function for each
 level. Something like
 
 xyplot(y ~ x | subject,
   panel = function(x,y) {
 panel.xyplot(x,y)
 
 panel.curve(x,y) {
   # something that dependents on the current subject
   ...
 }
   })
 

This seems to be different question than what you later requested. This 
question would seem to be documented in the 'panel' section of help(xyplot) in 
particular the paragraph beginning with this sentence:

One can also use functions called panel.number and packet.number, representing 
panel order and packet order respectively, inside the panel function (as well 
as the strip function or while interacting with a lattice display using 
trellis.focus etc). 

-- 

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integration in R

2012-10-02 Thread Berend Hasselman

On 02-10-2012, at 20:01, Rui Barradas ruipbarra...@sapo.pt wrote:

 Hello,
 
 Yes, it's possible to remove the loop. Since the loop is used to compute a 
 running product and all we want is the final result, use the vectorized 
 behavior of R and a final ?prod().
 Seedup: another 2x. And 4x2 == 8 == 1 [decimal] order of magnitude.
 
 
 lf2 -function (x) {
   v-1
   x1 - x[1]
   x2 - x[2]
   x3 - x[3]
   x4 - x[4]
   z1 - exp(x1+x2*dose)
   z2 - exp(x3+x4*dose)
   psi0-1/((1+z1)*(1+z2))
   psi1-z1*psi0
   v - (psi0^y0)*(psi1^y1)*((1-psi0-psi1)^y2)
   return( prod(v) )
 }
 
 lf2.c - cmpfun(lf2)
 
 Hope this helps,


Wonderful. It certainly does help.
A single nitpick: the v - 1 at the start of the function can now be removed.

I got a speedup of 7.5 compared to the very first version lf1.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integration in R

2012-10-02 Thread Rui Barradas

Hello,

Em 02-10-2012 19:18, Berend Hasselman escreveu:

On 02-10-2012, at 20:01, Rui Barradas ruipbarra...@sapo.pt wrote:


Hello,

Yes, it's possible to remove the loop. Since the loop is used to compute a 
running product and all we want is the final result, use the vectorized 
behavior of R and a final ?prod().
Seedup: another 2x. And 4x2 == 8 == 1 [decimal] order of magnitude.


lf2 -function (x) {
   v-1
   x1 - x[1]
   x2 - x[2]
   x3 - x[3]
   x4 - x[4]
   z1 - exp(x1+x2*dose)
   z2 - exp(x3+x4*dose)
   psi0-1/((1+z1)*(1+z2))
   psi1-z1*psi0
   v - (psi0^y0)*(psi1^y1)*((1-psi0-psi1)^y2)
   return( prod(v) )
}

lf2.c - cmpfun(lf2)

Hope this helps,


Wonderful. It certainly does help.
A single nitpick: the v - 1 at the start of the function can now be removed.

Yes, I thought about removing it but in the end forgot to.


I got a speedup of 7.5 compared to the very first version lf1.

My system is a Windows 7, R 2.15.1.

Rui Barradas

sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Portuguese_Portugal.1252 LC_CTYPE=Portuguese_Portugal.1252
[3] LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Portugal.1252

attached base packages:
[1] compiler  stats graphics  grDevices utils datasets methods
[8] base

other attached packages:
[1] rbenchmark_1.0.0 cubature_1.1-1

loaded via a namespace (and not attached):
[1] fortunes_1.5-0 tools_2.15.1


Berend



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kmeans cluster analysis. How do I (1) determine probability of cluster membership (2) determine cluster membership for a new subject

2012-10-02 Thread John Sorkin
Ranjan,
Thank you for your help. What eludes me is how one computes the distance from 
each cluster for each subject. For my first subject, datascaled[1,], I have 
tried to use the following: 
v1 - sum(fit$centers[1,]*datascaled[1,])
v2 - sum(fit$centers[2,]*datascaled[1,])
v3 - sum(fit$centers[2,]*datascaled[1,])
hoping the max(v1,v2,v3) would reproduce the group assignment, i.e. simply 
assign the subject to the group that gives the largest value, but it does not. 
How is the distance to the three clusters computed for each subject?
Thanks,
John 
 



 
 
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) Ranjan 
Maitra maitra.mbox.igno...@inbox.com 10/2/2012 1:59 PM 
John,

On Tue, 2 Oct 2012 11:35:12 -0400 John Sorkin
jsor...@grecc.umaryland.edu wrote:

 Window XP
 R 2.15
  
 I am running a cluster analysis in which I ask for three clusters (see code 
 below). The analysis nicely tells me what cluster each of the subjects in my 
 input dataset belongs to. I would like two pieces of information
 (1) for every subject in my input data set, what is the probability of the 
 subject belonging to each of the three cluster

K-means provides hard clustering, whatever cluster has closest mean
gets the assignment.

 (2) given a new subject, someone who was not in my original dataset, how can 
 I determine their cluster assignment?

Look at the distance between the subject the cluster means: the one
that is closest gets assigned the cluster.

If you are looking for probabilistic clustering (under Gaussian
mixture model assumptions), you could use model-based clustering: one R
package is mclust.

Btw, note that kmeans is very sensitive to initialization (as is
mclust): you may want to try several random starts (for kmeans),
at the very least. Use the argument nstart with a huge number.

HTH,
Ranjan


 Thanks,
 John
  
 # K-Means Cluster Analysis
 jclusters - 3
 fit   - kmeans(datascaled, jclusters) # 3 cluster solution
  
 and fit$cluster tells me what cluster each observation in my input dataset 
 belongs to (output truncated for brevity):
  
  fit$cluster   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16 
   17 . . . .
   1   1   1   1   3   1   1   1   1   2   1   2   1   1   1   1   1 . . . . 
 How do I get probability of being in cluster 1, cluster 2, and cluster 3 for 
 a given subject, e.g datascaled[1,]?How do I get the cluster assigment for a 
 new subject?Thanks,John 
 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)
 Confidentiality Statement:
 This email message, including any attachments, is for the sole use of the 
 intended recipient(s) and may contain confidential and privileged 
 information.  Any unauthorized use, disclosure or distribution is prohibited. 
  If you are not the intended recipient, please contact the sender by reply 
 email and destroy all copies of the original message. 
-- 
Important Notice: This mailbox is ignored: e-mails are set to be
deleted on receipt. For those needing to send personal or professional
e-mail, please use appropriate addresses.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!




Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.  
Any unauthorized use, disclosure or distribution is prohibited.  If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integration in R

2012-10-02 Thread William Dunlap
Another nitpick: don't use return() in the last statement.
It isn't needed, it looks like some other language, and
dropping it saves 8% of the time for the uncompiled code
(the compiler seems to get of it).

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Berend Hasselman
 Sent: Tuesday, October 02, 2012 11:19 AM
 To: Rui Barradas
 Cc: r-help@r-project.org; Naser Jamil
 Subject: Re: [R] Integration in R
 
 
 On 02-10-2012, at 20:01, Rui Barradas ruipbarra...@sapo.pt wrote:
 
  Hello,
 
  Yes, it's possible to remove the loop. Since the loop is used to compute a 
  running
 product and all we want is the final result, use the vectorized behavior of R 
 and a final
 ?prod().
  Seedup: another 2x. And 4x2 == 8 == 1 [decimal] order of magnitude.
 
 
  lf2 -function (x) {
v-1
x1 - x[1]
x2 - x[2]
x3 - x[3]
x4 - x[4]
z1 - exp(x1+x2*dose)
z2 - exp(x3+x4*dose)
psi0-1/((1+z1)*(1+z2))
psi1-z1*psi0
v - (psi0^y0)*(psi1^y1)*((1-psi0-psi1)^y2)
return( prod(v) )
  }
 
  lf2.c - cmpfun(lf2)
 
  Hope this helps,
 
 
 Wonderful. It certainly does help.
 A single nitpick: the v - 1 at the start of the function can now be removed.
 
 I got a speedup of 7.5 compared to the very first version lf1.
 
 Berend
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kmeans cluster analysis. How do I (1) determine probability of cluster membership (2) determine cluster membership for a new subject

2012-10-02 Thread Ranjan Maitra
On Tue, 2 Oct 2012 14:32:12 -0400 John Sorkin
jsor...@grecc.umaryland.edu wrote:

 Ranjan,
 Thank you for your help. What eludes me is how one computes the distance from 
 each cluster for each subject. For my first subject, datascaled[1,], I have 
 tried to use the following: 
 v1 - sum(fit$centers[1,]*datascaled[1,])
 v2 - sum(fit$centers[2,]*datascaled[1,])
 v3 - sum(fit$centers[2,]*datascaled[1,])
 hoping the max(v1,v2,v3) would reproduce the group assignment, i.e. simply 
 assign the subject to the group that gives the largest value, but it does 
 not. How is the distance to the three clusters computed for each subject?
 Thanks,
 John 

Well, it should be:

v - vector(length = 3)
for (i in 1:3) 
   v[i] - sum((fit$centers[i, ] - datascaled[1, ])^2)

whichmin(v)

should provide the cluster assignment.

Btw, there is a better, more efficient and automated way to do this,
i.e. avoid the loop using matrices and arrays and apply, but I have not
bothered with that here. 

Ranjan

-- 
Important Notice: This mailbox is ignored: e-mails are set to be
deleted on receipt. For those needing to send personal or professional
e-mail, please use appropriate addresses.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kmeans cluster analysis. How do I (1) determine probability of cluster membership (2) determine cluster membership for a new subject

2012-10-02 Thread John Sorkin
Thank you!
I just wanted to know how one goes from the values returned by kmeans to a 
distance metric. You have shown me that is simply the squared distance from the 
centers! Thanks again.
John

 
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) Ranjan 
Maitra maitra.mbox.igno...@inbox.com 10/2/2012 2:52 PM 
On Tue, 2 Oct 2012 14:32:12 -0400 John Sorkin
jsor...@grecc.umaryland.edu wrote:

 Ranjan,
 Thank you for your help. What eludes me is how one computes the distance from 
 each cluster for each subject. For my first subject, datascaled[1,], I have 
 tried to use the following: 
 v1 - sum(fit$centers[1,]*datascaled[1,])
 v2 - sum(fit$centers[2,]*datascaled[1,])
 v3 - sum(fit$centers[2,]*datascaled[1,])
 hoping the max(v1,v2,v3) would reproduce the group assignment, i.e. simply 
 assign the subject to the group that gives the largest value, but it does 
 not. How is the distance to the three clusters computed for each subject?
 Thanks,
 John 

Well, it should be:

v - vector(length = 3)
for (i in 1:3) 
   v[i] - sum((fit$centers[i, ] - datascaled[1, ])^2)

whichmin(v)

should provide the cluster assignment.

Btw, there is a better, more efficient and automated way to do this,
i.e. avoid the loop using matrices and arrays and apply, but I have not
bothered with that here. 

Ranjan

-- 
Important Notice: This mailbox is ignored: e-mails are set to be
deleted on receipt. For those needing to send personal or professional
e-mail, please use appropriate addresses.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most 
webmails




Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.  
Any unauthorized use, disclosure or distribution is prohibited.  If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R process must die - can I save history?

2012-10-02 Thread Prof Brian Ripley

On 02/10/2012 18:29, Bert Gunter wrote:

?history

in a fresh R session, to see what might be possible. I'll bet the
answer is, No, you're screwed, though. Nevertheless, maybe Linux
experts can save you.


Maybe not.  On a Unix-alike see ?Signals.  If you can find the pid of 
the R process and it is still running (and not e.g. suspended),


kill -USR1 pid

will save the workspace and history.



May the Force be with you.

-- Bert

On Tue, Oct 2, 2012 at 10:17 AM, Mike Miller mbmille...@gmail.com wrote:

I connected from my desktop Linux box to a Linux server using ssh in an
xterm, but that xterm was running in Xvnc.  I'm running R on the server in
that xterm (over ssh).  Something went wrong with Xvnc that has caused it to
hang, probably this bug:

https://bugs.launchpad.net/ubuntu/+source/vnc4/+bug/819473

So I can't get back to that ssh session or to R.  I had done a bunch of work
in R but the command history hasn't been written out.  If I kill R, I assume
the command history is gone.  I wish I could somehow cause R to dump the
command history.  Is there any way to tell the running R process to write
the history somewhere?

Thanks in advance.

Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behavior with weights in binomial glm()

2012-10-02 Thread Josh Browning
Thank you all for your help and advice.  This wasn't quite the answer
I was looking for, but these concepts make more sense to me now and I
think I should be able to resolve the issues I've been having.

Thanks again!

On Sun, Sep 30, 2012 at 6:26 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Sep 30, 2012, at 4:47 AM, Josh Browning wrote:

 Hi David,

 Yes, I agree that the results are very similar but I don't
 understand why they are not exactly equal given that the data sets are
 identical.

 And yes, this 1% numerical difference is hugely important to me.

 I believe you will find it is not important at all.

 I have another data set (much larger than this toy example) that works
 on the aggregated data (returning a coefficient of about 1) but
 returns the warning about perfect separation on the non-aggregated
 data (and a coefficient of about 1e15).  So, I'd at least like to be
 able to understand where this numerical difference is coming from and,
 preferably, a way to tweak my glm() runs (possibly adjusting the
 numerical precision somehow???) so that this doesn't happen.

 Here. This shows how you can ratchet up your desired precision:

 dAgg = aggregate( d$RESP, by=list(d$RESP, d$INDEP), FUN=length )
 colnames(dAgg) = c(RESP,INDEP,WT)

 glm( RESP ~ INDEP, family=binomial, data=dAgg, weights=WT, control = 
 glm.control(epsilon = 1e-10,
 maxit = 50) )
 glm( RESP ~ INDEP, family=binomial, data=dAgg, weights=WT, control = 
 glm.control(epsilon = 1e-11,
 maxit = 30) )
 glm( RESP ~ INDEP, family=binomial, data=dAgg, weights=WT, control = 
 glm.control(epsilon = 1e-12,
 maxit = 50) )
 glm( RESP ~ INDEP, family=binomial, data=dAgg, weights=WT, control = 
 glm.control(epsilon = 1e-13,
 maxit = 50) )

  ... and finally get a message that _should_ demonstrate you why I thought 
 asking for greater numerical precision in an extreme case such as this was a 
 Fool's errand. The true value is positive infinity, but numerical 
 precision is only 8 bytes so we could only get to an estimated odds ratio of 
 exp(32.975 ) = 2.09344e+14, which corresponds to odds of 200 trillion to 1. 
 Further increases on the 'maxit' and decreases in 'epsilon' are vigorously 
 ignored,  and rightfully so.

 After you do such analyses long enough you learn that coefficients above 5 
 are suspicious and those above 10 are usually a sign of an error in data 
 preparation.

  (I do not think this is an example of complete separation, since the range 
 of the predictors overlap for the two outcomes. But who know I have made 
 (many) errors before.)

 --
 David,



 Josh

 On Sat, Sep 29, 2012 at 7:50 PM, David Winsemius dwinsem...@comcast.net 
 wrote:

 On Sep 29, 2012, at 7:10 AM, Josh Browning wrote:

 Hi useRs,

 I'm experiencing something quite weird with glm() and weights, and
 maybe someone can explain what I'm doing wrong.  I have a dataset
 where each row represents a single case, and I run
 glm(...,family=binomial) and get my coefficients.  However, some of
 my cases have the exact same values for predictor variables, so I
 should be able to aggregate up my data frame and run glm(...,
 family=binomial,weights=wts) and get the same coefficients (maybe
 this is my incorrect assumption, but I can't see why it would be).
 Anyways, here's a minimum working example below:

 d = data.frame( RESP=c(rep(1,5),rep(0,5)), INDEP=c(1,1,1,1,0,0,0,0,0,0) )
 glm( RESP ~ INDEP, family=binomial, data=d )

 Call:  glm(formula = RESP ~ INDEP, family = binomial, data = d)

 Coefficients:
 (Intercept)INDEP
-1.609   21.176

 Degrees of Freedom: 9 Total (i.e. Null);  8 Residual
 Null Deviance:  13.86
 Residual Deviance: 5.407AIC: 9.407
 dAgg = aggregate( d$RESP, by=list(d$RESP, d$INDEP), FUN=length )
 colnames(dAgg) = c(RESP,INDEP,WT)
 glm( RESP ~ INDEP, family=binomial, data=dAgg, weights=WT )

 Call:  glm(formula = RESP ~ INDEP, family = binomial, data = dAgg,
   weights = WT)

 Coefficients:
 (Intercept)INDEP
-1.609   20.975

 Degrees of Freedom: 2 Total (i.e. Null);  1 Residual
 Null Deviance:  13.86
 Residual Deviance: 5.407AIC: 9.407

 Those two results look very similar and it is with a data situation that 
 seems somewhat extreme. The concern is for the 1% numerical  difference in 
 the regression coefficient? Am I reading you correctly?

 --
 David Winsemius, MD
 Alameda, CA, USA


 David Winsemius, MD
 Alameda, CA, USA


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Possible error in BCa method for confidence intervals in package 'boot'

2012-10-02 Thread Robert A. LaBudde

I'm using R 2.15.1 on a 64-bit machine with Windows 7 Home Premium.

Sample problem (screwy subscripted syntax is a relic of edited down a 
more complex script):


 N - 25
 s - rlnorm(N, 0, 1)
 require(boot)
Loading required package: boot
 v - NULL # hold sample variance estimates
 i - 1
 v[i] - var(s) # get sample variance
 nReal - 10
 varf - function (x,i) { var(x[i]) }
 fabc - function (x, w) { # weighted average (biased) variance
+   sum(x^2 * w) / sum(w) - (sum(x * w) / sum(w))^2
+ }
 p - c(.25, .75, .2, .8, .15, .85, .1, .9, .05, .95, .025, .975, 
.005, .995)

 cl - c(0.5, 0.6, 0.7, 0.8, 0.9, 0.95, 0.99)
 b - boot(s, varf, R = nReal) # bootstrap
 bv - NULL # hold bootstrap mean variance estimates
 bias - NULL #hold bias estimates
 bv[i] - mean(b$t) # bootstrap mean variance
 bias[i] - bv[i] - v[i] # bias estimate
 bCI90 - boot.ci(b, conf = 0.90)
Error in bca.ci(boot.out, conf, index[1L], L = L, t = t.o, t0 = t0.o,  :
  estimated adjustment 'a' is NA
In addition: Warning messages:
1: In norm.inter(t, (1 + c(conf, -conf))/2) :
  extreme order statistics used as endpoints
2: In boot.ci(b, conf = 0.9) :
  bootstrap variances needed for studentized intervals
3: In norm.inter(t, alpha) : extreme order statistics used as endpoints

 nReal - 25
 b - boot(s, varf, R = nReal) # bootstrap
 bv[i] - mean(b$t) # bootstrap mean variance
 bias[i] - bv[i] - v[i] # bias estimate
 bCI90 - boot.ci(b, conf = 0.90)
Warning messages:
1: In boot.ci(b, conf = 0.9) :
  bootstrap variances needed for studentized intervals
2: In norm.inter(t, adj.alpha) :
  extreme order statistics used as endpoints

The problem is that doing 10 resamples generates an NA in the 
estimation of the 'acceleration' in the function abc.ci(), but doing 
25 resamples does not. This implies a connection between the number 
of resamples and the 'acceleration' which should not exist. 
('Acceleration' should be obtained from the original sample via 
jackknife as 1/6 the coefficient of skewness.)


Looking at the script for abc.ci(), there is an anomalous reference 
to 'n' in the invocation line, yet 'n' is not an argument, so must be 
defined more globally before the call. Yet 'n' is defined within the 
script as the length of 'data', which is referred to as the 
'bootstrap' vector in the comments, yet should be the original sample 
data. This confusion, plus the use of an argument 'eps' as a default 
0.001/n in the calculations makes me suspect the programming in the script.


The script apparently works correctly if the number of resamples 
equals or exceeds the number of original data, but not otherwise.




Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: r...@lcfltd.com
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Zero Inflated Models - Plotting Results

2012-10-02 Thread lieslpe
Dear SamiC,
I am also attempting to plot my zero inflated model on my data. Did you find
a solution? Does anyone else on this list have a solution?

Thanks,
Liesl 

Message from SamiC Jun 30, 2011:
I am fitting a zero inflated negative binomial model to my data.  I have
pretty much got my selected model and now i am wanting to plot the model to
the origional data.

Example.

f8-formula(Birds~Tide+Fish+Zooplankton+Chlorophylla|StratificationIndex)
Nb8-zeroinfl(f8, dist=negbin, link= logit, data=ocean)

Tide is a factor, so i assume i have to plot a different graph for each
level of the factor.

what i essentially want to do is plot a graph for each variables against
birds with the fitted line of the model.

I have been using the predict function, but i get the same trend for every
graph and variable.  I was reading that the predict function gives a mean
across all the values (or something to this effect).

Does anyone know how to code for this in R.  from the above model i want to
plot birds~fish (the original data) and then fit the line of the
relationship from the model.




--
View this message in context: 
http://r.789695.n4.nabble.com/Zero-Inflated-Models-Plotting-Results-tp3636373p4644800.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Proposal: Package update log

2012-10-02 Thread Starkweather, Jonathan
I'm relatively new to R and would first like to sincerely thank all those who 
contribute to its development. Thank you. 

I would humbly like to propose a rule which creates a standard (i.e., strongly 
encouraged, mandatory, etc.) for authors to include a `change log' documenting 
specific changes for each update made to their package(s). The more detailed 
the description, the better; and it would be exceptionally useful if the 
document were available from the R-console (similar to the help function). In 
other words, I am suggesting that the `change log' file be included in the 
package(s) and preferably accessible from the R-console. 

I am aware that many packages available on CRAN have a `change log' or `news' 
page. However; not all packages have something like that and many which do, are 
not terribly detailed in conveying what has been updated or changed. 

I am also aware that package authors are not a particularly lazy group, sitting 
around with nothing to do. My proposal would likely add a non-significant 
amount of work to the already very generous (and appreciated) work performed by 
package authors, maintainers, etc. I do, however, believe it would be greatly 
appreciated and beneficial to have more detailed update information available 
from the R-console as some of us (users) update packages daily and are often 
left wondering what exactly has been updated. 

I did not post this to the R-devel list because I consider this proposal more 
of a documentation issue than a development issue. Also, I would like to see 
some discussion of this proposal from a varied pool of stakeholders (e.g., 
users and not just developers, package authors, package maintainers, etc.).  


Respectfully,

Jon Starkweather, PhD
University of North Texas
Research and Statistical Support
http://www.unt.edu/rss/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integration in R

2012-10-02 Thread Dereje Bacha
Hi

I am facing a problem of restricting an intercept of systems of equations.  
Y1=f(X1,X2,X3)
Y2=(X1,X2,X4)
 
I want to restrict an intercept of equation 2 equal to coefficient of X2 
of equation 1. 

Please help
Dereje  



 From: Naser Jamil jamilnase...@gmail.com
To: r-help@r-project.org 
Sent: Tuesday, October 2, 2012 10:23 AM
Subject: [R] Integration in R

Dear R-users,
I am facing problem with integrating in R a likelihood function which is a
function of four parameters. It's giving me the result at the end but
taking more than half an hour to run. I'm wondering is there any other
efficient way deal with. The following is my code. I am ready to provide
any other description of my function if you need to move forward.



library(cubature)
dose-c(2,3,5)
y0-c(2,1,0)
y1-c(1,1,1)
y2-c(0,1,2)

lf-function (x) {
    v-1
    for (i in 1:length(dose)) {
        psi0-1/((1+exp(x[1]+x[2]*dose[i]))*(1+exp(x[3]+x[4]*dose[i])))

psi1-exp(x[1]+x[2]*dose[i])/((1+exp(x[1]+x[2]*dose[i]))*(1+exp(x[3]+x[4]*dose[i])))
       v-v*(psi0^y0[i])*(psi1^y1[i])*((1-psi0-psi1)^y2[i])
        }
       return(v)
        }

adaptIntegrate(lf, lowerLimit = c(-20, 0,-20,0), upperLimit = c(0,10,0,10))

-
Thanks for your attention.


Kind regards,

Jamil.

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Parametric effects in GAM

2012-10-02 Thread Petar Milin
Hello!
Can anyone give a tip how to plot parametric effects in an Generalized
Additive Model, from mgcv package?

Thanks,
PM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integration in R

2012-10-02 Thread Berend Hasselman

On 02-10-2012, at 20:50, Dereje Bacha d_ba...@yahoo.com wrote:

 Hi
 
 I am facing a problem of restricting an intercept of systems of equations.  
 Y1=f(X1,X2,X3)
 Y2=(X1,X2,X4)
  
 I want to restrict an intercept of equation 2 equal to coefficient of X2 of 
 equation 1. 
 

Please do not hijack a thread/conversation.
Do not reply to a thread with a completely different subject.
Start a new thread for a new subject.

It is completely unclear what you want.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Proposal: Package update log

2012-10-02 Thread Greg Snow
There is already support in the packaging system for a NEWS file which
can be accessed within R using the 'news' function.  What would the
changelog that you are proposing contribute or contain beyond what the
NEWS file already does?

Creating and updating NEWS is not mandatory, but is encouraged.
Making the update of NEWS mandatory would probably not change anything
for the authors already using it, for those that don't it may cause
the author to be less likely to share or update their package rather
than bother with logging the updates (though whether that is a bad
thing or a good thing is a whole different discussion).  How would you
mandate the update log?  just requiring a change could result in a
lazy author adding a single line of gibberish or a statement like
updated on this date.  To enforce something more meaningful would
require some form of moderation and place additional burden where
there is already a lot of volunteer work happening.

If a package is on R-forge (or other similar locations) then you can
get a details view of the changes between versions.  Packages on CRAN
have previous versions available and there are tools that will let you
compare the differences in detail.

On Tue, Oct 2, 2012 at 11:01 AM, Starkweather, Jonathan
jonathan.starkweat...@unt.edu wrote:
 I'm relatively new to R and would first like to sincerely thank all those who 
 contribute to its development. Thank you.

 I would humbly like to propose a rule which creates a standard (i.e., 
 strongly encouraged, mandatory, etc.) for authors to include a `change log' 
 documenting specific changes for each update made to their package(s). The 
 more detailed the description, the better; and it would be exceptionally 
 useful if the document were available from the R-console (similar to the help 
 function). In other words, I am suggesting that the `change log' file be 
 included in the package(s) and preferably accessible from the R-console.

 I am aware that many packages available on CRAN have a `change log' or `news' 
 page. However; not all packages have something like that and many which do, 
 are not terribly detailed in conveying what has been updated or changed.

 I am also aware that package authors are not a particularly lazy group, 
 sitting around with nothing to do. My proposal would likely add a 
 non-significant amount of work to the already very generous (and appreciated) 
 work performed by package authors, maintainers, etc. I do, however, believe 
 it would be greatly appreciated and beneficial to have more detailed update 
 information available from the R-console as some of us (users) update 
 packages daily and are often left wondering what exactly has been updated.

 I did not post this to the R-devel list because I consider this proposal more 
 of a documentation issue than a development issue. Also, I would like to see 
 some discussion of this proposal from a varied pool of stakeholders (e.g., 
 users and not just developers, package authors, package maintainers, etc.).


 Respectfully,

 Jon Starkweather, PhD
 University of North Texas
 Research and Statistical Support
 http://www.unt.edu/rss/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ffsave problems

2012-10-02 Thread Lucas Chaparro
Dear R friends.
After having some troubles learning how to create a ffdf object, now I find
myself having problems saving it.

this is the data i´d like to save:
str(DATA)
List of 3
 $ virtual: 'data.frame': 6 obs. of  7 variables:
 .. $ VirtualVmode : chr  double short integer integer ...
 .. $ AsIs : logi  FALSE FALSE FALSE FALSE FALSE FALSE
 .. $ VirtualIsMatrix  : logi  FALSE FALSE FALSE FALSE FALSE FALSE
 .. $ PhysicalIsMatrix : logi  FALSE FALSE FALSE FALSE FALSE FALSE
 .. $ PhysicalElementNo: int  1 2 3 4 5 6
 .. $ PhysicalFirstCol : int  1 1 1 1 1 1
 .. $ PhysicalLastCol  : int  1 1 1 1 1 1
 .. - attr(*, Dim)= int  65640757 6
 .. - attr(*, Dimorder)= int  1 2
 $ physical: List of 6
 .. $ fecha: list()
 ..  ..- attr(*, physical)=Class 'ff_pointer' externalptr
 ..  .. ..- attr(*, vmode)= chr double
 ..  .. ..- attr(*, maxlength)= int 65640757
 ..  .. ..- attr(*, pattern)= chr clone
 ..  .. ..- attr(*, filename)= chr
C:/Users/ADMIN-~1/AppData/Local/Temp/RtmpqmcfGU/clone10d86b45b04.ff
 ..  .. ..- attr(*, pagesize)= int 65536
 ..  .. ..- attr(*, finalizer)= chr delete
 ..  .. ..- attr(*, finonexit)= logi TRUE
 ..  .. ..- attr(*, readonly)= logi FALSE
 ..  .. ..- attr(*, caching)= chr mmnoflush
 ..  ..- attr(*, virtual)= list()
 ..  .. ..- attr(*, Length)= int 65640757
 ..  .. ..- attr(*, Symmetric)= logi FALSE
 ..  .. ..- attr(*, ramclass)= chr Date
 ..  .. ..- attr(*, class)= chr virtual
 .. .. - attr(*, class) =  chr [1:2] ff_vector ff
 .. $ juliano  : list()
 ..  ..- attr(*, physical)=Class 'ff_pointer' externalptr
 ..  .. ..- attr(*, vmode)= chr short
 ..  .. ..- attr(*, maxlength)= int 65640757
 ..  .. ..- attr(*, pattern)= chr ff
 ..  .. ..- attr(*, filename)= chr
C:/Users/ADMIN-~1/AppData/Local/Temp/RtmpqmcfGU/ff10d829d36fc8.ff
 ..  .. ..- attr(*, pagesize)= int 65536
 ..  .. ..- attr(*, finalizer)= chr delete
 ..  .. ..- attr(*, finonexit)= logi TRUE
 ..  .. ..- attr(*, readonly)= logi FALSE
 ..  .. ..- attr(*, caching)= chr mmnoflush
 ..  ..- attr(*, virtual)= list()
 ..  .. ..- attr(*, Length)= int 65640757
 ..  .. ..- attr(*, Symmetric)= logi FALSE
 ..  .. ..- attr(*, class)= chr virtual
 .. .. - attr(*, class) =  chr [1:2] ff_vector ff
 .. $ altura   : list()
 ..  ..- attr(*, physical)=Class 'ff_pointer' externalptr
 ..  .. ..- attr(*, vmode)= chr integer
 ..  .. ..- attr(*, maxlength)= int 65640757
 ..  .. ..- attr(*, pattern)= chr ff
 ..  .. ..- attr(*, filename)= chr
C:/Users/ADMIN-~1/AppData/Local/Temp/RtmpqmcfGU/ff10d8dcd2eca.ff
 ..  .. ..- attr(*, pagesize)= int 65536
 ..  .. ..- attr(*, finalizer)= chr delete
 ..  .. ..- attr(*, finonexit)= logi TRUE
 ..  .. ..- attr(*, readonly)= logi FALSE
 ..  .. ..- attr(*, caching)= chr mmnoflush
 ..  ..- attr(*, virtual)= list()
 ..  .. ..- attr(*, Length)= int 65640757
 ..  .. ..- attr(*, Symmetric)= logi FALSE
 ..  .. ..- attr(*, class)= chr virtual
 .. .. - attr(*, class) =  chr [1:2] ff_vector ff
 .. $ UTM.E: list()
 ..  ..- attr(*, physical)=Class 'ff_pointer' externalptr
 ..  .. ..- attr(*, vmode)= chr integer
 ..  .. ..- attr(*, maxlength)= int 65640757
 ..  .. ..- attr(*, pattern)= chr ff
 ..  .. ..- attr(*, filename)= chr
C:/Users/ADMIN-~1/AppData/Local/Temp/RtmpqmcfGU/ff10d833f74cfa.ff
 ..  .. ..- attr(*, pagesize)= int 65536
 ..  .. ..- attr(*, finalizer)= chr delete
 ..  .. ..- attr(*, finonexit)= logi TRUE
 ..  .. ..- attr(*, readonly)= logi FALSE
 ..  .. ..- attr(*, caching)= chr mmnoflush
 ..  ..- attr(*, virtual)= list()
 ..  .. ..- attr(*, Length)= int 65640757
 ..  .. ..- attr(*, Symmetric)= logi FALSE
 ..  .. ..- attr(*, class)= chr virtual
 .. .. - attr(*, class) =  chr [1:2] ff_vector ff
 .. $ UTM.N: list()
 ..  ..- attr(*, physical)=Class 'ff_pointer' externalptr
 ..  .. ..- attr(*, vmode)= chr integer
 ..  .. ..- attr(*, maxlength)= int 65640757
 ..  .. ..- attr(*, pattern)= chr ff
 ..  .. ..- attr(*, filename)= chr
C:/Users/ADMIN-~1/AppData/Local/Temp/RtmpqmcfGU/ff10d85124f1a.ff
 ..  .. ..- attr(*, pagesize)= int 65536
 ..  .. ..- attr(*, finalizer)= chr delete
 ..  .. ..- attr(*, finonexit)= logi TRUE
 ..  .. ..- attr(*, readonly)= logi FALSE
 ..  .. ..- attr(*, caching)= chr mmnoflush
 ..  ..- attr(*, virtual)= list()
 ..  .. ..- attr(*, Length)= int 65640757
 ..  .. ..- attr(*, Symmetric)= logi FALSE
 ..  .. ..- attr(*, class)= chr virtual
 .. .. - attr(*, class) =  chr [1:2] ff_vector ff
 .. $ fire_ocur: list()
 ..  ..- attr(*, physical)=Class 'ff_pointer' externalptr
 ..  .. ..- attr(*, vmode)= chr byte
 ..  .. ..- attr(*, maxlength)= int 65640757
 ..  .. ..- attr(*, pattern)= chr ff
 ..  .. ..- attr(*, filename)= chr
C:/Users/ADMIN-~1/AppData/Local/Temp/RtmpqmcfGU/ff10d824ef517f.ff
 ..  .. ..- attr(*, pagesize)= int 65536
 ..  .. ..- attr(*, finalizer)= chr delete
 ..  .. ..- attr(*, finonexit)= logi TRUE
 ..  .. ..- attr(*, readonly)= logi FALSE
 ..  .. ..- attr(*, caching)= chr mmnoflush
 ..  ..- attr(*, virtual)= list()
 ..  .. ..- attr(*, Length)= int 65640757
 ..  .. ..- attr(*, 

Re: [R] Proposal: Package update log

2012-10-02 Thread Brian Diggs

On 10/2/2012 10:01 AM, Starkweather, Jonathan wrote:

I'm relatively new to R and would first like to sincerely thank all
those who contribute to its development. Thank you.

I would humbly like to propose a rule which creates a standard (i.e.,
strongly encouraged, mandatory, etc.) for authors to include a
`change log' documenting specific changes for each update made to
their package(s). The more detailed the description, the better; and
it would be exceptionally useful if the document were available from
the R-console (similar to the help function). In other words, I am
suggesting that the `change log' file be included in the package(s)
and preferably accessible from the R-console.


Let me just address the accessible from the R-console part of this. 
The infrastructure to support this already exists; check out the news() 
function. It does require a structured NEWS file, but will make some 
attempts to figure out the structure (see the documentation).



I am aware that many packages available on CRAN have a `change log'
or `news' page. However; not all packages have something like that
and many which do, are not terribly detailed in conveying what has
been updated or changed.


More detail is generally nice. However, I'm not sure what CRAN could 
enforce. A simple check that could be automated is that news() for the 
package and the version declared in the DESCRIPTION file returns a 
non-empty result. But I don't know if that would be a net positive 
(adding another burden for package authors vs giving more detail). Those 
authors that care probably already do so; those that don't will just 
have to work around the check without necessarily adding anything useful.



I am also aware that package authors are not a particularly lazy
group, sitting around with nothing to do. My proposal would likely
add a non-significant amount of work to the already very generous
(and appreciated) work performed by package authors, maintainers,
etc. I do, however, believe it would be greatly appreciated and
beneficial to have more detailed update information available from
the R-console as some of us (users) update packages daily and are
often left wondering what exactly has been updated.

I did not post this to the R-devel list because I consider this
proposal more of a documentation issue than a development issue.
Also, I would like to see some discussion of this proposal from a
varied pool of stakeholders (e.g., users and not just developers,
package authors, package maintainers, etc.).


Respectfully,

Jon Starkweather, PhD University of North Texas Research and
Statistical Support http://www.unt.edu/rss/




--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health  Science University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Having two different versions of a package in the same R installation

2012-10-02 Thread Søren Højsgaard
Dear list,

I am making some comparisons of two versions of the lme4 package: The CRAN 
version and the R-Forge version. For the moment I have two different R 
installations, each with a different version of lme4. However it would be 
convenient if I could have the same version within the same R installation such 
that I can run a code chunk on one lme4 version and immediately after run the 
same code chunk on another lme4 version. Does anyone know if this is possible?

Best regards
Søren

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Having two different versions of a package in the same R installation

2012-10-02 Thread Greg Snow
So if you have both loaded in the same instance of R, how will R know
which version of lmer or other functions that you want to run?

It seems cleanest to me to have the 2 different instances of R running
like you do now.  The other option would be to change all the names
(exported ones anyways) in one version and change that package name as
well.  Then you would have 2 different package names with different
exported function names.

On Tue, Oct 2, 2012 at 2:18 PM, Søren Højsgaard sor...@math.aau.dk wrote:
 Dear list,

 I am making some comparisons of two versions of the lme4 package: The CRAN 
 version and the R-Forge version. For the moment I have two different R 
 installations, each with a different version of lme4. However it would be 
 convenient if I could have the same version within the same R installation 
 such that I can run a code chunk on one lme4 version and immediately after 
 run the same code chunk on another lme4 version. Does anyone know if this is 
 possible?

 Best regards
 Søren

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Having two different versions of a package in the same R installation

2012-10-02 Thread Søren Højsgaard
I don't know if it would work but a kludgy attempt would be to install lme4 
from CRAN, rename the lme4 directory in library to lme4cran; then install lme4 
from R-forge and rename the lme4 directory to lme4forge. Then create a script 
flexible script that would copy one of the directories to a directory called 
lme4 on the fly. I don't know if it would work, but I just wonder if there 
would possibly a more elegant way.

Regards
Søren

-Original Message-
From: Greg Snow [mailto:538...@gmail.com] 
Sent: 2. oktober 2012 22:27
To: Søren Højsgaard
Cc: r-help@r-project.org
Subject: Re: [R] Having two different versions of a package in the same R 
installation

So if you have both loaded in the same instance of R, how will R know which 
version of lmer or other functions that you want to run?

It seems cleanest to me to have the 2 different instances of R running like you 
do now.  The other option would be to change all the names (exported ones 
anyways) in one version and change that package name as well.  Then you would 
have 2 different package names with different exported function names.

On Tue, Oct 2, 2012 at 2:18 PM, Søren Højsgaard sor...@math.aau.dk wrote:
 Dear list,

 I am making some comparisons of two versions of the lme4 package: The CRAN 
 version and the R-Forge version. For the moment I have two different R 
 installations, each with a different version of lme4. However it would be 
 convenient if I could have the same version within the same R installation 
 such that I can run a code chunk on one lme4 version and immediately after 
 run the same code chunk on another lme4 version. Does anyone know if this is 
 possible?

 Best regards
 Søren

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Having two different versions of a package in the same R installation

2012-10-02 Thread Henrik Bengtsson
Install the two versions in two different *libraries* and update
.libPaths() to prioritize one over the other. /Henrik

On Tue, Oct 2, 2012 at 1:18 PM, Søren Højsgaard sor...@math.aau.dk wrote:
 Dear list,

 I am making some comparisons of two versions of the lme4 package: The CRAN 
 version and the R-Forge version. For the moment I have two different R 
 installations, each with a different version of lme4. However it would be 
 convenient if I could have the same version within the same R installation 
 such that I can run a code chunk on one lme4 version and immediately after 
 run the same code chunk on another lme4 version. Does anyone know if this is 
 possible?

 Best regards
 Søren

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Install of R-2.15.1 for Windows (64 bit) on application server

2012-10-02 Thread dthomas
Thanks, will go ahead



--
View this message in context: 
http://r.789695.n4.nabble.com/Install-of-R-2-15-1-for-Windows-64-bit-on-application-server-tp4644042p4644829.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Having two different versions of a package in the same R installation

2012-10-02 Thread Prof Brian Ripley

On 02/10/2012 21:38, Søren Højsgaard wrote:

I don't know if it would work but a kludgy attempt would be to install lme4 
from CRAN, rename the lme4 directory in library to lme4cran; then install lme4 
from R-forge and rename the lme4 directory to lme4forge. Then create a script 
flexible script that would copy one of the directories to a directory called 
lme4 on the fly. I don't know if it would work, but I just wonder if there 
would possibly a more elegant way.


Do you want them installed or loaded?

For installation, simply use different library directories, and adjust 
.libPaths.


You won't succeed in loading them together: the insuperable problem is 
not the R functions (they could be in different environments) but the 
DLLs (you can only load one DLL of a given name).


We've been here with versioned installs, which were abandoned a long 
while ago (not least as we never had versioned namespaces).



Regards
Søren

-Original Message-
From: Greg Snow [mailto:538...@gmail.com]
Sent: 2. oktober 2012 22:27
To: Søren Højsgaard
Cc: r-help@r-project.org
Subject: Re: [R] Having two different versions of a package in the same R 
installation

So if you have both loaded in the same instance of R, how will R know which 
version of lmer or other functions that you want to run?

It seems cleanest to me to have the 2 different instances of R running like you 
do now.  The other option would be to change all the names (exported ones 
anyways) in one version and change that package name as well.  Then you would 
have 2 different package names with different exported function names.

On Tue, Oct 2, 2012 at 2:18 PM, Søren Højsgaard sor...@math.aau.dk wrote:

Dear list,

I am making some comparisons of two versions of the lme4 package: The CRAN 
version and the R-Forge version. For the moment I have two different R 
installations, each with a different version of lme4. However it would be 
convenient if I could have the same version within the same R installation such 
that I can run a code chunk on one lme4 version and immediately after run the 
same code chunk on another lme4 version. Does anyone know if this is possible?

Best regards
Søren

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Proposal: Package update log

2012-10-02 Thread Yihui Xie
One other similar location is Github, where you can watch a
package, and this is how I keep track of changes in the packages that
I'm interested in.

Just for the interest of other R package developers, the NEWS file can
be written in Markdown and I have a Makefile
(https://github.com/yihui/knitr/blob/master/Makefile) to convert
Markdown to R's standard NEWS format, e.g.

https://github.com/yihui/knitr/blob/master/NEWS.md
- (via make news)
https://github.com/yihui/knitr/blob/master/NEWS

If you do not belong to the plain-text-is-everything party, you may
consider this format.

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Tue, Oct 2, 2012 at 2:47 PM, Greg Snow 538...@gmail.com wrote:
 There is already support in the packaging system for a NEWS file which
 can be accessed within R using the 'news' function.  What would the
 changelog that you are proposing contribute or contain beyond what the
 NEWS file already does?

 Creating and updating NEWS is not mandatory, but is encouraged.
 Making the update of NEWS mandatory would probably not change anything
 for the authors already using it, for those that don't it may cause
 the author to be less likely to share or update their package rather
 than bother with logging the updates (though whether that is a bad
 thing or a good thing is a whole different discussion).  How would you
 mandate the update log?  just requiring a change could result in a
 lazy author adding a single line of gibberish or a statement like
 updated on this date.  To enforce something more meaningful would
 require some form of moderation and place additional burden where
 there is already a lot of volunteer work happening.

 If a package is on R-forge (or other similar locations) then you can
 get a details view of the changes between versions.  Packages on CRAN
 have previous versions available and there are tools that will let you
 compare the differences in detail.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Having two different versions of a package in the same R installation

2012-10-02 Thread Gabor Grothendieck
On Tue, Oct 2, 2012 at 4:18 PM, Søren Højsgaard sor...@math.aau.dk wrote:
 Dear list,

 I am making some comparisons of two versions of the lme4 package: The CRAN 
 version and the R-Forge version. For the moment I have two different R 
 installations, each with a different version of lme4. However it would be 
 convenient if I could have the same version within the same R installation 
 such that I can run a code chunk on one lme4 version and immediately after 
 run the same code chunk on another lme4 version. Does anyone know if this is 
 possible?


Install the two versions into two different R libraries:

install.packages(my.package, repos = ..., lib = ...) # 1st repo/lib
install.packages(my.package, repos = ..., lib = ...) # 2nd repo/lib

library(my.package, lib.loc = ...) # 1st lib
# use it
detach(my.package, unload = TRUE)
# next line may or may not be needed
library.dynam.unload(my.package, system.file(package = my.package))
library(my.package, lib.loc = ...) # 2nd lib

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Having two different versions of a package in the same R installation

2012-10-02 Thread Dirk Eddelbuettel

On 2 October 2012 at 20:18, Søren Højsgaard wrote:
| I am making some comparisons of two versions of the lme4 package: The CRAN 
version and the R-Forge version. For the moment I have two different R 
installations, each with a different version of lme4. However it would be 
convenient if I could have the same version within the same R installation such 
that I can run a code chunk on one lme4 version and immediately after run the 
same code chunk on another lme4 version. Does anyone know if this is possible?


If you use different directories both versions can be installed at the same
time. The one being found first when going down .libPaths() is the default,
the other can be loaded by supplying the path to library().

Dirk

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R process must die - can I save history?

2012-10-02 Thread steven mosher
thanks Dr. R. this will come in handy in the future as I have a knack for
hanging R.
On Oct 2, 2012 12:01 PM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote:

 On 02/10/2012 18:29, Bert Gunter wrote:

 ?history

 in a fresh R session, to see what might be possible. I'll bet the
 answer is, No, you're screwed, though. Nevertheless, maybe Linux
 experts can save you.


 Maybe not.  On a Unix-alike see ?Signals.  If you can find the pid of the
 R process and it is still running (and not e.g. suspended),

 kill -USR1 pid

 will save the workspace and history.


 May the Force be with you.

 -- Bert

 On Tue, Oct 2, 2012 at 10:17 AM, Mike Miller mbmille...@gmail.com
 wrote:

 I connected from my desktop Linux box to a Linux server using ssh in an
 xterm, but that xterm was running in Xvnc.  I'm running R on the server
 in
 that xterm (over ssh).  Something went wrong with Xvnc that has caused
 it to
 hang, probably this bug:

 https://bugs.launchpad.net/**ubuntu/+source/vnc4/+bug/**819473https://bugs.launchpad.net/ubuntu/+source/vnc4/+bug/819473

 So I can't get back to that ssh session or to R.  I had done a bunch of
 work
 in R but the command history hasn't been written out.  If I kill R, I
 assume
 the command history is gone.  I wish I could somehow cause R to dump the
 command history.  Is there any way to tell the running R process to write
 the history somewhere?

 Thanks in advance.

 Mike

 --
 Michael B. Miller, Ph.D.
 Minnesota Center for Twin and Family Research
 Department of Psychology
 University of Minnesota

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Annotate a segmented linear regression plot

2012-10-02 Thread Ben Harrison
On 28 September 2012 16:38, David Winsemius dwinsem...@comcast.net wrote:

 ?text  # should be fairly clear.

Thank you. I was stupid to ask such a trivial question along with a
not-so-trivial one. The second part of the question was probably more
important: is there a way to obtain the location of segments produced
by the segmented package, so that I can annotate them easily?

I have since asked this question of Vito, the package's author. Here
is his response:

#
dear Ben,
here a possible solution,

o is the segmented fit

  r-o$rangeZ[,1]
  est.psi-o$psi[,2]
  v-sort(c(r, est.psi))
  xCoord-rowMeans(cbind(v[-length(v)], v[-1]))

  Z-o$model[,o$nameUV$Z]

  id-sapply(xCoord, function(x)which.min(abs(x-Z)))
  yCoord-broken.line(o)[id]

  plot(o, col=2:4, res=T)
  text(xCoord, yCoord, labels=formatC(slope(o)[[1]][,1], digits=2),
pos=4, cex=1.3)

Play with pos, cex and digits to modify the appearance of the labels.

vito
#

 Please learn to post in plain text.

I hope this is better.

Ben.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] add values in one column getting range from other column

2012-10-02 Thread Sapana Lohani
Hi, 

My dataframe has two columns one with area and other with percent. How can i 
add the areas that are within a range of percentage??

My dataframe looks like
Area Percent
456   0

3400  10

79  25

56   18

467 0

67  67

839    85

1120  0

3482  85

I want to add the area for values whose percent is 0, 0-25, 25-50, 50-75, 75. 
How can I do that???
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Proposal: Package update log

2012-10-02 Thread Starkweather, Jonathan
Thanks to all for the responses and suggestions.

I was primarily proposing a more detailed change log for packages on CRAN. To 
my mind, repositories like R-forge host packages more 'raw' than those on CRAN 
(i.e. CRAN seems to me to contain more 'finished' packages which occasionally 
are updated or added-to). Also, some packages on R-forge do not contain any 
information regarding changes/updates [I'm hesitant to offer an example because 
I'm really just a lemming in terms of R community stature...]. 

I guess what I'm saying is, the 
news(package = yourPackageHere) 
function is not particularly useful currently because (in my *limited* 
experience) very few packages contain the news file and those which do, do not 
contain much in the way of description. Perhaps I'm being a bit too ambitious 
here, but I would just like to be able to see what has been changed (and why; 
if possible) each time a package is updated.

It would seem, from my rather rudimentary understanding, that using current 
TeX/LaTeX based tools for the basis of package documentation lends itself to 
having a better, more organized, change log or news file based on the package 
manual table of contents (toc). For instance, it would be great if we had 
something like: 
news(package = yourPackageHere, function = functionOfInterest) 
which could display a log of changes/updates sequentially for the named 
function of interest. Admittedly, I have not created a package myself, but I do 
have some experience with LaTeX and it may be as simple as changing the 
preamble to existing TeX file templates or style files. 

In terms of enforcement; yes I agree it would require more work from the 
package authors, as well as managers/moderators of CRAN; but, if we expect each 
package to have a working help file, then why not a (meaningfully) working 
'news' file. 

Respectfully, 


Jon Starkweather, PhD
University of North Texas
Research and Statistical Support
http://www.unt.edu/rss/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Efficient Way to gather data from various files

2012-10-02 Thread Sam Asin
Hello,

Sorry if this process is too simple for this list.  I know I can do it, but
I always read online about how when using R one should always try to avoid
loops and use vectors.  I am wondering if there exists a more R friendly
way to do this than to use for loops.

I have a dataset that has a list of IDs.  Let's call this dataset Master

Each of these IDs has an associated DBF file.  The DBF files each have
the same title, and they are each located in a directory path that
includes, as one of the folder names, the ID.

These DBF files have 2 columns of interest.  One is the run number the
other is the statistic.  I'm interested in the median and 90th percentile
of the statistic as well as their corresponding run numbers.  Ultimately,
I want a table that consists of

ID Run_50th Stat_50 Run_90 Stat_90
1AB  5102010 3 144376
1AC  399 6 9

etc.

Where I currently have a dataset that has

ID
1AB
1AC

etc.

And there are several DBF files that are in folders i.e.
folder1/1AC/folder2/blah.dbf

This dbf looks like

run   Stat

1  10
2  10
3  99
4  1000
5  1
6   99
7  1
8 10
9 10
1010
11 100


I know i could do this with a loop, but I can't see the efficient, R way.
 I was hoping that you experienced R programmers could give me some
pointers on the most efficient way to achieve this result.

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R process must die - can I save history?

2012-10-02 Thread Michael Weylandt
Agreed -- very cool trick. Thanks Prof Ripley

Michael

On Oct 2, 2012, at 9:59 PM, steven mosher mosherste...@gmail.com wrote:

 thanks Dr. R. this will come in handy in the future as I have a knack for
 hanging R.
 On Oct 2, 2012 12:01 PM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote:
 
 On 02/10/2012 18:29, Bert Gunter wrote:
 
 ?history
 
 in a fresh R session, to see what might be possible. I'll bet the
 answer is, No, you're screwed, though. Nevertheless, maybe Linux
 experts can save you.
 
 Maybe not.  On a Unix-alike see ?Signals.  If you can find the pid of the
 R process and it is still running (and not e.g. suspended),
 
 kill -USR1 pid
 
 will save the workspace and history.
 
 
 May the Force be with you.
 
 -- Bert
 
 On Tue, Oct 2, 2012 at 10:17 AM, Mike Miller mbmille...@gmail.com
 wrote:
 
 I connected from my desktop Linux box to a Linux server using ssh in an
 xterm, but that xterm was running in Xvnc.  I'm running R on the server
 in
 that xterm (over ssh).  Something went wrong with Xvnc that has caused
 it to
 hang, probably this bug:
 
 https://bugs.launchpad.net/**ubuntu/+source/vnc4/+bug/**819473https://bugs.launchpad.net/ubuntu/+source/vnc4/+bug/819473
 
 So I can't get back to that ssh session or to R.  I had done a bunch of
 work
 in R but the command history hasn't been written out.  If I kill R, I
 assume
 the command history is gone.  I wish I could somehow cause R to dump the
 command history.  Is there any way to tell the running R process to write
 the history somewhere?
 
 Thanks in advance.
 
 Mike
 
 --
 Michael B. Miller, Ph.D.
 Minnesota Center for Twin and Family Research
 Department of Psychology
 University of Minnesota
 
 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] svyby and make.formula

2012-10-02 Thread Muhuri, Pradip (SAMHSA/CBHSQ)

Hello,

Although my R code for the svymean () and svyquantile () functions works fine, 
I am stuck with the svyby () and make.formula () functions.   I got the 
following error messages.

-  Error: object of type 'closure' is not subsettable  # svyby ()
-  Error in xx[[1]] : subscript out of bounds# make.formula ()

A reproducible example is appended below.

I would appreciate if someone could help me.

Thank you in advance.

Pradip Muhuri


Below is a reproducible example ##

setwd (E:/RDATA)

library (survey)

xd1 -
dthage ypll_ler ypll_75 xspd2 psu stratum   wt8 mortstat
 NA NANA 2   1   1 1683.73870
 NA NANA 2   1   1  640.89500
 NA NANA 2   1   1  714.06620
 NA NANA 2   1   1  714.06620
 NA NANA 2   1   1  530.52630
 NA NANA 2   1   1 2205.28630
 NA NANA 2   1  339 1683.73870
 NA NANA 2   1  339  640.89500
 NA NANA 2   1  339  714.06620
 NA NANA 2   1  339  714.06620
 NA NANA 2   1  339  530.52630
 NA NANA 2   1  339 2205.28630
 788.817926   0  2   2 1  592.3100  1
 809.291881   0  2   2 1  1014.7387 1
 875.001076   0  2   2 1  853.4763  1
 875.001076   0  2   2 1  505.1475  1
 885.510514   0  2   2 1  1429.5963 1
 788.817926   0  2   2 339  592.31001
 809.291881   0  2   2 339 1014.73871
 875.001076   0  2   2 339  853.47631
 875.001076   0  2   2 339  505.14751
 885.510514   0  2   2 339 1429.59631
 788.817926   0  2   2 339  592.31001
 809.291881   0  2   2 339 1014.73871
 875.001076   0  2   2 339  853.47631
 875.001076   0  2   2 339  505.14751
 885.510514   0  2   2 339 1429.59631
newdata - read.table (textConnection(xd1), header=TRUE, as.is=TRUE)
dim  (newdata)


# make the grouping variable (xspd)2
newdata$xspd2 - factor(newdata$xspd2,levels=c (1,2),labels=c('SPD', 'No SPD'), 
ordered=TRUE)

nhis - svydesign (id=~psu,strat=~stratum, weights=~wt8, data=newdata, 
nest=TRUE)


# mean age at death - nationwide

svymean( ~dthage, data=nhis ,  subset (nhis, mortstat==1)) 

# mean by SPD status
svyby(~dthage, ~xspd2 ,  design=nhis, svymean )

#percentile
svyquantile(~dthage,  data = nhis ,  subset (nhis, mortstat==1), c( 0 , .25 , 
.5 , .75 , 1 )  )

# percentile by SPD status
svyby(~dthage, ~xspd2, desin=nhis, svyquantile,  c( 0 , .25 , .5 , .75 , 1 ),  
keep.var = F)

# mean for each of the 3 variables

vars - names(nhis) %in% c(dthage, ypll_ler, ypl_75)
vars
svymean(make.formula(vars),nhis,subset (nhis, mortstat==1), na.rm=TRUE)








#

Pradip K. Muhuri
Statistician
Substance Abuse  Mental Health Services Administration
The Center for Behavioral Health Statistics and Quality
Division of Population Surveys
1 Choke Cherry Road, Room 2-1071
Rockville, MD 20857

Tel: 240-276-1070
Fax: 240-276-1260
e-mail: pradip.muh...@samhsa.hhs.govmailto:pradip.muh...@samhsa.hhs.gov

The Center for Behavioral Health Statistics and Quality your feedback.  Please 
click on the following link to complete a brief customer survey:   
http://cbhsqsurvey.samhsa.gov

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svyboxplot - library (survey)

2012-10-02 Thread Muhuri, Pradip (SAMHSA/CBHSQ)
Hi Thomas,

Thank you so much for your help.

Pradip

From: Thomas Lumley [tlum...@uw.edu]
Sent: Monday, October 01, 2012 6:45 PM
To: Muhuri, Pradip (SAMHSA/CBHSQ)
Cc: Anthony Damico; R help
Subject: Re: [R] svyboxplot - library (survey)

The documentation says The grouping variable in svyboxplot, if
present, must be a factor

  -thomas

On Tue, Oct 2, 2012 at 4:28 AM, Muhuri, Pradip (SAMHSA/CBHSQ)
pradip.muh...@samhsa.hhs.gov wrote:
 Dear Anthony,

 Yes, I can follow the example code you have given.  But, do you know from the 
 code shown below (following Thomas Lumley's Complex Surveys) why I am 
 getting the boxplot of dthage for just xspd=1, not xspd2=2?

 My intent is the make this code work so that I can generate similar plots on 
 other continuous variable.

 Any help will be appreciated.

 Thanks,

 Pradip



 
 nhis - svydesign (id=~psu, strat=~stratum, weights=~wt8,
 data=tor, nest=TRUE)

 svyboxplot (dthage~xspd2, subset (nhis, mortstat==1), col=gray80,
  varwidth=TRUE, ylab=Age at Death, xlab=SPD Status: 1-SPD, 
 2=No SPD)



 Pradip K. Muhuri, PhD
 Statistician
 Substance Abuse  Mental Health Services Administration
 The Center for Behavioral Health Statistics and Quality
 Division of Population Surveys
 1 Choke Cherry Road, Room 2-1071
 Rockville, MD 20857

 Tel: 240-276-1070
 Fax: 240-276-1260
 e-mail: pradip.muh...@samhsa.hhs.govmailto:pradip.muh...@samhsa.hhs.gov

 The Center for Behavioral Health Statistics and Quality your feedback.  
 Please click on the following link to complete a brief customer survey:   
 http://cbhsqsurvey.samhsa.govhttp://cbhsqsurvey.samhsa.gov/

 From: Anthony Damico [mailto:ajdam...@gmail.com]
 Sent: Monday, October 01, 2012 10:07 AM
 To: Muhuri, Pradip (SAMHSA/CBHSQ)
 Cc: R help
 Subject: Re: [R] svyboxplot - library (survey)

 using a slight modification of the example shown in ?svyboxplot


 # load survey library
 library(survey)

 # load example data
 data(api)

 # create an example svydesign
 dstrat - svydesign(id = ~1, strata = ~stype, weights = ~pw, data = apistrat,
 fpc = ~fpc)

 # set the plot window to display 1 plot x 2 plots
 par(mfrow=c(1,2))

 # generate two example boxplots
 svyboxplot(enroll~stype,dstrat,all.outliers=TRUE)
 svyboxplot(enroll~1,dstrat)

 # done



 # alternative: not as nice

 # set the plot window to display 2 plots x 1 plot
 par(mfrow=c(2,1))

 # generate two example boxplots
 svyboxplot(enroll~stype,dstrat,all.outliers=TRUE)
 svyboxplot(enroll~1,dstrat)

 # done






 On Mon, Oct 1, 2012 at 9:50 AM, Muhuri, Pradip (SAMHSA/CBHSQ) 
 pradip.muh...@samhsa.hhs.govmailto:pradip.muh...@samhsa.hhs.gov wrote:
 Hello,

 I have used the library (survey) package for boxplots using the following 
 code.

 Could anyone please tell me why I am getting only 1  boxplot instead of 2 
 boxplots (1-SPD,  2-No SPD).

 What changes in the following code would be required to get 2 boxplots in the 
 same plot frame?

 Thanks,

 Pradip

 ###
 nhis - svydesign (id=~psu, strat=~stratum, weights=~wt8,
 data=tor, nest=TRUE)

 svyboxplot (dthage~xspd2, subset (nhis, mortstat==1), col=gray80,
  varwidth=TRUE, ylab=Age at Death, xlab=SPD Status: 1-SPD, 
 2=No SPD)


 Pradip K. Muhuri
 Statistician
 Substance Abuse  Mental Health Services Administration
 The Center for Behavioral Health Statistics and Quality
 Division of Population Surveys
 1 Choke Cherry Road, Room 2-1071
 Rockville, MD 20857

 Tel: 240-276-1070
 Fax: 240-276-1260
 e-mail: pradip.muh...@samhsa.hhs.govmailto:pradip.muh...@samhsa.hhs.gov

 The Center for Behavioral Health Statistics and Quality your feedback.  
 Please click on the following link to complete a brief customer survey:   
 http://cbhsqsurvey.samhsa.gov

 vide commented, minimal, self-contained, reproducible code.
 __
 R-help@r-project.orgmailto:R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R process must die - can I save history? [SOLVED]

2012-10-02 Thread Mike Miller
This is a solution for UNIX/Linux-type OS users and a lot of it is only 
related to R in the sense that it can allow you to reconnect to an R 
session.  I managed to do it and to save history, thanks to helpful 
messages from Ista Zahn and Brian Ripley.


To explain this, I will call my two Linux boxes Desktop and Server.  I 
logged into Desktop, an Ubuntu box, and ran this command:


ps aux | grep ssh

That showed me these ssh processes and a few extraneous things:

USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
mbmiller  3520  0.0  0.0  46944  4668 pts/5S+   Jul25   0:23 ssh -X Server
mbmiller  4602  0.0  0.0  47720  5544 pts/9S+   Aug09   1:07 ssh -X Server
mbmiller 25614  0.0  0.0  45584  3344 pts/11   S+   Sep24   0:00 ssh -X Server

I launched an xterm from which I would try to recapture those ssh 
sessions.  (Spoiler: This worked for every ssh process.)  I was missing 
the reptyr binary, so I installed it like so:


sudo apt-get install reptyr

The reptyr man page told me that I had to do this to allow reptyr to work 
(I could change the 0 back to 1 after finishing):


$ sudo su
root# echo 0  /proc/sys/kernel/yama/ptrace_scope
root# exit

After that I just ran reptyr for each process, for example:

reptyr 3520

A few lines of text would appear (the last saying Set the controlling 
tty), I'd hit enter and it would give my my command prompt from my old 
shell, or the R prompt if R was running in that shell.  I was then able to 
save command histories, etc.  When I tried to exit from R using q(yes) 
it accepted the command, but it did not return my bash prompt.  To deal 
with that, I tried Brian Ripley's recommendation:


I logged into Server via ssh and ran this command:

ps aux | grep R

Which returned this along with some irrelevant stuff:

USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
mbmiller  5156  0.0  0.1 213188  9244 pts/1S+   Aug06   1:00 
/share/apps/R-2.15.1/lib64/R/bin/exec/R -q

I tried the kill command...

kill -USR1 5156

...and that returned my bash prompt immediately in the other xterm.  R was 
done, the .Rhistory looked perfect, and .RData also was there.  I logged 
into Server, went to the appropriate directory, ran R and found that all 
of the objects were there and working correctly.


So that was amazing.  I could reattach to the R session and also kill it 
without losing history and data.  This is a big deal for me because I get 
stuck like that about once a year and it's always a huge pain.


Mike


On Tue, 2 Oct 2012, Ista Zahn wrote (off-list):

If you can find the process ID you can try connecting the process to 
another terminal with reptyr (https://github.com/nelhage/reptyr), and 
then just use savehistory() as usual. You don't say what flavor of Linux 
you're dealing with, but it looks like there are packaged versions for 
at least Ubuntu, Fedora, and Arch.



On Tue, 2 Oct 2012, Prof Brian Ripley wrote:

Maybe not.  On a Unix-alike see ?Signals.  If you can find the pid of 
the R process and it is still running (and not e.g. suspended),


kill -USR1 pid

will save the workspace and history.




Original query:

On Tue, 2 Oct 2012, Mike Miller wrote:

I connected from my desktop Linux box to a Linux server using ssh in an 
xterm, but that xterm was running in Xvnc.  I'm running R on the server 
in that xterm (over ssh).  Something went wrong with Xvnc that has 
caused it to hang, probably this bug:


https://bugs.launchpad.net/ubuntu/+source/vnc4/+bug/819473

So I can't get back to that ssh session or to R.  I had done a bunch of 
work in R but the command history hasn't been written out.  If I kill R, 
I assume the command history is gone.  I wish I could somehow cause R to 
dump the command history.  Is there any way to tell the running R 
process to write the history somewhere?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Annotate a segmented linear regression plot

2012-10-02 Thread David Winsemius

On Oct 2, 2012, at 3:38 PM, Ben Harrison wrote:

 On 28 September 2012 16:38, David Winsemius dwinsem...@comcast.net wrote:
 
 ?text  # should be fairly clear.
 
 Thank you. I was stupid to ask such a trivial question along with a
 not-so-trivial one. The second part of the question was probably more
 important: is there a way to obtain the location of segments produced
 by the segmented package, so that I can annotate them easily?

I guess it depends on what you mean by the location of segments. It's easy 
enough to retrun the x-values for the knots:

o$psi[ , Est.]

I do not see a `predicted.segmented` function, so might need to construct 
separate lm.predict estimates for values between the pairs of x-values if you 
meant you wanted the y-values for the knots.

-- 
David.

 
 I have since asked this question of Vito, the package's author. Here
 is his response:
 
 #
 dear Ben,
 here a possible solution,
 
 o is the segmented fit
 
  r-o$rangeZ[,1]
  est.psi-o$psi[,2]
  v-sort(c(r, est.psi))
  xCoord-rowMeans(cbind(v[-length(v)], v[-1]))
 
  Z-o$model[,o$nameUV$Z]
 
  id-sapply(xCoord, function(x)which.min(abs(x-Z)))
  yCoord-broken.line(o)[id]
 
  plot(o, col=2:4, res=T)
  text(xCoord, yCoord, labels=formatC(slope(o)[[1]][,1], digits=2),
 pos=4, cex=1.3)
 
 Play with pos, cex and digits to modify the appearance of the labels.
 
 vito
 #
 
 Please learn to post in plain text.
 
 I hope this is better.
 
 Ben.

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] add values in one column getting range from other column

2012-10-02 Thread arun
Hi,
I guess this is what you wanted:
dat1-read.table(text=
Area Percent
456   0

3400  10

79  25

56   18

467 0

67  67

839    85

1120  0

3482  85
,sep=,header=TRUE)
 aggregate(dat1$Percent, list(Area = dat1[,Area],Range=cut(dat1 
 $Percent,breaks=c(-Inf,0, 25, 50, 75, 100),
   labels=c(=0, 0-25, 25-50, 50-75, 75))),function(x) x)
#  Area Range  x
#1  456   =0  0
#2  467   =0  0
#3 1120   =0  0
#4   56  0-25 18
#5   79  0-25 25
#6 3400  0-25 10
#7   67 50-75 67
#8  839   75 85
#9 3482   75 85


A.K.



- Original Message -
From: Sapana Lohani lohani.sap...@ymail.com
To: R help r-help@r-project.org
Cc: 
Sent: Tuesday, October 2, 2012 5:25 PM
Subject: [R] add values in one column getting range from other column

Hi, 

My dataframe has two columns one with area and other with percent. How can i 
add the areas that are within a range of percentage??

My dataframe looks like
Area Percent
456   0

3400  10

79  25

56   18

467 0

67  67

839    85

1120  0

3482  85

I want to add the area for values whose percent is 0, 0-25, 25-50, 50-75, 75. 
How can I do that???
    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficient Way to gather data from various files

2012-10-02 Thread Rui Barradas

Hello,

There are more R friendly ways to do what you want, it seems to me easy 
to avoid loops but you need to tell us how do you know which rows 
correspond to the 50th and 90th quantiles. Maybe this comes from the 
value in some other column?


Give a bit more complete a description and we'll see what can be done.

Hope this helps,

Rui Barradas
Em 03-10-2012 00:33, Sam Asin escreveu:

Hello,

Sorry if this process is too simple for this list.  I know I can do it, but
I always read online about how when using R one should always try to avoid
loops and use vectors.  I am wondering if there exists a more R friendly
way to do this than to use for loops.

I have a dataset that has a list of IDs.  Let's call this dataset Master

Each of these IDs has an associated DBF file.  The DBF files each have
the same title, and they are each located in a directory path that
includes, as one of the folder names, the ID.

These DBF files have 2 columns of interest.  One is the run number the
other is the statistic.  I'm interested in the median and 90th percentile
of the statistic as well as their corresponding run numbers.  Ultimately,
I want a table that consists of

ID Run_50th Stat_50 Run_90 Stat_90
1AB  5102010 3 144376
1AC  399 6 9

etc.

Where I currently have a dataset that has

ID
1AB
1AC

etc.

And there are several DBF files that are in folders i.e.
folder1/1AC/folder2/blah.dbf

This dbf looks like

run   Stat

1  10
2  10
3  99
4  1000
5  1
6   99
7  1
8 10
9 10
1010
11 100


I know i could do this with a loop, but I can't see the efficient, R way.
  I was hoping that you experienced R programmers could give me some
pointers on the most efficient way to achieve this result.

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] add values in one column getting range from other column

2012-10-02 Thread arun
HI,

Infact, there is no need for aggregate().
dat1$range-cut(dat3$Percent,breaks=c(-Inf,0,25,50,75,100),labels=c(=0,0-25,25-50,50-75,75))
 dat1
#  Area Percent range
#1  456   0   =0
#2 3400  10  0-25
#3   79  25  0-25
#4   56  18  0-25
#5  467   0   =0
#6   67  67 50-75
#7  839  85   75
#8 1120   0   =0
#9 3482  85   75
A.K.




- Original Message -
From: Sapana Lohani lohani.sap...@ymail.com
To: R help r-help@r-project.org
Cc: 
Sent: Tuesday, October 2, 2012 5:25 PM
Subject: [R] add values in one column getting range from other column

Hi, 

My dataframe has two columns one with area and other with percent. How can i 
add the areas that are within a range of percentage??

My dataframe looks like
Area Percent
456   0

3400  10

79  25

56   18

467 0

67  67

839    85

1120  0

3482  85

I want to add the area for values whose percent is 0, 0-25, 25-50, 50-75, 75. 
How can I do that???
    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] add values in one column getting range from other column

2012-10-02 Thread Rui Barradas

Hello,

Maybe this time you've got it wrong, Arun. The op wants to sum the 
areas, not just label them.

Using your code,


Range=cut(dat1$Percent, breaks=c(-Inf,0, 25, 50, 75, 100),
labels=c(=0, 0-25, 25-50, 50-75, 75))
aggregate(Area ~ Range, data = dat1, FUN = sum)

#  Range Area
#1   =0 2043
#2  0-25 3535
#3 50-75   67
#4   75 4321

Hope this helps,

Rui Barradas
Em 03-10-2012 01:43, arun escreveu:

Hi,
I guess this is what you wanted:
dat1-read.table(text=
Area Percent
456   0

3400  10

79  25

56   18

467 0

67  67

83985

1120  0

3482  85
,sep=,header=TRUE)
  aggregate(dat1$Percent, list(Area = dat1[,Area],Range=cut(dat1
  $Percent,breaks=c(-Inf,0, 25, 50, 75, 100),
labels=c(=0, 0-25, 25-50, 50-75, 75))),function(x) x)
#  Area Range  x
#1  456   =0  0
#2  467   =0  0
#3 1120   =0  0
#4   56  0-25 18
#5   79  0-25 25
#6 3400  0-25 10
#7   67 50-75 67
#8  839   75 85
#9 3482   75 85


A.K.



- Original Message -
From: Sapana Lohani lohani.sap...@ymail.com
To: R help r-help@r-project.org
Cc:
Sent: Tuesday, October 2, 2012 5:25 PM
Subject: [R] add values in one column getting range from other column

Hi,

My dataframe has two columns one with area and other with percent. How can i 
add the areas that are within a range of percentage??

My dataframe looks like
Area Percent
456   0

3400  10

79  25

56   18

467 0

67  67

83985

1120  0

3482  85

I want to add the area for values whose percent is 0, 0-25, 25-50, 50-75, 75. 
How can I do that???
 [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svyby and make.formula

2012-10-02 Thread Anthony Damico
please double-check that you've got all of your parameters correct by
typing ?svymean ?svyby  and ?make.formula before you send questions to
r-help  :)


# you spelled design wrong and probably need to throw out your NA values.
try this

# percentile by SPD status
svyby(~dthage, ~xspd2, design=nhis, svyquantile,  c( 0 , .25 , .5 , .75 , 1
),  keep.var = F, na.rm = TRUE)



# mean for each of the 3 variables

# this returns a logical vector, but make.formula requires a character
vector
vars - names(nhis) %in% c(dthage, ypll_ler, ypl_75)
vars
svymean(make.formula(vars),nhis,subset (nhis, mortstat==1), na.rm=TRUE)


# create a character vector instead

# note you also spelled the third variable wrong-- it will break unless you
correct that
vars - c(dthage, ypll_ler, ypll_75)

# this statement has two survey design parameters, which won't work.  which
one do you want to use?
svymean(make.formula(vars),nhis,subset (nhis, mortstat==1), na.rm=TRUE)

# pick one
svymean(make.formula(vars),nhis, na.rm=TRUE)
svymean(make.formula(vars),subset(nhis, mortstat==1), na.rm=TRUE)
# all of the variables in vars are NA whenever mortstat isn't 1, so they
give the same results



On Tue, Oct 2, 2012 at 7:51 PM, Muhuri, Pradip (SAMHSA/CBHSQ) 
pradip.muh...@samhsa.hhs.gov wrote:


 Hello,

 Although my R code for the svymean () and svyquantile () functions works
 fine, I am stuck with the svyby () and make.formula () functions.   I got
 the following error messages.

 -  Error: object of type 'closure' is not subsettable  # svyby ()
 -  Error in xx[[1]] : subscript out of bounds# make.formula ()

 A reproducible example is appended below.

 I would appreciate if someone could help me.

 Thank you in advance.

 Pradip Muhuri


 Below is a reproducible example
 ##

 setwd (E:/RDATA)

 library (survey)

 xd1 -
 dthage ypll_ler ypll_75 xspd2 psu stratum   wt8 mortstat
  NA NANA 2   1   1 1683.73870
  NA NANA 2   1   1  640.89500
  NA NANA 2   1   1  714.06620
  NA NANA 2   1   1  714.06620
  NA NANA 2   1   1  530.52630
  NA NANA 2   1   1 2205.28630
  NA NANA 2   1  339 1683.73870
  NA NANA 2   1  339  640.89500
  NA NANA 2   1  339  714.06620
  NA NANA 2   1  339  714.06620
  NA NANA 2   1  339  530.52630
  NA NANA 2   1  339 2205.28630
  788.817926   0  2   2 1  592.3100  1
  809.291881   0  2   2 1  1014.7387 1
  875.001076   0  2   2 1  853.4763  1
  875.001076   0  2   2 1  505.1475  1
  885.510514   0  2   2 1  1429.5963 1
  788.817926   0  2   2 339  592.31001
  809.291881   0  2   2 339 1014.73871
  875.001076   0  2   2 339  853.47631
  875.001076   0  2   2 339  505.14751
  885.510514   0  2   2 339 1429.59631
  788.817926   0  2   2 339  592.31001
  809.291881   0  2   2 339 1014.73871
  875.001076   0  2   2 339  853.47631
  875.001076   0  2   2 339  505.14751
  885.510514   0  2   2 339 1429.59631
 newdata - read.table (textConnection(xd1), header=TRUE, as.is=TRUE)
 dim  (newdata)


 # make the grouping variable (xspd)2
 newdata$xspd2 - factor(newdata$xspd2,levels=c (1,2),labels=c('SPD', 'No
 SPD'), ordered=TRUE)

 nhis - svydesign (id=~psu,strat=~stratum, weights=~wt8, data=newdata,
 nest=TRUE)


 # mean age at death - nationwide

 svymean( ~dthage, data=nhis ,  subset (nhis, mortstat==1))

 # mean by SPD status
 svyby(~dthage, ~xspd2 ,  design=nhis, svymean )

 #percentile
 svyquantile(~dthage,  data = nhis ,  subset (nhis, mortstat==1), c( 0 ,
 .25 , .5 , .75 , 1 )  )

 # percentile by SPD status
 svyby(~dthage, ~xspd2, desin=nhis, svyquantile,  c( 0 , .25 , .5 , .75 , 1
 ),  keep.var = F)

 # mean for each of the 3 variables

 vars - names(nhis) %in% c(dthage, ypll_ler, ypl_75)
 vars
 svymean(make.formula(vars),nhis,subset (nhis, mortstat==1), na.rm=TRUE)








 #

 Pradip K. Muhuri
 Statistician
 Substance Abuse  Mental Health Services Administration
 The Center for Behavioral Health Statistics and Quality
 Division of Population Surveys
 1 Choke Cherry Road, Room 2-1071
 Rockville, MD 20857

 Tel: 240-276-1070
 Fax: 240-276-1260
 e-mail: 

Re: [R] Efficient Way to gather data from various files

2012-10-02 Thread Jeff Newmiller
File operations are not vectorizable. About the only thing you can do for the 
iterating through files part might be to use lapply instead of a for loop, but 
that is mostly a style change.

Once you have read the dbf files there will probably be vector functions you 
can use (quantile). Off the top of my head I don't know a function that tells 
you which value corresponds to a particular quantile, but you can probably sort 
the data with order(), find the value whose ecdf is just below your target with 
which.max, and look at the row number of that value.

x - rnorm(11)
names(x) - seq(x)
xs - x[order(x)]
Row90 - as.numeric(names (xs)[0.9=seq(xs)/length(xs))])

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Sam Asin asin@gmail.com wrote:

Hello,

Sorry if this process is too simple for this list.  I know I can do it,
but
I always read online about how when using R one should always try to
avoid
loops and use vectors.  I am wondering if there exists a more R
friendly
way to do this than to use for loops.

I have a dataset that has a list of IDs.  Let's call this dataset
Master

Each of these IDs has an associated DBF file.  The DBF files each
have
the same title, and they are each located in a directory path that
includes, as one of the folder names, the ID.

These DBF files have 2 columns of interest.  One is the run number
the
other is the statistic.  I'm interested in the median and 90th
percentile
of the statistic as well as their corresponding run numbers. 
Ultimately,
I want a table that consists of

ID Run_50th Stat_50 Run_90 Stat_90
1AB  5102010 3 144376
1AC  399 6 9

etc.

Where I currently have a dataset that has

ID
1AB
1AC

etc.

And there are several DBF files that are in folders i.e.
folder1/1AC/folder2/blah.dbf

This dbf looks like

run   Stat

1  10
2  10
3  99
4  1000
5  1
6   99
7  1
8 10
9 10
1010
11 100


I know i could do this with a loop, but I can't see the efficient, R
way.
 I was hoping that you experienced R programmers could give me some
pointers on the most efficient way to achieve this result.

Sam

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svyby and make.formula

2012-10-02 Thread Muhuri, Pradip (SAMHSA/CBHSQ)
Dear Anthony,

Thank you very much for helping me resolve the issues.  I now got all the 
results, which I intended to generate.

Pradip Muhuri


From: Anthony Damico [ajdam...@gmail.com]
Sent: Tuesday, October 02, 2012 9:50 PM
To: Muhuri, Pradip (SAMHSA/CBHSQ)
Cc: R help
Subject: Re: svyby and make.formula

please double-check that you've got all of your parameters correct by typing 
?svymean ?svyby  and ?make.formula before you send questions to r-help  :)


# you spelled design wrong and probably need to throw out your NA values.  try 
this

# percentile by SPD status
svyby(~dthage, ~xspd2, design=nhis, svyquantile,  c( 0 , .25 , .5 , .75 , 1 ),  
keep.var = F, na.rm = TRUE)



# mean for each of the 3 variables

# this returns a logical vector, but make.formula requires a character vector
vars - names(nhis) %in% c(dthage, ypll_ler, ypl_75)
vars
svymean(make.formula(vars),nhis,subset (nhis, mortstat==1), na.rm=TRUE)


# create a character vector instead

# note you also spelled the third variable wrong-- it will break unless you 
correct that
vars - c(dthage, ypll_ler, ypll_75)

# this statement has two survey design parameters, which won't work.  which one 
do you want to use?
svymean(make.formula(vars),nhis,subset (nhis, mortstat==1), na.rm=TRUE)

# pick one
svymean(make.formula(vars),nhis, na.rm=TRUE)
svymean(make.formula(vars),subset(nhis, mortstat==1), na.rm=TRUE)
# all of the variables in vars are NA whenever mortstat isn't 1, so they give 
the same results



On Tue, Oct 2, 2012 at 7:51 PM, Muhuri, Pradip (SAMHSA/CBHSQ) 
pradip.muh...@samhsa.hhs.govmailto:pradip.muh...@samhsa.hhs.gov wrote:

Hello,

Although my R code for the svymean () and svyquantile () functions works fine, 
I am stuck with the svyby () and make.formula () functions.   I got the 
following error messages.

-  Error: object of type 'closure' is not subsettable  # svyby ()
-  Error in xx[[1]] : subscript out of bounds# make.formula ()

A reproducible example is appended below.

I would appreciate if someone could help me.

Thank you in advance.

Pradip Muhuri


Below is a reproducible example ##

setwd (E:/RDATA)

library (survey)

xd1 -
dthage ypll_ler ypll_75 xspd2 psu stratum   wt8 mortstat
 NA NANA 2   1   1 1683.73870
 NA NANA 2   1   1  640.89500
 NA NANA 2   1   1  714.06620
 NA NANA 2   1   1  714.06620
 NA NANA 2   1   1  530.52630
 NA NANA 2   1   1 2205.28630
 NA NANA 2   1  339 1683.73870
 NA NANA 2   1  339  
640.8950tel:339%20%C2%A0640.89500
 NA NANA 2   1  339  
714.0662tel:339%20%C2%A0714.06620
 NA NANA 2   1  339  
714.0662tel:339%20%C2%A0714.06620
 NA NANA 2   1  339  
530.5263tel:339%20%C2%A0530.52630
 NA NANA 2   1  339 2205.28630
 788.817926   0  2   2 1  592.3100  1
 809.291881   0  2   2 1  1014.7387 1
 875.001076   0  2   2 1  853.4763  1
 875.001076   0  2   2 1  505.1475  1
 885.510514   0  2   2 1  1429.5963 1
 788.817926   0  2   2 339  
592.3100tel:339%20%C2%A0592.31001
 809.291881   0  2   2 339 1014.73871
 875.001076   0  2   2 339  
853.4763tel:339%20%C2%A0853.47631
 875.001076   0  2   2 339  
505.1475tel:339%20%C2%A0505.14751
 885.510514   0  2   2 339 1429.59631
 788.817926   0  2   2 339  
592.3100tel:339%20%C2%A0592.31001
 809.291881   0  2   2 339 1014.73871
 875.001076   0  2   2 339  
853.4763tel:339%20%C2%A0853.47631
 875.001076   0  2   2 339  
505.1475tel:339%20%C2%A0505.14751
 885.510514   0  2   2 339 1429.59631
newdata - read.table (textConnection(xd1), header=TRUE, 
as.ishttp://as.is=TRUE)
dim  (newdata)


# make the grouping variable (xspd)2
newdata$xspd2 - factor(newdata$xspd2,levels=c (1,2),labels=c('SPD', 'No SPD'), 
ordered=TRUE)

nhis - svydesign (id=~psu,strat=~stratum, weights=~wt8, data=newdata, 
nest=TRUE)


# mean age at death - nationwide

svymean( ~dthage, data=nhis ,  subset (nhis, mortstat==1))

# mean by SPD status
svyby(~dthage, ~xspd2 ,  design=nhis, svymean )

#percentile
svyquantile(~dthage,  data = nhis ,  subset (nhis, mortstat==1), c( 0 , .25 , 
.5 , .75 , 1 )  )

# percentile by SPD status
svyby(~dthage, ~xspd2, 

Re: [R] help: ks test fit Poisson-ness (D and p) with one sample data

2012-10-02 Thread yang ji

for a silly question, wondering how to test fit with the one sample as follow.
I have read _fitting distributions with R_, but that doesn't answer my specific 
question.
inclined to use Kolmogorov-Smirnov D, and its associative p value. 
much appreciation!
 X20.001  232   93   84  185  336  417  228  
199  2110 1411  612 1913 1314 3015 2516 3517
 3518 2219 3420 1721 3022 5023 6024 9825 6226   
  6227 6728 5429 5830 583110732 7933 5634 6535  
   8936111371063811339 8840 9841 9242 4243 1944 
 545 1146 1547 1548 2449  350  251  052  
453  154  9


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] calculating gelman diagnostic for mice object

2012-10-02 Thread Rachel Cole
I am using -mice- for multiple imputation and would like to use the gelman
diagnostic in -coda- to assess the convergence of my imputations.  However,
gelman.diag requires an mcmc list as input.  van Buuren and
Groothuis-Oudshoorn (2011) recommend running mice step-by-step to assess
convergence (e.g.  imp2 - mice.mids(imp1, maxit = 3, print = FALSE) )  but
this creates mids objects.  How can I convert the mids objects into an mcmc
list?  Or is the step-by-step approach wrong here?

thanks,
Rachel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.