Re: [R] significance test interquartile ranges

2012-07-13 Thread Prof Brian Ripley

On 13/07/2012 21:37, Greg Snow wrote:

A permutation test may be appropriate:


Yes, it may, but precisely which one is unclear.  You are testing 
whether the two samples have an identical distribution, whereas I took 
the question to be a test of differences in dispersion, with differences 
in location allowed.


I do not think this can be solved without further assumptions.  E.g 
people often replace the two-sample t-test by the two-sample Wilcoxon 
test as a test of differences in location, not realizing that the latter 
is also sensitive to other aspects of the difference (e.g. both 
dispersion and shape).


I nearly suggested (yesterday) doing the permutation test on differences 
from medians in the two groups.  But really this is off-topic for R-help 
and needs interaction with a knowledgeable statistician to refine the 
question.



1. compute the ratio of the 2 IQR values (or other comparison of interest)
2. combine the data from the 2 samples into 1 pool, then randomly
split into 2 groups (matching sample sizes of original) and compute
the ratio of the IQR values for the 2 new samples.
3. repeat #2 a bunch of times (like for a total of 999 random splits)
and combine with the original value.
4. (optional, but strongly suggested) plot a histogram of all the
ratios and place a reference line of the original ratio on the plot.
5. calculate the proportion of ratios that are as extreme or more
extreme than the original, this is the (approximate) p-value.


I think it is an 'exact' (but random) p-value.



On Fri, Jul 13, 2012 at 5:32 AM, Schaber, Jörg
 wrote:

Hi,

I have two non-normal distributions and use interquartile ranges as a 
dispersion measure.
Now I am looking for a test, which tests whether the interquartile ranges from 
the two distributions are significantly different.
Any idea?

Thanks,

joerg




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Arrange two columns into a five variable dataframe

2012-07-13 Thread Brian Diggs

On 7/13/2012 8:37 PM, darnold wrote:

Hi,

I hope that folks can give me some simple approaches to taking the data set
below, which is accumulated in two columns called "long" and "group", then
arrange the data is the "long" column into a data frame containing five
variables: "Group 1", "Group 2", "Group 3", "Group 4", and "Group 5".  I am
hoping for a few different techniques which I can pass on to my students.

Thanks

David Arnold
College of the Redwoods



dput(flies)

structure(list(long = c(40L, 37L, 44L, 47L, 47L, 47L, 68L, 47L,
54L, 61L, 71L, 75L, 89L, 58L, 59L, 62L, 79L, 96L, 58L, 62L, 70L,
72L, 74L, 96L, 75L, 46L, 42L, 65L, 46L, 58L, 42L, 48L, 58L, 50L,
80L, 63L, 65L, 70L, 70L, 72L, 97L, 46L, 56L, 70L, 70L, 72L, 76L,
90L, 76L, 92L, 21L, 40L, 44L, 54L, 36L, 40L, 56L, 60L, 48L, 53L,
60L, 60L, 65L, 68L, 60L, 81L, 81L, 48L, 48L, 56L, 68L, 75L, 81L,
48L, 68L, 35L, 37L, 49L, 46L, 63L, 39L, 46L, 56L, 63L, 65L, 56L,
65L, 70L, 63L, 65L, 70L, 77L, 81L, 86L, 70L, 70L, 77L, 77L, 81L,
77L, 16L, 19L, 19L, 32L, 33L, 33L, 30L, 42L, 42L, 33L, 26L, 30L,
40L, 54L, 34L, 34L, 47L, 47L, 42L, 47L, 54L, 54L, 56L, 60L, 44L
), group = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = c("Group 5", "Group 4", "Group 3", "Group 2",
"Group 1"), class = "factor")), .Names = c("long", "group"), row.names =
c(NA,
-125L), class = "data.frame")


Generally I would recommend either the reshape function or the functions 
in the reshape2 package. However, your data doesn't quite have what is 
needed to use those. You are implicitly assuming that the first 
occurring values in each group go together (should be in the same row), 
the second ones, etc.  The reshapes require an explicit indication of 
which variables go together.


The unstack function will work for you and uses the same assumption.

> unstack(flies)
   Group.5 Group.4 Group.3 Group.2 Group.1
1   16  35  21  46  40
2   19  37  40  42  37
3   19  49  44  65  44
4   32  46  54  46  47
5   33  63  36  58  47
6   33  39  40  42  47
7   30  46  56  48  68
8   42  56  60  58  47
9   42  63  48  50  54
10  33  65  53  80  61
11  26  56  60  63  71
12  30  65  60  65  75
13  40  70  65  70  89
14  54  63  68  70  58
15  34  65  60  72  59
16  34  70  81  97  62
17  47  77  81  46  79
18  47  81  48  56  96
19  42  86  48  70  58
20  47  70  56  70  62
21  54  70  68  72  70
22  54  77  75  76  72
23  56  77  81  90  74
24  60  81  48  76  96
25  44  77  68  92  75



--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Arrange two columns into a five variable dataframe

2012-07-13 Thread darnold
Hi,

I hope that folks can give me some simple approaches to taking the data set
below, which is accumulated in two columns called "long" and "group", then
arrange the data is the "long" column into a data frame containing five
variables: "Group 1", "Group 2", "Group 3", "Group 4", and "Group 5".  I am
hoping for a few different techniques which I can pass on to my students.

Thanks

David Arnold
College of the Redwoods


> dput(flies)
structure(list(long = c(40L, 37L, 44L, 47L, 47L, 47L, 68L, 47L, 
54L, 61L, 71L, 75L, 89L, 58L, 59L, 62L, 79L, 96L, 58L, 62L, 70L, 
72L, 74L, 96L, 75L, 46L, 42L, 65L, 46L, 58L, 42L, 48L, 58L, 50L, 
80L, 63L, 65L, 70L, 70L, 72L, 97L, 46L, 56L, 70L, 70L, 72L, 76L, 
90L, 76L, 92L, 21L, 40L, 44L, 54L, 36L, 40L, 56L, 60L, 48L, 53L, 
60L, 60L, 65L, 68L, 60L, 81L, 81L, 48L, 48L, 56L, 68L, 75L, 81L, 
48L, 68L, 35L, 37L, 49L, 46L, 63L, 39L, 46L, 56L, 63L, 65L, 56L, 
65L, 70L, 63L, 65L, 70L, 77L, 81L, 86L, 70L, 70L, 77L, 77L, 81L, 
77L, 16L, 19L, 19L, 32L, 33L, 33L, 30L, 42L, 42L, 33L, 26L, 30L, 
40L, 54L, 34L, 34L, 47L, 47L, 42L, 47L, 54L, 54L, 56L, 60L, 44L
), group = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = c("Group 5", "Group 4", "Group 3", "Group 2", 
"Group 1"), class = "factor")), .Names = c("long", "group"), row.names =
c(NA, 
-125L), class = "data.frame")

--
View this message in context: 
http://r.789695.n4.nabble.com/Arrange-two-columns-into-a-five-variable-dataframe-tp4636503.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Side by side strip charts

2012-07-13 Thread darnold
Very nice suggestion. I am getting some very kind help here.

x1 <- round(rnorm(10,60,3))
x2 <- round(rnorm(10,65,3))
x3 <- round(rnorm(10,70,3))
stripchart(list(sample1=x1,sample2=x2,sample3=x3),
   method="stack",
   pch=4,
   offset=1/2,
   col="blue",
   lwd=2,
   las=1,
   xlim=c(50,80))

axis(1,pos=2.9,labels=FALSE)
axis(1,pos=1.9,labels=FALSE)

http://r.789695.n4.nabble.com/file/n4636502/Rplot.png 

Thanks.

David Arnold
College of the Redwoods

--
View this message in context: 
http://r.789695.n4.nabble.com/Side-by-side-strip-charts-tp4636399p4636502.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] integrating multi-dimensional dat along one dimension

2012-07-13 Thread David L Carlson
set.seed(42)
d <- array(as.integer(round(runif(125)*10, 0)), dim=c(5, 5, 5))
data_int <- apply(d, c(1,2), sum)

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


- Original Message - 

From: "Sam B Civ USAF AFMC AFRL Cable/RVBXI"  
To: r-help@r-project.org 
Sent: Friday, July 13, 2012 4:11:28 PM 
Subject: [R] integrating multi-dimensional dat along one dimension 

I just want to integrate a 3D data set along one dimension to obtain a 
2D data set. Something like: 



(given array "d" with dim nx,ny,nz ...) 



data_int<-array(dim=c(nx,ny)) 

for (n in 1:ny) { 

for (m in 1:nx) { 

data_int[m,n]<-sum(d[m,n,]) 

} 

} 



The thing is, given R's facility with integers, it seems that I should 
be able to obtain data_int without the explicit for-loops, but I haven't 
been able to figure out how to do it. Anyone know how? Thanks. 


[[alternative HTML version deleted]] 

__ 
R-help@r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Side by side strip charts

2012-07-13 Thread Peter Ehlers

On 2012-07-13 11:37, Rui Barradas wrote:

Hello,

Or maybe the argument 'pos' of axis().


stripchart(list(sample1=x1,sample2=x2,sample3=x3),
 method="stack",
 pch=4,
 offset=1/2,
 col="blue",
 lwd=2,
 las=1,
 xlim=c(53, 77),
 xaxt="n")
axis(1, at = seq(55, 75, by=5), lwd=2)
axis(1, at = seq(55, 75, by=5), pos=1.90, lwd=2)
axis(1, at = seq(55, 75, by=5), pos=2.90, lwd=2)


(I've also added xlim)

Hope this helps,

Rui Barradas


It seemed like a good exercise to try to imitate the plot
posted by the OP (on Nabble) a bit more closely; so here's
my attempt:

  ## x-axis values to print
  myat <- seq(55, 75, 5)

  ## adjust plot margins to accommodate side 4 labels
  par(mar = c(4,2,2,6), oma = rep(1,4))

  ## do the plot without axes or frame
  stripchart(list(sample1=x1,sample2=x2,sample3=x3),
  method = "stack",
  pch = 4,
  offset = 1/2,
  col = "blue",
  lwd = 2,
  xlim = c(53, 77),
  axes = FALSE)

  ## add the axes; tcl=-0.5 is the default; not really needed
  axis(1, at = myat, tcl = -0.5)
  axis(1, at = myat, pos = 1.90, tcl = -0.5)
  axis(1, at = myat, pos = 2.90, tcl = -0.5)

  ## reprint the axes without labels; ticks are upward
  axis(1, at = myat, labels = NA, tcl = 0.5)
  axis(1, at = myat, labels = NA, pos = 1.90, tcl = 0.5)
  axis(1, at = myat, labels = NA, pos = 2.90, tcl = 0.5)

  ## do the right-side axis, labels only
  axis(4, at = (1:3)-0.1,
   labels = paste("Sample",1:3), las = 1, lwd = 0)

  ## extend horizontal axis lines
  abline(h = (1:3)-0.1, lwd = 2)

  ## add the frame; it's in a bit from the outer edges
  ## due to the 'oma=' par setting
  box("figure")

Peter Ehlers



Em 13-07-2012 19:24, John Kane escreveu:

try something like
abline(h=1.9)

John Kane
Kingston ON Canada



-Original Message-
From: dwarnol...@suddenlink.net
Sent: Fri, 13 Jul 2012 09:54:35 -0700 (PDT)
To: r-help@r-project.org
Subject: Re: [R] Side by side strip charts

OK, got this far:

x1 <- round(rnorm(10,60,3))
x2 <- round(rnorm(10,65,3))
x3 <- round(rnorm(10,70,3))
stripchart(list(sample1=x1,sample2=x2,sample3=x3),
 method="stack",
 pch=4,
 offset=1/2,
 col="blue",
 lwd=2,
 las=1)

Any ideas on how to get an axes drawn under each one as in the image?

Thanks.

David Arnold
College of the Redwoods
http://msemac.redwoods.edu/~darnold/index.php

--
View this message in context:
http://r.789695.n4.nabble.com/Side-by-side-strip-charts-tp4636399p4636464.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Species names in distance matrix

2012-07-13 Thread Peter Ehlers

On 2012-07-13 15:55, Katharine Miller wrote:

I have a function that requires a distance matrix of class dist with
species names as row names.  For the life of me, I cannot figure out how to
get dist() to include species names.

I am sure this must be easy, because a lot of packages and functions out
there require dist objects to have row names.  Any help would be MUCH
appreciated.
- Katharine


You should be able to just set the 'Labels' attribute:

  x <- matrix(rnorm(100), nrow=5)
  d <- dist(x)
  attr(d, "Labels") <- LETTERS[1:5]

Alternatively, convert d to a matrix, assign rownames
and convert back to a dist:

  m <- as.matrix(d)
  rownames(m) <- letters[1:5]
  d1 <- as.dist(m)

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Species names in distance matrix

2012-07-13 Thread Katharine Miller
I have a function that requires a distance matrix of class dist with
species names as row names.  For the life of me, I cannot figure out how to
get dist() to include species names.

I am sure this must be easy, because a lot of packages and functions out
there require dist objects to have row names.  Any help would be MUCH
appreciated.
- Katharine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] significance test interquartile ranges

2012-07-13 Thread Weidong Gu
Sorry, I meant Kolmogorov-Smirnov test.

Thanks Peter for correction.

Weidong

On Fri, Jul 13, 2012 at 4:56 PM, Peter Ehlers  wrote:
> On 2012-07-13 13:33, Weidong Gu wrote:
>>
>> Hi Joerg,
>>
>> Seems Mann-Whitney-Wilcoxon test (ks.test in R) would do the work
>> which tests differences anywhere in two distributions, e.g. tails,
>> interquartiles and center.
>
>
> The ks.test() function refers to the Kolmogorov-Smirnov test,
> not the Wilcoxon test.
>
> The OP might find this link helpful (I haven't used it, and
> beware of mailer linebreaks):
>
>
> http://www.r-statistics.com/2010/02/siegel-tukey-a-non-parametric-test-for-equality-in-variability-r-code/
>
> Peter Ehlers
>
>>
>> Weidong Gu
>>
>>
>> On Fri, Jul 13, 2012 at 7:32 AM, Schaber, Jörg
>>  wrote:
>>>
>>> Hi,
>>>
>>> I have two non-normal distributions and use interquartile ranges as a
>>> dispersion measure.
>>> Now I am looking for a test, which tests whether the interquartile ranges
>>> from the two distributions are significantly different.
>>> Any idea?
>>>
>>> Thanks,
>>>
>>> joerg
>>>
>>>  [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] image.plot transparent?

2012-07-13 Thread Peter Ehlers

On 2012-07-13 06:07, Chris82 wrote:

Hello,

hiere is a small reproducible example.

All z.i which are NA should be transparent at the plot, but they are white
colored.


### Example image.plot regular x,y grid ###

x <- seq(2,2.9,0.1)
y <- seq(42,42.9,0.1)

z <- matrix(seq(-5,4.9,0.1),nrow=10)

image.plot(x,y,z)


### overplotting by a irregular grid 

x.i <-
matrix(c(2.434842,2.436714,2.438593,2.440477,2.442368,2.472929,2.474831,2.476739,2.478654,2.480574,2.511019,2.512950,2.514888,2.516832,2.518782,2.549111,2.551072,2.553039,2.555012,2.556992,2.587205,2.589195,2.591192,2.593195,2.595204),nrow=5,byrow=T)
y.i <-
matrix(c(42.24368,42.28684,42.33005,42.37330,42.41660,42.24392,42.28708,42.33028,42.37354,42.41683,42.24415,42.28732,42.33052,42.37378,42.41707,42.24439,42.28756,42.33076,42.37402,42.41732,42.24464,42.28780,42.33101,42.37426,42.41756),nrow=5,byrow=T)


z.i <- matrix(seq(-10,14,1),nrow=5)

z.i[z.i< 0] <- NA

image.plot(x.i,y.i,z.i, col=terrain.colors(20),add=T)


You still haven't told us what package you're using but,
like Professor Ripley, I assume that it's the fields package.
I believe that package uses a default of "white" for
missing values. You can reset that with the argument
'transparent.color' as in:

 image.plot(x.i,y.i,z.i, col=terrain.colors(20), add=TRUE,
transparent.color = "transparent")

Peter Ehlers



--
View this message in context: 
http://r.789695.n4.nabble.com/image-plot-transparent-tp4635976p4636434.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] integrating multi-dimensional dat along one dimension

2012-07-13 Thread Cable, Sam B Civ USAF AFMC AFRL/RVBXI
I just want to integrate a 3D data set along one dimension to obtain a
2D data set.  Something like:

 

(given array "d" with dim nx,ny,nz ...)

 

data_int<-array(dim=c(nx,ny))

for (n in 1:ny) {

  for (m in 1:nx) {

 data_int[m,n]<-sum(d[m,n,])

  }

}

 

The thing is, given R's facility with integers, it seems that I should
be able to obtain data_int without the explicit for-loops, but I haven't
been able to figure out how to do it.  Anyone know how?  Thanks.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] significance test interquartile ranges

2012-07-13 Thread Peter Ehlers

On 2012-07-13 13:33, Weidong Gu wrote:

Hi Joerg,

Seems Mann-Whitney-Wilcoxon test (ks.test in R) would do the work
which tests differences anywhere in two distributions, e.g. tails,
interquartiles and center.


The ks.test() function refers to the Kolmogorov-Smirnov test,
not the Wilcoxon test.

The OP might find this link helpful (I haven't used it, and
beware of mailer linebreaks):


http://www.r-statistics.com/2010/02/siegel-tukey-a-non-parametric-test-for-equality-in-variability-r-code/

Peter Ehlers



Weidong Gu

On Fri, Jul 13, 2012 at 7:32 AM, Schaber, Jörg
 wrote:

Hi,

I have two non-normal distributions and use interquartile ranges as a 
dispersion measure.
Now I am looking for a test, which tests whether the interquartile ranges from 
the two distributions are significantly different.
Any idea?

Thanks,

joerg

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting data and removing outliers

2012-07-13 Thread David L Carlson
If not linear, then perhaps nlrob() in package robustbase. 

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


- Original Message - 

From: "Stephen Sefick"  
To: "Lauren Vogric" , r-help@r-project.org 
Sent: Friday, July 13, 2012 3:15:25 PM 
Subject: Re: [R] Fitting data and removing outliers 

They are due to measurement error, sample of a different population, or 
... ? What is the unusual event? Does it explain something important 
about the system that you are working on? I am not telling you not to 
do what you are doing, but just writing things that I consider when I am 
doing regression modelling. 
FWIW, 

Stephen 

On 07/13/2012 02:26 PM, Lauren Vogric wrote: 
> Yes, they are unusual events that occurred that affected my data. They have 
> no positive affect in shaping a strong model. 
> 
> -Original Message- 
> From: stephen sefick [mailto:ssef...@gmail.com] 
> Sent: Friday, July 13, 2012 3:24 PM 
> To: David L Carlson 
> Cc: Lauren Vogric; r-help@r-project.org 
> Subject: Re: [R] Fitting data and removing outliers 
> 
> Do you have a good reason to throw these points out? 
> 
> On Fri, Jul 13, 2012 at 2:17 PM, David L Carlson  wrote: 
>> I didn't actually see any question in this posting, but instead of removing 
>> the outliers consider using a robust linear model. 
>> 
>> library(MASS) 
>> ?rlm 
>> 
>> The TeachingDemos package has a data set called outliers to show what can 
>> happen when you iteratively remove "outliers" in the way you suggest. 
>> 
>> - 
>> David L Carlson 
>> Associate Professor of Anthropology 
>> Texas A&M University 
>> College Station, TX 77840-4352 
>> 
>> 
>> - Original Message - 
>> 
>> From: "Lauren Vogric"  
>> To: r-help@r-project.org 
>> Sent: Friday, July 13, 2012 1:36:43 PM 
>> Subject: [R] Fitting data and removing outliers 
>> 
>> What I'm trying to do is create best fit line in R for a set of data points 
>> and then remove all the outliers to re-create a best fit. I can't use IQR 
>> because the outliers I have in mind are easily within the range, but way out 
>> of line for the best fit, which is ruining the fit. I'd rather throw out 
>> those points all together. 
>> 
>> Thanks! 
>> 
>> [[alternative HTML version deleted]] 
>> 
>> __ 
>> R-help@r-project.org mailing list 
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code. 
>> 
>> __ 
>> R-help@r-project.org mailing list 
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code. 
> 
> 
> -- 
> Stephen Sefick 
> ** 
> Auburn University 
> Biological Sciences 
> 331 Funchess Hall 
> Auburn, Alabama 
> 36849 
> ** 
> sas0...@auburn.edu 
> http://www.auburn.edu/~sas0025 
> ** 
> 
> Let's not spend our time and resources thinking about things that are so 
> little or so large that all they really do for us is puff us up and make us 
> feel like gods. We are mammals, and have not exhausted the annoying little 
> problems of being mammals. 
> 
> -K. Mullis 
> 
> "A big computer, a complex algorithm and a long time does not equal science." 
> 
> -Robert Gentleman 

__ 
R-help@r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] significance test interquartile ranges

2012-07-13 Thread Greg Snow
A permutation test may be appropriate:

1. compute the ratio of the 2 IQR values (or other comparison of interest)
2. combine the data from the 2 samples into 1 pool, then randomly
split into 2 groups (matching sample sizes of original) and compute
the ratio of the IQR values for the 2 new samples.
3. repeat #2 a bunch of times (like for a total of 999 random splits)
and combine with the original value.
4. (optional, but strongly suggested) plot a histogram of all the
ratios and place a reference line of the original ratio on the plot.
5. calculate the proportion of ratios that are as extreme or more
extreme than the original, this is the (approximate) p-value.

On Fri, Jul 13, 2012 at 5:32 AM, Schaber, Jörg
 wrote:
> Hi,
>
> I have two non-normal distributions and use interquartile ranges as a 
> dispersion measure.
> Now I am looking for a test, which tests whether the interquartile ranges 
> from the two distributions are significantly different.
> Any idea?
>
> Thanks,
>
> joerg
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] significance test interquartile ranges

2012-07-13 Thread Weidong Gu
Hi Joerg,

Seems Mann-Whitney-Wilcoxon test (ks.test in R) would do the work
which tests differences anywhere in two distributions, e.g. tails,
interquartiles and center.

Weidong Gu

On Fri, Jul 13, 2012 at 7:32 AM, Schaber, Jörg
 wrote:
> Hi,
>
> I have two non-normal distributions and use interquartile ranges as a 
> dispersion measure.
> Now I am looking for a test, which tests whether the interquartile ranges 
> from the two distributions are significantly different.
> Any idea?
>
> Thanks,
>
> joerg
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Power analysis for Cox regression with a time-varying covariate

2012-07-13 Thread Greg Snow
For something like this the best (and possibly only reasonable) option
is to use simulation. I have posted on the general steps for using
simulation for power studies in this list and elsewhere before, but
probably never with coxph.

The general steps still hold, but the complicated part here will be to
simulate the data.  I would recommend something along the lines of:

1. generate a value for the censoring time, possibly exponential or
weibull (for simplicity I would make this not dependent on the
covariates if reasonable).
2. generate a value for the covariate for the given time period
(sample function possibly), then generate a survival time for this
covariate value (possibly weibull distribution, or lognormal,
exponential, etc.)  If the survival time is less than the time period
and censoring time then you have an event and a time to the event.  If
the survival time is longer than the censoring time, but not longer
than the time period (for the covariate), then you have censoring and
you can record the time to censoring.  If the survival time is longer
than the time period then you have the row information for that time
period and can move on to the next time period where you will first
randomly choose the covariate value again, then generate another
survival time based on the covariate and given that they have already
survived a given amount.  Continue with this until you have an event
or censoring time for each subject.

On Fri, Jul 13, 2012 at 9:17 AM, Paul Miller  wrote:
> Hello All,
>
> Does anyone know where I can find information about how to do a power 
> analysis for Cox regression with a time-varying covariate using R or  some 
> other readily available software? I've done some searching online but haven't 
> found anything.
>
> Thanks,
>
> Paul
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting data and removing outliers

2012-07-13 Thread Stephen Sefick
They are due to measurement error, sample of a different population, or 
... ?  What is the unusual event?  Does it explain something important 
about the system that you are working on?  I am not telling you not to 
do what you are doing, but just writing things that I consider when I am 
doing regression modelling.

FWIW,

Stephen

On 07/13/2012 02:26 PM, Lauren Vogric wrote:

Yes, they are unusual events that occurred that affected my data. They have no 
positive affect in shaping a strong model.

-Original Message-
From: stephen sefick [mailto:ssef...@gmail.com]
Sent: Friday, July 13, 2012 3:24 PM
To: David L Carlson
Cc: Lauren Vogric; r-help@r-project.org
Subject: Re: [R] Fitting data and removing outliers

Do you have a good reason to throw these points out?

On Fri, Jul 13, 2012 at 2:17 PM, David L Carlson  wrote:

I didn't actually see any question in this posting, but instead of removing the 
outliers consider using a robust linear model.

library(MASS)
?rlm

The TeachingDemos package has a data set called outliers to show what can happen when you 
iteratively remove "outliers" in the way you suggest.

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


- Original Message -

From: "Lauren Vogric" 
To: r-help@r-project.org
Sent: Friday, July 13, 2012 1:36:43 PM
Subject: [R] Fitting data and removing outliers

What I'm trying to do is create best fit line in R for a set of data points and 
then remove all the outliers to re-create a best fit. I can't use IQR because 
the outliers I have in mind are easily within the range, but way out of line 
for the best fit, which is ruining the fit. I'd rather throw out those points 
all together.

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Stephen Sefick
**
Auburn University
Biological Sciences
331 Funchess Hall
Auburn, Alabama
36849
**
sas0...@auburn.edu
http://www.auburn.edu/~sas0025
**

Let's not spend our time and resources thinking about things that are so little 
or so large that all they really do for us is puff us up and make us feel like 
gods.  We are mammals, and have not exhausted the annoying little problems of 
being mammals.

 -K. Mullis

"A big computer, a complex algorithm and a long time does not equal science."

   -Robert Gentleman


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-squared with Intercept set to 0 (zero) for linear regression in R is incorrect

2012-07-13 Thread William Dunlap
S+ and, I assume, R compute r^2 when there is no intercept as
   sum(fitted(M1)^2) / sum(y^2)
where M1 is the fitted model and y the response.

See, for example,
   
http://web.ist.utl.pt/~ist11038/compute/errtheory/,regression/regrthroughorigin.pdf
for the derivation of this formula.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: Pamela Krone-Davis [mailto:pkrone-da...@csumb.edu]
Sent: Friday, July 13, 2012 11:05 AM
To: William Dunlap
Subject: Re: [R] R-squared with Intercept set to 0 (zero) for linear regression 
in R is incorrect

Thanks William,

I have actually tried a couple of different formulas for determining R^2.  The 
second formula does return a different value and assumes no intercept.  The 
second formula is attached in the image.

However, when I use the manual formula for R^2 that is shown, I get an R^2 
value that matches the second formula when I don't set the intercept.  ISo I 
think the formula I showed works for both cases.

M1 = lm(k[,1]~k[,2] + 0) ## set intercept to 0
M2 = lm(k[,1]~k[,2])

sqerrM2 = (k[,1] - predict(M2))^2
sqtotM2 = (k[,1] - mean(k[,1]))  ^2
sqerrM1 = (k[,1] - predict(M1))^2
sqtotM1 = (k[,1] - mean(k[,1]))  ^2
R2M1 = 1 -  sum(sqerrM1)/sum(sqtotM1) ##  get 0.328   same as excel value
R2M2 = 1 -  sum(sqerrM2)/sum(sqtotM2)## same as Excel 0.408 and as R in 
this case

> R2M1
[1] 0.3284381
> R2M2
[1] 0.4083052

How does R compute the R-squared value?

Thanks
Pam

On Fri, Jul 13, 2012 at 10:38 AM, William Dunlap 
mailto:wdun...@tibco.com>> wrote:
While excluding the intercept may make sense, your formula for r^2 assumes
that there was an intercept (that is why mean(y)  is in your expression for
sqtot).

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: Pamela Krone-Davis 
[mailto:pkrone-da...@csumb.edu]
Sent: Friday, July 13, 2012 10:32 AM
To: William Dunlap
Subject: Re: [R] R-squared with Intercept set to 0 (zero) for linear regression 
in R is incorrect

Hi William,

Thanks for getting back to me.  When I use the values you provided, it would 
not make sense to set an intercept of 0.  For my data, 0 does make sense as an 
intercept.  When I do not set a 0 intercept using your data points, I get the 
same value for R-squared in R and in Excel and manually.

Thanks
Pam

On Fri, Jul 13, 2012 at 10:03 AM, William Dunlap 
mailto:wdun...@tibco.com>> wrote:
What does Excel give for the following data, where the by-hand formula
you gave is obviously wrong?
   > x <- c(1, 2, 3)
   > y <- c(13.1, 11.9, 11.0)
   > M1 <- lm(y~x+0)
   > sqerr <- (y- predict(M1)) ^ 2
   > sqtot <- (y - mean(y)) ^ 2
   > 1 - sum(sqerr)/sum(sqtot)
  [1] -37.38707

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On
> Behalf Of Pamela Krone-Davis
> Sent: Friday, July 13, 2012 9:01 AM
> To: r-help@r-project.org
> Subject: [R] R-squared with Intercept set to 0 (zero) for linear regression 
> in R is
> incorrect
>
> Hi,
>
> I have been using lm in R to do a linear regression and find the slope
> coefficients and value for R-squared.  The R-squared value reported by R
> (R^2 = 0.9558) is very different than the R-squared value when I use the
> same equation in Exce (R^2 = 0.328).  I manually computed R-squared and the
> Excel value is correct.  I show my code for the determination of R^2 in R.
> When I do not set 0 as the intercept, the R^2 value is the same in R and
> Excel.  In both cases the slope coefficient from R and from Excel are
> identical.
>
> k is a data frame with two columns.
>
> M1 = lm(k[,1]~k[,2] + 0) ## set intercept to 0 and get different
> R^2 values in R and Excel
> M2 = lm(k[,1]~k[,2])
> sumM1 = summary(M1)
> sumM2 = summary(M2)## get same value as Excel when intercept is not
> set to 0
>
> Below is what R returns for sumM1:
>
> lm(formula = k[, 1] ~ k[, 2] + 0)
>
> Residuals:
>   Min1QMedian3Q   Max
> -0.057199 -0.015857  0.003793  0.013737  0.056178
>
> Coefficients:
>Estimate Std. Error t value Pr(>|t|)
> k[, 2]  1.050220.04266   24.62   <2e-16 ***
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Residual standard error: 0.02411 on 28 degrees of freedom
> Multiple R-squared: 0.9558, Adjusted R-squared: 0.9543
> F-statistic: 606.2 on 1 and 28 DF,  p-value: < 2.2e-16
>
> Way manual determination was performed.  The value returned coincides with
> the value from Excel:
>
>  trying to figure out why the R^2 for R and Excel are so different.
>  sqerr = (k[,1] - predict(M1))^2
>  sqtot = (k[,1] - mean(k[,1])   ^2
>
>  R2 = 1 -  sum(sqerr)/sum(sqtot) ## for 1D get 0.328   same as
> excel value
>
> I am very puzzled by this.  How does R compute the value for 

[R] LiblineaR: read/write model files?

2012-07-13 Thread Sam Steingold
How do I read/write liblinear models to files?
E.g., if I train a model using the command line interface, I might want
to load it into R to look the histogram of the weights.
Or I might want to train a model in R and then apply it using a command
line interface.
-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://www.PetitionOnline.com/tap12009/
http://openvotingconsortium.org http://www.memritv.org http://pmw.org.il
Volume(Pizza of radius Z and thickness A) = PI * Z * Z * A

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to find frequent sequences.

2012-07-13 Thread Vineet Shukla
Hi Petr,
   Yes, that's really very helpful.


Petr : Using this interpretation, AB occurs at lines 1,3,4 and not 1,3,5.
Is this correct?
Vineet : Yes , thats right sorry for the typo.



Petr: If some sequence contains several ocurrences of a pattern, for
example,
the sequence
   A, B, A, B
contains AB twice, then it is counted only once?

Vineet : what needs to be done if I would like to count it as many times as
it occurred ?
remove dont call unique function from "unique(embed(rev(x), lpattern))" ?

Rgds,
Vineet



On Fri, Jul 13, 2012 at 3:36 AM, Petr Savicky  wrote:

> On Thu, Jul 12, 2012 at 03:51:54PM -0500, Vineet Shukla wrote:
> > I have independent event sequences for example as follows :
> >
> > Independent event sequence   1 : A , B , C , D
> > Independent event sequence   2 : A, C , B
> > Independent event sequence   3 :D, A, B, X,Y, Z
> > Independent event sequence   4 :C,A,A,B
> > Independent event sequence   5 :B,A,D
> >
> > I want to able to find that most common sequence patters as
> >
> > {A, B }  = > 3
> > from lines 1,3,5.
> >
> > Pls note that A,C,B must not be considered because C comes in between
> > and line 5 also must not be considered because order of A,B is reversed.
>
> Hi.
>
> If i understand correctly, the first sequence contains patterns
>
>   AB, BC, CD.
>
> Using this interpretation, AB occurs at lines 1,3,4 and not 1,3,5.
> Is this correct?
>
> If some sequence contains several ocurrences of a pattern, for example,
> the sequence
>
>A, B, A, B
>
> contains AB twice, then it is counted only once?
>
> If this is correct, then try the following
>
>   # your input list
>   lst <- list(
>   c("A", "B", "C", "D"),
>   c("A", "C", "B"),
>   c("D", "A", "B", "X", "Y", "Z"),
>   c("C", "A", "A", "B"),
>   c("B", "A", "D"))
>
>   # extract unique patterns from a single sequence as rows of a matrix
>   # lpattern is the length of the patterns
>   singleSeq <- function(x, lpattern)
>   {
>   unique(embed(rev(x), lpattern))
>   }
>
>   lst1 <- lapply(lst, singleSeq, lpattern=2)
>   # combine the matrices to a single matrix
>   mat <- do.call(rbind, lst1)
>   # convert the patters to strings
>   pat <- do.call(paste, c(data.frame(mat), sep=""))
>   out <- table(pat)
>   out
>
>   pat
>   AA AB AC AD BA BC BX CA CB CD DA XY YZ
>1  3  1  1  1  1  1  1  1  1  1  1  1
>
>   names(out)[which.max(out)]
>
>   [1] "AB"
>
> Hope this helps.
>
> Petr Savicky.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting data and removing outliers

2012-07-13 Thread stephen sefick
Do you have a good reason to throw these points out?

On Fri, Jul 13, 2012 at 2:17 PM, David L Carlson  wrote:
> I didn't actually see any question in this posting, but instead of removing 
> the outliers consider using a robust linear model.
>
> library(MASS)
> ?rlm
>
> The TeachingDemos package has a data set called outliers to show what can 
> happen when you iteratively remove "outliers" in the way you suggest.
>
> -
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
>
> - Original Message -
>
> From: "Lauren Vogric" 
> To: r-help@r-project.org
> Sent: Friday, July 13, 2012 1:36:43 PM
> Subject: [R] Fitting data and removing outliers
>
> What I'm trying to do is create best fit line in R for a set of data points 
> and then remove all the outliers to re-create a best fit. I can't use IQR 
> because the outliers I have in mind are easily within the range, but way out 
> of line for the best fit, which is ruining the fit. I'd rather throw out 
> those points all together.
>
> Thanks!
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Stephen Sefick
**
Auburn University
Biological Sciences
331 Funchess Hall
Auburn, Alabama
36849
**
sas0...@auburn.edu
http://www.auburn.edu/~sas0025
**

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

"A big computer, a complex algorithm and a long time does not equal science."

  -Robert Gentleman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting data and removing outliers

2012-07-13 Thread David L Carlson
I didn't actually see any question in this posting, but instead of removing the 
outliers consider using a robust linear model.

library(MASS)
?rlm

The TeachingDemos package has a data set called outliers to show what can 
happen when you iteratively remove "outliers" in the way you suggest.

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


- Original Message - 

From: "Lauren Vogric"  
To: r-help@r-project.org 
Sent: Friday, July 13, 2012 1:36:43 PM 
Subject: [R] Fitting data and removing outliers 

What I'm trying to do is create best fit line in R for a set of data points and 
then remove all the outliers to re-create a best fit. I can't use IQR because 
the outliers I have in mind are easily within the range, but way out of line 
for the best fit, which is ruining the fit. I'd rather throw out those points 
all together. 

Thanks! 

[[alternative HTML version deleted]] 

__ 
R-help@r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Side by side strip charts

2012-07-13 Thread Rui Barradas

Hello,

Or maybe the argument 'pos' of axis().


stripchart(list(sample1=x1,sample2=x2,sample3=x3),
   method="stack",
   pch=4,
   offset=1/2,
   col="blue",
   lwd=2,
   las=1,
   xlim=c(53, 77),
   xaxt="n")
axis(1, at = seq(55, 75, by=5), lwd=2)
axis(1, at = seq(55, 75, by=5), pos=1.90, lwd=2)
axis(1, at = seq(55, 75, by=5), pos=2.90, lwd=2)


(I've also added xlim)

Hope this helps,

Rui Barradas

Em 13-07-2012 19:24, John Kane escreveu:

try something like
abline(h=1.9)

John Kane
Kingston ON Canada



-Original Message-
From: dwarnol...@suddenlink.net
Sent: Fri, 13 Jul 2012 09:54:35 -0700 (PDT)
To: r-help@r-project.org
Subject: Re: [R] Side by side strip charts

OK, got this far:

x1 <- round(rnorm(10,60,3))
x2 <- round(rnorm(10,65,3))
x3 <- round(rnorm(10,70,3))
stripchart(list(sample1=x1,sample2=x2,sample3=x3),
method="stack",
pch=4,
offset=1/2,
col="blue",
lwd=2,
las=1)

Any ideas on how to get an axes drawn under each one as in the image?

Thanks.

David Arnold
College of the Redwoods
http://msemac.redwoods.edu/~darnold/index.php

--
View this message in context:
http://r.789695.n4.nabble.com/Side-by-side-strip-charts-tp4636399p4636464.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fitting data and removing outliers

2012-07-13 Thread Lauren Vogric
What I'm trying to do is create  best fit line in R for a set of  data points 
and then remove all the outliers to re-create a best fit. I can't use IQR 
because the outliers I have in mind are easily within the range, but way out of 
line for the best fit, which is ruining the fit. I'd rather throw out those 
points all together.

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Side by side strip charts

2012-07-13 Thread peter dalgaard

On Jul 13, 2012, at 18:54 , darnold wrote:

> OK, got this far:
> 
> x1 <- round(rnorm(10,60,3))
> x2 <- round(rnorm(10,65,3))
> x3 <- round(rnorm(10,70,3))
> stripchart(list(sample1=x1,sample2=x2,sample3=x3),
>   method="stack",
>   pch=4,
>   offset=1/2,
>   col="blue",
>   lwd=2,
>   las=1)
> 
> Any ideas on how to get an axes drawn under each one as in the image?
> 

I'd expect axis(., pos=something) to be your friend.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Side by side strip charts

2012-07-13 Thread John Kane
try something like 
abline(h=1.9)

John Kane
Kingston ON Canada


> -Original Message-
> From: dwarnol...@suddenlink.net
> Sent: Fri, 13 Jul 2012 09:54:35 -0700 (PDT)
> To: r-help@r-project.org
> Subject: Re: [R] Side by side strip charts
> 
> OK, got this far:
> 
> x1 <- round(rnorm(10,60,3))
> x2 <- round(rnorm(10,65,3))
> x3 <- round(rnorm(10,70,3))
> stripchart(list(sample1=x1,sample2=x2,sample3=x3),
>method="stack",
>pch=4,
>offset=1/2,
>col="blue",
>lwd=2,
>las=1)
> 
> Any ideas on how to get an axes drawn under each one as in the image?
> 
> Thanks.
> 
> David Arnold
> College of the Redwoods
> http://msemac.redwoods.edu/~darnold/index.php
> 
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Side-by-side-strip-charts-tp4636399p4636464.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table with numeric row names

2012-07-13 Thread peter dalgaard

On Jul 13, 2012, at 17:59 , arun wrote:

> Hi Peter,
> 
> I copied the data from your email and run it again.
> 
> dat1<-read.table(text="
>   2.5  3.6  7.1  7.9
>   100  3  4  23
>   200  3.1  4  3  3
>   300  2.2  3.3  24
>   ",sep="",header=TRUE) 
> 
> dat1
> X2.5 X3.6 X7.1 X7.9
> 100  3.0  4.023
> 200  3.1  4.033
> 300  2.2  3.324
> 
> 
> colnames(dat1)<-gsub("^[X](.*)","\\1",colnames(dat1))
> 
> 
> I am  not sure what happened with your end.  May be you could try
> readtable(, fill=TRUE)
> 

More likely, I need something to filter out oddball characters inserted by 
Nabble or one of the mail agents. Watch this: Cut+paste from arun (22:17), Chi 
(07:09), and arun (07:08)

> dat1<-read.table(text="
+  2.5  3.6  7.1  7.9
+  100  3  4  23
+  200  3.1  4  3  3
+  300  2.2  3.3  24
+  ",sep="",header=TRUE)
> dat1<-read.table(text="
+ 2.5  3.6  7.1  7.9
+ 100  3  4  23
+ 200  3.1  4  3  3
+ 300  2.2  3.3  24
+ ",sep="",header=TRUE)
> dat1<-read.table(text=" 
+  2.5  3.6  7.1  7.9 
+  100  3  4  23 
+  200  3.1  4  3  3 
+  300  2.2  3.3  24 
+  ",sep="",header=TRUE)  
Error in read.table(text = " \n 2.5  3.6  7.1  7.9 \n 100  3  4  23 
\n 200  3.1  4  3  3 \n 300  2.2  3.3  24 \n ",  : 
  more columns than column names

The clue seems to be that the 3rd variant has a trailing space added to each 
line. No big deal, just drove me up the wall for while this afternoon...


> I guess Chi was able to read it as I understood from his email: 
> ("Thanks. It works very good.Chi"
> 
> A.K.
> 
> 
> 
> 
> - Original Message -
> From: peter dalgaard 
> To: arun 
> Cc: kexinz ; R help 
> Sent: Friday, July 13, 2012 10:27 AM
> Subject: Re: [R] read.table with numeric row names
> 
> 
> On Jul 13, 2012, at 04:27 , arun wrote:
> 
>> Hello,
>> 
>> I saw your reply in nabble.  Sorry about that.  I thought the dataset had 
>> only few columns.
>> 
>> #You can read first line of a file using:
>> readLines("foo.txt",n=1)[1]
>> 
>> 
>> #The more generic colname substitution
>> dat1<-read.table(text=" 
>>   2.5  3.6  7.1  7.9 
>>   100  3  4  23 
>>   200  3.1  4  3  3 
>>   300  2.2  3.3  24 
>>   ",sep="",header=TRUE)
> 
> (This didn't survive too well in mail:
> 
>> dat1<-read.table(text=" 
> +  2.5  3.6  7.1  7.9 
> +  100  3  4  23 
> +  200  3.1  4  3  3 
> +  300  2.2  3.3  24 
> +  ",sep="",header=TRUE)  
> Error in read.table(text = " \n 2.5  3.6  7.1  7.9 \n 100  3  4  2
> 3 \n 200  3.1  4  3  3 \n 300  2.2  3.3  24 \n ",  : 
>   more columns than column names
> 
> Not sure exactly what happened there...)
> 
> 
> 
>>   
>> #The code should remove the "X" from the column names (row names?)
>> 
> 
> However, adding check.names=FALSE should be more expedient.
> 
>> colnames(dat1)<-gsub("^[X](.*)","\\1",colnames(dat1))
>> dat1
>>  2.5 3.6 7.1 7.9
>> 100 3.0 4.0   2   3
>> 200 3.1 4.0   3   3
>> 300 2.2 3.3   2   4
>> plot(colMeans(dat1)~as.numeric(names(dat1)),xlab="Column_Name",ylab="Column_Mean")
>> 
>> A.K.
>> 
>> 
>> 
>> 
>> - Original Message -
>> From: kexinz 
>> To: r-help@r-project.org
>> Cc: 
>> Sent: Thursday, July 12, 2012 2:50 PM
>> Subject: [R] read.table with numeric row names
>> 
>> I have a text file like this
>>   2.5  3.6  7.1  7.9
>> 100   3  4   2 3
>> 200   3.1   4  3  3
>> 300   2.2   3.3   2 4
>> 
>> I used "r <- read.table("a.txt", header=T)"
>> The row names becomes X2.5, X3.6... What I need is the row names are
>> numeric, so I can use the row names as numbers on x-axis for plotting. e.g.
>> "plot(colMeans(r)~names(r))", something like this. How to do this?
>> 
>> Thanks.
>> 
>> --
>> View this message in context: 
>> http://r.789695.n4.nabble.com/read-table-with-numeric-row-names-tp4636342.html
>> Sent from the R help mailing list archive at Nabble.com.
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

_

Re: [R] R-squared with Intercept set to 0 (zero) for linear regression in R is incorrect

2012-07-13 Thread William Dunlap
While excluding the intercept may make sense, your formula for r^2 assumes
that there was an intercept (that is why mean(y)  is in your expression for
sqtot).

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: Pamela Krone-Davis [mailto:pkrone-da...@csumb.edu]
Sent: Friday, July 13, 2012 10:32 AM
To: William Dunlap
Subject: Re: [R] R-squared with Intercept set to 0 (zero) for linear regression 
in R is incorrect

Hi William,

Thanks for getting back to me.  When I use the values you provided, it would 
not make sense to set an intercept of 0.  For my data, 0 does make sense as an 
intercept.  When I do not set a 0 intercept using your data points, I get the 
same value for R-squared in R and in Excel and manually.

Thanks
Pam


On Fri, Jul 13, 2012 at 10:03 AM, William Dunlap 
mailto:wdun...@tibco.com>> wrote:
What does Excel give for the following data, where the by-hand formula
you gave is obviously wrong?
   > x <- c(1, 2, 3)
   > y <- c(13.1, 11.9, 11.0)
   > M1 <- lm(y~x+0)
   > sqerr <- (y- predict(M1)) ^ 2
   > sqtot <- (y - mean(y)) ^ 2
   > 1 - sum(sqerr)/sum(sqtot)
  [1] -37.38707

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On
> Behalf Of Pamela Krone-Davis
> Sent: Friday, July 13, 2012 9:01 AM
> To: r-help@r-project.org
> Subject: [R] R-squared with Intercept set to 0 (zero) for linear regression 
> in R is
> incorrect
>
> Hi,
>
> I have been using lm in R to do a linear regression and find the slope
> coefficients and value for R-squared.  The R-squared value reported by R
> (R^2 = 0.9558) is very different than the R-squared value when I use the
> same equation in Exce (R^2 = 0.328).  I manually computed R-squared and the
> Excel value is correct.  I show my code for the determination of R^2 in R.
> When I do not set 0 as the intercept, the R^2 value is the same in R and
> Excel.  In both cases the slope coefficient from R and from Excel are
> identical.
>
> k is a data frame with two columns.
>
> M1 = lm(k[,1]~k[,2] + 0) ## set intercept to 0 and get different
> R^2 values in R and Excel
> M2 = lm(k[,1]~k[,2])
> sumM1 = summary(M1)
> sumM2 = summary(M2)## get same value as Excel when intercept is not
> set to 0
>
> Below is what R returns for sumM1:
>
> lm(formula = k[, 1] ~ k[, 2] + 0)
>
> Residuals:
>   Min1QMedian3Q   Max
> -0.057199 -0.015857  0.003793  0.013737  0.056178
>
> Coefficients:
>Estimate Std. Error t value Pr(>|t|)
> k[, 2]  1.050220.04266   24.62   <2e-16 ***
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Residual standard error: 0.02411 on 28 degrees of freedom
> Multiple R-squared: 0.9558, Adjusted R-squared: 0.9543
> F-statistic: 606.2 on 1 and 28 DF,  p-value: < 2.2e-16
>
> Way manual determination was performed.  The value returned coincides with
> the value from Excel:
>
>  trying to figure out why the R^2 for R and Excel are so different.
>  sqerr = (k[,1] - predict(M1))^2
>  sqtot = (k[,1] - mean(k[,1])   ^2
>
>  R2 = 1 -  sum(sqerr)/sum(sqtot) ## for 1D get 0.328   same as
> excel value
>
> I am very puzzled by this.  How does R compute the value for R^2 in this
> case? Did i write the lm incorrectly?
>
> Thanks
> Pam
>
> PS  In case you are interested, the data I am using for hte two columns is
> below.
>
> k[, 1]
> 1]
>  [1] 0.17170228 0.10881539 0.11843669 0.11619201 0.08441067 0.09424441
> 0.04782264 0.09526496 0.11596476 0.10323453 0.06487894 0.08916484
> 0.06358752 0.07945473
> [15] 0.11213532 0.06531185 0.11503484 0.13679548 0.13762677 0.13126827
> 0.12350649 0.12842441 0.13075654 0.15026602 0.14536351 0.07841638
> 0.08419016 0.11995240
> [29] 0.14425678
>
> > k[,2]
>  [1] 0.11 0.10 0.11 0.10 0.10 0.09 0.10 0.09 0.09 0.11 0.09 0.10 0.09 0.10
> 0.09 0.10 0.10 0.10 0.11 0.10 0.11 0.11 0.12 0.13 0.15 0.10 0.09 0.11 0.12
>
>
> --
> Pam Krone-Davis
> Project Research Assistant and Grant Manager
> PO Box 22122
> Carmel, CA 93922
> (831)582-3684 (o)
> (831)324-0391 (h)
>
>   [[alternative HTML version deleted]]



--
Pam Krone-Davis
Project Research Assistant and Grant Manager
PO Box 22122
Carmel, CA 93922
(831)582-3684 (o)
(831)324-0391 (h)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-squared with Intercept set to 0 (zero) for linear regression in R is incorrect

2012-07-13 Thread John Sorkin
Pamela
R squared with a non-zero, and with a zero intercept can be very different as 
the regression line that you get with and without a zero intercept can be very 
different. Have you plotted your data plot(k[,2],k[,1]) to see if a zero 
intercept is reasonable for your data? Have you drawn the regression lines that 
you get from your models and compared the lines to the plots of your data?
John 

>>> Pamela Krone-Davis  7/13/2012 12:00:36 PM >>>
Hi,

I have been using lm in R to do a linear regression and find the slope
coefficients and value for R-squared.  The R-squared value reported by R
(R^2 = 0.9558) is very different than the R-squared value when I use the
same equation in Exce (R^2 = 0.328).  I manually computed R-squared and the
Excel value is correct.  I show my code for the determination of R^2 in R.
When I do not set 0 as the intercept, the R^2 value is the same in R and
Excel.  In both cases the slope coefficient from R and from Excel are
identical.

k is a data frame with two columns.

M1 = lm(k[,1]~k[,2] + 0) ## set intercept to 0 and get different
R^2 values in R and Excel
M2 = lm(k[,1]~k[,2])
sumM1 = summary(M1)
sumM2 = summary(M2)## get same value as Excel when intercept is not
set to 0

Below is what R returns for sumM1:

lm(formula = k[, 1] ~ k[, 2] + 0)

Residuals:
  Min1QMedian3Q   Max
-0.057199 -0.015857  0.003793  0.013737  0.056178

Coefficients:
   Estimate Std. Error t value Pr(>|t|)
k[, 2]  1.050220.04266   24.62   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.02411 on 28 degrees of freedom
Multiple R-squared: 0.9558, Adjusted R-squared: 0.9543
F-statistic: 606.2 on 1 and 28 DF,  p-value: < 2.2e-16

Way manual determination was performed.  The value returned coincides with
the value from Excel:

 trying to figure out why the R^2 for R and Excel are so different.
 sqerr = (k[,1] - predict(M1))^2
 sqtot = (k[,1] - mean(k[,1])   ^2

 R2 = 1 -  sum(sqerr)/sum(sqtot) ## for 1D get 0.328   same as
excel value

I am very puzzled by this.  How does R compute the value for R^2 in this
case? Did i write the lm incorrectly?

Thanks
Pam

PS  In case you are interested, the data I am using for hte two columns is
below.

k[, 1]
1]
 [1] 0.17170228 0.10881539 0.11843669 0.11619201 0.08441067 0.09424441
0.04782264 0.09526496 0.11596476 0.10323453 0.06487894 0.08916484
0.06358752 0.07945473
[15] 0.11213532 0.06531185 0.11503484 0.13679548 0.13762677 0.13126827
0.12350649 0.12842441 0.13075654 0.15026602 0.14536351 0.07841638
0.08419016 0.11995240
[29] 0.14425678

> k[,2]
 [1] 0.11 0.10 0.11 0.10 0.10 0.09 0.10 0.09 0.09 0.11 0.09 0.10 0.09 0.10
0.09 0.10 0.10 0.10 0.11 0.10 0.11 0.11 0.12 0.13 0.15 0.10 0.09 0.11 0.12


-- 
Pam Krone-Davis
Project Research Assistant and Grant Manager
PO Box 22122
Carmel, CA 93922
(831)582-3684 (o)
(831)324-0391 (h)

[[alternative HTML version deleted]]


Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] LOCFIT

2012-07-13 Thread Ioanna Ioannou

Hello,
I am using the locfit to fit a non parametric glm model to data with a gamma 
distributed response variable. In the parametric glm regression the diagnostics 
were based on the study of the standardized deviance or pearson residuals.  How 
can I estimate the the standardized Pearson residuals for the nonparametric 
model? 
Thanks, Ioanna
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-squared with Intercept set to 0 (zero) for linear regression in R is incorrect

2012-07-13 Thread William Dunlap
You might want to look at 
   http://support.microsoft.com/kb/214230
entitled
   Incorrect output is returned when you use the Linear Regression (LINEST) 
function in Excel

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
> Behalf Of William Dunlap
> Sent: Friday, July 13, 2012 10:04 AM
> To: Pamela Krone-Davis; r-help@r-project.org
> Subject: Re: [R] R-squared with Intercept set to 0 (zero) for linear 
> regression in R is
> incorrect
> 
> What does Excel give for the following data, where the by-hand formula
> you gave is obviously wrong?
>> x <- c(1, 2, 3)
>> y <- c(13.1, 11.9, 11.0)
>> M1 <- lm(y~x+0)
>> sqerr <- (y- predict(M1)) ^ 2
>> sqtot <- (y - mean(y)) ^ 2
>> 1 - sum(sqerr)/sum(sqtot)
>   [1] -37.38707
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
> > Behalf Of Pamela Krone-Davis
> > Sent: Friday, July 13, 2012 9:01 AM
> > To: r-help@r-project.org
> > Subject: [R] R-squared with Intercept set to 0 (zero) for linear regression 
> > in R is
> > incorrect
> >
> > Hi,
> >
> > I have been using lm in R to do a linear regression and find the slope
> > coefficients and value for R-squared.  The R-squared value reported by R
> > (R^2 = 0.9558) is very different than the R-squared value when I use the
> > same equation in Exce (R^2 = 0.328).  I manually computed R-squared and the
> > Excel value is correct.  I show my code for the determination of R^2 in R.
> > When I do not set 0 as the intercept, the R^2 value is the same in R and
> > Excel.  In both cases the slope coefficient from R and from Excel are
> > identical.
> >
> > k is a data frame with two columns.
> >
> > M1 = lm(k[,1]~k[,2] + 0) ## set intercept to 0 and get different
> > R^2 values in R and Excel
> > M2 = lm(k[,1]~k[,2])
> > sumM1 = summary(M1)
> > sumM2 = summary(M2)## get same value as Excel when intercept is not
> > set to 0
> >
> > Below is what R returns for sumM1:
> >
> > lm(formula = k[, 1] ~ k[, 2] + 0)
> >
> > Residuals:
> >   Min1QMedian3Q   Max
> > -0.057199 -0.015857  0.003793  0.013737  0.056178
> >
> > Coefficients:
> >Estimate Std. Error t value Pr(>|t|)
> > k[, 2]  1.050220.04266   24.62   <2e-16 ***
> > ---
> > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> >
> > Residual standard error: 0.02411 on 28 degrees of freedom
> > Multiple R-squared: 0.9558, Adjusted R-squared: 0.9543
> > F-statistic: 606.2 on 1 and 28 DF,  p-value: < 2.2e-16
> >
> > Way manual determination was performed.  The value returned coincides with
> > the value from Excel:
> >
> >  trying to figure out why the R^2 for R and Excel are so different.
> >  sqerr = (k[,1] - predict(M1))^2
> >  sqtot = (k[,1] - mean(k[,1])   ^2
> >
> >  R2 = 1 -  sum(sqerr)/sum(sqtot) ## for 1D get 0.328   same as
> > excel value
> >
> > I am very puzzled by this.  How does R compute the value for R^2 in this
> > case? Did i write the lm incorrectly?
> >
> > Thanks
> > Pam
> >
> > PS  In case you are interested, the data I am using for hte two columns is
> > below.
> >
> > k[, 1]
> > 1]
> >  [1] 0.17170228 0.10881539 0.11843669 0.11619201 0.08441067 0.09424441
> > 0.04782264 0.09526496 0.11596476 0.10323453 0.06487894 0.08916484
> > 0.06358752 0.07945473
> > [15] 0.11213532 0.06531185 0.11503484 0.13679548 0.13762677 0.13126827
> > 0.12350649 0.12842441 0.13075654 0.15026602 0.14536351 0.07841638
> > 0.08419016 0.11995240
> > [29] 0.14425678
> >
> > > k[,2]
> >  [1] 0.11 0.10 0.11 0.10 0.10 0.09 0.10 0.09 0.09 0.11 0.09 0.10 0.09 0.10
> > 0.09 0.10 0.10 0.10 0.11 0.10 0.11 0.11 0.12 0.13 0.15 0.10 0.09 0.11 0.12
> >
> >
> > --
> > Pam Krone-Davis
> > Project Research Assistant and Grant Manager
> > PO Box 22122
> > Carmel, CA 93922
> > (831)582-3684 (o)
> > (831)324-0391 (h)
> >
> > [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-squared with Intercept set to 0 (zero) for linear regression in R is incorrect

2012-07-13 Thread William Dunlap
What does Excel give for the following data, where the by-hand formula
you gave is obviously wrong?
   > x <- c(1, 2, 3)
   > y <- c(13.1, 11.9, 11.0)
   > M1 <- lm(y~x+0)
   > sqerr <- (y- predict(M1)) ^ 2
   > sqtot <- (y - mean(y)) ^ 2
   > 1 - sum(sqerr)/sum(sqtot)
  [1] -37.38707

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
> Behalf Of Pamela Krone-Davis
> Sent: Friday, July 13, 2012 9:01 AM
> To: r-help@r-project.org
> Subject: [R] R-squared with Intercept set to 0 (zero) for linear regression 
> in R is
> incorrect
> 
> Hi,
> 
> I have been using lm in R to do a linear regression and find the slope
> coefficients and value for R-squared.  The R-squared value reported by R
> (R^2 = 0.9558) is very different than the R-squared value when I use the
> same equation in Exce (R^2 = 0.328).  I manually computed R-squared and the
> Excel value is correct.  I show my code for the determination of R^2 in R.
> When I do not set 0 as the intercept, the R^2 value is the same in R and
> Excel.  In both cases the slope coefficient from R and from Excel are
> identical.
> 
> k is a data frame with two columns.
> 
> M1 = lm(k[,1]~k[,2] + 0) ## set intercept to 0 and get different
> R^2 values in R and Excel
> M2 = lm(k[,1]~k[,2])
> sumM1 = summary(M1)
> sumM2 = summary(M2)## get same value as Excel when intercept is not
> set to 0
> 
> Below is what R returns for sumM1:
> 
> lm(formula = k[, 1] ~ k[, 2] + 0)
> 
> Residuals:
>   Min1QMedian3Q   Max
> -0.057199 -0.015857  0.003793  0.013737  0.056178
> 
> Coefficients:
>Estimate Std. Error t value Pr(>|t|)
> k[, 2]  1.050220.04266   24.62   <2e-16 ***
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> Residual standard error: 0.02411 on 28 degrees of freedom
> Multiple R-squared: 0.9558, Adjusted R-squared: 0.9543
> F-statistic: 606.2 on 1 and 28 DF,  p-value: < 2.2e-16
> 
> Way manual determination was performed.  The value returned coincides with
> the value from Excel:
> 
>  trying to figure out why the R^2 for R and Excel are so different.
>  sqerr = (k[,1] - predict(M1))^2
>  sqtot = (k[,1] - mean(k[,1])   ^2
> 
>  R2 = 1 -  sum(sqerr)/sum(sqtot) ## for 1D get 0.328   same as
> excel value
> 
> I am very puzzled by this.  How does R compute the value for R^2 in this
> case? Did i write the lm incorrectly?
> 
> Thanks
> Pam
> 
> PS  In case you are interested, the data I am using for hte two columns is
> below.
> 
> k[, 1]
> 1]
>  [1] 0.17170228 0.10881539 0.11843669 0.11619201 0.08441067 0.09424441
> 0.04782264 0.09526496 0.11596476 0.10323453 0.06487894 0.08916484
> 0.06358752 0.07945473
> [15] 0.11213532 0.06531185 0.11503484 0.13679548 0.13762677 0.13126827
> 0.12350649 0.12842441 0.13075654 0.15026602 0.14536351 0.07841638
> 0.08419016 0.11995240
> [29] 0.14425678
> 
> > k[,2]
>  [1] 0.11 0.10 0.11 0.10 0.10 0.09 0.10 0.09 0.09 0.11 0.09 0.10 0.09 0.10
> 0.09 0.10 0.10 0.10 0.11 0.10 0.11 0.11 0.12 0.13 0.15 0.10 0.09 0.11 0.12
> 
> 
> --
> Pam Krone-Davis
> Project Research Assistant and Grant Manager
> PO Box 22122
> Carmel, CA 93922
> (831)582-3684 (o)
> (831)324-0391 (h)
> 
>   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running R from the command line: weird problem

2012-07-13 Thread Duncan Murdoch

On 13/07/2012 9:46 AM, Abhi Raghavan wrote:

Hi,

I'm trying to run an R script from the command line and I do it in the
following way:

R CMD BATCH -q  *

*I write the R script from inside a Perl program (on UNIX of course!)
and execute the shell command using the function "system".**What
intrigues me is the strange error that I see in the output file:

Error: embedded nul in string: '\0'
Execution halted


I would guess a likely cause of this is that there really are nuls in 
your something.R file.  Do you have a binary editor that can look at it?


There are lots of other possibilities too, e.g. something is writing out 
of bounds and putting nuls into strings in memory.  Those are likely 
harder to track down, so I'd look in the file first.


Duncan Murdoch



When I copy and paste the commands manually in the R command prompt, I
get exactly what I expect to get. I've never come across this error
before. I did some brief searches on the net but cannot find any
comprehensive on how this problem can be fixed. I would appreciate some
help in the matter.

As a general question, are there any mature Perl modules that help one
interface R with Perl without having to resort to more cumbersome
methods such as using system calls from within?

Many thanks.

Abhi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Side by side strip charts

2012-07-13 Thread darnold
OK, got this far:

x1 <- round(rnorm(10,60,3))
x2 <- round(rnorm(10,65,3))
x3 <- round(rnorm(10,70,3))
stripchart(list(sample1=x1,sample2=x2,sample3=x3),
   method="stack",
   pch=4,
   offset=1/2,
   col="blue",
   lwd=2,
   las=1)

Any ideas on how to get an axes drawn under each one as in the image?

Thanks.

David Arnold
College of the Redwoods
http://msemac.redwoods.edu/~darnold/index.php

--
View this message in context: 
http://r.789695.n4.nabble.com/Side-by-side-strip-charts-tp4636399p4636464.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get all list item to one string variable

2012-07-13 Thread MacQueen, Don
First of all, although the question uses the term "list", the object named
'a' is clearly not a list as R defines the term. It is a vector.

Thus a better example answer is

  a <- c('abc', 'def')
  paste(a , collapse=' ')

and this works for however many elements there are in the vector a.

The example
  a <- list("abc", "def", "ghi")
  paste(a, collapse=" ")
works, but only because the paste() function first converts its arguments
to character strings.


I only mention this because it's important (especially for those new to R)
to understand the distinction between lists and vectors, even if they
occasionally appear to behave the same way. Consider this:

>   a <- list("abc", "def", 3:6,"ghi")

> a
[[1]]
[1] "abc"

[[2]]
[1] "def"

[[3]]
[1] 3 4 5 6

[[4]]
[1] "ghi"


>   paste(a, collapse=" ")
[1] "abc def 3:6 ghi"



And I did not expect that result, but something like this:


> unlist(a)
[1] "abc" "def" "3"   "4"   "5"   "6"   "ghi"

> paste(unlist(a), collapse=' ')
[1] "abc def 3 4 5 6 ghi"





 

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 7/12/12 5:22 AM, "purushothaman"  wrote:

>hi,
>
>sorry it's not 2 different list all item in same list like this
>
>a[1]="abc"
>a[2]="def"
>...
>output ="abc def ..."
>
>Thanks
>B.Purushothaman
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/How-to-get-all-list-item-to-one-string-varia
>ble-tp4636283p4636287.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R2OpenBUGS quesion

2012-07-13 Thread Uwe Ligges



On 13.07.2012 03:18, elong zhang wrote:

Dear All:

Could anybody help me figure out why I get the Error message below while I
running the example code of bugs() function in R2OpenBUGS packages? I have
tried the code both in Win 7 and Ubuntu 12.04, but they show the same
message. My R version is 2.15.1 in win7 and 2.15.0 in Ubuntu.  Many thanks !



1. Please always send reproducible examples.
2. Which version of OpenBUGS and which version of R2OpenBUGS (for both 
there are recent updates)


Uwe Ligges




Best,

Yilong



schools.sim <- bugs(data, inits, parameters, model.file,+ 
n.chains=3, n.iter=5000)Error in matrix(, n.sims, n.parameters) :

   invalid 'nrow' value (too large or NA)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with R2 OpenBUGs

2012-07-13 Thread Uwe Ligges



On 13.07.2012 12:51, Tom Porteus wrote:

It seems to not be recognising a function you are calling within your model -
taking a quick look you might want to check that table() is a function in
OpenBUGS.

g <- length(table(G))


[This time also CCing the OP]

Additional hint:

For debugging purposes, it is advisable to open the model file in 
OpenBUGS directly until you are sure it is correct, then go back to R 
and automate things 


Uwe Ligges



TP

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-R2-OpenBUGs-tp4636412p4636424.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with R2 OpenBUGs

2012-07-13 Thread Uwe Ligges



On 13.07.2012 12:51, Tom Porteus wrote:

It seems to not be recognising a function you are calling within your model -
taking a quick look you might want to check that table() is a function in
OpenBUGS.

g <- length(table(G))



Additional hint:

For debugging purposes, it is advisable to open the model file in 
OpenBUGS directly until you are sure it is correct, then go back to R 
and automate things 


Uwe Ligges



TP

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-R2-OpenBUGs-tp4636412p4636424.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which() in subset()

2012-07-13 Thread Charles Stangor
Thanks Dennis and Mike... I'm getting it!!!

Sent from my Android

Rui Barradas  wrote:

>Hello,
>
>To know why, just evaluate the condition, with 't1$' before 'version_1':
>
>which(as.character(t1$version_1) %in% a) != 0
>[1] TRUE TRUE
>
>
>It allways evaluates to TRUE, therefore, subset() returns all rows.
>
>See if this isn't simpler than both of your forms.
>
>v2 <- subset(t1, version_1 %in% a)
>v2
>   id version_1
>1  1 100-1
>2  2 100-2
>
>The trick is to use %in% when doing multiple comparisons. With the 
>vector with length equal to the number of observations on the left hand 
>side.
>
>Hope this helps,
>
>Rui Barradas
>
>Em 13-07-2012 12:12, Charles Stangor escreveu:
>> Why does the subset not work in the which() version below?
>>
>> Thank you
>>
>>
>> v1 <- subset(t1,
>>   version_1==as.character("100-1")
>>   | version_1==as.character("100-2"))
>>
>> a<-c("100-1", "100-2")
>> v1 <- subset(t1, which(a==as.character(version_1)) != 0)
>>
>>  [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] box plot and plot whiskers

2012-07-13 Thread S Ellison
 

> -Original Message-
> I have question concerning box plot and it's whiskers. As I 
> understood from the description of the boxplot() function, if 
> the range value is positive the plot whiskers extend out from 
> the box to the most extreme data points defined by the values 
> of  the IQR  times range (default 1.5).  
... from the box. For a normal distribution (N(mu, sigma) the expected position 
of the whisker ends would be at mu+-4*0.674*sigma (that corresponds to a 
two-tailed 99.3% interval, if I've not lost a factor of two somewhere).

> It suggests that the 
> upper and lower plot whiskers should be more less the same length.
>  What does it mean if they are not? How it's possible? 
The end of each whisker is always a data point in your data set. Data can be 
anywhere.

In small data sets (under 20 per group) the whiskers can vary quite a lot by 
chance; for example try
set.seed(1027)
y <- rnorm(150)
g <- gl(10,15)
boxplot(y~g)


#and note group 2.

In bigger data sets the quantiles are less variable and different whisker 
length, like the different lengths of the box parts, becomes a more reliable 
indicator of asymmetry. 

S Ellison

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adjusting format of boxplot

2012-07-13 Thread Bert Gunter
?plot.default
## documents the frame.plot arg and
?boxplot
## documents miscellaneous graphical parameters that go into ...

I would agree that the documentation is relatively poor, a legacy of the
original graphics system, which splits up parameter documentation among
par(), plot(), and specific types of plots. I would guess Paull Murrell's
book also has it, but I haven't checked.

-- Bert

On Fri, Jul 13, 2012 at 12:45 AM, Ivan Calandra <
ivan.calan...@u-bourgogne.fr> wrote:

> Hi Peter,
>
> I had never heard of this 'frame' argument and it's a breakthrough for me
> to be finally able to get rid of this frame!
>
> But where is this argument explained? I couldn't find it in plot(),
> boxplot(), bxp() or par().
>
> Thank you for your answer :)
> Ivan
>
> --
> Ivan CALANDRA
> Université de Bourgogne
> UMR CNRS/uB 6282 Biogéosciences
> 6 Boulevard Gabriel
> 21000 Dijon, FRANCE
> +33(0)3.80.39.63.06
> ivan.calan...@u-bourgogne.fr
> http://biogeosciences.u-**bourgogne.fr/calandra
>
> Le 13/07/12 04:48, Peter Ehlers a écrit :
>
>> On 2012-07-12 18:39, David L Carlson wrote:
>>
>>> Sorry about that. I got rid of the box another way and then switched to
>>> using pars=
>>> without making sure it worked. This works:
>>>
>>> flies$group <- factor(flies$group, 5:1) # 1
>>> levels(flies$group) <- paste0("Group ", 5:1) # 2
>>>
>>> oldpar <- par(bty="n")
>>> boxplot(long ~ group,
>>>  data = flies,
>>>  pars=list(las=1, ylim=c(10, 110), xaxt="n"),
>>>  horizontal = TRUE,
>>>  col = "red")
>>>
>>> axis(1, at=seq(10, 110, 20))
>>> par(oldpar)
>>>
>>
>> Or you could add 'frame = FALSE' to the boxplot() call.
>>
>> Peter Ehlers
>>
>>
>>> --**---
>>> David
>>>
>>> --**---
>>> David L Carlson
>>> Associate Professor of Anthropology
>>> Texas A&M University
>>> College Station, TX 77840-4352
>>>
>>>
>>> - Original Message -
>>>
>>> From: "darnold" 
>>> To: r-help@r-project.org
>>> Sent: Thursday, July 12, 2012 7:53:33 PM
>>> Subject: Re: [R] Adjusting format of boxplot
>>>
>>> Added your code:
>>>
>>>
>>> flies <- read.table("example12_1.dat",**header=TRUE,sep="\t")
>>>
>>> flies$group <- factor(flies$group,5:1)
>>>
>>> levels(flies$group) <- paste0("Group ",5:1)
>>>
>>> boxplot(long ~ group,
>>> data = flies,
>>> pars = list(las=1, ylim=c(10,110), xaxt="n", bty="n"),
>>> horizontal = TRUE,
>>> col = "red")
>>>
>>> axis(1,at=seq(10,110,20))
>>>
>>> Almost worked perfectly, except the frame around the plot remains, which
>>> is
>>> strange as you have bty="n".
>>>
>>> http://r.789695.n4.nabble.com/**file/n4636381/Rplot11.png
>>>
>>> David
>>>
>>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which() in subset()

2012-07-13 Thread Rui Barradas

Hello,

To know why, just evaluate the condition, with 't1$' before 'version_1':

which(as.character(t1$version_1) %in% a) != 0
[1] TRUE TRUE


It allways evaluates to TRUE, therefore, subset() returns all rows.

See if this isn't simpler than both of your forms.

v2 <- subset(t1, version_1 %in% a)
v2
  id version_1
1  1 100-1
2  2 100-2

The trick is to use %in% when doing multiple comparisons. With the 
vector with length equal to the number of observations on the left hand 
side.


Hope this helps,

Rui Barradas

Em 13-07-2012 12:12, Charles Stangor escreveu:

Why does the subset not work in the which() version below?

Thank you


v1 <- subset(t1,
  version_1==as.character("100-1")
  | version_1==as.character("100-2"))

a<-c("100-1", "100-2")
v1 <- subset(t1, which(a==as.character(version_1)) != 0)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R-squared with Intercept set to 0 (zero) for linear regression in R is incorrect

2012-07-13 Thread Pamela Krone-Davis
Hi,

I have been using lm in R to do a linear regression and find the slope
coefficients and value for R-squared.  The R-squared value reported by R
(R^2 = 0.9558) is very different than the R-squared value when I use the
same equation in Exce (R^2 = 0.328).  I manually computed R-squared and the
Excel value is correct.  I show my code for the determination of R^2 in R.
When I do not set 0 as the intercept, the R^2 value is the same in R and
Excel.  In both cases the slope coefficient from R and from Excel are
identical.

k is a data frame with two columns.

M1 = lm(k[,1]~k[,2] + 0) ## set intercept to 0 and get different
R^2 values in R and Excel
M2 = lm(k[,1]~k[,2])
sumM1 = summary(M1)
sumM2 = summary(M2)## get same value as Excel when intercept is not
set to 0

Below is what R returns for sumM1:

lm(formula = k[, 1] ~ k[, 2] + 0)

Residuals:
  Min1QMedian3Q   Max
-0.057199 -0.015857  0.003793  0.013737  0.056178

Coefficients:
   Estimate Std. Error t value Pr(>|t|)
k[, 2]  1.050220.04266   24.62   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.02411 on 28 degrees of freedom
Multiple R-squared: 0.9558, Adjusted R-squared: 0.9543
F-statistic: 606.2 on 1 and 28 DF,  p-value: < 2.2e-16

Way manual determination was performed.  The value returned coincides with
the value from Excel:

 trying to figure out why the R^2 for R and Excel are so different.
 sqerr = (k[,1] - predict(M1))^2
 sqtot = (k[,1] - mean(k[,1])   ^2

 R2 = 1 -  sum(sqerr)/sum(sqtot) ## for 1D get 0.328   same as
excel value

I am very puzzled by this.  How does R compute the value for R^2 in this
case? Did i write the lm incorrectly?

Thanks
Pam

PS  In case you are interested, the data I am using for hte two columns is
below.

k[, 1]
1]
 [1] 0.17170228 0.10881539 0.11843669 0.11619201 0.08441067 0.09424441
0.04782264 0.09526496 0.11596476 0.10323453 0.06487894 0.08916484
0.06358752 0.07945473
[15] 0.11213532 0.06531185 0.11503484 0.13679548 0.13762677 0.13126827
0.12350649 0.12842441 0.13075654 0.15026602 0.14536351 0.07841638
0.08419016 0.11995240
[29] 0.14425678

> k[,2]
 [1] 0.11 0.10 0.11 0.10 0.10 0.09 0.10 0.09 0.09 0.11 0.09 0.10 0.09 0.10
0.09 0.10 0.10 0.10 0.11 0.10 0.11 0.11 0.12 0.13 0.15 0.10 0.09 0.11 0.12


-- 
Pam Krone-Davis
Project Research Assistant and Grant Manager
PO Box 22122
Carmel, CA 93922
(831)582-3684 (o)
(831)324-0391 (h)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] significance test interquartile ranges

2012-07-13 Thread Schaber , Jörg
Hi,

I have two non-normal distributions and use interquartile ranges as a 
dispersion measure.
Now I am looking for a test, which tests whether the interquartile ranges from 
the two distributions are significantly different.
Any idea?

Thanks,

joerg

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maximum number of patterns and speed in grep

2012-07-13 Thread Gabor Grothendieck
On Fri, Jul 13, 2012 at 9:40 AM, mdvaan  wrote:
> Thanks, I see that it is working in the sample data. My data, however, gives
> me an error message:
>
> data <- strapplyc(text, batch[[l]])
> Error in structure(.External("dotTcl", ..., PACKAGE = "tcltk"), class =
> "tclObj") :
>   [tcl] couldn't compile regular expression pattern: parentheses () not
> balanced.
>
> batch[[l]] is similar to your "re" string except that there is a larger
> variety of characters. I haven't been able to figure out which characters
> are causing trouble here. Any thoughts?
>
> Thank you very much.
>
> Math
...
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Note part on last line about posting reproducible code.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] box plot and plot whiskers

2012-07-13 Thread David L Carlson
The whiskers will not be symmetrical if 1.5*IQR extends beyond the maximum or 
minimum values. In that case, the whisker stops at the maximum or minimum.

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


- Original Message - 

From: "Barbara Uszczynska"  
To: r-help@r-project.org 
Sent: Friday, July 13, 2012 7:40:30 AM 
Subject: [R] box plot and plot whiskers 

Dear R users, 

I have question concerning box plot and it's whiskers. As I understood from 
the description of the boxplot() function, if the range value is positive 
the plot whiskers extend out from the box to the most extreme data points 
defined by the values of the IQR times range (default 1.5). It suggests 
that the upper and lower plot whiskers should be more less the same length. 
What does it mean if they are not? How it's possible? I'm using default 
value of the range. 

Would be garateful for any hint. 

Kind regards, 

Barbara 

[[alternative HTML version deleted]] 

__ 
R-help@r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] easy way to fit saturated model in sem package?

2012-07-13 Thread luke-tierney

Apologies -- replied to the wrong message.

luke

On Fri, 13 Jul 2012, luke-tier...@uiowa.edu wrote:


They look fine to me.

luke

On Fri, 13 Jul 2012, Joshua Wiley wrote:


Dear John,

Thanks very much for the reply.  Looking at the optimizers, I had
thought that the objectiveML did what I wanted.  I appreciate the
clarification.

I think that multiple imputation is more flexible in some ways because
you can easy create different models for every variable.  At the same
time, if the assumptions hold, FIML is equivalent to multiple
imputation, and considerably more convenient.  Further, I suspect that
in many circumstances, either option is equal to or better than
listwise deletion.

In my case, I am working on some tools primarily for data exploration,
in a SEM context (some characteristics of individual variables and
then covariance/correlation matrices, clustering, etc.) and hoped to
include listwise/pairwise/FIML as options.

I will check out the lavaan package.

Thanks again for your time,

Josh

On Thu, Jul 12, 2012 at 8:20 AM, John Fox  wrote:

Dear Joshua,

If I understand correctly what you want to do, the sem package won't do 
it.

That is, the sem() function won't do what often is called FIML estimation
for models with missing data. I've been thinking about implementing this
feature, and don't think that it would be too difficult, but I can't 
promise

when and if I'll get to it. You might also take a look at the lavaan
package.

As well, I must admit to some skepticism about the FIML estimator, as
opposed to approaches such as multiple imputation of missing data. I 
suspect

that the former is more sensitive than the latter to the assumption of
multinormality.

Best,
 John


John Fox
Senator William McMaster
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
project.org] On Behalf Of Joshua Wiley
Sent: July-12-12 2:53 AM
To: r-help@r-project.org
Cc: John Fox
Subject: [R] easy way to fit saturated model in sem package?

Hi,

I am wondering if anyone knows of an easy way to fit a saturated model
using the sem package on raw data?  Say the data were:

mtcars[, c("mpg", "hp", "wt")]

The model would estimate the three means (intercepts) of c("mpg", "hp",
"wt").  The variances of c("mpg", "hp", "wt").  The covariance of mpg
with hp and wt and the covariance of hp with wt.

I am interested in this because I want to obtain the MLE mean vector
and covariance matrix when there is missing data (i.e., the sum of the
case wise likelihoods or so-called full information maximum
likelihood).  Here is exemplary missing data:

dat <- as.matrix(mtcars[, c("mpg", "hp", "wt")])
dat[sample(length(dat), length(dat) * .25)] <- NA dat <-
as.data.frame(dat)

It is not too difficult to write a wrapper that does this in the OpenMx
package because you can easily define paths using vectors and get all
pairwise combinations using:

combn(c("mpg", "hp", "wt"), 2)

but I would prefer to use the sem package, because OpenMx does not work
on 64 bit versions of R for Windows x64 and is not available from CRAN
presently.  Obviously it is not difficult to write out the model, but I
am hoping to bundle this in a function that for some arbitrary data,
will return the FIML estimated covariance (and correlation matrix).
Alternately, if there are any functions/packages that just return FIML
estimates of a covariance matrix from raw data, that would be great
(but googling and using findFn() from the sos package did not turn up
good results).

Thanks!

Josh


--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group University of
California, Los Angeles https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.












--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] easy way to fit saturated model in sem package?

2012-07-13 Thread luke-tierney

They look fine to me.

luke

On Fri, 13 Jul 2012, Joshua Wiley wrote:


Dear John,

Thanks very much for the reply.  Looking at the optimizers, I had
thought that the objectiveML did what I wanted.  I appreciate the
clarification.

I think that multiple imputation is more flexible in some ways because
you can easy create different models for every variable.  At the same
time, if the assumptions hold, FIML is equivalent to multiple
imputation, and considerably more convenient.  Further, I suspect that
in many circumstances, either option is equal to or better than
listwise deletion.

In my case, I am working on some tools primarily for data exploration,
in a SEM context (some characteristics of individual variables and
then covariance/correlation matrices, clustering, etc.) and hoped to
include listwise/pairwise/FIML as options.

I will check out the lavaan package.

Thanks again for your time,

Josh

On Thu, Jul 12, 2012 at 8:20 AM, John Fox  wrote:

Dear Joshua,

If I understand correctly what you want to do, the sem package won't do it.
That is, the sem() function won't do what often is called FIML estimation
for models with missing data. I've been thinking about implementing this
feature, and don't think that it would be too difficult, but I can't promise
when and if I'll get to it. You might also take a look at the lavaan
package.

As well, I must admit to some skepticism about the FIML estimator, as
opposed to approaches such as multiple imputation of missing data. I suspect
that the former is more sensitive than the latter to the assumption of
multinormality.

Best,
 John


John Fox
Senator William McMaster
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
project.org] On Behalf Of Joshua Wiley
Sent: July-12-12 2:53 AM
To: r-help@r-project.org
Cc: John Fox
Subject: [R] easy way to fit saturated model in sem package?

Hi,

I am wondering if anyone knows of an easy way to fit a saturated model
using the sem package on raw data?  Say the data were:

mtcars[, c("mpg", "hp", "wt")]

The model would estimate the three means (intercepts) of c("mpg", "hp",
"wt").  The variances of c("mpg", "hp", "wt").  The covariance of mpg
with hp and wt and the covariance of hp with wt.

I am interested in this because I want to obtain the MLE mean vector
and covariance matrix when there is missing data (i.e., the sum of the
case wise likelihoods or so-called full information maximum
likelihood).  Here is exemplary missing data:

dat <- as.matrix(mtcars[, c("mpg", "hp", "wt")])
dat[sample(length(dat), length(dat) * .25)] <- NA dat <-
as.data.frame(dat)

It is not too difficult to write a wrapper that does this in the OpenMx
package because you can easily define paths using vectors and get all
pairwise combinations using:

combn(c("mpg", "hp", "wt"), 2)

but I would prefer to use the sem package, because OpenMx does not work
on 64 bit versions of R for Windows x64 and is not available from CRAN
presently.  Obviously it is not difficult to write out the model, but I
am hoping to bundle this in a function that for some arbitrary data,
will return the FIML estimated covariance (and correlation matrix).
Alternately, if there are any functions/packages that just return FIML
estimates of a covariance matrix from raw data, that would be great
(but googling and using findFn() from the sos package did not turn up
good results).

Thanks!

Josh


--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group University of
California, Los Angeles https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.









--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Running R from the command line: weird problem

2012-07-13 Thread Abhi Raghavan
Hi,

I'm trying to run an R script from the command line and I do it in the 
following way:

R CMD BATCH -q  *

*I write the R script from inside a Perl program (on UNIX of course!) 
and execute the shell command using the function "system".**What 
intrigues me is the strange error that I see in the output file:

Error: embedded nul in string: '\0'
Execution halted

When I copy and paste the commands manually in the R command prompt, I 
get exactly what I expect to get. I've never come across this error 
before. I did some brief searches on the net but cannot find any 
comprehensive on how this problem can be fixed. I would appreciate some 
help in the matter.

As a general question, are there any mature Perl modules that help one 
interface R with Perl without having to resort to more cumbersome 
methods such as using system calls from within?

Many thanks.

Abhi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maximum number of patterns and speed in grep

2012-07-13 Thread mdvaan
Thanks, I see that it is working in the sample data. My data, however, gives
me an error message: 

data <- strapplyc(text, batch[[l]]) 
Error in structure(.External("dotTcl", ..., PACKAGE = "tcltk"), class =
"tclObj") : 
  [tcl] couldn't compile regular expression pattern: parentheses () not
balanced.

batch[[l]] is similar to your "re" string except that there is a larger
variety of characters. I haven't been able to figure out which characters
are causing trouble here. Any thoughts?

Thank you very much.

Math 




Gabor Grothendieck wrote
> 
> On Fri, Jul 6, 2012 at 10:45 AM, mdvaan  wrote:
>> Hi,
>>
>> I am using R's grep function to find patterns in vectors of strings. The
>> number of patterns I would like to match is 7,700 (of different sizes). I
>> noticed that I get an error message when I do the following:
>>
>> data <- array()
>> for (j in 1:length(x))
>> {
>> array[j] <- length(grep(paste(patterns[1:7700], collapse = "|"),  x[j],
>> value = T))
>> }
>>
>> When I break this up into 4 chunks of patterns it works:
>>
>> data <- array()
>> for (j in 1:length(x))
>> {
>> array$chunk1[j] <- length(grep(paste(patterns[1:2500], collapse = "|"),
>> x[j], value = T))
>> array$chunk1[j] <- length(grep(paste(patterns[2501:5000], collapse =
>> "|"),
>> x[j], value = T))
>> array$chunk1[j] <- length(grep(paste(patterns[5001:7500], collapse =
>> "|"),
>> x[j], value = T))
>> array$chunk1[j] <- length(grep(paste(patterns[7501:7700], collapse =
>> "|"),
>> x[j], value = T))
>> }
>>
>> My questions: what's the maximum size of the patterns argument in grep?
>> Is
>> there a way to do this faster? It is very slow.
> 
> Try strapplyc in gsubfn and see
>   http://gsubfn.googlecode.com
> for more info.
> 
> # test data
> x <- c("abcd", "z", "dbef")
> 
> # re is regexp with 7700 alternatives
> #  to test with
> g <- expand.grid(letters, letters, letters)
> gp <- do.call("paste0", g)
> gp7700 <- head(gp, 7700)
> re <- paste(gp7700, collapse = "|")
> 
> # grep gives error message
> grep.out <- grep(re, x)
> 
> # strapplyc works
> library(gsubfn)
> which(sapply(strapplyc(x, re), length) > 0)
> 
> 
> -- 
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
> 
> __
> R-help@ mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

--
View this message in context: 
http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4636437.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table with numeric row names

2012-07-13 Thread kexinz
Thanks all you guys' help!

--
View this message in context: 
http://r.789695.n4.nabble.com/read-table-with-numeric-row-names-tp4636342p4636446.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Grabbing Indexes of a certain standard deviation

2012-07-13 Thread gcm
Thank you a ton! This is perfect!

From: Jean V Adams [via R] [mailto:ml-node+s789695n4636370...@n4.nabble.com]
Sent: Thursday, July 12, 2012 4:59 PM
To: Lauren Vogric
Subject: Re: Grabbing Indexes of a certain standard deviation

I wrote a little function called first() to help with situations like
this.  It returns a 1 every time an element of a vector is different from
the previous element, and a 0 otherwise.

first <- function(x) {
L <- length(x)
c(1, 1-(x[-1]==x[-L]))
}

sd <- 1
residuals <- c(1, 2.1, 3, 4, 3, 1, 0, -4, -1)
# logical, indicating if the residual exceeds 2 standard deviations
exceed <- abs(residuals) > (2*sd)
# indices of the first exceeding residual of a series
which(first(exceed) & exceed)

Jean


gcm <[hidden email]> wrote on 
07/12/2012 11:07:49 AM:

> I have a graph of residuals and I am attempting to get a list of the
indexes
> of each time the residual is greater than 2 standard deviations or less
than
> -2 standard deviations, but only the first point of the section. And
then
> I'd also need the first point where the point returns to the range
between
> +/- 2 standard deviations.
> So basically if my standard deviation=1 and my residuals=c(1, 2.1, 3, 4,
3,
> 1, 0, -4, -1) I want it to grab the second number, where it exceeds 2
> standard deviations and the 6th where it returns to less than 2 standard
> deviations. Also, it would grab -4 (or residuals[ 8]) and -1
(residuals[9])

[[alternative HTML version deleted]]

__
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/Grabbing-Indexes-of-a-certain-standard-deviation-tp4636318p4636370.html
To unsubscribe from Grabbing Indexes of a certain standard deviation, click 
here.
NAML


--
View this message in context: 
http://r.789695.n4.nabble.com/Grabbing-Indexes-of-a-certain-standard-deviation-tp4636318p4636429.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] box plot and plot whiskers

2012-07-13 Thread Barbara Uszczynska
Dear R users,

I have question concerning box plot and it's whiskers. As I understood from
the description of the boxplot() function, if the range value is positive
the plot whiskers extend out from the box to the most extreme data points
defined by the values of  the IQR  times range (default 1.5).  It suggests
that the upper and lower plot whiskers should be more less the same length.
 What does it mean if they are not? How it's possible? I'm using default
value of the range.

Would be garateful for any hint.

Kind regards,

Barbara

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with R2 OpenBUGs

2012-07-13 Thread Tom Porteus
It seems to not be recognising a function you are calling within your model -
taking a quick look you might want to check that table() is a function in
OpenBUGS.

g <- length(table(G))

TP

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-R2-OpenBUGs-tp4636412p4636424.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] prediction interval for predicted y mean at terminal node - rpart method "anova"

2012-07-13 Thread agent dunham
Dear community, 

I'm using rpart and would like to extract 95% prediction interval at each
terminal node from a regression tree (built with rpart, method= "anova") Is
that possible? 

Thanks in advance, u...@host.com as crossp...@host.com


--
View this message in context: 
http://r.789695.n4.nabble.com/prediction-interval-for-predicted-y-mean-at-terminal-node-rpart-method-anova-tp4636425.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] which() in subset()

2012-07-13 Thread Charles Stangor
Why does the subset not work in the which() version below?

Thank you


v1 <- subset(t1,
 version_1==as.character("100-1")
 | version_1==as.character("100-2"))

a<-c("100-1", "100-2")
v1 <- subset(t1, which(a==as.character(version_1)) != 0)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] easy way to fit saturated model in sem package?

2012-07-13 Thread Joshua Wiley
Dear John,

Thanks very much for the reply.  Looking at the optimizers, I had
thought that the objectiveML did what I wanted.  I appreciate the
clarification.

I think that multiple imputation is more flexible in some ways because
you can easy create different models for every variable.  At the same
time, if the assumptions hold, FIML is equivalent to multiple
imputation, and considerably more convenient.  Further, I suspect that
in many circumstances, either option is equal to or better than
listwise deletion.

In my case, I am working on some tools primarily for data exploration,
in a SEM context (some characteristics of individual variables and
then covariance/correlation matrices, clustering, etc.) and hoped to
include listwise/pairwise/FIML as options.

I will check out the lavaan package.

Thanks again for your time,

Josh

On Thu, Jul 12, 2012 at 8:20 AM, John Fox  wrote:
> Dear Joshua,
>
> If I understand correctly what you want to do, the sem package won't do it.
> That is, the sem() function won't do what often is called FIML estimation
> for models with missing data. I've been thinking about implementing this
> feature, and don't think that it would be too difficult, but I can't promise
> when and if I'll get to it. You might also take a look at the lavaan
> package.
>
> As well, I must admit to some skepticism about the FIML estimator, as
> opposed to approaches such as multiple imputation of missing data. I suspect
> that the former is more sensitive than the latter to the assumption of
> multinormality.
>
> Best,
>  John
>
> 
> John Fox
> Senator William McMaster
>   Professor of Social Statistics
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> http://socserv.mcmaster.ca/jfox
>
>
>
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
>> project.org] On Behalf Of Joshua Wiley
>> Sent: July-12-12 2:53 AM
>> To: r-help@r-project.org
>> Cc: John Fox
>> Subject: [R] easy way to fit saturated model in sem package?
>>
>> Hi,
>>
>> I am wondering if anyone knows of an easy way to fit a saturated model
>> using the sem package on raw data?  Say the data were:
>>
>> mtcars[, c("mpg", "hp", "wt")]
>>
>> The model would estimate the three means (intercepts) of c("mpg", "hp",
>> "wt").  The variances of c("mpg", "hp", "wt").  The covariance of mpg
>> with hp and wt and the covariance of hp with wt.
>>
>> I am interested in this because I want to obtain the MLE mean vector
>> and covariance matrix when there is missing data (i.e., the sum of the
>> case wise likelihoods or so-called full information maximum
>> likelihood).  Here is exemplary missing data:
>>
>> dat <- as.matrix(mtcars[, c("mpg", "hp", "wt")])
>> dat[sample(length(dat), length(dat) * .25)] <- NA dat <-
>> as.data.frame(dat)
>>
>> It is not too difficult to write a wrapper that does this in the OpenMx
>> package because you can easily define paths using vectors and get all
>> pairwise combinations using:
>>
>> combn(c("mpg", "hp", "wt"), 2)
>>
>> but I would prefer to use the sem package, because OpenMx does not work
>> on 64 bit versions of R for Windows x64 and is not available from CRAN
>> presently.  Obviously it is not difficult to write out the model, but I
>> am hoping to bundle this in a function that for some arbitrary data,
>> will return the FIML estimated covariance (and correlation matrix).
>> Alternately, if there are any functions/packages that just return FIML
>> estimates of a covariance matrix from raw data, that would be great
>> (but googling and using findFn() from the sos package did not turn up
>> good results).
>>
>> Thanks!
>>
>> Josh
>>
>>
>> --
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> Programmer Analyst II, Statistical Consulting Group University of
>> California, Los Angeles https://joshuawiley.com/
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table with numeric row names

2012-07-13 Thread William Dunlap
> BTW, is there an R command to read just the first line of the file?

scan() or readLines() will read as many lines of the file as you want.
Use the file() function to open a "file connection" so a subsequent
read.table() will start where scan() or readLines() finished.  E.g.,

  > tfile <- tempfile()
  > cat(file=tfile, " 2.5  3.6  7.1  7.9 
  +  100  3  4  23 
  +  200  3.1  4  3  3 
  +  300  2.2  3.3  24
  + ") # now tfile looks like your example file
  > read.table(header=TRUE, tfile) # easy way, but not what you want
  X2.5 X3.6 X7.1 X7.9
  100  3.0  4.023
  200  3.1  4.033
  300  2.2  3.324
  > fileConn <- file(tfile, open="r")
  > scan(fileConn, what=0.0, nlines=1)
  Read 4 items
  [1] 2.5 3.6 7.1 7.9
  > read.table(header=FALSE, fileConn)
 V1  V2  V3 V4 V5
  1 100 3.0 4.0  2  3
  2 200 3.1 4.0  3  3
  3 300 2.2 3.3  2  4
  > str(.Last.value)
  'data.frame':   3 obs. of  5 variables:
   $ V1: int  100 200 300
   $ V2: num  3 3.1 2.2
   $ V3: num  4 4 3.3
   $ V4: int  2 3 2
   $ V5: int  3 3 4
   > close(fileConn) # clean up

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
> Behalf Of kexinz
> Sent: Thursday, July 12, 2012 2:59 PM
> To: r-help@r-project.org
> Subject: Re: [R] read.table with numeric row names
> 
> Thanks Yasir, this helps a lot.
> BTW, is there an R command to read just the first line of the file?
> 
> 
> Yasir Kaheil wrote
> >
> > just do this:
> > colnames(r)<-substr(colnames(r),2,nchar(colnames(r)))
> >
> > This will remove the X.
> > Later when you want to use the headed to plot something, cast it as
> > numeric:
> > plot(colMeans(r)~as.numeric(colnames(r)))
> >
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/read-table-with-
> numeric-row-names-tp4636342p4636377.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table with numeric row names

2012-07-13 Thread jim holtman
Try this:

> x <- read.table(text = " 2.5  3.6  7.1  7.9
+ 100   3  4   2 3
+ 200   3.1   4  3  3
+ 300   2.2   3.3   2 4", header = TRUE, check.names = FALSE)
>
> x
2.5 3.6 7.1 7.9
100 3.0 4.0   2   3
200 3.1 4.0   3   3
300 2.2 3.3   2   4
> names(x)
[1] "2.5" "3.6" "7.1" "7.9"


On Thu, Jul 12, 2012 at 2:50 PM, kexinz  wrote:
> I have a text file like this
>  2.5  3.6  7.1  7.9
> 100   3  4   2 3
> 200   3.1   4  3  3
> 300   2.2   3.3   2 4
>
> I used "r <- read.table("a.txt", header=T)"
> The row names becomes X2.5, X3.6... What I need is the row names are
> numeric, so I can use the row names as numbers on x-axis for plotting. e.g.
> "plot(colMeans(r)~names(r))", something like this. How to do this?
>
> Thanks.
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/read-table-with-numeric-row-names-tp4636342.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Power analysis for Cox regression with a time-varying covariate

2012-07-13 Thread Paul Miller
Hello All,

Does anyone know where I can find information about how to do a power analysis 
for Cox regression with a time-varying covariate using R or  some other readily 
available software? I've done some searching online but haven't found anything. 

Thanks,

Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Substitute list value

2012-07-13 Thread Duncan Murdoch

On 13/07/2012 9:50 AM, Bert Gunter wrote:

Jessica:

On Fri, Jul 13, 2012 at 1:35 AM, Jessica Streicher  wrote:

> two things:
>
> - R always counts from 1, not from 0
> - listmembers are accessed by using [[ ]] , not [ ]
>

FALSE! -- or at least not clearly stated:

  > x <- list(a=letters[1:3],b=1:4)
> x[[2]]
[1] 1 2 3 4
> x[2]
$b
[1] 1 2 3 4

Note that the first is a vector, the second component of x; while the
second is a list whose only component is the second component of x.


I think Jessica was right, and clear.  List members are accessed using 
[[ ]].  Lists are subsetted using [ ].  Your first example extracts the 
second member of x.  Your second example constructs a new list with a 
subset of the members that are in x.


Duncan Murdoch



-- Bert

>
> try
>
> t1[t==ll[[1]], "v"] <- 99
>
> greetings Jessi
>
>
> On 11.07.2012, at 15:47, Charles Stangor wrote:
>
> > I can't seem to determine how to get the name of a list member to
> > substitute:
> >
> > ll <- list("a1" = "a","a2" = "b")
> >
> > t1[t==ll[0], "v"] <- 99
> >
> > why doesn't this substitute to:
> >
> > t1[t=="a", "v"] <- 99
> >
> > Thank you!
> >
> >
> >
> >
> > --
> > Charles Stangor
> > Professor
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table with numeric row names

2012-07-13 Thread peter dalgaard

On Jul 13, 2012, at 04:27 , arun wrote:

> Hello,
> 
> I saw your reply in nabble.  Sorry about that.  I thought the dataset had 
> only few columns.
> 
> #You can read first line of a file using:
> readLines("foo.txt",n=1)[1]
> 
> 
> #The more generic colname substitution
> dat1<-read.table(text=" 
>  2.5  3.6  7.1  7.9 
>  100  3  4  23 
>  200  3.1  4  3  3 
>  300  2.2  3.3  24 
>  ",sep="",header=TRUE)

(This didn't survive too well in mail:

> dat1<-read.table(text=" 
+  2.5  3.6  7.1  7.9 
+  100  3  4  23 
+  200  3.1  4  3  3 
+  300  2.2  3.3  24 
+  ",sep="",header=TRUE)  
Error in read.table(text = " \n 2.5  3.6  7.1  7.9 \n 100  3  4  23 
\n 200  3.1  4  3  3 \n 300  2.2  3.3  24 \n ",  : 
  more columns than column names

Not sure exactly what happened there...)

 

>  
> #The code should remove the "X" from the column names (row names?)
> 

However, adding check.names=FALSE should be more expedient.

> colnames(dat1)<-gsub("^[X](.*)","\\1",colnames(dat1))
> dat1
> 2.5 3.6 7.1 7.9
> 100 3.0 4.0   2   3
> 200 3.1 4.0   3   3
> 300 2.2 3.3   2   4
> plot(colMeans(dat1)~as.numeric(names(dat1)),xlab="Column_Name",ylab="Column_Mean")
> 
> A.K.
> 
> 
> 
> 
> - Original Message -
> From: kexinz 
> To: r-help@r-project.org
> Cc: 
> Sent: Thursday, July 12, 2012 2:50 PM
> Subject: [R] read.table with numeric row names
> 
> I have a text file like this
>  2.5  3.6  7.1  7.9
> 100   3  4   2 3
> 200   3.1   4  3  3
> 300   2.2   3.3   2 4
> 
> I used "r <- read.table("a.txt", header=T)"
> The row names becomes X2.5, X3.6... What I need is the row names are
> numeric, so I can use the row names as numbers on x-axis for plotting. e.g.
> "plot(colMeans(r)~names(r))", something like this. How to do this?
> 
> Thanks.
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/read-table-with-numeric-row-names-tp4636342.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Substitute list value

2012-07-13 Thread Bert Gunter
Jessica:

On Fri, Jul 13, 2012 at 1:35 AM, Jessica Streicher  wrote:

> two things:
>
> - R always counts from 1, not from 0
> - listmembers are accessed by using [[ ]] , not [ ]
>

FALSE! -- or at least not clearly stated:

 > x <- list(a=letters[1:3],b=1:4)
> x[[2]]
[1] 1 2 3 4
> x[2]
$b
[1] 1 2 3 4

Note that the first is a vector, the second component of x; while the
second is a list whose only component is the second component of x.

-- Bert

>
> try
>
> t1[t==ll[[1]], "v"] <- 99
>
> greetings Jessi
>
>
> On 11.07.2012, at 15:47, Charles Stangor wrote:
>
> > I can't seem to determine how to get the name of a list member to
> > substitute:
> >
> > ll <- list("a1" = "a","a2" = "b")
> >
> > t1[t==ll[0], "v"] <- 99
> >
> > why doesn't this substitute to:
> >
> > t1[t=="a", "v"] <- 99
> >
> > Thank you!
> >
> >
> >
> >
> > --
> > Charles Stangor
> > Professor
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to Generate index File From Rd file

2012-07-13 Thread purushothaman
Hi,

I need to generate HTML index file from Rd file.

example

Module Class Function   
Description

Csvfunction csvoperations   Module Description
 http://test  Read CSV  Function
Description
   http://test Write CSV
Function
Description


Thanks
B.Purushothaman

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-Generate-index-File-From-Rd-file-tp4636436.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] image.plot transparent?

2012-07-13 Thread Chris82
Hello,

hiere is a small reproducible example.

All z.i which are NA should be transparent at the plot, but they are white
colored. 


### Example image.plot regular x,y grid ###

x <- seq(2,2.9,0.1)
y <- seq(42,42.9,0.1)

z <- matrix(seq(-5,4.9,0.1),nrow=10)

image.plot(x,y,z)


### overplotting by a irregular grid 

x.i <-
matrix(c(2.434842,2.436714,2.438593,2.440477,2.442368,2.472929,2.474831,2.476739,2.478654,2.480574,2.511019,2.512950,2.514888,2.516832,2.518782,2.549111,2.551072,2.553039,2.555012,2.556992,2.587205,2.589195,2.591192,2.593195,2.595204),nrow=5,byrow=T)
y.i <-
matrix(c(42.24368,42.28684,42.33005,42.37330,42.41660,42.24392,42.28708,42.33028,42.37354,42.41683,42.24415,42.28732,42.33052,42.37378,42.41707,42.24439,42.28756,42.33076,42.37402,42.41732,42.24464,42.28780,42.33101,42.37426,42.41756),nrow=5,byrow=T)


z.i <- matrix(seq(-10,14,1),nrow=5)

z.i[z.i< 0] <- NA 

image.plot(x.i,y.i,z.i, col=terrain.colors(20),add=T)

--
View this message in context: 
http://r.789695.n4.nabble.com/image-plot-transparent-tp4635976p4636434.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] -1.1 - 0.1 + 1.2 is NOT null! Why?

2012-07-13 Thread ollestrat
Someone pointed me to this paper:

http://www.validlab.com/goldberg/paper.pdf

--
View this message in context: 
http://r.789695.n4.nabble.com/1-1-0-1-1-2-is-NOT-null-Why-tp4636053p4636433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] -1.1 - 0.1 + 1.2 is NOT null! Why?

2012-07-13 Thread ollestrat
Thank you for the explanation. Good to know about the issue how double values
are "constructed" by a bit system. This makes me handling double values with
care in using it in R or aother languages control structures etc. 

Thank you also for the hint concerning the Null vs. Zero vs.. issue. Yes,
the subject title is misleading.. 


--
View this message in context: 
http://r.789695.n4.nabble.com/1-1-0-1-1-2-is-NOT-null-Why-tp4636053p4636432.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vuong test

2012-07-13 Thread Marc Schwartz

On Jul 13, 2012, at 12:35 AM, sanchez ana wrote:

> Dear All,
> 
> I am using the function vuong from pscl package to compare 2 non nested 
> models NB1
> (negative binomial I ) and Zero-inflated model.
> 
> 
> NB1 <-  glm(, , family = quasipoisson), it is an
> object of class: "glm" "lm"
> zinb <-
> zeroinfl( dist = "negbin") is an object of class: "zeroinfl"
>  
> when applying vuong
> function I get the following:
> vuong(NB1, zinb)
> Error en predprob.glm(m1) : 
>   your object of class glm is unsupported by
> predprob.glmyour object of class lm is unsupported by predprob.glm
> 
> 
> Any help will be really appreciated.



Either you did not properly copy your code above or something is amiss in your 
understanding.

NB1 is NOT a negative binomial model, but is a quasipoisson model. The latter 
is not fit using maximum likelihood, hence the Vuong test, which is a 
likelihood ratio based test, fails.

If you want a negative binomial model, you should be using glm.nb() in V&R's 
MASS package:

require(MASS)
NB1 <- glm.nb(...)

Then you can use the Vuong test from pscl.

If you have not, you might want to read:

  vignette("countreg")

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Warning message with read.csv.sql

2012-07-13 Thread Gabor Grothendieck
On Thu, Jul 12, 2012 at 8:17 AM, Bharat Warule  wrote:
> Hello,
>
> I am using read.csv.sql first time for reading the large data file.If I am
> ran this code that showns warning “closing unused connection”.
>
> Is it I am missing any argument from my command or how to comeout from this
> warning?.
>
> R code:-
>
> Library(sqldf)
>
> ip_dir_path <- “D:/BharatWarule/big_data_July”
>
> inFile_sql <- file.path(ip_dir_path, "cypress_modeldev_account_info.csv")
> input_abt <- read.csv.sql(inFile_sql,sql = s)
> Warning message:
> closing unused connection 3
> (D:/BharatWarule/big_data_July/cypress_modeldev_account_info.csv)
>
>
> Thanks,
> Bharat
>
>
> -
> Bharat Warule
> Cypress Analytica ,
> Pune
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Warning-message-with-read-csv-sql-tp4636285.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Read the last line about reproducible code.


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merging on Datetime Column

2012-07-13 Thread Rui Barradas

Hello,

Check the structure of what you have, df and newdf. You will see that in 
df dateTime is of class POSIXlt and in newDf newDateTime is of class 
POSIXct.


Solution:

[...]
df$dateTime <- strptime(df$dateTime,"%m/%d/%Y %H:%M")
df$dateTime <- as.POSIXct(df$dateTime)
[...]

Hope this helps,

Rui Barradas

Em 13-07-2012 10:24, vioravis escreveu:

I have the following dataframe with the first column being of type datetime:

dateTime <- c("10/01/2005 0:00",
   "10/01/2005 0:20",
   "10/01/2005 0:40",
   "10/01/2005 1:00",
   "10/01/2005 1:20")
var1 <- c(1,2,3,4,5)
var2 <- c(10,20,30,40,50)
df <- data.frame(dateTime = dateTime, var1 = var1, var2 = var2)
df$dateTime <- strptime(df$dateTime,"%m/%d/%Y %H:%M")

I want to create 10 minute interval data as follows:

minTime <- min(df$dateTime)
maxTime <- max(df$dateTime)
newTime <- seq(minTime,maxTime,600)
newDf <- data.frame(newDateTime = newTime)
newDf <- merge(newDf,df,by.x = "newDateTime",by.y = "dateTime",all.x = TRUE)

The objective here is to create a data frame with values from df for the
datetime in df and NA for the missing ones. However, I am getting the
following data frame with both Var1 and Var2 having all NAs.


newDf

   newDateTime var1 var2
1 2005-10-01 00:00:00   NA   NA
2 2005-10-01 00:10:00   NA   NA
3 2005-10-01 00:20:00   NA   NA
4 2005-10-01 00:30:00   NA   NA
5 2005-10-01 00:40:00   NA   NA
6 2005-10-01 00:50:00   NA   NA
7 2005-10-01 01:00:00   NA   NA
8 2005-10-01 01:10:00   NA   NA
9 2005-10-01 01:20:00   NA   NA

Can someone help me on how to do the merge based on the two datetime
columns?

Thank you.

Ravi






--
View this message in context: 
http://r.789695.n4.nabble.com/Merging-on-Datetime-Column-tp4636417.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Substitute list value

2012-07-13 Thread Charles Stangor
thanks!

On Fri, Jul 13, 2012 at 4:35 AM, Jessica Streicher  wrote:

> two things:
>
> - R always counts from 1, not from 0
> - listmembers are accessed by using [[ ]] , not [ ]
>
> try
>
> t1[t==ll[[1]], "v"] <- 99
>
> greetings Jessi
>
>
> On 11.07.2012, at 15:47, Charles Stangor wrote:
>
> > I can't seem to determine how to get the name of a list member to
> > substitute:
> >
> > ll <- list("a1" = "a","a2" = "b")
> >
> > t1[t==ll[0], "v"] <- 99
> >
> > why doesn't this substitute to:
> >
> > t1[t=="a", "v"] <- 99
> >
> > Thank you!
> >
> >
> >
> >
> > --
> > Charles Stangor
> > Professor
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Charles Stangor
Professor and Associate Chair

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loading R scripts stored behind htaccess

2012-07-13 Thread Gaj Stan (BIGCAT)
Hello,

Within my department I would like to share the latest version(s) of my R 
scripts to my colleagues using subversion. This repository is password 
protected (WebDAV) as it should not be accessed by external persons. 
In the majority of my scripts I load scripts using the source() function, but 
this will now return a ''401 Authorization Required'  error.

I've been browsing through the RCurl documentation, since this seems to be the 
package that will allow me to do this and use the following line of code to 
download the script content and load a single script:

  x <- getURL("http://www.example.com/script.R";, userpwd="key:secret") 
  write(x, "script.R")
  source("script.R")

I have two direct questions related to this:
- Can this be done in an easier/neater way? In some cases there are a 
considerable amount of scripts that need to be downloaded (up to 7 or so). If 
not, then this will be converted to a small function. Ideally the scripts 
should be loaded immediately, without any download steps.
- I'm not really in favor of storing user names and passwords in scripts. Does 
anyone know of a way where R asks the user which login credentials to use, 
stores this in an (encrypted) variable and then uses this to access the website?

Best,

  -- Stan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] many don't know responses

2012-07-13 Thread Iasonas Lamprianou


Dear all, I am analyzing data from a survey question where I have many "Don'k 
know" responses. Respondents could either say "Don't know" or give a rating 
from 0 to 10. I could analyse the "Don't know" responses separately using a 
logistic regression to see who is the "typical" person who doesnt know and then 
on a separate analysis deal with the 0-10 responses. However, I was wondering 
if there are any R packages which can simultaneously do this, so that I do not 
lose much information and keep my sample size large. In effect, I am looking 
for a package which can simultaneously maximize the loglikelihood of the two 
regression equations. Is this doable or am I talking nonsense?

Thank you for your time
Jason

 
Dr. Iasonas Lamprianou
Department of Social and Political Sciences
University of Cyprus




>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Merging on Datetime Column

2012-07-13 Thread vioravis
I have the following dataframe with the first column being of type datetime:

dateTime <- c("10/01/2005 0:00",
  "10/01/2005 0:20",
  "10/01/2005 0:40",
  "10/01/2005 1:00",
  "10/01/2005 1:20")
var1 <- c(1,2,3,4,5)
var2 <- c(10,20,30,40,50)
df <- data.frame(dateTime = dateTime, var1 = var1, var2 = var2)
df$dateTime <- strptime(df$dateTime,"%m/%d/%Y %H:%M")

I want to create 10 minute interval data as follows:

minTime <- min(df$dateTime)
maxTime <- max(df$dateTime)
newTime <- seq(minTime,maxTime,600)
newDf <- data.frame(newDateTime = newTime)
newDf <- merge(newDf,df,by.x = "newDateTime",by.y = "dateTime",all.x = TRUE)

The objective here is to create a data frame with values from df for the
datetime in df and NA for the missing ones. However, I am getting the
following data frame with both Var1 and Var2 having all NAs.

> newDf
  newDateTime var1 var2
1 2005-10-01 00:00:00   NA   NA
2 2005-10-01 00:10:00   NA   NA
3 2005-10-01 00:20:00   NA   NA
4 2005-10-01 00:30:00   NA   NA
5 2005-10-01 00:40:00   NA   NA
6 2005-10-01 00:50:00   NA   NA
7 2005-10-01 01:00:00   NA   NA
8 2005-10-01 01:10:00   NA   NA
9 2005-10-01 01:20:00   NA   NA

Can someone help me on how to do the merge based on the two datetime
columns?

Thank you.

Ravi






--
View this message in context: 
http://r.789695.n4.nabble.com/Merging-on-Datetime-Column-tp4636417.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Column create and Update using function

2012-07-13 Thread Rantony
Hi,

here i have a Max and Min values
Min <-3
Max <-6
and also a matrix like this,

ABCXYZ PQR
--   ------
2 43
5 48
7 13

In this i need to check each particular column values are between Max and
Min value.
If the coulmn value not coming between Max and Min, then i need to create
another coulmn
by adding column name header with "_QF". and assign a string like "RC" for
that particular row.

For eg:- i need to checkout coulmn "ABC".
Here  2,5,6 are the values we need to checkout with Min,Max values. and here
Min <-3,Max <-6
First need to create a new column called "ABC_QF" with current matrix.

ABCXYZ PQR  ABC_QF
--   ------ ---
2 43   RC
5 48
7 13

Next, for 5 , it coming in between 3 to 6. so nothing to do.

ABCXYZ PQR  ABC_QF
--   ------ ---
2 43   RC
5 48
7 13

Next, for 7 , its not coming in between 3 to 6. so put "RC"

ABCXYZ PQR  ABC_QF
--   ------ ---
2 43   RC
5 48
7 13   RC
---
This is the requirement. i did it using for-loop,it will check each value
and it taking time when bulk of data come.
Any hope to do using lappy,appy kind of functions ? Because at a time
complete coulmn should get update.
Could you please help me urgently ?

- Thanks
Antony. 

--
View this message in context: 
http://r.789695.n4.nabble.com/Column-create-and-Update-using-function-tp4636400.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with R2 OpenBUGs

2012-07-13 Thread SuzieK
Hi, I'm currently working on the below codes however whenever I run it in
openbugs it gives an error message saying: unknown type of logical function
error pos 76.  Any help would be appreciated. 

## bugs code
library(R2OpenBUGS)
sink("C:/Users/CCF/Documents/Suzie Work/PTY Project/Waterhole
Correction/ungulate.txt")
cat("
model{

# hyperparameters
# habitat effects for each functional group
g <- length(table(G))
for(i in 1:g){  # number of functional group
for(j in 1:3){  # number of habitat type
mu.h[i,j] ~ dnorm(0,0.0001) I(-5,5)
sigma.h[i,j] ~ dunif(0,5)
tau.h[i,j] <- 1/(sigma.h[i,j]*sigma.h[i,j])
}
}

# detectability
mu.r ~ dnorm(0,0.0001) I(-5,5)
sigma.r ~ dunif(0,5)
tau.r <- 1/(sigma.r*sigma.r)

psi ~ dunif(0,1)# inclusion rate that generates wi

# proportion of number of species among groups
for(i in 1:g){
prop[i] ~ dgamma(1,1)
prob[i] <- prop[i]/sum(prop[])
}

for(i in 1:(S+m)){
r[i] ~ dnorm(mu.r,tau.r) I(-5,5)# generating parameters 
related to
detectability
p[i] <- 1/(1+exp(-(r[i])))  # individual-level detection 
probability
w[i] ~ dbern(psi)   # indicator variable 
whether each species is exposed
to sampling or not
G[i] ~ dcat(prob[1:g])  # group identity
for(h in 1:3){  # habitat effects
habitat.eff[i,h] ~ dnorm(mu.h[G[i],h],tau.h[G[i],h]) I(-5,5)
}
for(j in 1:20){ # fitting process
# ecological process model
lambda[i,j] <-  exp(habitat.eff[i,habitat[j]])
Z[i,j] ~ dpois(lambda[i,j]) # latent abundance of each species at 
each site
at each visit
A[i,j] <- Z[i,j]*w[i]   # latent abundance only for species 
exposed to
sampling
for(v in 1:7){
# detection process model
AY[i,j,v] ~ dbin(p[i],A[i,j])
}
}
# group identity (indicator variable)
G1[i] <- equals(G[i],1)
G2[i] <- equals(G[i],2)
}

for(i in 1:(S+m)){
for(j in 1:20){
A1[i,j] <- A[i,j]*G1[i]
A2[i,j] <- A[i,j]*G2[i]
O[i,j] <- step(A[i,j]-1)# latent occupancy of each 
species for each site
O1[i,j] <- O[i,j]*G1[i]
O2[i,j] <- O[i,j]*G2[i]
}
}

for(j in 1:20){
AB0[j] <- sum(A[,j])
AB1[j] <- sum(A1[,j])
AB2[j] <- sum(A2[,j])
SpR0[j] <- sum(O[,j])
SpR1[j] <- sum(O1[,j])
SpR2[j] <- sum(O2[,j])
}

for(i in 1:3){
for(j in 1:7){
HabAB0[i,j] <- AB0[hab[i,j]]
HabAB1[i,j] <- AB1[hab[i,j]]
HabAB2[i,j] <- AB2[hab[i,j]]
HabSpR0[i,j] <- SpR0[hab[i,j]]
HabSpR1[i,j] <- SpR1[hab[i,j]]
HabSpR2[i,j] <- SpR2[hab[i,j]]
}
HAB0[i] <- mean(HabAB0[i,])
HAB1[i] <- mean(HabAB1[i,])
HAB2[i] <- mean(HabAB2[i,])
HSpR0[i] <- mean(HabSpR0[i,])
HSpR1[i] <- mean(HabSpR1[i,])
HSpR2[i] <- mean(HabSpR2[i,])
}

R <- sum(w[1:(S+m)])# estimating unknown number of species that occupy any
sites

}
",fill=TRUE)
sink()

data <- list("m","S",
"habitat",
"G","g",
#   "date",
"hab",
"AY")
g <- length(table(G))   
inits <- function()list(
mu.h=matrix(rnorm(6),nrow=2),sigma.h=matrix(runif(6),nrow=2),
mu.r=rnorm(1),sigma.r=runif(1),
r=rnorm(S+m),
prop=runif(n=g,min=0,max=7),
Z=array(rpois(n=(S+m)*20,lambda=20),dim=c((S+m),20)),
w=rep(1,(S+m)),psi=runif(1))
parameters <- c(
"mu.h","sigma.h",
"habitat.eff",
"mu.r","sigma.r",
"r",
"R","psi",
"A",
"O",
"HAB0","HAB1","HAB2",
"HSpR0","HSpR1","HSpR2"
)

out <- bugs (data, inits, parameters,
model.file="C:/Users/CCF/Documents/Suzie Work/PTY Project/Waterhole
Correction/ungulate.txt",
OpenBUGS.pgm="C:/Users/CCF/Desktop/OpenBUGS/OpenBUGS321/OpenBUGS.exe",
n.thin=10, n.burnin=100, n.chains=3,n.iter=1000, debug=TRUE)

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-R2-OpenBUGs-tp4636412.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two R sessions on multicore computer seem to inhibit each other ?

2012-07-13 Thread Ulrike Grömping
Thanks to both of you, you are probably right that memory is the limiting
factor.  I have no knowledge about the available memory on the lab machines,
but I will find out and make sure that this is the explanation.

Best,
Ulrike

--
View this message in context: 
http://r.789695.n4.nabble.com/Two-R-sessions-on-multicore-computer-seem-to-inhibit-each-other-tp4636336p4636396.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot graph by first letter

2012-07-13 Thread arun


Hello,

This information is new.  Are you able to import data from access?

A.K.

- Original Message -
From: imnew 
To: r-help@r-project.org
Cc: 
Sent: Thursday, July 12, 2012 9:52 PM
Subject: Re: [R] plot graph by first letter


Hi, for my dataset is actually retrieve from a access file not a textfile.thanks

Date: Thu, 12 Jul 2012 11:58:30 -0700
From: ml-node+s789695n4636343...@n4.nabble.com
To: jubil...@live.com.sg
Subject: Re: plot graph by first letter



    Hi,


Try this:

dat1<-read.table(text="

Name                      Age

Angel                        20

Amelia                      20

Bernard                  19

Stephanie              20

Vanessa                  22

Angeline                  23

Camel                      21

",sep="",header=TRUE)

dat2<-dat1[grepl("[A].*",dat1$Name),]

dat2

      Name Age

1    Angel  20

2   Amelia  20

6 Angeline  23


rownames(dat2)<-1:nrow(dat2)

#Now you might be okay to plot.


?plot()

A.K.




- Original Message -

From: imnew <[hidden email]>

To: [hidden email]

Cc: 

Sent: Thursday, July 12, 2012 4:15 AM

Subject: [R] plot graph by first letter


Hi all, may i know is it possible to plot a graph by first letter?

for example: 


Name:                      Age:

Angel                        20

Amelia                      20

Bernard                   19

Stephanie               20

Vanessa                   22

Angeline                  23

Camel                       21


If I want to plot the name started with letter 'A' and their Angel, how is

it possible to plot it?

Thanks.


--

View this message in context: 
http://r.789695.n4.nabble.com/plot-graph-by-first-letter-tp4636266.html
Sent from the R help mailing list archive at Nabble.com.


__

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


    
    

    

    
    
        If you reply to this email, your message will be added to the 
discussion below:
        
http://r.789695.n4.nabble.com/plot-graph-by-first-letter-tp4636266p4636343.html
    
    
        
        To unsubscribe from plot graph by first letter, click here.

        NAML
                               

--
View this message in context: 
http://r.789695.n4.nabble.com/plot-graph-by-first-letter-tp4636266p4636383.html
Sent from the R help mailing list archive at Nabble.com.
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to add a variable to a graph title

2012-07-13 Thread JeffND
Thank you for this useful code!

--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-add-a-variable-to-a-graph-title-tp4636364p4636405.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tck = 0 for the colorkey?

2012-07-13 Thread Deepayan Sarkar
On Fri, Jul 13, 2012 at 12:21 PM, Martin Ivanov  wrote:
>  Dear R users,
>
> I am struggling with the colorkey on a levelplot lattice graphic.
> I want that no ticks are printed on the colorkey. That is, I want their size 
> tck=0.
> Merely setting tck=0 t in the colorkey parameter does not work. Setting it in 
> the lattice.par.set()
> removes the ticks from the levelplot, but not from the colorkey.

The tick lengths are hard-coded. I will try to change that; meanwhile,
all I can suggest is something like

 levelplot(volcano, colorkey = list(axis.line = list(col = NA)))

which unfortunately also removes the border around the colorkey.

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package 'MASS' (polr): Error in svd(X) : infinite or missing values in 'x'

2012-07-13 Thread Jessica Streicher
You could probably make them numeric, like

> v<-c("a","a","b","c")
> f<-factor(v)
> as.numeric(f)
[1] 1 1 2 3

to get a numeric "rock_id", but i wouldn't per se recommend it.  

You should ask someone who knows more about the scientific side of this method 
to tell you how factorial data is properly treated.


On 12.07.2012, at 03:35, Jeremy Little wrote:

> Thanks Jessi,
> 
> your insights are extremely helpful.
> 
> If you would indulge me one more quick question on your script.
> You have written...
> newData<-data.frame(JVeg5=factor(Jdata[,"JVeg5"]),scale(Jdata[,c("Elevation","Lat_Y_pos","Coast_dist","Stream_dist")]))
> 
> I wish to expand this analysis for all other variables in my data matrix, of
> which one is a factor (and therefore cannot be 'scaled').
> 
> Adding these variables to your script...
> newData<-data.frame(JVeg5=factor(Jdata[,"JVeg5"]),scale(Jdata[,c("Elevation",
> "Slope", "Aspect", "Hillshade", "Lat_Y_pos", "Coast_dist", "Coast_SE",
> "Coast_E", "Wind_310", "Stream_dist", "TPI", "Landform", "Rock_Name")]))
> 
> ...returns the error:
> "Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric"
> 
> because "Rock_Name" must be numeric to be scaled.
> 
> I've tried a couple of options for incorporating this factor (Rock_Name)
> into the script without success.
> For example:
> "newData<-data.frame(JVeg5=factor(Jdata[,"JVeg5"],
> Rock_Name=factor(Jdata[,"Rock_Name"]), scale(Jdata[,c("Elevation", "Slope",
> "Aspect", "Hillshade", "Lat_Y_pos", "Coast_dist", "Coast_SE", "Coast_E",
> "Wind_310", "Stream_dist", "TPI", "Landform")]))"
> 
> Do you have a suggestion which might work for this analysis?
> 
> Thank you for your support with this, I really appreciate it.
> 
> kind regards
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Package-MASS-polr-Error-in-svd-X-infinite-or-missing-values-in-x-tp4635829p4636244.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to find frequent sequences.

2012-07-13 Thread Petr Savicky
On Thu, Jul 12, 2012 at 03:51:54PM -0500, Vineet Shukla wrote:
> I have independent event sequences for example as follows :
> 
> Independent event sequence   1 : A , B , C , D
> Independent event sequence   2 : A, C , B
> Independent event sequence   3 :D, A, B, X,Y, Z
> Independent event sequence   4 :C,A,A,B
> Independent event sequence   5 :B,A,D
> 
> I want to able to find that most common sequence patters as
> 
> {A, B }  = > 3
> from lines 1,3,5.
> 
> Pls note that A,C,B must not be considered because C comes in between
> and line 5 also must not be considered because order of A,B is reversed.

Hi.

If i understand correctly, the first sequence contains patterns

  AB, BC, CD.

Using this interpretation, AB occurs at lines 1,3,4 and not 1,3,5.
Is this correct?

If some sequence contains several ocurrences of a pattern, for example,
the sequence

   A, B, A, B

contains AB twice, then it is counted only once?

If this is correct, then try the following

  # your input list
  lst <- list(
  c("A", "B", "C", "D"),
  c("A", "C", "B"),
  c("D", "A", "B", "X", "Y", "Z"),
  c("C", "A", "A", "B"),
  c("B", "A", "D"))

  # extract unique patterns from a single sequence as rows of a matrix 
  # lpattern is the length of the patterns
  singleSeq <- function(x, lpattern)
  {
  unique(embed(rev(x), lpattern))
  }
 
  lst1 <- lapply(lst, singleSeq, lpattern=2)
  # combine the matrices to a single matrix
  mat <- do.call(rbind, lst1)
  # convert the patters to strings
  pat <- do.call(paste, c(data.frame(mat), sep=""))
  out <- table(pat)
  out

  pat
  AA AB AC AD BA BC BX CA CB CD DA XY YZ 
   1  3  1  1  1  1  1  1  1  1  1  1  1 

  names(out)[which.max(out)]

  [1] "AB"

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Substitute list value

2012-07-13 Thread Jessica Streicher
two things:

- R always counts from 1, not from 0
- listmembers are accessed by using [[ ]] , not [ ]

try 

t1[t==ll[[1]], "v"] <- 99

greetings Jessi


On 11.07.2012, at 15:47, Charles Stangor wrote:

> I can't seem to determine how to get the name of a list member to
> substitute:
> 
> ll <- list("a1" = "a","a2" = "b")
> 
> t1[t==ll[0], "v"] <- 99
> 
> why doesn't this substitute to:
> 
> t1[t=="a", "v"] <- 99
> 
> Thank you!
> 
> 
> 
> 
> -- 
> Charles Stangor
> Professor
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to add a variable to a graph title

2012-07-13 Thread mlell08
To use variables in mathematical expressions, bquote can be used:

#Term in in .( ) is evaluated
plot(1:10,main=bquote(frac(alpha,.(2+2


-- 
GnuPG Key: 0x7340821E

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adjusting format of boxplot

2012-07-13 Thread Gerrit Eichner

Hello, Ivan,

you'll find argument frame (actually frame.plot) explained in

?plot.default

 Regards  --  Gerrit



Hi Peter,

I had never heard of this 'frame' argument and it's a breakthrough for me to 
be finally able to get rid of this frame!


But where is this argument explained? I couldn't find it in plot(), 
boxplot(), bxp() or par().


Thank you for your answer :)
Ivan

--
Ivan CALANDRA
Université de Bourgogne
UMR CNRS/uB 6282 Biogéosciences
6 Boulevard Gabriel
21000 Dijon, FRANCE
+33(0)3.80.39.63.06
ivan.calan...@u-bourgogne.fr
http://biogeosciences.u-bourgogne.fr/calandra

Le 13/07/12 04:48, Peter Ehlers a écrit :

On 2012-07-12 18:39, David L Carlson wrote:
Sorry about that. I got rid of the box another way and then switched to 
using pars=

without making sure it worked. This works:

flies$group <- factor(flies$group, 5:1) # 1
levels(flies$group) <- paste0("Group ", 5:1) # 2

oldpar <- par(bty="n")
boxplot(long ~ group,
 data = flies,
 pars=list(las=1, ylim=c(10, 110), xaxt="n"),
 horizontal = TRUE,
 col = "red")

axis(1, at=seq(10, 110, 20))
par(oldpar)


Or you could add 'frame = FALSE' to the boxplot() call.

Peter Ehlers



-
David

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


- Original Message -

From: "darnold" 
To: r-help@r-project.org
Sent: Thursday, July 12, 2012 7:53:33 PM
Subject: Re: [R] Adjusting format of boxplot

Added your code:


flies <- read.table("example12_1.dat",header=TRUE,sep="\t")

flies$group <- factor(flies$group,5:1)

levels(flies$group) <- paste0("Group ",5:1)

boxplot(long ~ group,
data = flies,
pars = list(las=1, ylim=c(10,110), xaxt="n", bty="n"),
horizontal = TRUE,
col = "red")

axis(1,at=seq(10,110,20))

Almost worked perfectly, except the frame around the plot remains, which 
is

strange as you have bty="n".

http://r.789695.n4.nabble.com/file/n4636381/Rplot11.png

David



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] minor axis ticks in trellis graphics?

2012-07-13 Thread Martin Ivanov
 Dear R users,

I need to add minor axis ticks to my graph. In traditional R this is easily 
achievable by simply
adding a second axis with the minor ticks. But how to do that in trellis? I am 
already out of ideas.

Any suggestions will be appreciated.

Best regards,

Martin

-
Гражданска отговорност – Цените на компаниите
http://www.sdi.bg/onlineInsurance/?utm_source=gbg&utm_medium=txtLink&utm_content=home

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adjusting format of boxplot

2012-07-13 Thread Ivan Calandra

Hi Peter,

I had never heard of this 'frame' argument and it's a breakthrough for 
me to be finally able to get rid of this frame!


But where is this argument explained? I couldn't find it in plot(), 
boxplot(), bxp() or par().


Thank you for your answer :)
Ivan

--
Ivan CALANDRA
Université de Bourgogne
UMR CNRS/uB 6282 Biogéosciences
6 Boulevard Gabriel
21000 Dijon, FRANCE
+33(0)3.80.39.63.06
ivan.calan...@u-bourgogne.fr
http://biogeosciences.u-bourgogne.fr/calandra

Le 13/07/12 04:48, Peter Ehlers a écrit :

On 2012-07-12 18:39, David L Carlson wrote:
Sorry about that. I got rid of the box another way and then switched 
to using pars=

without making sure it worked. This works:

flies$group <- factor(flies$group, 5:1) # 1
levels(flies$group) <- paste0("Group ", 5:1) # 2

oldpar <- par(bty="n")
boxplot(long ~ group,
 data = flies,
 pars=list(las=1, ylim=c(10, 110), xaxt="n"),
 horizontal = TRUE,
 col = "red")

axis(1, at=seq(10, 110, 20))
par(oldpar)


Or you could add 'frame = FALSE' to the boxplot() call.

Peter Ehlers



-
David

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


- Original Message -

From: "darnold" 
To: r-help@r-project.org
Sent: Thursday, July 12, 2012 7:53:33 PM
Subject: Re: [R] Adjusting format of boxplot

Added your code:


flies <- read.table("example12_1.dat",header=TRUE,sep="\t")

flies$group <- factor(flies$group,5:1)

levels(flies$group) <- paste0("Group ",5:1)

boxplot(long ~ group,
data = flies,
pars = list(las=1, ylim=c(10,110), xaxt="n", bty="n"),
horizontal = TRUE,
col = "red")

axis(1,at=seq(10,110,20))

Almost worked perfectly, except the frame around the plot remains, 
which is

strange as you have bty="n".

http://r.789695.n4.nabble.com/file/n4636381/Rplot11.png

David



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove loop which compares row i to row i-1

2012-07-13 Thread Rui Barradas

Hello,

Works unchanged with me.
Yesterday it could have worked for some other reason, like having other 
variables in my environment, which I had, but this time I have started 
anew. Try including


tUnitsmall <- tUnitsort[, cols]

and then use this data.frame to see what happens.

Rui Barradas

Em 12-07-2012 22:43, jcrosbie escreveu:

I'm sorry but I see to be getting an error.Â


data.tmp <- aggregate(MyTo ~ Date + Hour, data = tUnitsort[, cols], max) Error 
in `[.default`(xj, i) : invalid subscript type 'builtin'






  From: Rui Barradas [via R] 
To: jcrosbie 
Sent: Thursday, July 12, 2012 3:12 PM
Subject: Re: remove loop which compares row i to row i-1


Hello,

I've not been following this thread but this seems ndependent from
previous posts. Try the following.



url <- "http://r.789695.n4.nabble.com/file/n4636337/BR3_2011_New.csv";
tUnitsort <- read.csv(url, header=TRUE)

cols <- sapply(c("Date", "Hour", "BlockNumber", "MyTo"), function(x)
         grep(x, names(tUnitsort)))

# This does it
# Use full data.frame 'tUnitsort' if you want
data.tmp <- aggregate(MyTo ~ Date + Hour, data = tUnitsort[, cols], max)
data.tmp <- merge(tUnitsort[, cols], data.tmp, by=c("Date", "Hour"))

# Make it pretty
idx <- grep("MyTo", names(data.tmp))
names(data.tmp)[idx] <- c("MyTo", "NewColumn")

# See it
head(data.tmp, 20)
tail(data.tmp, 20)


Also, you should quote the context. Many, almost all of us do NOT read
the posts on Nabble. And Nabble does have a "quote" button.

Hope this helps,

Rui Barradas

Em 12-07-2012 18:55, jcrosbie escreveu:


Thank you,

I am sorry but I am still trying to figure out how to make the function
work.

I have a column called tUnitsort$BlockNumber which can range from 0 to 6.
I have another two columns with the date and the hour ending for the given
day.
Example

Date        Hour   BlockNumber  MyTo  NewColumn
2011-01 Â  1 Â  Â  Â  Â  Â 2 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 140 Â  Â  
140
2011-01 Â  1 Â  Â  Â  Â  Â 1 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  70 Â  Â  Â  
 140
2011-01 Â  1 Â  Â  Â  Â  Â 0 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  0 Â  Â  Â  
    140
2011-02 Â  1 Â  Â  Â  Â  Â 2 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  160 Â  Â  
160
2011-02 Â  1 Â  Â  Â  Â  Â 1 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  70 Â  Â  Â  
 160
2011-02 Â  1 Â  Â  Â  Â  Â 0 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  0 Â  Â  Â  
   160
2011-03 Â  1 Â  Â  Â  Â  Â 2 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  150 Â  Â  
150

I want to create a NewColumn which will place the MyTo number for the
highest block number for the rest blocks in a given hour ending within a
day.


ifelse(tUnitsort[,4]>=tUnitsort[-1,4],tUnitsort[,7],tUnitsort[-1,7])

I am unsure how to refference the element before in this case. Â I thought
the -1 was doing this but I believe I'm wrong now.

http://r.789695.n4.nabble.com/file/n4636337/BR3_2011_New.csv
BR3_2011_New.csv

--
View this message in context: 
http://r.789695.n4.nabble.com/remove-loop-which-compares-row-i-to-row-i-1-tp4635327p4636337.html
Sent from the R help mailing list archive at Nabble.com.

__
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/remove-loop-which-compares-row-i-to-row-i-1-tp4635327p4636372.html
To unsubscribe from remove loop which compares row i to row i-1, click here.
NAML

--
View this message in context: 
http://r.789695.n4.nabble.com/remove-loop-which-compares-row-i-to-row-i-1-tp4635327p4636376.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SVAR Restriction on AB-model

2012-07-13 Thread Pfaff, Bernhard Dr.
Hello Veronica,

what makes you think that this is an error? It is a warning that your specified 
SVAR-model is **just** identified and hence an over-identification test cannot 
be conducted. You can suppress this warning by not asking for an 
over-identification in the first place, by setting lrtest = FALSE in your call 
to SVAR(). See ?SVAR (Arguments and Details sections) and the package's 
vignette.
To your second question, provide zero entries in the respective column of the 
A-matrix except for the main-diagonal element. 

Best,
Bernhard

-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im 
Auftrag von vero_acurio
Gesendet: Donnerstag, 12. Juli 2012 16:10
An: r-help@r-project.org
Betreff: [R] SVAR Restriction on AB-model

Hello!

I'm doing a svar and when I make the estimation the next error message
appears:

In SVAR(x, Amat = amat, Bmat = bmat, start = NULL, max.iter = 1000,  :
  The AB-model is just identified. No test possible.

Could you help me to interpret it please.

Also I have the identification assumption that one of my shocks is exogenous 
relative to the contemporaneous values of the other variables in the SVAR, 
could you help me with the construction of the restriction matrices A and B of 
the SVAR model please?

Thanks a lot!

Best Regards,

Veronica

--
View this message in context: 
http://r.789695.n4.nabble.com/SVAR-Restriction-on-AB-model-tp4636306.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
*
Confidentiality Note: The information contained in this ...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.