date:20090719

[R] R-package question

2009-07-19 Thread spime


Dear R-users,

Suppose I want to modify and use internal functions of an R-package as my
requirement. By any way is it possible to explore the internal coding
structure of a package and get a list of internal functions? 

thanks.
-- 
View this message in context: 
http://www.nabble.com/R-package-question-tp24554613p24554613.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Zinb for Non-interger data

2009-07-19 Thread JPS2009


Sorry bit of a Newbie question, and I promise I have searched the forum
already, but I'm getting a bit desperate!

I have over-dispersed, zero inflated data, with variance greater than the
mean, suggesting Zero-Inflated Negative Binomial - which I attempted in R
with the pscl package suggested on
http://www.ats.ucla.edu/stat/R/dae/zinbreg.htm

However my data is non-integer with some pesky decimals (i.e. 33.12) and
zinb / pscl doesn't like that - not surprising as zinb is for count data,
normally whole integers etc.

Does anyone know of a different zinb package that will allow non-integers or
and equivalent test/ model to zinb for non-integer data? Or should I try
something else like a quasi-Poisson GLM?

Apologies for the Newbie question! Any help much appreciated!
Thanks!
-- 
View this message in context: 
http://www.nabble.com/Zinb-for-Non-interger-data-tp24550044p24550044.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem With Repeated Use Of Load/Save/Close Commands

2009-07-19 Thread Marilyn Rich Short


Duncan,

Thank you very much for your help. Your first work-around solved my 
problem, even if the second one didn't. I'm not sure if it was the 
elimination of the file() or the close() commands, or both,  that did 
it.


You got me going again after a couple of weeks of casting about. My hat 
is off to you!


Thanks,
Rich


- Original Message - 
From: Duncan Murdoch murd...@stats.uwo.ca

To: Marilyn  Rich Short rm.sh...@comcast.net
Cc: r-help@r-project.org
Sent: Saturday, July 18, 2009 8:21 AM
Subject: Re: [R] Problem With Repeated Use Of Load/Save/Close Commands



On 18/07/2009 11:08 AM, Duncan Murdoch wrote:

On 17/07/2009 7:57 PM, Marilyn  Rich Short wrote:

Hello,

I'm having a problem in R with repeated use of the load, save, 
and close commands. I'm getting an error message that reads, Too 
many open files. I'm opening files and closing them (and unlinking 
them), but when I go through that process 509 times, the program 
halts and I get this error message: cannot open the connection 
with warning messages: Too many open files.  I've been working on 
this problem for a couple of weeks and have gleaned a bit of info 
from different internet threads, but no solutions yet.


I'm using Windows XP, SP3 and R 2.9.1.
Here is my session info:

R version 2.9.1 (2009-06-26)
i286-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States. 1252;LC_MONETARY=English_United
States. 1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

I'm using the vanilla load of R, with nothing added after booting 
up.

The problem also occurs on my Vista machine as well.

The program below will induce the problem.

   junk   = 1
   PathA= tempdir()
   conX = paste(PathA,junk,sep=\\)
   outJunk  = file(conX, open=wb)
   save(junk, file=outJunk)
   close(outJunk)
   for(i in 1:4000){
   outMIA = file(conX, open=rb)
   load(file=outMIA)
   close(outMIA)
   closeAllConnections()
   unlink(conX)
   rm(outMIA)
   rm(junk)
   cat( i = ,i,sep= )
   gc()
   zzz = showConnections(all=FALSE)
   cat( zzz = ,zzz,\n,sep= )
   }

There is some talk on the internet that some windows systems have a 
limit of 512 files that can be open at one time. Even though I'm 
closing my files each time, something is keeping track of how many 
times I've opened and closed a file in a session. I've talked to 
Microsoft and run a test program in Visual Studio C#, and, at the 
moment, it looks like the problem does not lie in the Microsoft 
arena. The C# program performed a similar task 10,000 times without 
a problem. I'm not totally convinced, but the current evidence says 
to look elsewhere.


I've attached a script that will induce the problem. However, be 
warned that, if you use it, you will have to get out of R after you 
run it. R will no longer be able to access files such as help or 
sessionInfo(). This can be solved by getting out of the R GUI and 
back in.


R E-mails from as far back as 2006 ask for help on the issue. There 
have been several suggestions, but no confirmed solution that I can 
find. You will see my attempt at these suggestions in the script [ 
rm(outMIA); rm(junk); closeAllConnections(); and gc(); after 
close(outMIA);  and unlink(conX);]. For me, this becomes important 
because it limits the total number if iterations in nested do-loops. 
This is where I ran into the problem. The program above will allow 
you to reproduce the problem.


Any suggestion would be greatly appreciated.


I've somewhat localized the problem.  load(file=outMIA) passes the 
outMIA connection to gzcon(), which handles decompression of the 
data. gzcon() re-opens the file, and it looks as though the original 
file handle is lost, because closing either the result of gzcon() or 
the original connection results in only the second file handle being 
closed.


So a second workaround besides the one I sent earlier is just not to 
open or close outMIA.  That is, rewrite it as


junk   = 1
PathA= tempdir()
conX = paste(PathA,junk,sep=\\)
outJunk  = file(conX, open=wb)
save(junk, file=outJunk)
close(outJunk)
for(i in 1:4000){
outMIA = file(conX)
load(file=outMIA)
rm(outMIA)
rm(junk)
cat( i = ,i,sep= )
gc()
zzz = showConnections(all=FALSE)
cat( zzz = ,zzz,\n,sep= )
}

so that load() does the open and close, and things are fine.


Oops, not quite fine.  There's a whole set of warnings...

Duncan Murdoch


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal,

[R] transform(_data,...) using strptime gives an error

2009-07-19 Thread amvds

I have timstamped data like this:

 sd[1:10,]
 Tstamp Density Mesh50 Mesh70 Mesh100 Mesh150 Mesh200
2  2009/02/27 07:0030.50.7   10.721.432.841.6
3  2009/02/27 08:0032.21.6   12.423.334.543.0
4  2009/02/27 09:0032.74.8   13.024.035.143.5
5  2009/02/27 10:0026.70.36.517.628.136.9
6  2009/02/27 11:0026.60.96.617.028.637.9
7  2009/02/27 12:0023.36.33.414.025.534.6
8  2009/02/27 13:0025.21.15.115.427.336.8
9  2009/02/27 14:0028.60.28.719.430.940.0
10 2009/02/27 15:0028.00.68.018.630.239.3
11 2009/02/27 16:0028.30.98.318.930.539.5

The timstamps are character vectors:

 str(sd)
'data.frame':   591 obs. of  7 variables:
 $ Tstamp : chr  2009/02/27 07:00 2009/02/27 08:00 2009/02/27 09:00
2009/02/27 10:00 ...
 $ Density: num  30.5 32.2 32.7 26.7 26.6 23.3 25.2 28.6 28 28.3 ...
 $ Mesh50 : num  0.7 1.6 4.8 0.3 0.9 6.3 1.1 0.2 0.6 0.9 ...
 $ Mesh70 : num  10.7 12.4 13 6.5 6.6 3.4 5.1 8.7 8 8.3 ...
 $ Mesh100: num  21.4 23.3 24 17.6 17 14 15.4 19.4 18.6 18.9 ...
 $ Mesh150: num  32.8 34.5 35.1 28.1 28.6 25.5 27.3 30.9 30.2 30.5 ...
 $ Mesh200: num  41.6 43 43.5 36.9 37.9 34.6 36.8 40 39.3 39.5 ...
 - attr(*, na.action)=Class 'exclude'  Named int [1:58] 1 88 89 90 250
318 319 320 321 322 ...
  .. ..- attr(*, names)= chr [1:58] 1 88 89 90 ...

Trying to transform the timestamped character vector 'in place' using
transform gives this error message:

 sd-transform(sd,Tstamp=strptime(Tstamp,format='%Y/%m/%d %H:%M'))
Error in `[-.data.frame`(`*tmp*`, inx[matched], value = list(Tstamp =
list( :
  replacement element 1 has 9 rows, need 591

Why? It beats me...

I do have a backup of course:

td-strptime(sd$Tstamp,format='%Y/%m/%d %H:%M')
sd-data.frame(Tstamp=td, sd[2:7])

this works fine but is one step more complicated. Something I miss about
transform()?

Thanks in advance,
Alex van der Spek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] transform(_data,...) using strptime gives an error

2009-07-19 Thread Gabor Grothendieck

strptime produces a POSIXlt result and you likely intended
POSIXct instead:

as.POSIXct(x, format = %Y/%m/%d %H:%M)

for storing in a data frame. Also to avoid time zone
problems down the line you might want to use chron:

library(chron)
as.chron(x, format = %Y/%m/%d %H:%M)

See R News 4/1.

On Sun, Jul 19, 2009 at 6:26 AM, am...@xs4all.nl wrote:
 I have timstamped data like this:

 sd[1:10,]
             Tstamp Density Mesh50 Mesh70 Mesh100 Mesh150 Mesh200
 2  2009/02/27 07:00    30.5    0.7   10.7    21.4    32.8    41.6
 3  2009/02/27 08:00    32.2    1.6   12.4    23.3    34.5    43.0
 4  2009/02/27 09:00    32.7    4.8   13.0    24.0    35.1    43.5
 5  2009/02/27 10:00    26.7    0.3    6.5    17.6    28.1    36.9
 6  2009/02/27 11:00    26.6    0.9    6.6    17.0    28.6    37.9
 7  2009/02/27 12:00    23.3    6.3    3.4    14.0    25.5    34.6
 8  2009/02/27 13:00    25.2    1.1    5.1    15.4    27.3    36.8
 9  2009/02/27 14:00    28.6    0.2    8.7    19.4    30.9    40.0
 10 2009/02/27 15:00    28.0    0.6    8.0    18.6    30.2    39.3
 11 2009/02/27 16:00    28.3    0.9    8.3    18.9    30.5    39.5

 The timstamps are character vectors:

 str(sd)
 'data.frame':   591 obs. of  7 variables:
  $ Tstamp : chr  2009/02/27 07:00 2009/02/27 08:00 2009/02/27 09:00
 2009/02/27 10:00 ...
  $ Density: num  30.5 32.2 32.7 26.7 26.6 23.3 25.2 28.6 28 28.3 ...
  $ Mesh50 : num  0.7 1.6 4.8 0.3 0.9 6.3 1.1 0.2 0.6 0.9 ...
  $ Mesh70 : num  10.7 12.4 13 6.5 6.6 3.4 5.1 8.7 8 8.3 ...
  $ Mesh100: num  21.4 23.3 24 17.6 17 14 15.4 19.4 18.6 18.9 ...
  $ Mesh150: num  32.8 34.5 35.1 28.1 28.6 25.5 27.3 30.9 30.2 30.5 ...
  $ Mesh200: num  41.6 43 43.5 36.9 37.9 34.6 36.8 40 39.3 39.5 ...
  - attr(*, na.action)=Class 'exclude'  Named int [1:58] 1 88 89 90 250
 318 319 320 321 322 ...
  .. ..- attr(*, names)= chr [1:58] 1 88 89 90 ...

 Trying to transform the timestamped character vector 'in place' using
 transform gives this error message:

 sd-transform(sd,Tstamp=strptime(Tstamp,format='%Y/%m/%d %H:%M'))
 Error in `[-.data.frame`(`*tmp*`, inx[matched], value = list(Tstamp =
 list( :
  replacement element 1 has 9 rows, need 591

 Why? It beats me...

 I do have a backup of course:

 td-strptime(sd$Tstamp,format='%Y/%m/%d %H:%M')
 sd-data.frame(Tstamp=td, sd[2:7])

 this works fine but is one step more complicated. Something I miss about
 transform()?

 Thanks in advance,
 Alex van der Spek

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-package question

2009-07-19 Thread Duncan Murdoch


spime wrote:

Dear R-users,

Suppose I want to modify and use internal functions of an R-package as my
requirement. By any way is it possible to explore the internal coding
structure of a package and get a list of internal functions? 

  
The best way is to download the source to the package, which will be on 
CRAN if that's where you got the package.


You can see a deparsed version of the R code just by asking to print the 
functions (possibly prefixed with pkgname::: if they aren't exported), 
but the source is always better.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Zinb for Non-interger data

2009-07-19 Thread Ted Harding

On 18-Jul-09 17:26:36, JPS2009 wrote:
 Sorry bit of a Newbie question, and I promise I have searched the
 forum already, but I'm getting a bit desperate!
 
 I have over-dispersed, zero inflated data, with variance greater
 than the mean, suggesting Zero-Inflated Negative Binomial - which
 I attempted in R with the pscl package suggested on
 http://www.ats.ucla.edu/stat/R/dae/zinbreg.htm
 
 However my data is non-integer with some pesky decimals (i.e. 33.12)
 and zinb / pscl doesn't like that - not surprising as zinb is for
 count data, normally whole integers etc.
 
 Does anyone know of a different zinb package that will allow
 non-integers or and equivalent test/ model to zinb for non-integer
 data? Or should I try something else like a quasi-Poisson GLM?
 
 Apologies for the Newbie question! Any help much appreciated!
 Thanks!

The presence of decimals suggests that those data values are records
of quantities which ought to be modelled as continuous variables.
For instance, in answer to a survey question How much did you spend
on alcoholic drinks yesterday, the answer would be either a positive
sum of money (with decimals), or zero, depending on whether the
person spent anything at all on alcohol.

So:
With probability p, the amount spent was positive and, conditional
on being positive, has a distribution which can be modelled by a
particular continuous distribution (maybe Log-normal?).

With probability (1-p), the amount spent was zero.

So a correct approach first requires you to face the question of
how to model the positive part of the distribution.

Once you have settled that question, it is then possible to see
whether that particular class of problem is covered by some package
in R, or whether you need to develop an approach yourself.

In any case, if I am barking up the right tree above, neither negative
binomial nor Poisson would, in principle, be correct for such data
since, as you observe, these are intended for count data, not for
data which is essentially continuous.

Hoping this helps,
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 19-Jul-09   Time: 12:25:39
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)

2009-07-19 Thread David Freedman


There is a function mh_test in the coin package.  

library(coin)
mh_test(tt)

The documentation states, The null hypothesis of independence of row and
column totals is tested. The corresponding test for binary factors x and y
is known as McNemar test. For larger tables, Stuart’s W0 statistic (Stuart,
1955, Agresti, 2002, page 422, also known as Stuart-Maxwell test) is
computed.

hth, david freedman


Tal Galili wrote:
 
 Hello all,
 
 I wish to perform a mcnemar test for a 3 by 3 matrix.
 By running the slandered R command I am getting a result but I am not sure
 I
 am getting the correct one.
 Here is an example code:
 
 (tt -  as.table(t(matrix(c(1,4,1,
 0,5,5,
 3,1,5), ncol = 3
 mcnemar.test(tt, correct=T)
 #And I get:
 McNemar's Chi-squared test
 data:  tt
 McNemar's chi-squared = 7.6667, df = 3, p-value = *0.05343*
 
 
 Now I was wondering if the test I just performed is the correct one.
From looking at the Wikipedia article on mcnemar (
 http://en.wikipedia.org/wiki/McNemar's_test), it is said that:
 The Stuart-Maxwell
 testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm
 is
 different generalization of the McNemar test, used for testing marginal
 homogeneity in a square table with more than two rows/columns
 
From searching for a Stuart-Maxwell
 testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm
 in
 google, I found an algorithm here:
 http://www.m-hikari.com/ams/ams-password-2009/ams-password9-12-2009/abbasiAMS9-12-2009.pdf
 
From running this algorithm I am getting a different P value, here is the
 (somewhat ugly) code I produced for this:
 get.d - function(xx)
 {
   length1 - dim(xx)[1]
   ret1 - margin.table(xx,1) - margin.table(xx,2)
   return(ret1)
 }
 
 get.s - function(xx)
 {
   the.s - xx
   for( i in 1:dim(xx)[1])
   {
 for(j in 1:dim(xx)[2])
 {
   if(i == j)
   {
 the.s[i,j] - margin.table(xx,1)[i] + margin.table(xx,2)[i] -
 2*xx[i,i]
   } else {
 the.s[i,j] - -(xx[i,j] + xx[j,i])
   }
 }
   }
   return(the.s)
 }
 
 chi.statistic - t(get.d(tt)[-3]) %*% solve(get.s(tt)[-3,-3])  %*%
 get.d(tt)[-3]
 paste(the P value:, pchisq(chi.statistic, 2))
 
 #and the result was:
  the P value: 0.268384371053358
 
 
 
 So to summarize my questions:
 1) can I use mcnemar.test for 3*3 (or more) tables ?
 2) if so, what test is being performed (
 Stuart-Maxwellhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm)
 ?
 3) Do you have a recommended link to an explanation of the algorithm
 employed?
 
 
 Thanks,
 Tal
 
 
 
 
 
 -- 
 --
 
 
 My contact information:
 Tal Galili
 Phone number: 972-50-3373767
 FaceBook: Tal Galili
 My Blogs:
 http://www.r-statistics.com/
 http://www.talgalili.com
 http://www.biostatistics.co.il
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Can-I-use-%22mcnemar.test%22-for-3*3-tables-%28or-is-there-a-bug-in-the-command-%29-tp24556414p24556693.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building a big.matrix using foreach

2009-07-19 Thread Jay Emerson

Michael,

If you have a big.matrix, you just want to iterate over the rows.  I'm not
in R and am just making this up on the fly (from a bar in Beijing, if you
believe that):

foreach(i=1:nrow(x),.combine=c) %dopar% f(x[i,])

should work, essentially applying the functin f() to the rows of x?  But
perhaps I misunderstand you.  Please feel free to email me or Mike (
michael.k...@yale.edu) directoy with questions about bigmemory, we are very
interested in applications of it to real problems.

Note that the package foreach uses package iterators, and is very flexible,
in case you need more general iteration in parellel.

Regards,

Jay



Original message:
Hi there!
I have become a big fan of the 'foreach' package allowing me to do a
lot of stuff in parallel. For example, evaluating the function f on
all elements in a vector x is easily accomplished:
foreach(i=1:length(x),.combine=c) %dopar% f(x[i])
Here the .combine=c option tells foreach to combine output using the
c()-function. That is, to return it as a vector.
Today I discovered the 'bigmemory' package, and I would like to
contruct a big.matrix in a parralel fashion row by row. To use foreach
I see no other way than to come up with a substitute for c in the
.combine option. I have checked out the big.matrix manual, but I can't
find a function suitable for just that.
Actually, I wouldn't even know how to do it for a usual matrix. Any clues?
Thanks!
--
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/



-- 
John W. Emerson (Jay)
Associate Professor of Statistics
Department of Statistics
Yale University
http://www.stat.yale.edu/~jay

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)

2009-07-19 Thread Tal Galili

Hello David,Thank you for your answer.

Do you know then what does the mcnemar.test do in the case of a 3*3 table
?
Because the results for the simple example I gave are rather different (P
value of 0.053 VS 0.73)

In case the mcnemar can't really handle a 3*3 matrix (or more), shouldn't
there be an error massage for this case? (if so, who should I turn to, in
order to report this?)

Thanks again,
Tal





On Sun, Jul 19, 2009 at 3:47 PM, David Freedman 3.14da...@gmail.com wrote:


 There is a function mh_test in the coin package.

 library(coin)
 mh_test(tt)

 The documentation states, The null hypothesis of independence of row and
 column totals is tested. The corresponding test for binary factors x and y
 is known as McNemar test. For larger tables, Stuarts W0 statistic (Stuart,
 1955, Agresti, 2002, page 422, also known as Stuart-Maxwell test) is
 computed.

 hth, david freedman


 Tal Galili wrote:
 
  Hello all,
 
  I wish to perform a mcnemar test for a 3 by 3 matrix.
  By running the slandered R command I am getting a result but I am not
 sure
  I
  am getting the correct one.
  Here is an example code:
 
  (tt -  as.table(t(matrix(c(1,4,1,
  0,5,5,
  3,1,5), ncol = 3
  mcnemar.test(tt, correct=T)
  #And I get:
  McNemar's Chi-squared test
  data:  tt
  McNemar's chi-squared = 7.6667, df = 3, p-value = *0.05343*
 
 
  Now I was wondering if the test I just performed is the correct one.
 From looking at the Wikipedia article on mcnemar (
  http://en.wikipedia.org/wiki/McNemar's_test), it is said that:
  The Stuart-Maxwell
  testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm
  is
  different generalization of the McNemar test, used for testing marginal
  homogeneity in a square table with more than two rows/columns
 
 From searching for a Stuart-Maxwell
  testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm
  in
  google, I found an algorithm here:
 
 http://www.m-hikari.com/ams/ams-password-2009/ams-password9-12-2009/abbasiAMS9-12-2009.pdf
 
 From running this algorithm I am getting a different P value, here is the
  (somewhat ugly) code I produced for this:
  get.d - function(xx)
  {
length1 - dim(xx)[1]
ret1 - margin.table(xx,1) - margin.table(xx,2)
return(ret1)
  }
 
  get.s - function(xx)
  {
the.s - xx
for( i in 1:dim(xx)[1])
{
  for(j in 1:dim(xx)[2])
  {
if(i == j)
{
  the.s[i,j] - margin.table(xx,1)[i] + margin.table(xx,2)[i] -
  2*xx[i,i]
} else {
  the.s[i,j] - -(xx[i,j] + xx[j,i])
}
  }
}
return(the.s)
  }
 
  chi.statistic - t(get.d(tt)[-3]) %*% solve(get.s(tt)[-3,-3])  %*%
  get.d(tt)[-3]
  paste(the P value:, pchisq(chi.statistic, 2))
 
  #and the result was:
   the P value: 0.268384371053358
 
 
 
  So to summarize my questions:
  1) can I use mcnemar.test for 3*3 (or more) tables ?
  2) if so, what test is being performed (
  Stuart-Maxwell
 http://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm)
  ?
  3) Do you have a recommended link to an explanation of the algorithm
  employed?
 
 
  Thanks,
  Tal
 
 
 
 
 
  --
  --
 
 
  My contact information:
  Tal Galili
  Phone number: 972-50-3373767
  FaceBook: Tal Galili
  My Blogs:
  http://www.r-statistics.com/
  http://www.talgalili.com
  http://www.biostatistics.co.il
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

 --
 View this message in context:
 http://www.nabble.com/Can-I-use-%22mcnemar.test%22-for-3*3-tables-%28or-is-there-a-bug-in-the-command-%29-tp24556414p24556693.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
--


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Hmisc, Design, summary.Design plot- changing confidence intervals, adding color or decreasing font size

2009-07-19 Thread Frank E Harrell Jr


David Winsemius wrote:


On Jul 18, 2009, at 11:14 AM, Elizabeth Stanny wrote:



Frank,

I had already tried q=c(0.7,0.8,0.9,0.95) as an argument in 
plot.summary.Design and it did not work (i.e., CIs were not plotted).  
Could you point me to an example using plot.summary.Design that uses q 
as argument and has output?  I would greatly appreciate it.


I am encountering an error when using any number of probabilities other 
than 5 and I think it relates to how the confbar col= defaults are set.


plot(s, log=TRUE, at=c(.1,.5,1,1.5,2,4,8), q=c(0.7,0.8,0.9,0.95))
Error in confbar(nbar - (i - is + 1) + 1, effect[i], se[i], q = q, type 
= h,  :

  q and col must have same length

?confbar

 Using the example on the help page with a modified q vector of length 5 
gives output:


 plot(s, log=TRUE, at=c(.1,.5,1,1.5,2,4,8), q = c(0.6, 0.8, 0.95, 0.975, 
0.99))


Experimentation shows that providing a col vector of the same length as 
q also succeeds:


plot(s, log=TRUE, at=c(.1,.5,1,1.5,2,4,8), q=c(0.7,0.8,0.9,0.95), 
col=gray(c(0, 0.25, 0.75, 1)))




Exactly, as per the documentation.  I have just added a sentence to the 
help file to make this more clear.


Thanks
Frank

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with as.POSIXct and daylight savings time

2009-07-19 Thread Denis Chabot

[was   [R] end of daylight saving time]

Hi,

I got no reply with the previous subject line, probably a bad choice  
of subject on my part, so here it is again.

I read from the help on DateTimeClasses and various posts on this list  
that, quite logically, one needs to specify if DST is active or not  
when time is between 1 and 2 AM on the first Sunday in November (for  
North America in recent years).

This I can do for on date at a time:

a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT)  # to get  
automatic use of DST
b - as.POSIXct(2008-11-02 01:30:00, tz=EST)  # to tell T this is  
the second occurrence of 1:30 that day, in ST
difftime(b,a)

Time difference of 1 hours

But why can't I do the following, which appears to be a typical R way  
of doing things, to handle several date-times at once?

c - rep(2008-11-02 01:30:00, 2)
tzone = c(EST5EDT, EST)

as.POSIXct(c, tz=tzone)
Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) :
  valeur 'tz' incorrecte

???

Thanks,

Denis Chabot

sessionInfo()
R version 2.9.1 Patched (2009-07-09 r48929)
x86_64-apple-darwin9.7.0

locale:
fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.9.1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)

2009-07-19 Thread Tal Galili

Hello all,

I wish to perform a mcnemar test for a 3 by 3 matrix.
By running the slandered R command I am getting a result but I am not sure I
am getting the correct one.
Here is an example code:

(tt -  as.table(t(matrix(c(1,4,1,
0,5,5,
3,1,5), ncol = 3
mcnemar.test(tt, correct=T)
#And I get:
McNemar's Chi-squared test
data:  tt
McNemar's chi-squared = 7.6667, df = 3, p-value = *0.05343*


Now I was wondering if the test I just performed is the correct one.
From looking at the Wikipedia article on mcnemar (
http://en.wikipedia.org/wiki/McNemar's_test), it is said that:
The Stuart-Maxwell
testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm
is
different generalization of the McNemar test, used for testing marginal
homogeneity in a square table with more than two rows/columns

From searching for a Stuart-Maxwell
testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm
in
google, I found an algorithm here:
http://www.m-hikari.com/ams/ams-password-2009/ams-password9-12-2009/abbasiAMS9-12-2009.pdf

From running this algorithm I am getting a different P value, here is the
(somewhat ugly) code I produced for this:
get.d - function(xx)
{
  length1 - dim(xx)[1]
  ret1 - margin.table(xx,1) - margin.table(xx,2)
  return(ret1)
}

get.s - function(xx)
{
  the.s - xx
  for( i in 1:dim(xx)[1])
  {
for(j in 1:dim(xx)[2])
{
  if(i == j)
  {
the.s[i,j] - margin.table(xx,1)[i] + margin.table(xx,2)[i] -
2*xx[i,i]
  } else {
the.s[i,j] - -(xx[i,j] + xx[j,i])
  }
}
  }
  return(the.s)
}

chi.statistic - t(get.d(tt)[-3]) %*% solve(get.s(tt)[-3,-3])  %*%
get.d(tt)[-3]
paste(the P value:, pchisq(chi.statistic, 2))

#and the result was:
 the P value: 0.268384371053358



So to summarize my questions:
1) can I use mcnemar.test for 3*3 (or more) tables ?
2) if so, what test is being performed (
Stuart-Maxwellhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm)
?
3) Do you have a recommended link to an explanation of the algorithm
employed?


Thanks,
Tal





-- 
--


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building a big.matrix using foreach

2009-07-19 Thread Michael Kane

Another thing to realize if you are doing this in parallel, f(x[i,]) is
being executed on each of the worker R sessions.  Now, a big.matrix object
is essentially a pointer to an object created in C++ and, a pointer address
space is specific to a process (in this case the master R session).  As a
result, when you call f(x[i,]) in each of the workers, you are trying to use
a pointer that no longer points to the original C++ object.  To get around
this you need to either use a shared.big.matrix or filebacked.big.matrix
object and you need to use the descriptor in the foreach function.

desc = describe(x)
foreach (i=1:nrow(x), .combine=c, .packages='bigmemory') %dopar%
{
  x = attach.big.matrix(desc)
  f(x[i,])
}

The descriptor is just a list of information that is not process specific.
So, in the example, the descriptor is passed to each of the workers.  Then
the worker attaches to the big.matrix object which is shared across R
sessions, and then function is called on the attached big.matrix object.

I may have provided more detail than you were interested.  The thing to take
away is that if you are working with foreach and bigmemory in parallel, you
need to use descriptors to the workers and attach to the big.matrix object.

Thanks,
Mike

On Sun, Jul 19, 2009 at 8:29 AM, Jay Emerson jayemer...@gmail.com wrote:

 Michael,

 If you have a big.matrix, you just want to iterate over the rows.  I'm not
 in R and am just making this up on the fly (from a bar in Beijing, if you
 believe that):

 foreach(i=1:nrow(x),.combine=c) %dopar% f(x[i,])

 should work, essentially applying the functin f() to the rows of x?  But
 perhaps I misunderstand you.  Please feel free to email me or Mike (
 michael.k...@yale.edu) directoy with questions about bigmemory, we are
 very interested in applications of it to real problems.

 Note that the package foreach uses package iterators, and is very flexible,
 in case you need more general iteration in parellel.

 Regards,

 Jay



 Original message:
 Hi there!
 I have become a big fan of the 'foreach' package allowing me to do a
 lot of stuff in parallel. For example, evaluating the function f on
 all elements in a vector x is easily accomplished:
 foreach(i=1:length(x),.combine=c) %dopar% f(x[i])
 Here the .combine=c option tells foreach to combine output using the
 c()-function. That is, to return it as a vector.
 Today I discovered the 'bigmemory' package, and I would like to
 contruct a big.matrix in a parralel fashion row by row. To use foreach
 I see no other way than to come up with a substitute for c in the
 .combine option. I have checked out the big.matrix manual, but I can't
 find a function suitable for just that.
 Actually, I wouldn't even know how to do it for a usual matrix. Any clues?
 Thanks!
 --
 Michael Knudsen
 micknud...@gmail.com
 http://lifeofknudsen.blogspot.com/



 --
 John W. Emerson (Jay)
 Associate Professor of Statistics
 Department of Statistics
 Yale University
 http://www.stat.yale.edu/~jay http://www.stat.yale.edu/%7Ejay


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Hmisc, Design, summary.Design plot- changing confidence intervals, adding color or decreasing font size

2009-07-19 Thread Mike Babyak

Elizabeth Stanny wrote:
 Hi,

 1. I want 95% not 99% confidence intervals in my summary.Design plot
using the Design package. Putting conf.int http://conf.int/=.95 as an
argument in plot does not work.  The  default appears  to be .99 not .95 as
stated in the package Design manual (p. 164).


(hope this works--I haven't posted from the digest before)


It may not be immediately obvious, but the color (col) argument must always
match q in length.  When you change the length of q to display a number of
CIs that is different from the default (5, I think), col is still set to
correspond to that default number of CIs.

So, you also must change col so that it matches q in length.

For example, this will not work because there is no matching col argument:
plot(summymodel,q=.95)

You'll get this error msg:

Error in confbar(nbar - (i - is + 1) + 1, effect[i], se[i], q = q, type =
h,   q and col must have same length

But this will:
plot(summymodel,q=.95,col=2)

If you wanted two sets of CIs, say 95 and 99:
plot(summymodel,q=c(.95,.99),col=c(2,5))


Mike Babyak
Duke Medical Center

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package X12

2009-07-19 Thread Ben Bolker




Ana Quiterio wrote:
 
 Dear All.
 I can't execute X12GUI() from package x12. I have problems in the step :
 Please choose the x12 binaries.
 Can anyone help me?
 Thanks in advance,
 AnaQ
 
 

 I know barely anything about the package.  I gather from a bit of
googling that X12 is a time-series analysis package from the US
Census Bureau, and that the x12 package provides an R interface.
Have you followed any available installation instructions that
come with the package?  Have you tried contacting the package
author (see help(package=x12) ) ?

  Ben Bolker

-- 
View this message in context: 
http://www.nabble.com/package-X12-tp24548691p24557843.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with as.POSIXct and daylight savings time

2009-07-19 Thread Denis Chabot


[was  [R] end of daylight saving time]

Hi,

I got no reply with the previous subject line, probably a bad choice  
of subject on my part, so here it is again.


I read from the help on DateTimeClasses and various posts on this list  
that, quite logically, one needs to specify if DST is active or not  
when time is between 1 and 2 AM on the first Sunday in November (for  
North America in recent years).


This I can do for on date at a time:

a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT)  # to get  
automatic use of DST
b - as.POSIXct(2008-11-02 01:30:00, tz=EST)  # to tell T this is  
the second occurrence of 1:30 that day, in ST

difftime(b,a)

Time difference of 1 hours

But why can't I do the following, which appears to be a typical R way  
of doing things, to handle several date-times at once?


c - rep(2008-11-02 01:30:00, 2)
tzone = c(EST5EDT, EST)

as.POSIXct(c, tz=tzone)
Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) :
 valeur 'tz' incorrecte

???

Thanks,

Denis Chabot

sessionInfo()
R version 2.9.1 Patched (2009-07-09 r48929)
x86_64-apple-darwin9.7.0

locale:
fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.9.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with as.POSIXct and daylight savings time

2009-07-19 Thread Duncan Murdoch


On 19/07/2009 11:23 AM, Denis Chabot wrote:

[was  [R] end of daylight saving time]

Hi,

I got no reply with the previous subject line, probably a bad choice  
of subject on my part, so here it is again.


I read from the help on DateTimeClasses and various posts on this list  
that, quite logically, one needs to specify if DST is active or not  
when time is between 1 and 2 AM on the first Sunday in November (for  
North America in recent years).


This I can do for on date at a time:

a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT)  # to get  
automatic use of DST
b - as.POSIXct(2008-11-02 01:30:00, tz=EST)  # to tell T this is  
the second occurrence of 1:30 that day, in ST

difftime(b,a)

Time difference of 1 hours

But why can't I do the following, which appears to be a typical R way  
of doing things, to handle several date-times at once?


c - rep(2008-11-02 01:30:00, 2)
tzone = c(EST5EDT, EST)

as.POSIXct(c, tz=tzone)
Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) :
  valeur 'tz' incorrecte

???


Objects of the POSIXlt and POSIXct classes don't support multiple time 
zones, so if you specified several time zones on input, how would the 
conversion functions decide which one to use for output?  You'll need to 
write your own wrapper function to make this decision, and do the 
conversions separately for each input timezone.


Why don't those classes support a separate time zone for each entry? 
Presumably because their designer never thought anyone would want to do 
that.


Duncan Murdoch




Thanks,

Denis Chabot

sessionInfo()
R version 2.9.1 Patched (2009-07-09 r48929)
x86_64-apple-darwin9.7.0

locale:
fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.9.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with as.POSIXct and daylight savings time

2009-07-19 Thread Denis Chabot


Thank you very much Duncan.

I'll follow your suggestion.

Why do I want to do what the designer did not think anyone would want  
to do? I have data acquisition equipment taking measurements every 15  
min or so for days at a time, and I need to compile all such  
experiments in a master data set. The data acquisition equipment  
automatically switches to DST in spring and back to ST in autumn,  
which I did not disable because it is easier to work with while we are  
running the experiments.


I could use chron to ignore time zones and daylight savings time, but  
this would not be of much help as whether or not I use as.POSIXct or  
chron, there is one day of the year that has 25 h and I need to deal  
with that 25th hour or I'll lose one hour of data!


Denis
Le 09-07-19 à 11:45, Duncan Murdoch a écrit :


On 19/07/2009 11:23 AM, Denis Chabot wrote:

[was  [R] end of daylight saving time]
Hi,
I got no reply with the previous subject line, probably a bad  
choice  of subject on my part, so here it is again.
I read from the help on DateTimeClasses and various posts on this  
list  that, quite logically, one needs to specify if DST is active  
or not  when time is between 1 and 2 AM on the first Sunday in  
November (for  North America in recent years).

This I can do for on date at a time:
a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT)  # to get   
automatic use of DST
b - as.POSIXct(2008-11-02 01:30:00, tz=EST)  # to tell T this  
is  the second occurrence of 1:30 that day, in ST

difftime(b,a)
Time difference of 1 hours
But why can't I do the following, which appears to be a typical R  
way  of doing things, to handle several date-times at once?

c - rep(2008-11-02 01:30:00, 2)
tzone = c(EST5EDT, EST)
as.POSIXct(c, tz=tzone)
Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) :
 valeur 'tz' incorrecte
???


Objects of the POSIXlt and POSIXct classes don't support multiple  
time zones, so if you specified several time zones on input, how  
would the conversion functions decide which one to use for output?   
You'll need to write your own wrapper function to make this  
decision, and do the conversions separately for each input timezone.


Why don't those classes support a separate time zone for each entry?  
Presumably because their designer never thought anyone would want to  
do that.


Duncan Murdoch



Thanks,
Denis Chabot
sessionInfo()
R version 2.9.1 Patched (2009-07-09 r48929)
x86_64-apple-darwin9.7.0
locale:
fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8
attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base
loaded via a namespace (and not attached):
[1] tools_2.9.1
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with as.POSIXct and daylight savings time

2009-07-19 Thread spencerg

 Have you considered the timeDate package? 


 Spencer

Denis Chabot wrote:

Thank you very much Duncan.

I'll follow your suggestion.

Why do I want to do what the designer did not think anyone would want 
to do? I have data acquisition equipment taking measurements every 15 
min or so for days at a time, and I need to compile all such 
experiments in a master data set. The data acquisition equipment 
automatically switches to DST in spring and back to ST in autumn, 
which I did not disable because it is easier to work with while we are 
running the experiments.


I could use chron to ignore time zones and daylight savings time, but 
this would not be of much help as whether or not I use as.POSIXct or 
chron, there is one day of the year that has 25 h and I need to deal 
with that 25th hour or I'll lose one hour of data!


Denis
Le 09-07-19 à 11:45, Duncan Murdoch a écrit :


On 19/07/2009 11:23 AM, Denis Chabot wrote:

[was [R] end of daylight saving time]
Hi,
I got no reply with the previous subject line, probably a bad 
choice  of subject on my part, so here it is again.
I read from the help on DateTimeClasses and various posts on this 
list  that, quite logically, one needs to specify if DST is active 
or not  when time is between 1 and 2 AM on the first Sunday in 
November (for  North America in recent years).

This I can do for on date at a time:
a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT)  # to get  
automatic use of DST
b - as.POSIXct(2008-11-02 01:30:00, tz=EST)  # to tell T this 
is  the second occurrence of 1:30 that day, in ST

difftime(b,a)
Time difference of 1 hours
But why can't I do the following, which appears to be a typical R 
way  of doing things, to handle several date-times at once?

c - rep(2008-11-02 01:30:00, 2)
tzone = c(EST5EDT, EST)
as.POSIXct(c, tz=tzone)
Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) :
 valeur 'tz' incorrecte
???


Objects of the POSIXlt and POSIXct classes don't support multiple 
time zones, so if you specified several time zones on input, how 
would the conversion functions decide which one to use for output?  
You'll need to write your own wrapper function to make this decision, 
and do the conversions separately for each input timezone.


Why don't those classes support a separate time zone for each entry? 
Presumably because their designer never thought anyone would want to 
do that.


Duncan Murdoch



Thanks,
Denis Chabot
sessionInfo()
R version 2.9.1 Patched (2009-07-09 r48929)
x86_64-apple-darwin9.7.0
locale:
fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8
attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base
loaded via a namespace (and not attached):
[1] tools_2.9.1
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with as.POSIXct and daylight savings time

2009-07-19 Thread Denis Chabot

Thanks for the suggestion, Spencer. I will take a look and will report  
to the list if I find this a better solution for my situation. Might  
take a couple of days though.


Denis
Le 09-07-19 à 12:42, spencerg a écrit :


Have you considered the timeDate package?
Spencer

Denis Chabot wrote:

Thank you very much Duncan.

I'll follow your suggestion.

Why do I want to do what the designer did not think anyone would  
want to do? I have data acquisition equipment taking measurements  
every 15 min or so for days at a time, and I need to compile all  
such experiments in a master data set. The data acquisition  
equipment automatically switches to DST in spring and back to ST in  
autumn, which I did not disable because it is easier to work with  
while we are running the experiments.


I could use chron to ignore time zones and daylight savings time,  
but this would not be of much help as whether or not I use  
as.POSIXct or chron, there is one day of the year that has 25 h and  
I need to deal with that 25th hour or I'll lose one hour of data!


Denis
Le 09-07-19 à 11:45, Duncan Murdoch a écrit :


On 19/07/2009 11:23 AM, Denis Chabot wrote:

[was [R] end of daylight saving time]
Hi,
I got no reply with the previous subject line, probably a bad  
choice  of subject on my part, so here it is again.
I read from the help on DateTimeClasses and various posts on this  
list  that, quite logically, one needs to specify if DST is  
active or not  when time is between 1 and 2 AM on the first  
Sunday in November (for  North America in recent years).

This I can do for on date at a time:
a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT)  # to get   
automatic use of DST
b - as.POSIXct(2008-11-02 01:30:00, tz=EST)  # to tell T  
this is  the second occurrence of 1:30 that day, in ST

difftime(b,a)
Time difference of 1 hours
But why can't I do the following, which appears to be a typical R  
way  of doing things, to handle several date-times at once?

c - rep(2008-11-02 01:30:00, 2)
tzone = c(EST5EDT, EST)
as.POSIXct(c, tz=tzone)
Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) :
valeur 'tz' incorrecte
???


Objects of the POSIXlt and POSIXct classes don't support multiple  
time zones, so if you specified several time zones on input, how  
would the conversion functions decide which one to use for  
output?  You'll need to write your own wrapper function to make  
this decision, and do the conversions separately for each input  
timezone.


Why don't those classes support a separate time zone for each  
entry? Presumably because their designer never thought anyone  
would want to do that.


Duncan Murdoch



Thanks,
Denis Chabot
sessionInfo()
R version 2.9.1 Patched (2009-07-09 r48929)
x86_64-apple-darwin9.7.0
locale:
fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8
attached base packages:
[1] stats graphics  grDevices utils datasets  methods
base

loaded via a namespace (and not attached):
[1] tools_2.9.1
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sampling of non-overlapping intervals of variable length

2009-07-19 Thread Hadassa Brunschwig

Hi,

I hope I am not repeating a question which has been posed already.
I am trying to do the following in the most efficient way:
I would like to sample from a finite (large) set of integers n non-overlapping
intervals, where each interval i has a different, set length L_i
(which is the number
of integers in the interval).
I had the idea to sample recursively on a vector with the already
chosen intervals
discarded but that seems to be too complicated.
Any suggestions on that?

Thanks a lot.

Hadassa


-- 
Hadassa Brunschwig
PhD Student
Department of Statistics
The Hebrew University of Jerusalem
http://www.stat.huji.ac.il

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with as.POSIXct and daylight savings time

2009-07-19 Thread Gabor Grothendieck

as.chron.POSIXt has an offset= argument and it may be
a vector:

 args(chron:::as.chron.POSIXt)
function (x, offset = 0, tz = GMT, ...)

On Sun, Jul 19, 2009 at 9:10 AM, Denis Chabotchab...@globetrotter.net wrote:
 [was   [R] end of daylight saving time]

 Hi,

 I got no reply with the previous subject line, probably a bad choice
 of subject on my part, so here it is again.

 I read from the help on DateTimeClasses and various posts on this list
 that, quite logically, one needs to specify if DST is active or not
 when time is between 1 and 2 AM on the first Sunday in November (for
 North America in recent years).

 This I can do for on date at a time:

 a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT)  # to get
 automatic use of DST
 b - as.POSIXct(2008-11-02 01:30:00, tz=EST)  # to tell T this is
 the second occurrence of 1:30 that day, in ST
 difftime(b,a)

 Time difference of 1 hours

 But why can't I do the following, which appears to be a typical R way
 of doing things, to handle several date-times at once?

 c - rep(2008-11-02 01:30:00, 2)
 tzone = c(EST5EDT, EST)

 as.POSIXct(c, tz=tzone)
 Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) :
  valeur 'tz' incorrecte

 ???

 Thanks,

 Denis Chabot

 sessionInfo()
 R version 2.9.1 Patched (2009-07-09 r48929)
 x86_64-apple-darwin9.7.0

 locale:
 fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 loaded via a namespace (and not attached):
 [1] tools_2.9.1

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (-8)^(1/3) == NaN?

2009-07-19 Thread Victor Manuel Garcia Guerrero

It' true, but if you type -8^(1/3) returns -2, and if you type -8^1/3 it 
returns -2.6, maybe there are some rules about parenthesis...

regards

Víctor 




De: r-help-boun...@r-project.org en nombre de Dave DeBarr
Enviado el: sáb 18/07/2009 05:04
Para: r-help@r-project.org
Asunto: [R] (-8)^(1/3) == NaN?



Why does the expression (-8)^(1/3) return NaN, instead of -2?

This is not answered by 
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-are-powers-of-negative-numbers-wrong_003f

Thanks,
Dave


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (-8)^(1/3) == NaN?

2009-07-19 Thread jim holtman

It is the order of operator precedence.  Look at what you typed:

-8^1/3 is parsed as -(8^1)/3 = -2.6

On Sun, Jul 19, 2009 at 1:12 PM, Victor Manuel Garcia
Guerrerovmgar...@colmex.mx wrote:
 It' true, but if you type -8^(1/3) returns -2, and if you type -8^1/3 it 
 returns -2.6, maybe there are some rules about parenthesis...

 regards

 Víctor


 

 De: r-help-boun...@r-project.org en nombre de Dave DeBarr
 Enviado el: sáb 18/07/2009 05:04
 Para: r-help@r-project.org
 Asunto: [R] (-8)^(1/3) == NaN?



 Why does the expression (-8)^(1/3) return NaN, instead of -2?

 This is not answered by 
 http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-are-powers-of-negative-numbers-wrong_003f

 Thanks,
 Dave


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] space in column name

2009-07-19 Thread Farrel Buchinsky

I read a table from Microsoft Access using RODBC. Some of the variables had
a name with a space in it.
R has no problem with it but I do.
I cannot find out how to specify the space

names(alltime)
 [1] IDLVL7  Ref Pv No Ref Pv Name   DOS
Pt Last Name  Pt First Name MRN   CPT   CPT
Desc  DxCd1 DxCd2 DxCd3 DxCd4
[15] DOE

But what do I do if I want to do something such as this
 alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc,]
Error: unexpected symbol in alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT
Desc


Farrel Buchinsky


Sent from Pittsburgh, Pennsylvania, United States

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] trouble using optim for maximalisation of 2-parameter function

2009-07-19 Thread Matthias Kohl


try:

# first argument of llik.nor has to be the parameter
llik.nor-function(theta,x){mu-theta[1];sig-theta[2];-length(x)*log(sqrt(2*pi))-length(x)*log(sig)-(1/(2*sig**2))*sum((x-mu)**2)}

# optim by default does minimization
# setting fnscale = -1 one obtains a maximization problem
optim(c(1,5), llik.nor, x=x, control = list(fnscale = -1))

hth,
Matthias

Anjali Sing schrieb:

Hello, I am having trouble using optim.

I want to maximalise a function to its parameters [kind of like: univariate
maximum likelihood estimation, but i wrote the likelihood function myself
because of data issues ]

When I try to optimize a function for only one parameter there is no
problem:

llik.expo-function(x,lam){(length(x)*log(lam))-(length(x)*log(1-exp(-1*lam*
*cons*)))-lam*sum(x)}

cons-

data-c(.)

expomx-optimize(llik.expo,c(0,100),maximum=TRUE,tol=0.0001,x=data)
expomx


 To optimize to two parameters you can't use optimize, so I tried the
following to test my input:

  
llik.nor-function(x,theta){mu-theta[1];sig-theta[2];-length(x)*log(sqrt(2*pi))-length(x)*log(sig)-(1/(2*sig**2))*sum((x-mu)**2)}
  

x-c(-1,4,6,4,2)
normx-optim(c(1,20),llik.nor)



the output should be close to
parameters: mu=3 and sigma=2.366
[This I calculated by hand to compare with the output]

but in stead of output I get an error:
Error in fn(par, ...) : argument theta is missing, with no default

I don't understand why this happened? I hope someone can help me with this
for I am still a [R]ookie.

Kind regards,
Sonko Lady  [Anjali]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  


--
Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] space in column name

2009-07-19 Thread cls59



Farrel Buchinsky-3 wrote:
 
 I read a table from Microsoft Access using RODBC. Some of the variables
 had
 a name with a space in it.
 R has no problem with it but I do.
 I cannot find out how to specify the space
 
 names(alltime)
  [1] IDLVL7  Ref Pv No Ref Pv Name   DOS
 Pt Last Name  Pt First Name MRN   CPT  
 CPT
 Desc  DxCd1 DxCd2 DxCd3 DxCd4
 [15] DOE
 
 But what do I do if I want to do something such as this
 alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc,]
 Error: unexpected symbol in alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT
 Desc
 
 
 Farrel Buchinsky
 
 


The following might work:

alltime[grep(MIDDLE EAR EXPLORE,alltime[[ CPT Desc ]] ]

-Charlie

-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://www.nabble.com/space-in-column-name-tp24559626p24559726.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] space in column name

2009-07-19 Thread cls59




cls59 wrote:
 
 


The following might work:

alltime[grep(MIDDLE EAR EXPLORE,alltime[[ CPT Desc ]] ]

-Charlie



ACK! Terribly sorry about the double post- but I forgot to close the quote.
It should be:

alltime[grep(MIDDLE EAR EXPLORE,alltime[[ CPT Desc ]] ]


Maybe I should wait until AFTER my morning coffee to browse R-help...

-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://www.nabble.com/space-in-column-name-tp24559626p24559754.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] space in column name

2009-07-19 Thread Farrel Buchinsky

I sifted some more and read about a workaround for the problem. I could
simply rename the columns so that there were no more spaces
names(alltime) -gsub( ,., names(alltime))

 names(alltime)
 [1] IDLVL7  Ref.Pv.No Ref.Pv.Name   DOS
Pt.Last.Name  Pt.First.Name MRN   CPT
CPT.Desc  DxCd1 DxCd2 DxCd3 DxCd4

[15] DOE

Farrel Buchinsky
Google Voice Tel: (412) 567-7870

Sent from Pittsburgh, Pennsylvania, United States

On Sun, Jul 19, 2009 at 14:32, Farrel Buchinsky fjb...@gmail.com wrote:

 I read a table from Microsoft Access using RODBC. Some of the variables had
 a name with a space in it.
 R has no problem with it but I do.
 I cannot find out how to specify the space

 names(alltime)
  [1] IDLVL7  Ref Pv No Ref Pv Name   DOS
   Pt Last Name  Pt First Name MRN   CPT
 CPT Desc  DxCd1 DxCd2 DxCd3 DxCd4

 [15] DOE

 But what do I do if I want to do something such as this
  alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc,]
 Error: unexpected symbol in alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT
 Desc


 Farrel Buchinsky


 Sent from Pittsburgh, Pennsylvania, United States


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling of non-overlapping intervals of variable length

2009-07-19 Thread David Winsemius



On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote:


Hi,

I hope I am not repeating a question which has been posed already.
I am trying to do the following in the most efficient way:
I would like to sample from a finite (large) set of integers n non- 
overlapping

intervals, where each interval i has a different, set length L_i
(which is the number
of integers in the interval).
I had the idea to sample recursively on a vector with the already
chosen intervals
discarded but that seems to be too complicated.


It might be ridiculously easy if you sampled on an index of a group of  
intervals.
Why not pose the question in the form of example data.frames or other  
classes of R objects? Specification of the desired output would be  
essential. I think further specification of the sampling strategy  
would also help because I am unable to understand what sort of  
probability model you are hoping to apply.



Any suggestions on that?

Thanks a lot.

Hadassa


--
Hadassa Brunschwig
PhD Student
Department of Statistics



David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] space in column name

2009-07-19 Thread jim holtman

use 'make.names'

 make.names(MIDDLE EAR EXPLORE)
[1] MIDDLE.EAR.EXPLORE


On Sun, Jul 19, 2009 at 2:32 PM, Farrel Buchinskyfjb...@gmail.com wrote:
 I read a table from Microsoft Access using RODBC. Some of the variables had
 a name with a space in it.
 R has no problem with it but I do.
 I cannot find out how to specify the space

 names(alltime)
  [1] ID            LVL7          Ref Pv No     Ref Pv Name   DOS
        Pt Last Name  Pt First Name MRN           CPT           CPT
 Desc      DxCd1         DxCd2         DxCd3         DxCd4
 [15] DOE

 But what do I do if I want to do something such as this
 alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT Desc,]
 Error: unexpected symbol in alltime[grep(MIDDLE EAR EXPLORE,alltime$CPT
 Desc


 Farrel Buchinsky


 Sent from Pittsburgh, Pennsylvania, United States

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] trouble using optim for maximalisation of 2-parameter function

2009-07-19 Thread Anjali Sing

Hello, I am having trouble using optim.

I want to maximalise a function to its parameters [kind of like: univariate
maximum likelihood estimation, but i wrote the likelihood function myself
because of data issues ]

When I try to optimize a function for only one parameter there is no
problem:

llik.expo-function(x,lam){(length(x)*log(lam))-(length(x)*log(1-exp(-1*lam*
*cons*)))-lam*sum(x)}

cons-

data-c(.)

expomx-optimize(llik.expo,c(0,100),maximum=TRUE,tol=0.0001,x=data)
expomx


 To optimize to two parameters you can't use optimize, so I tried the
following to test my input:


llik.nor-function(x,theta){mu-theta[1];sig-theta[2];-length(x)*log(sqrt(2*pi))-length(x)*log(sig)-(1/(2*sig**2))*sum((x-mu)**2)}
 x-c(-1,4,6,4,2)
 normx-optim(c(1,20),llik.nor)

the output should be close to
parameters: mu=3 and sigma=2.366
[This I calculated by hand to compare with the output]

but in stead of output I get an error:
Error in fn(par, ...) : argument theta is missing, with no default

I don't understand why this happened? I hope someone can help me with this
for I am still a [R]ookie.

Kind regards,
Sonko Lady  [Anjali]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] space in column name

2009-07-19 Thread cls59



Farrel Buchinsky-3 wrote:
 
 I sifted some more and read about a workaround for the problem. I could
 simply rename the columns so that there were no more spaces
 names(alltime) -gsub( ,., names(alltime))
 

That would certainly be a solution. The method I was trying to demonstrate
is that in addition  to using the $ sign syntax, lists and data.frames can
be accessed using strings- much like a hash. In order to do so, you must use
the [[]] or [] methods.

For example, just as you can do the following for a vector:

x - c(1,3,5)

for( i in 1:length(x) ){

  print( x[ i ] )

}

You could do something similar for your data.frame:

for( i in names(alltime) ){

 print( alltime[[ i ]] )

}

The important thing to note is that if you tried to use alltime$i in the
above loop, you would get a bunch of NULLS as there is no component named
'i'.

Hope that helps!

-Charlie

-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://www.nabble.com/space-in-column-name-tp24559626p24559869.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling of non-overlapping intervals of variable length

2009-07-19 Thread Hadassa Brunschwig

Hi

I am not sure what you mean by sampling an index of a group of
intervals. I will try to give an example:
Let's assume I have a vector 1:100. Let's say I have 10 intervals
of different but known length, say,
c(4,6,11,2,8,14,7,2,18,32). For simulation purposes I have to sample
those 10 intervals 1000 times.
The requirement is, however, that they should be of those lengths and
should not be overlapping.
In short, I would like to obtain a 10x1000 matrix with sampled intervals.

Thanks
Hadassa

On Sun, Jul 19, 2009 at 9:48 PM, David Winsemiusdwinsem...@comcast.net wrote:

 On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote:

 Hi,

 I hope I am not repeating a question which has been posed already.
 I am trying to do the following in the most efficient way:
 I would like to sample from a finite (large) set of integers n
 non-overlapping
 intervals, where each interval i has a different, set length L_i
 (which is the number
 of integers in the interval).
 I had the idea to sample recursively on a vector with the already
 chosen intervals
 discarded but that seems to be too complicated.

 It might be ridiculously easy if you sampled on an index of a group of
 intervals.
 Why not pose the question in the form of example data.frames or other
 classes of R objects? Specification of the desired output would be
 essential. I think further specification of the sampling strategy would also
 help because I am unable to understand what sort of probability model you
 are hoping to apply.

 Any suggestions on that?

 Thanks a lot.

 Hadassa


 --
 Hadassa Brunschwig
 PhD Student
 Department of Statistics


 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT





-- 
Hadassa Brunschwig
PhD Student
Department of Statistics
The Hebrew University of Jerusalem
http://www.stat.huji.ac.il

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] abline(v= x) in plot with time formated xaxis not working

2009-07-19 Thread F!!


Sorry for the lack of the plot. It's is just a simple xy plot with xvalues
formated as shown in my post (strptime()).

Anyway, here a working example:



 
 rm(list=ls())
 
 x-c(100*1500:1559, 100*1600:1659)
 x-strptime(x, format=%H%M%S)
 y-c(CO2$conc, CO2$conc[1:36])
 
 plot(x, y, pch=18, type=p, ylab=yfoo, xlab=xfoo, yaxs=r)
 
 abline(h=100, v=0, col=gray60, lty=2)
 text.xvalue-151500
 text.xvalue-strptime(text.xvalue, format=%H%M%S)
 text(text.xvalue, 80, detection threshold, col=gray60, cex=0.8)
 
 event.1-152833
 event.1-strptime(event.1, format=%H%M%S)
 abline(v=event.1, col=red)
 
 

The format for the xvalues works in the case of text positioning, but
doesn't in the case of abline. 

JimHoltman replied this topic via email with a solution:

-as.POSIXct(strptime(y, format=%H%M%S))


I still don't know why this second function is necessary since everything
works without it, except for the abline-thing...

Nevertheless, thanks for all the suggestions!

Fritz
-- 
View this message in context: 
http://www.nabble.com/abline%28v%3D-x%29-in-plot-with-time-formated-xaxis-not-working-tp24505884p24558044.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)

2009-07-19 Thread Charles C. Berry


On Sun, 19 Jul 2009, Tal Galili wrote:


Hello David,Thank you for your answer.

Do you know then what does the mcnemar.test do in the case of a 3*3 table
?


print(mcnemar.test)

will show you what it does.


Because the results for the simple example I gave are rather different (P
value of 0.053 VS 0.73)


The test mcnemar.test() constructs is one of symmetry, which is equivalent 
to marginal homogenity in hierarchical log-linear models as I recall from 
Bishop, Fienberg, and Holland's 1975 opus on count data.


Stuart-Maxwell uses the dispersion matrix of marginal difference.

These are two different tests. I suspect that Stuart-Maxwell is less 
susceptible to continuity issues in very sparse tables, which may account 
for the difference you see here.





In case the mcnemar can't really handle a 3*3 matrix (or more), shouldn't
there be an error massage for this case? (if so, who should I turn to, in
order to report this?)


Well, the code is pretty straightforward and

mcnemar.test(matrix(1:16,4))

returns 11.5495 which is correct.

It looks like there is nothing to report. 3,1,5), ncol = 3


Chuck



Thanks again,
Tal





On Sun, Jul 19, 2009 at 3:47 PM, David Freedman 3.14da...@gmail.com wrote:



There is a function mh_test in the coin package.

library(coin)
mh_test(tt)

The documentation states, The null hypothesis of independence of row and
column totals is tested. The corresponding test for binary factors x and y
is known as McNemar test. For larger tables, Stuart?s W0 statistic (Stuart,
1955, Agresti, 2002, page 422, also known as Stuart-Maxwell test) is
computed.

hth, david freedman


Tal Galili wrote:


Hello all,

I wish to perform a mcnemar test for a 3 by 3 matrix.
By running the slandered R command I am getting a result but I am not

sure

I
am getting the correct one.
Here is an example code:

(tt -  as.table(t(matrix(c(1,4,1,
0,5,5,
3,1,5), ncol = 3
mcnemar.test(tt, correct=T)
#And I get:
McNemar's Chi-squared test
data:  tt
McNemar's chi-squared = 7.6667, df = 3, p-value = *0.05343*


Now I was wondering if the test I just performed is the correct one.

From looking at the Wikipedia article on mcnemar (

http://en.wikipedia.org/wiki/McNemar's_test), it is said that:
The Stuart-Maxwell
testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm
is
different generalization of the McNemar test, used for testing marginal
homogeneity in a square table with more than two rows/columns


From searching for a Stuart-Maxwell

testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm
in
google, I found an algorithm here:


http://www.m-hikari.com/ams/ams-password-2009/ams-password9-12-2009/abbasiAMS9-12-2009.pdf



From running this algorithm I am getting a different P value, here is the

(somewhat ugly) code I produced for this:
get.d - function(xx)
{
  length1 - dim(xx)[1]
  ret1 - margin.table(xx,1) - margin.table(xx,2)
  return(ret1)
}

get.s - function(xx)
{
  the.s - xx
  for( i in 1:dim(xx)[1])
  {
for(j in 1:dim(xx)[2])
{
  if(i == j)
  {
the.s[i,j] - margin.table(xx,1)[i] + margin.table(xx,2)[i] -
2*xx[i,i]
  } else {
the.s[i,j] - -(xx[i,j] + xx[j,i])
  }
}
  }
  return(the.s)
}

chi.statistic - t(get.d(tt)[-3]) %*% solve(get.s(tt)[-3,-3])  %*%
get.d(tt)[-3]
paste(the P value:, pchisq(chi.statistic, 2))

#and the result was:
 the P value: 0.268384371053358



So to summarize my questions:
1) can I use mcnemar.test for 3*3 (or more) tables ?
2) if so, what test is being performed (
Stuart-Maxwell

http://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm)

?
3) Do you have a recommended link to an explanation of the algorithm
employed?


Thanks,
Tal





--
--


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context:
http://www.nabble.com/Can-I-use-%22mcnemar.test%22-for-3*3-tables-%28or-is-there-a-bug-in-the-command-%29-tp24556414p24556693.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
--


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:

Re: [R] Sampling of non-overlapping intervals of variable length

2009-07-19 Thread David Winsemius



On Jul 19, 2009, at 3:11 PM, Hadassa Brunschwig wrote:


Hi

I am not sure what you mean by sampling an index of a group of
intervals. I will try to give an example:


If you had a dataframe of the following sort:

dfint
start stop
3   7
12  20
40  45
60  72

And you wanted to generate a set of 100 samples with equal probability  
of occurring in any one of those intervals, you might sample first on  
the index of the intervals:


idx=sample(1:4, 100, replace=TRUE)
some sort of appropriate iterative construct
 ### and then sample within the intervals.
sample((dfint[idx,1]):(dfint[idx,2]), 1)
end loop



Let's assume I have a vector 1:100. Let's say I have 10 intervals
of different but known length, say,
c(4,6,11,2,8,14,7,2,18,32). For simulation purposes I have to sample
those 10 intervals 1000 times.
The requirement is, however, that they should be of those lengths and
should not be overlapping.
In short, I would like to obtain a 10x1000 matrix with sampled  
intervals.


I am trying to understand how the vector 1:100 relates to the  
intervals. What do you mean by sample the 10 intervals?  For one  
thing what you have offered are not intervals at all. Are you saying  
(in part at least) that you want equal probabilities of a sampled  
element to come from each of the intervals, however, they might be  
defined? Or do you want the sampling probabilities to vary from  
interval to interval.


I say again, ... a concrete example could do marvels in communicating  
your goals.


Do




Thanks
Hadassa

On Sun, Jul 19, 2009 at 9:48 PM, David Winsemiusdwinsem...@comcast.net 
 wrote:


On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote:


Hi,

I hope I am not repeating a question which has been posed already.
I am trying to do the following in the most efficient way:
I would like to sample from a finite (large) set of integers n
non-overlapping
intervals, where each interval i has a different, set length L_i
(which is the number
of integers in the interval).
I had the idea to sample recursively on a vector with the already
chosen intervals
discarded but that seems to be too complicated.


It might be ridiculously easy if you sampled on an index of a group  
of

intervals.
Why not pose the question in the form of example data.frames or other
classes of R objects? Specification of the desired output would be
essential. I think further specification of the sampling strategy  
would also
help because I am unable to understand what sort of probability  
model you

are hoping to apply.


Any suggestions on that?

Thanks a lot.

Hadassa


--
Hadassa Brunschwig
PhD Student
Department of Statistics



David Winsemius, MD
Heritage Laboratories
West Hartford, CT






--
Hadassa Brunschwig
PhD Student
Department of Statistics
The Hebrew University of Jerusalem
http://www.stat.huji.ac.il


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling of non-overlapping intervals of variable length

2009-07-19 Thread Charles C. Berry


On Sun, 19 Jul 2009, Hadassa Brunschwig wrote:


Hi

I am not sure what you mean by sampling an index of a group of
intervals. I will try to give an example:
Let's assume I have a vector 1:100. Let's say I have 10 intervals
of different but known length, say,
c(4,6,11,2,8,14,7,2,18,32). For simulation purposes I have to sample
those 10 intervals 1000 times.
The requirement is, however, that they should be of those lengths and
should not be overlapping.
In short, I would like to obtain a 10x1000 matrix with sampled intervals.


Something like this:



lens - c(4,6,11,2,8,14,7,2,18,32)
perm.lens - sample(lens)
sort(sample(1e06-sum(lens)+length(lens),length(lens)))+cumsum(c(0,head(perm.lens,-1)))

 [1]  15424 261927 430276 445976 451069 546578 656123 890494 939714 969643




The vector above gives the starting points for the intervals whose lengths 
are perm.lens.


I'd bet every introductory combinatorics book has a problem or example in 
which the expression for the number of ways in which K ordered objects can 
be assigned to I groups consisting of n_i adjacent objects each is 
constructed. The construction is along the lines of the calculation above.


HTH,

Chuck




Thanks
Hadassa

On Sun, Jul 19, 2009 at 9:48 PM, David Winsemiusdwinsem...@comcast.net wrote:


On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote:


Hi,

I hope I am not repeating a question which has been posed already.
I am trying to do the following in the most efficient way:
I would like to sample from a finite (large) set of integers n
non-overlapping
intervals, where each interval i has a different, set length L_i
(which is the number
of integers in the interval).
I had the idea to sample recursively on a vector with the already
chosen intervals
discarded but that seems to be too complicated.


It might be ridiculously easy if you sampled on an index of a group of
intervals.
Why not pose the question in the form of example data.frames or other
classes of R objects? Specification of the desired output would be
essential. I think further specification of the sampling strategy would also
help because I am unable to understand what sort of probability model you
are hoping to apply.


Any suggestions on that?

Thanks a lot.

Hadassa


--
Hadassa Brunschwig
PhD Student
Department of Statistics



David Winsemius, MD
Heritage Laboratories
West Hartford, CT






--
Hadassa Brunschwig
PhD Student
Department of Statistics
The Hebrew University of Jerusalem
http://www.stat.huji.ac.il

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] abline(v= x) in plot with time formated xaxis not working

2009-07-19 Thread jim holtman

The reason that you need the 'as.POSIXct' is that the 'abline'
function does not recognize variables with POSIXlt class.

 methods(abline)
no methods were found
Warning message:
In methods(abline) : function 'abline' appears not to be generic
 methods(plot)
 [1] plot.acf*   plot.data.frame*plot.Date*
plot.decomposed.ts* plot.default
 [6] plot.dendrogram*plot.densityplot.ecdf
plot.factor*plot.formula*
[11] plot.hclust*plot.histogram* plot.HoltWinters*
plot.isoreg*plot.lm
[16] plot.medpolish* plot.mlmplot.POSIXct*
plot.POSIXlt*   plot.ppr*
[21] plot.prcomp*plot.princomp*  plot.profile.nls*
plot.sarplot.spec
[26] plot.spec.coherency plot.spec.phase plot.srx.file
plot.stepfunplot.stl*
[31] plot.table* plot.ts plot.tskernel*
plot.TukeyHSD

'plot' knows how to handle POSIXct/lt classes.  You need to at least
convert to POSIXct since the value will be within the range of what
the x-axis is:

 str(event.1)
 POSIXlt[1:9], format: 2009-07-19 15:28:33
 par('usr')
[1] 1248015314.4 1248023025.6 58.8   1036.2
 unclass(as.POSIXct(event.1))
[1] 1248017313
attr(,tzone)
[1] GMT

This shows the extent of the x-axis and then what the value of event.1
will be after conversion to POSIXct.

HTH

On Sun, Jul 19, 2009 at 11:33 AM, F!!p...@students.uni-mainz.de wrote:

 Sorry for the lack of the plot. It's is just a simple xy plot with xvalues
 formated as shown in my post (strptime()).

 Anyway, here a working example:




 rm(list=ls())

 x-c(100*1500:1559, 100*1600:1659)
 x-strptime(x, format=%H%M%S)
 y-c(CO2$conc, CO2$conc[1:36])

 plot(x, y, pch=18, type=p, ylab=yfoo, xlab=xfoo, yaxs=r)

 abline(h=100, v=0, col=gray60, lty=2)
 text.xvalue-151500
 text.xvalue-strptime(text.xvalue, format=%H%M%S)
 text(text.xvalue, 80, detection threshold, col=gray60, cex=0.8)

 event.1-152833
 event.1-strptime(event.1, format=%H%M%S)
 abline(v=event.1, col=red)



 The format for the xvalues works in the case of text positioning, but
 doesn't in the case of abline.

 JimHoltman replied this topic via email with a solution:

 -as.POSIXct(strptime(y, format=%H%M%S))


 I still don't know why this second function is necessary since everything
 works without it, except for the abline-thing...

 Nevertheless, thanks for all the suggestions!

 Fritz
 --
 View this message in context: 
 http://www.nabble.com/abline%28v%3D-x%29-in-plot-with-time-formated-xaxis-not-working-tp24505884p24558044.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)

2009-07-19 Thread Peter Dalgaard


Charles C. Berry wrote:



The test mcnemar.test() constructs is one of symmetry, which is 
equivalent to marginal homogenity in hierarchical log-linear models as I 
recall from Bishop, Fienberg, and Holland's 1975 opus on count data.


Umm, er... Symmetry in the 3x3 table is a 3DF hypothesis, whereas 
marginal homogeneity has 2DF, so unless I'm missing a fine point in the 
requirement of hierarchical log-linear, I'd say that one implies the 
other, but not the other way around.


E.g., you can easily check that the following two matrices have the same 
homogeneous margins, but only one is symmetric


3 2 1
2 3 2
1 2 3

3 1 2
3 3 1
0 3 3

--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (-8)^(1/3) == NaN?

2009-07-19 Thread Liviu Andronic

On Sun, Jul 19, 2009 at 7:33 PM, jim holtmanjholt...@gmail.com wrote:
 -8^1/3 is parsed as -(8^1)/3 = -2.6


However the following is evaluated as one would expect:
 8^(1/3)
[1] 2
 -8^(1/3)
[1] -2

Perhaps it is parsed in this way:
 -(8^(1/3))
[1] -2

Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (-8)^(1/3) == NaN?

2009-07-19 Thread Liviu Andronic

On Sun, Jul 19, 2009 at 12:28 AM, jim holtmanjholt...@gmail.com wrote:
 First of all, read FAQ 7.31 to understand that 1/3 is not
 representable in floating point.  Also a^b is actually exp(log(a) * b)
 and log(-8) is not valid (NaN).


If this is so, why would the following evaluate as expected?
 (-8)^(3)
[1] -512

Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] svm works but tune.svm give error

2009-07-19 Thread Uwe Ligges




Noah Silverman wrote:

Hello,

I'm using the e1071 library for SVM functions.

I can quickly train an SVM with:
svm(formula = label ~ ., data = testdata)

That works well.

I want to tune the parameters, so I tried:

tune.svm(label ~ ., data=testdata[1:2000, ], gamma=10^(-6:3), cost=10^(1:2))

THIS FAILS WITH AN ERROR:
'names' attribute [199] must be the same length as the vector [184]

I don't understand why this is happening.  



We do not know, since you forgot to specify a reproducible example.
Anyway, is label in your workspace and/or in the testdata?

Uwe Ligges


Presumably, if the data is 
correct for training an SVM, then it should also be correct for the 
tune.svm function.


Can anyone help me solve this one?

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparing loadings (next to each other)

2009-07-19 Thread Uwe Ligges




Adam D. I. Kramer wrote:

Dear colleagues,

I've been running some principal components analyses, which generate
tables of loadings that I'm interested in looking at. 
print(f1$rot$load,cutoff=.4) is what I use, and it gives me what I want.


However, I'm now interested in comparing these loadings across a few
data sets. In other words, I would like R to match the loadings on
rownames() and display them next to each other, e.g.:

Loadings:
PC 1PC 2PC 1PC 2PC 1PC 2
col10.400.450.90
col20.800.55 col30.770.700.42

...I could certainly just cbind them together, but then I can't class them
as loadings:


class(x) - loadings



What is x? What is loadings? Where is the reproducible code?

Uwe Ligges




Error in class(x) - loadings :
  cannot coerce type 'closure' to vector of type 'character'

...and I'm very interested in using the cutoff feature of print.loadings.

Any suggestions? I could go in and alter print.loadings myself, but if
there's an easier way let me know.

Many thanks,
--
Adam D. I. Kramer
a...@uoregon.edu
Ph.D. Candidate, Social and Personality Psycholgoy
University of Oregon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (-8)^(1/3) == NaN?

2009-07-19 Thread jim holtman

If the power that a number is being raised to is integer, then is does
evaluate honoring the unary minus.

 (-2) ^ 5  #integer power
[1] -32
 (-2) ^ 5.1
[1] NaN



-8^(1/3)

is parsed as -(8^(1/3)) according to operator precedence.

On Sun, Jul 19, 2009 at 4:49 PM, Liviu Androniclandronim...@gmail.com wrote:
 On Sun, Jul 19, 2009 at 12:28 AM, jim holtmanjholt...@gmail.com wrote:
 First of all, read FAQ 7.31 to understand that 1/3 is not
 representable in floating point.  Also a^b is actually exp(log(a) * b)
 and log(-8) is not valid (NaN).


 If this is so, why would the following evaluate as expected?
 (-8)^(3)
 [1] -512

 Liviu




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ifelse choices in a data.frame?

2009-07-19 Thread Mark Knecht

Hi,
   In my data.frame I  wanted to essentially write

If(Test) c*d else c+d

but that doesn't work. I found I could do it mathematically, but it
seems forced and won't scale well for nested logic. I have two
examples below writing columns e  f, but I don't think the code is
self-documenting as it depends on knowing that Test is a TRUE/FALSE.

   Is there a better way to do the following?

Thanks,
Mark


DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0))
DF$Test - with(DF, a == b)

DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test

DF$f = with(DF, (c*d)*Test + (c+d)*!Test)

DF

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ifelse choices in a data.frame?

2009-07-19 Thread Duncan Murdoch


On 19/07/2009 5:17 PM, Mark Knecht wrote:

Hi,
   In my data.frame I  wanted to essentially write

If(Test) c*d else c+d

but that doesn't work. I found I could do it mathematically, but it
seems forced and won't scale well for nested logic. I have two
examples below writing columns e  f, but I don't think the code is
self-documenting as it depends on knowing that Test is a TRUE/FALSE.

   Is there a better way to do the following?

Thanks,
Mark


DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0))
DF$Test - with(DF, a == b)

DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test

DF$f = with(DF, (c*d)*Test + (c+d)*!Test)


Use ifelse().

DF$f - with(DF, ifelse(a == b, c*d, c+d))

Duncan Murdoch



DF

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ifelse choices in a data.frame?

2009-07-19 Thread jim holtman

use ifelse:

 DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0))
 DF$Test - with(DF, a == b)

 DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test

 DF$f = with(DF, (c*d)*Test + (c+d)*!Test)
 #or
 DF$f.1 - ifelse(DF$Test, DF$c * DF$d, DF$c + DF$d)
 head(DF)
  a b c d  e  f  Test f.1
1 1 1 1 1  1  1  TRUE   1
2 2 2 2 2  4  4  TRUE   4
3 3 1 3 3  6  6 FALSE   6
4 4 2 4 4  8  8 FALSE   8
5 1 1 5 5 25 25  TRUE  25
6 2 2 6 6 36 36  TRUE  36



On Sun, Jul 19, 2009 at 5:17 PM, Mark Knechtmarkkne...@gmail.com wrote:
 Hi,
   In my data.frame I  wanted to essentially write

 If(Test) c*d else c+d

 but that doesn't work. I found I could do it mathematically, but it
 seems forced and won't scale well for nested logic. I have two
 examples below writing columns e  f, but I don't think the code is
 self-documenting as it depends on knowing that Test is a TRUE/FALSE.

   Is there a better way to do the following?

 Thanks,
 Mark


 DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0))
 DF$Test - with(DF, a == b)

 DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test

 DF$f = with(DF, (c*d)*Test + (c+d)*!Test)

 DF

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ifelse choices in a data.frame?

2009-07-19 Thread Daniel Nordlund

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Mark Knecht
 Sent: Sunday, July 19, 2009 2:18 PM
 To: r-help
 Subject: [R] ifelse choices in a data.frame?

 Hi,
In my data.frame I  wanted to essentially write

 If(Test) c*d else c+d

 but that doesn't work. I found I could do it mathematically, but it
 seems forced and won't scale well for nested logic. I have two
 examples below writing columns e  f, but I don't think the code is
 self-documenting as it depends on knowing that Test is a TRUE/FALSE.

Is there a better way to do the following?

 Thanks,
 Mark

 DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0))
 DF$Test - with(DF, a == b)

 DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test

 DF$f = with(DF, (c*d)*Test + (c+d)*!Test)

 DF

Mark,

Why can't you use ifelse() ?

 DF$g - with(DF, ifelse(Test,c*d,c+d))

Have I missed something in what you are doing?

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ifelse choices in a data.frame?

2009-07-19 Thread Mark Knecht

That's much better. thanks!

I was sure I'd looked at that but then for some reason stuck with just if.

Again, as a total a newb I appreciate the help.

- Mark

On Sun, Jul 19, 2009 at 2:26 PM, Duncan Murdochmurd...@stats.uwo.ca wrote:
 On 19/07/2009 5:17 PM, Mark Knecht wrote:

 Hi,
   In my data.frame I  wanted to essentially write

 If(Test) c*d else c+d

 but that doesn't work. I found I could do it mathematically, but it
 seems forced and won't scale well for nested logic. I have two
 examples below writing columns e  f, but I don't think the code is
 self-documenting as it depends on knowing that Test is a TRUE/FALSE.

   Is there a better way to do the following?

 Thanks,
 Mark


 DF - data.frame(cbind(a=1:4, b=1:2, c=1:8, d=1:16, e=0, f=0))
 DF$Test - with(DF, a == b)

 DF$e = (DF$c*DF$d) * DF$Test + (DF$c+DF$d) * !DF$Test

 DF$f = with(DF, (c*d)*Test + (c+d)*!Test)

 Use ifelse().

 DF$f - with(DF, ifelse(a == b, c*d, c+d))

 Duncan Murdoch


 DF

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (-8)^(1/3) == NaN?

2009-07-19 Thread Rolf Turner



On 20/07/2009, at 9:13 AM, jim holtman wrote:


If the power that a number is being raised to is integer, then is does
evaluate honoring the unary minus.


(-2) ^ 5  #integer power

[1] -32

(-2) ^ 5.1

[1] NaN


snip

I was vaguely aware of this ... but it now triggers in my mind the
question of how the ^ function decides when the exponent is an integer.

A bit of experimentation seems to indicate that, e.g., (-2)^x ``works''
if (and only if?) round(x)==x returns TRUE.

Note that (-2)^x may NOT ``work'' in some cases were all.equal(x,round 
(x))

returns TRUE.

Young players should also be aware of the following trap.  It
can happen that n + epsilon ``is an integer'' according to my
rule, but m + epsilon is NOT an integer according to this rule.
Where m and n are both integers.

E.g.:

 eps - 0.4e-15
 x - 5+eps
 x==round(x)
[1] TRUE
 y - 3+eps
 y==round(y)
[1] FALSE

This is of course due to the exigencies of how n and m are represented
in floating point arithmetic.  Not too deep once you're aware of the
problem, but it can still be a ``gotcha'' if one is not alert.

(Be alert.  The world needs more lerts!)

cheers,

Rolf Turner

P. S.  Perhaps young players should be reminded at this point that  
is.integer() is
no help here.  This function tells you about the ***storage mode***  
of its argument.

Only.

R. T.


##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] combining 4 vectors into one

2009-07-19 Thread Christopher Desjardins


Hi,
How can I perform the following

y1y2y3y4
1  0   3   2
0  1   2   1
3  2   0   1
0  5   1   0

into ...

 y
 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0


Please cc me on reply as I subscribe to the digest.
Thanks!
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (-8)^(1/3) == NaN?

2009-07-19 Thread jim holtman

It also works for raising a number to a negative integer:

 (-3)^(-3)
[1] -0.03703704



On Sun, Jul 19, 2009 at 6:23 PM, Rolf Turnerr.tur...@auckland.ac.nz wrote:

 On 20/07/2009, at 9:13 AM, jim holtman wrote:

 If the power that a number is being raised to is integer, then is does
 evaluate honoring the unary minus.

 (-2) ^ 5  #integer power

 [1] -32

 (-2) ^ 5.1

 [1] NaN

        snip

 I was vaguely aware of this ... but it now triggers in my mind the
 question of how the ^ function decides when the exponent is an integer.

 A bit of experimentation seems to indicate that, e.g., (-2)^x ``works''
 if (and only if?) round(x)==x returns TRUE.

 Note that (-2)^x may NOT ``work'' in some cases were all.equal(x,round(x))
 returns TRUE.

 Young players should also be aware of the following trap.  It
 can happen that n + epsilon ``is an integer'' according to my
 rule, but m + epsilon is NOT an integer according to this rule.
 Where m and n are both integers.

 E.g.:

 eps - 0.4e-15
 x - 5+eps
 x==round(x)
 [1] TRUE
 y - 3+eps
 y==round(y)
 [1] FALSE

 This is of course due to the exigencies of how n and m are represented
 in floating point arithmetic.  Not too deep once you're aware of the
 problem, but it can still be a ``gotcha'' if one is not alert.

 (Be alert.  The world needs more lerts!)

        cheers,

                Rolf Turner

 P. S.  Perhaps young players should be reminded at this point that
 is.integer() is
 no help here.  This function tells you about the ***storage mode*** of its
 argument.
 Only.

                R. T.


 ##
 Attention:This e-mail message is privileged and confidential. If you are not
 theintended recipient please delete the message and notify the sender.Any
 views or opinions presented are solely those of the author.

 This e-mail has been scanned and cleared by
 MailMarshalwww.marshalsoftware.com
 ##




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] combining 4 vectors into one

2009-07-19 Thread jim holtman

Try this:

 x
  y1 y2 y3 y4
1  1  0  3  2
2  0  1  2  1
3  3  2  0  1
4  0  5  1  0
 as.vector(t(x))
 [1] 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0




On Sun, Jul 19, 2009 at 7:25 PM, Christopher
Desjardinscddesjard...@gmail.com wrote:
 Hi,
 How can I perform the following

 y1    y2    y3    y4
 1      0       3       2
 0      1       2       1
 3      2       0       1
 0      5       1       0

 into ...

 y
  1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0


 Please cc me on reply as I subscribe to the digest.
 Thanks!
 Chris

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] combining 4 vectors into one

2009-07-19 Thread Christopher Desjardins

Thanks. That worked.
Chris

On 7/19/09 6:41 PM, jim holtman wrote:
 Try this:


 x
  
y1 y2 y3 y4
 1  1  0  3  2
 2  0  1  2  1
 3  3  2  0  1
 4  0  5  1  0

 as.vector(t(x))
  
   [1] 1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0




 On Sun, Jul 19, 2009 at 7:25 PM, Christopher
 Desjardinscddesjard...@gmail.com  wrote:

 Hi,
 How can I perform the following

 y1y2y3y4
 1  0   3   2
 0  1   2   1
 3  2   0   1
 0  5   1   0

 into ...

  
 y

   1 0 3 2 0 1 2 1 3 2 0 1 0 5 1 0


 Please cc me on reply as I subscribe to the digest.
 Thanks!
 Chris

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

  






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)

2009-07-19 Thread Charles C. Berry


On Sun, 19 Jul 2009, Peter Dalgaard wrote:


Charles C. Berry wrote:



 The test mcnemar.test() constructs is one of symmetry, which is equivalent
 to marginal homogenity in hierarchical log-linear models as I recall from
 Bishop, Fienberg, and Holland's 1975 opus on count data.


Umm, er... Symmetry in the 3x3 table is a 3DF hypothesis, whereas marginal 
homogeneity has 2DF, so unless I'm missing a fine point in the requirement of 
hierarchical log-linear, I'd say that one implies the other, but not the 
other way around.



Right, symmetry equals marginal homogenity plus 'quasi-symmetry' - a 
condition on the odds-ratios of a two way table and here that condition 
uses one degree of freedom.


But, representing marginal homogenity in log-linear models gets sticky 
without that quasi-symmetry condition.


---

Taking m_{ij} to be the expected cell frequencies in a two way table, the 
log-linear model for the two way table is


log m_{ij} = \mu + \mu_{1(i)} + \mu_{2(j)} + \mu_{12(ij)}

with side conditions that any of the subscripted \mu terms sums to zero 
over any of its subscripts. In the notation here, \mu is an intercept, 
\mu_1 terms are row effects, \mu_2 terms are column effects, and \mu_{12} 
terms are interactions of the row and columns. The parenthical terms (i), 
(j), or (ij) index the row, column, or cell.


In the case of the 3 x 3 table, there are 1, 2, 2, and 4 degrees of 
freedom respectively for each of the sets of terms in the saturated 
log-linear model.


---

Marginal homogenity says m_{i+} = m_{+i}, all i, taking m_{ij} to be the 
expected cell frequencies and the {i+} notation to indicate summation over 
the missing subscript.


---

Trying to set up a log-linear model for marginal homogeneity would lead 
you to equate the row and column effects:


log m_{ij} = \mu + \mu_{1(i)} + \mu_{1(j)} + \mu_{12(ij)}

but this does not imply marginal homogenity given the side conditions 
unless the \mu_{12(ij)} obey additional constraints which also implies 
symmetry.




E.g., you can easily check that the following two matrices have the same 
homogeneous margins, but only one is symmetric


3 2 1
2 3 2
1 2 3

3 1 2
3 3 1
0 3 3



If you want to represent this last table as

m_{ij} = \exp(\mu + \mu_{1(i)} + \mu_{1(j)} + \mu_{12(ij)})

you cannot get there with the side conditions that are imposed on 
\mu_{12}. You need additional terms.


Chuck



--
  O__   Peter Dalgaard ??ster Farimagsgade 5, Entr.B
 c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~ ~ ~ ~ ~ ~ ~ ~ ~ ~  - (p.dalga...@biostat.ku.dk)  FAX: (+45) 
~ ~ ~ ~ ~ ~ ~ ~ ~ ~  35327907





Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Arules questions. I need some help please

2009-07-19 Thread Michael Hahsler


Question 2a)
I am also working with arules package and I have the following problem
let suppose the matrix b like:
b-matrix(c(1,1,1,1,1,1,0,0,1,1,1,1,0,0,1,1,0,1,1,1,1,1,1,1),nrow=6)
rownames(b)=c(T1, T2, T3, T4, T5, T6)
colnames(b)=c(It1, It2, It3, It4)
bt-as(b, transactions)
rules-apriori(bt, parameter = list(maxlen=2))
k-inspect(rules)
k
##we obtain
  lhs  rhs support confidence lift
1 {}= {It4} 1.000  1  1.0
2 {}= {It1} 1.000  1  1.0
3 {It3} = {It2} 0.500  1  1.5
4 {It3} = {It4} 0.500  1  1.0
5 {It3} = {It1} 0.500  1  1.0
6 {It2} = {It4} 0.667  1  1.0
7 {It2} = {It1} 0.667  1  1.0
8 {It4} = {It1} 1.000  1  1.0
9 {It1} = {It4} 1.000  1  1.0

WRITE(rules)
##we obtain
rules support confidence lift
1 {} = {It4} 1 1 1
2 {} = {It1} 1 1 1
3 {It3} = {It2} 0.5 1 1.5
4 {It3} = {It4} 0.5 1 1
5 {It3} = {It1} 0.5 1 1
6 {It2} = {It4} 0.667 1 1
7 {It2} = {It1} 0.667 1 1
8 {It4} = {It1} 1 1 1
9 {It1} = {It4} 1 1 1
I want to convert this in a matrix or data frame and the result should be
something like this
##we obtain
  rules  support  confidence  lift
1 {}={It4}  1.000   11.0
2 {}={It1}  1.000   11.0
3 {It3}={It2}  0.500   11.5
4 {It3}={It4}  0.500   11.0
5 {It3}={It1}  0.500   11.0
6 {It2}={It4}  0.667   11.0
7 {It2}={It1}  0.667   11.0
8 {It4}={It1}  1.000   11.0
9 {It1}={It4}  1.000   11.0



R data.frame(rules = labels(rules), quality(rules))
   rules   support confidence lift
1{} = {It4} 1.000  1  1.0
2{} = {It1} 1.000  1  1.0
3 {It3} = {It2} 0.500  1  1.5
4 {It3} = {It4} 0.500  1  1.0
5 {It3} = {It1} 0.500  1  1.0
6 {It2} = {It4} 0.667  1  1.0
7 {It2} = {It1} 0.667  1  1.0
8 {It4} = {It1} 1.000  1  1.0
9 {It1} = {It4} 1.000  1  1.0



In another hand is it possible to obtain all the rules where lHS=rhs?. In
our last example that means to obtain
  lhs  rhs support confidence lift
8 {It4} = {It1} 1.000  1  1.0
9 {It1} = {It4} 1.000  1  1.0


This is a little more tricky.

We can create a new set of rules with rhs and lhs reversed and then see 
if these reversed rules match some of the original rules.


 rulesRev - new(rules, rhs=lhs(rules), lhs=rhs(rules))
 m - match(rules, rulesRev, nomatch=0)
 inspect(rules[m,])
  lhs  rhs   support confidence lift
1 {It1} = {It4}   1  11
2 {It4} = {It1}   1  11

Maybe there is an easier way. I have to think about it.

-Michael


Thanks again Alberto






--
  Michael Hahsler
  email: mich...@hahsler.net
  web: http://michael.hahsler.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can I use mcnemar.test for 3*3 tables (or is there a bug in the command?)

2009-07-19 Thread David Winsemius



On Jul 19, 2009, at 6:09 PM, Tal Galili wrote:


Hello Charles,
Thank you for the detail reply.

I am still left with the leading question which is: which test  
should I use
when analyzing the 3 by 3 matrix I have? The mcnemar.test or the   
mh_test?

Is the one necessarily better then the other?


Please define better.


(for example for
sparser matrices ?)


That does not help.



What about:
mh_test(as.table(matrix(1:16,4)))
It returns a very significant result:
chi-squared = 11.4098, df = 3, p-value = 0.009704

Where as mcnemar.test(matrix(1:16,4)), didn't:
McNemar's chi-squared = 11.5495, df = 6, p-value = 0.0728

So which one is right ?


And now ... define right.


(from the looks of it, the mh_test is doing much better)


Perhaps from the perspective of a statistically naive reviewer.



Should the strategy be to try and use both methods, and start  
digging when

one doesn't sit well with the other?


I am reminded of Jim Holtam's tag line:  What problem are you trying  
to solve?





Thanks,
Tal



On Sun, Jul 19, 2009 at 10:26 PM, Charles C. Berry cbe...@tajo.ucsd.edu 
wrote:



On Sun, 19 Jul 2009, Tal Galili wrote:

Hello David,Thank you for your answer.


Do you know then what does the mcnemar.test do in the case of a  
3*3

table
?



  print(mcnemar.test)

will show you what it does.

Because the results for the simple example I gave are rather  
different (P

value of 0.053 VS 0.73)



The test mcnemar.test() constructs is one of symmetry, which is  
equivalent
to marginal homogenity in hierarchical log-linear models as I  
recall from

Bishop, Fienberg, and Holland's 1975 opus on count data.

Stuart-Maxwell uses the dispersion matrix of marginal difference.

These are two different tests. I suspect that Stuart-Maxwell is less
susceptible to continuity issues in very sparse tables, which may  
account

for the difference you see here.



In case the mcnemar can't really handle a 3*3 matrix (or more),  
shouldn't
there be an error massage for this case? (if so, who should I turn  
to, in

order to report this?)



Well, the code is pretty straightforward and

  mcnemar.test(matrix(1:16,4))

returns 11.5495 which is correct.

It looks like there is nothing to report. 3,1,5), ncol = 3


Chuck



Thanks again,
Tal





On Sun, Jul 19, 2009 at 3:47 PM, David Freedman  
3.14da...@gmail.com

wrote:



There is a function mh_test in the coin package.

library(coin)
mh_test(tt)

The documentation states, The null hypothesis of independence of  
row and
column totals is tested. The corresponding test for binary  
factors x and

y
is known as McNemar test. For larger tables, Stuarts W0 statistic
(Stuart,
1955, Agresti, 2002, page 422, also known as Stuart-Maxwell test)  
is

computed.

hth, david freedman


Tal Galili wrote:



Hello all,

I wish to perform a mcnemar test for a 3 by 3 matrix.
By running the slandered R command I am getting a result but I  
am not



sure


I
am getting the correct one.
Here is an example code:

(tt -  as.table(t(matrix(c(1,4,1,
  0,5,5,
  3,1,5), ncol = 3
mcnemar.test(tt, correct=T)
#And I get:
  McNemar's Chi-squared test
data:  tt
McNemar's chi-squared = 7.6667, df = 3, p-value = *0.05343*


Now I was wondering if the test I just performed is the correct  
one.



From looking at the Wikipedia article on mcnemar (


http://en.wikipedia.org/wiki/McNemar's_test), it is said that:
The Stuart-Maxwell
testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm 


is
different generalization of the McNemar test, used for testing  
marginal

homogeneity in a square table with more than two rows/columns

From searching for a Stuart-Maxwell


testhttp://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm 


in
google, I found an algorithm here:



http://www.m-hikari.com/ams/ams-password-2009/ams-password9-12-2009/abbasiAMS9-12-2009.pdf



From running this algorithm I am getting a different P value,  
here is

the


(somewhat ugly) code I produced for this:
get.d - function(xx)
{
length1 - dim(xx)[1]
ret1 - margin.table(xx,1) - margin.table(xx,2)
return(ret1)
}

get.s - function(xx)
{
the.s - xx
for( i in 1:dim(xx)[1])
{
  for(j in 1:dim(xx)[2])
  {
if(i == j)
{
  the.s[i,j] - margin.table(xx,1)[i] + margin.table(xx,2) 
[i] -

2*xx[i,i]
} else {
  the.s[i,j] - -(xx[i,j] + xx[j,i])
}
  }
}
return(the.s)
}

chi.statistic - t(get.d(tt)[-3]) %*% solve(get.s(tt)[-3,-3])  %*%
get.d(tt)[-3]
paste(the P value:, pchisq(chi.statistic, 2))

#and the result was:
the P value: 0.268384371053358



So to summarize my questions:
1) can I use mcnemar.test for 3*3 (or more) tables ?
2) if so, what test is being performed (
Stuart-Maxwell


http://ourworld.compuserve.com/homepages/jsuebersax/mcnemar.htm)


?
3) Do you have a recommended link to an explanation of the  
algorithm

employed?


Thanks,
Tal





--
--


My contact

Re: [R] Solving two nonlinear equations with two knowns

2009-07-19 Thread yhsu6

I try to use integrate command to get theoretical mean and variance, and then 
solve for a and b. I have an error as follows,
Iteration:  0  ||F(x0)||:  2.7096 
Error in integrate(InverseF1, tau, 1, mu = mu2, sigma = sigma2, a = a,  : 
  non-finite function value

Any suggestion?
 
#R code:
mu2=0.4
sigma2=4
tau=0.75
mean.diff=2
var.ratio=3

InverseF1-function(u,mu,sigma,a,b,tau)
{
qnorm(u,mean=mu,sd=sigma)+a*(abs(u-tau))^b*(utau)
}

InverseF2-function(u,mu,sigma)
{
qnorm(u,mean=mu,sd=sigma)
}

part1-function(u,mu,sigma,a,b,tau)
{
(InverseF2(u,mu,sigma)+a*(abs(u-tau))^b*(utau))^2
}
part2-function(u,mu,sigma)
{
InverseF2(u,mu,sigma)^2
}

parameter- function(cons) {
a-cons[1]
b-cons[2]
f - rep(NA, 2)
EX-integrate(InverseF2,0,tau,mu=mu2,sigma=sigma2)$value+integrate(InverseF1,tau,1,mu=mu2,sigma=sigma2,a=a,b=b,tau=tau)$value
 
EY-integrate(InverseF2,0,1,mu=mu2,sigma=sigma2)$value 
VarX-integrate(part2,0,tau,mu=mu2,sigma=sigma2)$value+integrate(part1,tau,1,mu=mu2,sigma=sigma2,a=a,b=b,tau=tau)$value-EX^2
VarY-integrate(part2,0,1,mu=mu2,sigma=sigma2)$value-EY^2
f[1] - abs(EX - EY) - mean.diff
f[2] - sqrt(VarX) - sqrt(var.ratio) * sqrt(VarY) 
f
} 

library(BB)
c0-c(3,1)
ans1 - dfsane(par=c0, fn=parameter, method=3)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Index and dummy

2009-07-19 Thread Marujo A.

Dear R-helpers

I have 2 variables
x1=rgamma(6000, 2, 1) and x2=rgamma(6000, 3,2). I have to sort (descending) 
each one and split it into groups. After this each two groups must be merged 
into one until all population becomes one group. A dummy vector must be created 
for each group (8, 4, 2, 1) being equal to 1 if the individual (i) belongs to 
the group and equal to 0, otherwise.

What I have done was:
id=(6000)
x1sort=sort(x1, decreasing=TRUE)
x1g8_1=x1sort[1:750]
x1g8_2=x1sort[751:1500]
x1g8_3=x1sort[1501:2250]
x1g8_4=x1sort[2251:3000]
x1g8_5=x1sort[3001:3750]
x1g8_6=x1sort[3751:4500]
x1g8_7=x1sort[4501:5250]
x1g8_8=x1sort[5251:6000]

x1g4_1=c(x1g8_1, x1g8_2)
x1g4_2=c(x1g8_3, x1g8_4)
x1g4_3=c(x1g8_5, x1g8_6)
x1g4_4=c(x1g8_7, x1g8_8)

x1g2_1=c(x1g4_1, x1g4_2)
x1g2_2=c(x1g4_3, x1g4_4)

x1ng=c(x1g2_1, x1g2_2)

After this I did the dummy vector (the example is for group4)

dum= replace(matrix(0, 4, 1), cbind(4, 1), 0)   # 
matrix of zeros
dummy=lapply(1:4, function(i) replace(dum, cbind(i), 1))# 4 dummy 
vectors
s=split(dummy, 1:4)
ss=rename.vars(s, c(1, 2, 3, 4), c(dx14_1, dx14_2, dx14_3, 
dx14_4))

The problem is when I split into groups each group only identifies 750 
individuals(in the case of x1g8 for instance) only assumes i=1, ..., 750 and I 
need to keep i=1, , 6000. Also my option to dummy vectors don't seem to 
work because I get 4 vectors with the number one (1) in each different variable 
and not only one.

So, I need some help on how should I make to keep i=1, ..., 6000 and how to 
create a dummy vector that assumes only the value one (1) when some I belongs 
to some group.

Thank you so much.
Ana

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Re gression for loop test HELP! URGENT!

2009-07-19 Thread Rbeginner


Hi everyone!
I'm new to R, and I'm stuck on a problem I don't know how to approach.
I have calculated a regression in the form of M ~ D + O + S, and I would
like to take this regression and test it with other samples, 5 at a time(5
meaning 5 set, each consisting M, D, O, and S of a specific date). I assume
I'll need a for loop. Right now, My data of M, D, O, and S are all stored in
separate txt files, but should I just put them into a table or something?
And then how would I calculate the error of how well the test samples fit
the original regression?
This is for my internship, so it's very urgent.
THANKS A LOT!
RBeginner
-- 
View this message in context: 
http://www.nabble.com/Regression-for-loop-test-HELP%21-URGENT%21-tp24562766p24562766.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] HELP: BRUGS/WinBUGS/RBUGS Response is a combination of random variables

2009-07-19 Thread Haoda Fu

Hi,

Is there anyone know if BUGS language allows the combination of variables as 
response

such as
Y[i] - a*X1[i]+b*X2[i]
Y[i] ~ dnorm(c,d)


It seems doesn't work in my model. The problem is between two ##.
The error message is 
  modelCheck(BayesBioMarker.BUGS)
model is syntactically correct
  modelData(paste(BUGS_data.txt,sep=))
data loaded
  modelCompile(numChains=1)
multiple definitions of node bm[1]


===
 model
{
   for(iter in 1:numSubj){
 bmd[iter] ~ dnorm(u[trt[iter]+1],sdbmd.precision);
 ###
 bm[iter] -  inprod(biomarker[iter,1:4], bta[1:4])
 bm[iter] ~ dnorm(bmu[iter], sdbm.precision);
 ###
  bmu[iter] -  (1-delta[trt[iter]+1])*u[trt[iter]+1]+ 
delta[trt[iter]+1]*(u[1]+u[2])/2;
   }
   for(iter in 1:4){
 bta[iter] ~ dnorm(0, 0.01);
   }
   for (iter in 1:2){
delta[iter] ~ dbeta(0.0001,0.0001);
u[iter] ~ dnorm(0,01);
   }
   sdbmd.precision ~ dgamma(0.1,0.1);
   sdbm.precision ~ dgamma(0.1,0.1);
}
===
Data


list(biomarker= structure(.Data= c(-1.6332505600E-01, -9.9820335000E-02, 
4.0692602900E-01, -8.265000E-03, -6.6739468400E-01, -1.9444291700E-01, 
-2.5035248500E-01, -1.592600E-02, 2.5951119500E-01, 3.1943077100E-01, 
2.1915456300E-01, -7.06E-04, -1.1407237740E+00, -4.7155522000E-01, 
-1.4495155440E+00, 1.703600E-02, -1.9105523700E-01, -5.698021E-03, 
-8.4883829700E-01, 4.939000E-03), .Dim=c(5, 4)), bmd=c(-4.944000E-03, 
3.520700E-02, 2.597500E-02, 2.739400E-02, 7.501000E-03), 
trt=c(1.00E+00, 0.00E+00, 0.00E+00, 1.00E+00, 
1.00E+00), numSubj=5.00E+00)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] what is meaning of the bubbles in boxplots?

2009-07-19 Thread Jie TANG

Hi ,everyone ,
  I draw some boxplot figure with the command boxplot.But in the
figure,there are some bubbles at the top part of the figure.

 Can anyone tell me what the correct meaning of these bubbles?and how to
remove it?

-- 
TANG Jie
Email: totang...@gmail.com
Tel: 0086-2154896104
Shanghai Typhoon Institute,China

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Re gression for loop test HELP! URGENT!

2009-07-19 Thread Daniel Malter

for a beginner, it's probably even easier to it by hand if it is just five
datasets.

bind the 5 datasets together in one dataset and create and index variable (1
to 5) for each of the observations according to the dataset the obersvation
comes from

then run five regressions using

reg1=lm(M~D+O+S,subset=c(index==1))
.
.
.
reg5=lm(M~D+O+S,subset=c(index==5))

and then predict from each regression

predict(reg1,newdata=data.frame(D,O,S))
.
.
.
predict(reg5,newdata=data.frame(D,O,S))

You can then assess how well the prediction from each of the datasets fits
the respective other datasets...

Daniel

-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im
Auftrag von Rbeginner
Gesendet: Sunday, July 19, 2009 9:49 PM
An: r-help@r-project.org
Betreff: [R] Re gression for loop test HELP! URGENT!


Hi everyone!
I'm new to R, and I'm stuck on a problem I don't know how to approach.
I have calculated a regression in the form of M ~ D + O + S, and I would
like to take this regression and test it with other samples, 5 at a time(5
meaning 5 set, each consisting M, D, O, and S of a specific date). I assume
I'll need a for loop. Right now, My data of M, D, O, and S are all stored in
separate txt files, but should I just put them into a table or something?
And then how would I calculate the error of how well the test samples fit
the original regression?
This is for my internship, so it's very urgent.
THANKS A LOT!
RBeginner
--
View this message in context:
http://www.nabble.com/Regression-for-loop-test-HELP%21-URGENT%21-tp24562766p
24562766.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with as.POSIXct on dates object

2009-07-19 Thread Remko Duursma

Dear R-helpers,


I have a problem converting an object made with the 'chron' function
to a POSIXct object:

# Make date based on DOY
dat - chron(dates=232, origin.=c(month=1, day=1, year=2008))

dat
#[1] 08/20/08

# Converting to POSIXct uses current timezone (Sydney):
as.POSIXct(dat)
#[1] 2008-08-20 10:00:00 EST

# Setting GMT timezone has no effect?
as.POSIXct(dat, tz=GMT)
#[1] 2008-08-20 10:00:00 EST

# But to POSIXlt works fine:
as.POSIXlt(dat, tz=GMT)
#[1] 2008-08-20 GMT

Is this behavior expected? If so, can you explain why?

thanks for your help,
Remko



-
Remko Duursma
Post-Doctoral Fellow

Centre for Plants and the Environment
University of Western Sydney
Hawkesbury Campus
Richmond NSW 2753

Dept of Biological Science
Macquarie University
North Ryde NSW 2109
Australia

Mobile: +61 (0)422 096908

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with as.POSIXct on dates object

2009-07-19 Thread Gabor Grothendieck

as.POSIXct.dates does not make use of tz:

 as.POSIXct.dates
function (x, ...)
{
if (inherits(x, dates)) {
z - attr(x, origin)
x - as.numeric(x) * 86400
if (length(z) == 3L  is.numeric(z))
x - x + as.numeric(ISOdate(z[3L], z[1L], z[2L],
0))
return(structure(x, class = c(POSIXt, POSIXct)))
}
else stop(gettextf('%s' is not a \dates\ object,
deparse(substitute(x
}
environment: namespace:base


On Sun, Jul 19, 2009 at 11:30 PM, Remko Duursmaremkoduur...@gmail.com wrote:
 Dear R-helpers,


 I have a problem converting an object made with the 'chron' function
 to a POSIXct object:

 # Make date based on DOY
 dat - chron(dates=232, origin.=c(month=1, day=1, year=2008))

 dat
 #[1] 08/20/08

 # Converting to POSIXct uses current timezone (Sydney):
 as.POSIXct(dat)
 #[1] 2008-08-20 10:00:00 EST

 # Setting GMT timezone has no effect?
 as.POSIXct(dat, tz=GMT)
 #[1] 2008-08-20 10:00:00 EST

 # But to POSIXlt works fine:
 as.POSIXlt(dat, tz=GMT)
 #[1] 2008-08-20 GMT

 Is this behavior expected? If so, can you explain why?

 thanks for your help,
 Remko



 -
 Remko Duursma
 Post-Doctoral Fellow

 Centre for Plants and the Environment
 University of Western Sydney
 Hawkesbury Campus
 Richmond NSW 2753

 Dept of Biological Science
 Macquarie University
 North Ryde NSW 2109
 Australia

 Mobile: +61 (0)422 096908

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with as.POSIXct on dates object

2009-07-19 Thread Remko Duursma

 as.POSIXct.dates does not make use of tz:

Ok, but it is supposed to, right? Or maybe the documentation can be
updated, because ?as.POSIXct does seem to imply the timezone is used
(as it is for other methods of as.POSIXct).

thanks,
Remko




-
Remko Duursma
Post-Doctoral Fellow

Centre for Plants and the Environment
University of Western Sydney
Hawkesbury Campus
Richmond NSW 2753

Dept of Biological Science
Macquarie University
North Ryde NSW 2109
Australia

Mobile: +61 (0)422 096908



On Mon, Jul 20, 2009 at 1:41 PM, Gabor
Grothendieckggrothendi...@gmail.com wrote:
 as.POSIXct.dates does not make use of tz:

 as.POSIXct.dates
 function (x, ...)
 {
    if (inherits(x, dates)) {
        z - attr(x, origin)
        x - as.numeric(x) * 86400
        if (length(z) == 3L  is.numeric(z))
            x - x + as.numeric(ISOdate(z[3L], z[1L], z[2L],
                0))
        return(structure(x, class = c(POSIXt, POSIXct)))
    }
    else stop(gettextf('%s' is not a \dates\ object,
 deparse(substitute(x
 }
 environment: namespace:base


 On Sun, Jul 19, 2009 at 11:30 PM, Remko Duursmaremkoduur...@gmail.com wrote:
 Dear R-helpers,


 I have a problem converting an object made with the 'chron' function
 to a POSIXct object:

 # Make date based on DOY
 dat - chron(dates=232, origin.=c(month=1, day=1, year=2008))

 dat
 #[1] 08/20/08

 # Converting to POSIXct uses current timezone (Sydney):
 as.POSIXct(dat)
 #[1] 2008-08-20 10:00:00 EST

 # Setting GMT timezone has no effect?
 as.POSIXct(dat, tz=GMT)
 #[1] 2008-08-20 10:00:00 EST

 # But to POSIXlt works fine:
 as.POSIXlt(dat, tz=GMT)
 #[1] 2008-08-20 GMT

 Is this behavior expected? If so, can you explain why?

 thanks for your help,
 Remko



 -
 Remko Duursma
 Post-Doctoral Fellow

 Centre for Plants and the Environment
 University of Western Sydney
 Hawkesbury Campus
 Richmond NSW 2753

 Dept of Biological Science
 Macquarie University
 North Ryde NSW 2109
 Australia

 Mobile: +61 (0)422 096908

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling of non-overlapping intervals of variable length

2009-07-19 Thread Hadassa Brunschwig

Thanks Chuck.

Ups, did not think of the problem in that way.
That did exactly what I needed. I have another complication to this problem:
I do not only have one vector of 1:1e^6 but several vectors of
different length, say 5.
Initially, my intervals are distributed over those 5 vectors and the
ranges of those
5 vectors in a specific way (and you might have guessed by now that I would like
to do something like a permutation test). Because I have this
additional level, I guess
I could do something like:

1)Sample the 5 vectors with probabilities proportional to the
frequencies of the intial
intervals on these vectors.
2)For each sampled vector: apply Chucks solution.

?
Thanks a lot.
Hadassa

On Sun, Jul 19, 2009 at 11:13 PM, Charles C. Berrycbe...@tajo.ucsd.edu wrote:
On Sun, 19 Jul 2009, Hadassa Brunschwig wrote:

I am not sure what you mean by sampling an index of a group of
intervals. I will try to give an example:
Let's assume I have a vector 1:100. Let's say I have 10 intervals
of different but known length, say,
c(4,6,11,2,8,14,7,2,18,32). For simulation purposes I have to sample
those 10 intervals 1000 times.
The requirement is, however, that they should be of those lengths and
should not be overlapping.
In short, I would like to obtain a 10x1000 matrix with sampled intervals.

Something like this:

lens - c(4,6,11,2,8,14,7,2,18,32)
perm.lens - sample(lens)

sort(sample(1e06-sum(lens)+length(lens),length(lens)))+cumsum(c(0,head(perm.lens,-1)))

[1] 15424 261927 430276 445976 451069 546578 656123 890494 939714 969643

The vector above gives the starting points for the intervals whose lengths
are perm.lens.

I'd bet every introductory combinatorics book has a problem or example in
which the expression for the number of ways in which K ordered objects can
be assigned to I groups consisting of n_i adjacent objects each is
constructed. The construction is along the lines of the calculation above.

HTH,

Chuck

Thanks
Hadassa

On Sun, Jul 19, 2009 at 9:48 PM, David Winsemiusdwinsem...@comcast.net
wrote:

On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote:

Hi,

I hope I am not repeating a question which has been posed already.
I am trying to do the following in the most efficient way:
I would like to sample from a finite (large) set of integers n
non-overlapping
intervals, where each interval i has a different, set length L_i
(which is the number
of integers in the interval).
I had the idea to sample recursively on a vector with the already
chosen intervals
discarded but that seems to be too complicated.

It might be ridiculously easy if you sampled on an index of a group of
intervals.
Why not pose the question in the form of example data.frames or other
classes of R objects? Specification of the desired output would be
essential. I think further specification of the sampling strategy would
also
help because I am unable to understand what sort of probability model you
are hoping to apply.

Any suggestions on that?

Thanks a lot.

Hadassa

--
Hadassa Brunschwig
PhD Student
Department of Statistics

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

--
Hadassa Brunschwig
PhD Student
Department of Statistics
The Hebrew University of Jerusalem
http://www.stat.huji.ac.il

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Charles C. Berry (858) 534-2098
Dept of Family/Preventive
Medicine
E mailto:cbe...@tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901

--
Hadassa Brunschwig
PhD Student
Department of Statistics
The Hebrew University of Jerusalem
http://www.stat.huji.ac.il

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Re gression for loop test HELP! URGENT!

2009-07-19 Thread Rbeginner


Thanks for the suggestion, but I think I might not have made myself very
clear. I actually have about 2000 sets of M, D, O, and S, so it's probably
not efficient to make each a separate set and then index. I've put
everything into a data frame, so I would like to test how well the
regression fit for each group, which consists of 5 sets of M, D, O, and S.
Since I'll need to test it for about 400 groups, I thought a for loop might
be necessary. 
Any suggestions? I'm just not especially sure how to do the for loop.



Daniel Malter wrote:
 
 for a beginner, it's probably even easier to it by hand if it is just five
 datasets.
 
 bind the 5 datasets together in one dataset and create and index variable
 (1
 to 5) for each of the observations according to the dataset the
 obersvation
 comes from
 
 then run five regressions using
 
 reg1=lm(M~D+O+S,subset=c(index==1))
 .
 .
 .
 reg5=lm(M~D+O+S,subset=c(index==5))
 
 and then predict from each regression
 
 predict(reg1,newdata=data.frame(D,O,S))
 .
 .
 .
 predict(reg5,newdata=data.frame(D,O,S))
 
 You can then assess how well the prediction from each of the datasets fits
 the respective other datasets...
 
 Daniel
 
 -
 cuncta stricte discussurus
 -
 
 -Ursprüngliche Nachricht-
 Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im
 Auftrag von Rbeginner
 Gesendet: Sunday, July 19, 2009 9:49 PM
 An: r-help@r-project.org
 Betreff: [R] Re gression for loop test HELP! URGENT!
 
 
 Hi everyone!
 I'm new to R, and I'm stuck on a problem I don't know how to approach.
 I have calculated a regression in the form of M ~ D + O + S, and I would
 like to take this regression and test it with other samples, 5 at a time(5
 meaning 5 set, each consisting M, D, O, and S of a specific date). I
 assume
 I'll need a for loop. Right now, My data of M, D, O, and S are all stored
 in
 separate txt files, but should I just put them into a table or something?
 And then how would I calculate the error of how well the test samples fit
 the original regression?
 This is for my internship, so it's very urgent.
 THANKS A LOT!
 RBeginner
 --
 View this message in context:
 http://www.nabble.com/Regression-for-loop-test-HELP%21-URGENT%21-tp24562766p
 24562766.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Regression-for-loop-test-HELP%21-URGENT%21-tp24562766p24563579.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling of non-overlapping intervals of variable length

2009-07-19 Thread Moshe Olshansky


Another possibility, if the total length of your intervals is small in 
comparison to the big interval is to choose the starting points of all your 
intervals randomly and to dismiss the entire set if some of the intervals 
overlap. Most probably you will not have too many such cases (assuming, as 
stated above, that the total length of all the intervals is a small proportion 
of the length of the big interval).

--- On Mon, 20/7/09, Hadassa Brunschwig hadassa.brunsch...@mail.huji.ac.il 
wrote:

 From: Hadassa Brunschwig hadassa.brunsch...@mail.huji.ac.il
 Subject: Re: [R] Sampling of non-overlapping intervals of variable length
 To: Charles C. Berry cbe...@tajo.ucsd.edu
 Cc: r-help@r-project.org
 Received: Monday, 20 July, 2009, 3:08 PM
 Thanks Chuck.
 
 Ups, did not think of the problem in that way.
 That did exactly what I needed.  I have another
 complication to this problem:
 I do not only have one vector of 1:1e^6 but several vectors
 of
 different length, say 5.
 Initially, my intervals are distributed over those 5
 vectors and the
 ranges of those
 5 vectors in a specific way (and you might have guessed by
 now that I would like
 to do something like a permutation test). Because I have
 this
 additional level, I guess
 I could do something like:
 
 1)Sample the 5 vectors with probabilities proportional to
 the
 frequencies of the intial
 intervals on these vectors.
 2)For each sampled vector: apply Chucks solution.
 
 ?
 Thanks a lot.
 Hadassa
 
 On Sun, Jul 19, 2009 at 11:13 PM, Charles C. Berrycbe...@tajo.ucsd.edu
 wrote:
  On Sun, 19 Jul 2009, Hadassa Brunschwig wrote:
 
  Hi
 
  I am not sure what you mean by sampling an index
 of a group of
  intervals. I will try to give an example:
  Let's assume I have a vector 1:100. Let's say
 I have 10 intervals
  of different but known length, say,
  c(4,6,11,2,8,14,7,2,18,32). For simulation
 purposes I have to sample
  those 10 intervals 1000 times.
  The requirement is, however, that they should be
 of those lengths and
  should not be overlapping.
  In short, I would like to obtain a 10x1000 matrix
 with sampled intervals.
 
  Something like this:
 
 
  lens - c(4,6,11,2,8,14,7,2,18,32)
  perm.lens - sample(lens)
 
 
 sort(sample(1e06-sum(lens)+length(lens),length(lens)))+cumsum(c(0,head(perm.lens,-1)))
 
   [1]  15424 261927 430276 445976 451069 546578
 656123 890494 939714 969643
 
 
  The vector above gives the starting points for the
 intervals whose lengths
  are perm.lens.
 
  I'd bet every introductory combinatorics book has a
 problem or example in
  which the expression for the number of ways in which K
 ordered objects can
  be assigned to I groups consisting of n_i adjacent
 objects each is
  constructed. The construction is along the lines of
 the calculation above.
 
  HTH,
 
  Chuck
 
 
 
  Thanks
  Hadassa
 
  On Sun, Jul 19, 2009 at 9:48 PM, David
 Winsemiusdwinsem...@comcast.net
  wrote:
 
  On Jul 19, 2009, at 1:05 PM, Hadassa
 Brunschwig wrote:
 
  Hi,
 
  I hope I am not repeating a question which
 has been posed already.
  I am trying to do the following in the
 most efficient way:
  I would like to sample from a finite
 (large) set of integers n
  non-overlapping
  intervals, where each interval i has a
 different, set length L_i
  (which is the number
  of integers in the interval).
  I had the idea to sample recursively on a
 vector with the already
  chosen intervals
  discarded but that seems to be too
 complicated.
 
  It might be ridiculously easy if you sampled
 on an index of a group of
  intervals.
  Why not pose the question in the form of
 example data.frames or other
  classes of R objects? Specification of the
 desired output would be
  essential. I think further specification of
 the sampling strategy would
  also
  help because I am unable to understand what
 sort of probability model you
  are hoping to apply.
 
  Any suggestions on that?
 
  Thanks a lot.
 
  Hadassa
 
 
  --
  Hadassa Brunschwig
  PhD Student
  Department of Statistics
 
 
  David Winsemius, MD
  Heritage Laboratories
  West Hartford, CT
 
 
 
 
 
  --
  Hadassa Brunschwig
  PhD Student
  Department of Statistics
  The Hebrew University of Jerusalem
  http://www.stat.huji.ac.il
 
  __
  R-help@r-project.org
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 
  Charles C. Berry                        
    (858) 534-2098
                                     
        Dept of Family/Preventive
  Medicine
  E mailto:cbe...@tajo.ucsd.edu
               UC San Diego
  http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla,
 San Diego 92093-0901
 
 
 
 
 
 
 -- 
 Hadassa Brunschwig
 PhD Student
 Department of Statistics
 The Hebrew University of Jerusalem
 http://www.stat.huji.ac.il
 
 __

[R] Regression for loop test HELP! URGENT!

2009-07-19 Thread chopin.abacus

Hi everyone!
I'm new to R, and I've sent this message as a non-member, but since it's
pretty urgent, I'm sending it again now I'm on the mailing list (Thanks
Daniel for your suggestion nevertheless).

I have calculated a regression in the form of M ~ D + O + S, and I would
like to take this regression and test it with other samples, 5 sets of M, D,
O, and S at a time(I actually have 2000 sets, so it's probably not efficient
to make each a separate set and then index). Since I'll need to test the
regression for 400 groups, I thought a for loop might be necessary. I've put
everything into a data frame already. Can anyone tell me how to write the
code? I'm especially not sure about how to do the for loop.
And then how would I calculate the error of how well the test samples fit
the original regression?
This is for my internship, so it's very urgent.
THANKS A LOT!
RBeginner

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

72 matches

Mail list logo