from:"joris meys"

Re: [R] Tinn-R user guide (latex sources) available on GitHub

2013-11-27 Thread Joris Meys

Thank you for doing this!


On Wed, Nov 27, 2013 at 11:22 AM, Jose Claudio Faria 
joseclaudio.fa...@gmail.com wrote:

 Dears user,

 The Tinn-R User Guide is completely written in LaTeX and the idea
 behind this to be available on GitHub is that it has contributions
 from multiple users.

 If you find something that you would like to include or impruve:
 please, fell free to make it better.

 This User Guide have been developing under both OS: Windows and Linux.

 Under Windows: we have been using Tinn-R as editor and MikTeX as compiler.

 Under Linux: we have been using Vim (with LaTeX-Box plugin) as editor
 and TexLive as compiler.

 Link: https://github.com/jcfaria/Tinn-R-User-Guide

 Regards,
 --
 ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\
 Jose Claudio Faria
 Estatistica
 UESC/DCET/Brasil
 joseclaudio.faria at gmail.com
 Telefones:
 55(73)3680.5545 - UESC
 55(73)9100.7351 - TIM
 55(73)8817.6159 - OI
 ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Should there be an R-beginners list?

2013-11-27 Thread Joris Meys

StackOverflow has certainly its merits, although I miss a bit the good ol'
Oxford sarcasm gems you find in this list.

This said : Beginner's list. Bad, bad idea. First rule in my classes is:
RTFI (Read The Fucking Internetzz). Anybody using R should be able to do a
basic Google search. A beginner's list is not going to help them in
learning that.

If beginners do the effort of following the posting guidelines, netiquette
or any other guide to getting help on the internet, they can safely use
this list.

Cheers
Joris




On Wed, Nov 27, 2013 at 9:47 AM, Rolf Turner r.tur...@auckland.ac.nzwrote:

 On 11/25/13 09:04, Rich Shepard wrote:

 On Sun, 24 Nov 2013, Yihui Xie wrote:

  Mailing lists are good for a smaller group of people, and especially
 good when more focused on discussions on development (including bug
 reports). The better place for questions is a web forum.


   I disagree. Mail lists push messages to subscribers while web fora
 require
 one to use a browser, log in, then pull messages. Not nearly as
 convenient.


 Well expressed Rich.  I agree with you completely.

 cheers,

 Rolf Turner


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/
 posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems when Apply a script to a list

2010-08-27 Thread Joris Meys

Where exactly did you put the sink() statement? I tried it with a 1000
dataframes and I have no problem whatsoever.

Cheers
Joris

On Fri, Aug 27, 2010 at 6:56 AM,  ev...@aueb.gr wrote:
 Joris,
 thank you very much for your help.
 It is very helpful for me.
 I still have a problem with sink stack although I put after my last print
 result sink().
 I want to ask you another one question. Do you know how can I have at the
 sink output file a message for which one set of the list are the exact
 results. When I use for instead of apply I had
 cat(\n***\nEstimation \n\nDataset
 Sim : ,
          i )
 Thank you in advance
 Evgenia
 Joris Meys writes:

 Answers below.
 On Thu, Aug 26, 2010 at 11:20 AM, Evgenia ev...@aueb.gr wrote:

 Dear users,
 ***I have a function f to simulate data from a model (example below
 used
 only to show my problems)
 f-function(n,mean1){
 a-matrix(rnorm(n, mean1 , sd = 1),ncol=5)
 b-matrix(runif(n),ncol=5)
 data-rbind(a,b)
 out-data
 out}
 *I want to simulate 1000 datasets (here only 5) so I use
 S-list()
 for (i in 1:5){
 S[[i]]-f(n=10,mean1=0)}
 **I have a very complicated function  for estimation of a model which
 I
 want to apply to Each one of the above simulated datasets
 fun-function(data){data-as.matrix(data)
 sink(' Example.txt',append=TRUE)
          cat(\n***\nEstimation
 \n\nDataset Sim : ,
            i )
 d-data%*%t(data)
 s-solve(d)
 print(s)
 out-s
 out
 }
 results-list()
 for (i in 1:5){results[[i]]-fun(data=S[[i]])}

 My questions are:
 1) for some datasets system is computational singular and this causes
 execution of the for to stop.By this way I have only results until this
 problem happens.How can I pass over the execution for this step and have
 results for All other datasets for which function fun is applicable?

 see ?try, or ?tryCatch.
 I'd do something in the line of
 for(i in 1:5){
     tmp - try(fun(data=S[[i]]))
     results[[i]] - ifelse(is(tmp,try-error),NA,tmp)
 }
 Alternatively, you could also use lapply :
 results - lapply(S,function(x{
   tmp - try(fun(data=x))
    ifelse(is(tmp,try-error),NA,tmp)
 })

 2) After some runs to my program, I receive this error message someError
 in
 sink(output.txt) : sink stack is full . How can I solve this problem,
 as I
 want to have results of my program for 1000 datasets.

 That is because you never empty the sink. add sink() after the last
 line you want in the file. This will empty the sink buffer to the
 file. Otherwise R keeps everything in the memory, and that gets too
 full after a while.

 3) Using for is the correct way to run my proram for a list

 See the lapply solution.

 Thanks alot
 Evgenia
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Problems-when-Apply-a-script-to-a-list-tp2339403p2339403.html
 Sent from the R help mailing list archive at Nabble.com.
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Joris Meys
 Statistical consultant
 Ghent University
 Faculty of Bioscience Engineering
 Department of Applied mathematics, biometrics and process control
 tel : +32 9 264 59 87
 joris.m...@ugent.be
 ---
 Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php






-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change value of a slot of an S4 object within a method.

2010-08-26 Thread Joris Meys

OK, I admit. It will never win a beauty price, but I won't either so
we go pretty well together.

Seriously, I am aware this is about the worst way to solve such a
problem. But as I said, rewriting the class definitions wasn't an
option any more. I'll definitely take your advice for the next
project.

Cheers
Joris

On Wed, Aug 25, 2010 at 9:36 PM, Steve Lianoglou
mailinglist.honey...@gmail.com wrote:
 Howdy,

 My eyes start to gloss over on their first encounter of `substitute`
 ... add enough `eval`s and (even) an `expression` (!) to that, and
 you'll probably see me running for the hills ... but hey, if it makes
 sense to you, more power to you ;-)

 Glad you found a fix that works,

 -steve

 --
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
  | Memorial Sloan-Kettering Cancer Center
  | Weill Medical College of Cornell University
 Contact Info: http://cbio.mskcc.org/~lianos/contact




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems when Apply a script to a list

2010-08-26 Thread Joris Meys

Answers below.

On Thu, Aug 26, 2010 at 11:20 AM, Evgenia ev...@aueb.gr wrote:

 Dear users,

 ***I have a function f to simulate data from a model (example below used
 only to show my problems)

 f-function(n,mean1){
 a-matrix(rnorm(n, mean1 , sd = 1),ncol=5)
 b-matrix(runif(n),ncol=5)
 data-rbind(a,b)
 out-data
 out}

 *I want to simulate 1000 datasets (here only 5) so I use
 S-list()

 for (i in 1:5){
 S[[i]]-f(n=10,mean1=0)}

 **I have a very complicated function  for estimation of a model which I
 want to apply to Each one of the above simulated datasets

 fun-function(data){data-as.matrix(data)
 sink(' Example.txt',append=TRUE)
          cat(\n***\nEstimation
 \n\nDataset Sim : ,
            i )
 d-data%*%t(data)
 s-solve(d)
 print(s)
 out-s
 out
 }
 results-list()
 for (i in 1:5){results[[i]]-fun(data=S[[i]])}


 My questions are:
 1) for some datasets system is computational singular and this causes
 execution of the for to stop.By this way I have only results until this
 problem happens.How can I pass over the execution for this step and have
 results for All other datasets for which function fun is applicable?

see ?try, or ?tryCatch.

I'd do something in the line of
for(i in 1:5){
 tmp - try(fun(data=S[[i]]))
 results[[i]] - ifelse(is(tmp,try-error),NA,tmp)
}

Alternatively, you could also use lapply :
results - lapply(S,function(x{
   tmp - try(fun(data=x))
ifelse(is(tmp,try-error),NA,tmp)
})


 2) After some runs to my program, I receive this error message someError in
 sink(output.txt) : sink stack is full . How can I solve this problem, as I
 want to have results of my program for 1000 datasets.

That is because you never empty the sink. add sink() after the last
line you want in the file. This will empty the sink buffer to the
file. Otherwise R keeps everything in the memory, and that gets too
full after a while.


 3) Using for is the correct way to run my proram for a list
See the lapply solution.


 Thanks alot

 Evgenia

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Problems-when-Apply-a-script-to-a-list-tp2339403p2339403.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Passing arguments between S4 methods fails within a function:bug? example with raster package.

2010-08-26 Thread Joris Meys

Dear all,

This problem came up initially while debugging a function, but it
seems to be a more general problem of R. I hope I'm wrong, but I can't
find another explanation. Let me illustrate with the raster package.

For an object RasterLayer (which inherits from Raster), there is a
method xyValues defined with the signature
(object=RasterLayer,xy=matrix). There is also a method with
signature (object=Raster,xy=vector). The only thing this method
does, is change xy into a matrix and then pass on to the next method
using callGeneric again. Arguments are passed.

Now this all works smoothly, as long as you stay in the global environment :
require(raster)

a - raster()
a[] - 1:ncell(a)

origin - c(-80,50)
eff.dist - 10

unlist(xyValues(a,xy=origin,buffer=eff.dist))
[1] 14140 14141 14500 14501

Now let's make a very basic test function :

test - function(x,orig.point){
eff.distance - 10
p - unlist(xyValues(x,xy=orig.point,buffer=eff.distance))
return(p)
}

This gives the following result :
 test(a,origin)
Error in .local(object, xy, ...) : object 'eff.distance' not found

huh? Apparently, eff.distance got lost somewhere in the parsetree (am
I saying this correctly?)

The funny thing is when we change origin to a matrix :
 origin - matrix(origin,ncol=2)

 unlist(xyValues(a,xy=origin,buffer=eff.dist))
[1] 14140 14141 14500 14501

 test(a,origin)
[1] 14140 14141 14500 14501

It all works again! So something goes wrong with passing the arguments
from one method to another using callGeneric. Is this a bug in R or am
I missing something obvious?

The relevant code from the raster package :

setMethod(xyValues, signature(object='Raster', xy='vector'),
function(object, xy, ...) {
if (length(xy) == 2) {
callGeneric(object, matrix(xy, ncol=2), ...)
} else {
stop('xy coordinates should be a two-column matrix or 
data.frame,
or a vector of two numbers.')
}
} )

setMethod(xyValues, signature(object='RasterLayer', xy='matrix'),
function(object, xy, method='simple', buffer=NULL, fun=NULL, 
na.rm=TRUE) {

if (dim(xy)[2] != 2) {
stop('xy has wrong dimensions; it should have 2 
columns' )
}

if (! is.null(buffer)) {
return( .xyvBuf(object, xy, buffer, fun, na.rm=na.rm) )
}

if (method=='bilinear') {
return(.bilinearValue(object, xy))
} else if (method=='simple') {
cells - cellFromXY(object, xy)
return(.readCells(object, cells))
} else {
stop('invalid method argument. Should be simple or 
bilinear.')
}
}   
)   



-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Change value of a slot of an S4 object within a method.

2010-08-25 Thread Joris Meys

Dear all,

I have an S4 class with a slot extra which is a list. I want to be
able to add an element called name to that list, containing the
object value (which can be a vector, a dataframe or another S4
object)

Obviously

setMethod(add.extra,signature=c(PM10Data,character,vector),
  function(object,name,value){
 obj...@extra[[name]] - value
  }
)

Contrary to what I would expect, the line :
eval(eval(substitute(expression(obj...@extra[[name]] - value

gives the error :
Error in obj...@extra[[name]] - value : object 'object' not found

Substitute apparently doesn't work any more in S4 methods...

 I found a work-around by calling the initializer every time, but this
influences the performance. Returning the object is also not an
option, as I'd have to remember to assign that one each time to the
original name.

Basically I'm trying to do some call by reference with S4, but don't
see how I should do that. How would I be able to tackle this problem
in an efficient and elegant way?

Thank you in advance
Cheers
Joris

-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change value of a slot of an S4 object within a method.

2010-08-25 Thread Joris Meys

Hi Steve,

thanks for the tip.  I'll definitely take a closer look at your
solution for implementation for future use.  But right now I don't
have the time to start rewriting my class definitions.

Luckily, I found where exactly things were going wrong. After reading
into the documentation about the evaluation in R, I figured out I have
to specify the environment where substitute should look explicitly as
parent.frame(1). I still don't understand completely why exactly, but
it does the job.

Thus :
eval(eval(substitute(expression(obj...@extra[[name]] - value

should become :

eval(
   eval(
  substitute(
 expression(obj...@extra[[name]] - value)
  ,env=parent.frame(1) )
   )
)

Tried it out and it works.

Cheers
Joris

On Wed, Aug 25, 2010 at 6:21 PM, Steve Lianoglou
mailinglist.honey...@gmail.com wrote:
 Hi Joris,

 On Wed, Aug 25, 2010 at 11:56 AM, Joris Meys jorism...@gmail.com wrote:
 Dear all,

 I have an S4 class with a slot extra which is a list. I want to be
 able to add an element called name to that list, containing the
 object value (which can be a vector, a dataframe or another S4
 object)

 Obviously

 setMethod(add.extra,signature=c(PM10Data,character,vector),
  function(object,name,value){
             obj...@extra[[name]] - value
  }
 )

 Contrary to what I would expect, the line :
 eval(eval(substitute(expression(obj...@extra[[name]] - value

 gives the error :
 Error in obj...@extra[[name]] - value : object 'object' not found

 Substitute apparently doesn't work any more in S4 methods...

  I found a work-around by calling the initializer every time, but this
 influences the performance. Returning the object is also not an
 option, as I'd have to remember to assign that one each time to the
 original name.

 Basically I'm trying to do some call by reference with S4, but don't
 see how I should do that. How would I be able to tackle this problem
 in an efficient and elegant way?

 In lots of my own S4 classes I define a slot called .cache which is
 an environment for this exact purpose.

 Using this solution for your scenario might look something like this:

 setMethod(add.extra,signature=c(PM10Data,character,vector),
 function(object, name, value) {
  obj...@.cache$extra[[name]] - value
 })

 I'm not sure what your entire problem looks like, but to get your
 extra list, or a value form it, you could:

 setMethod(extra, signature=PM10Data,
 function(object, name=NULL) {
  if (!is.null(name)) {
    obj...@.cache$extra[[name]]
  } else {
    obj...@.cache$extra
 })

 ... or something like that.

 The last thing you have to be careful of is that you nave to make sure
 that each new(PM10Data) object you have initializes its *own* cache:

 setClass(PM10Data, representation(..., .cache='environment'))
 setMethod(initialize, PM10Data,
 function(.Object, ..., .cache=new.env()) {
  callNextMethod(.Object, .cache=.cache, ...)
 })

 Does that make sense?

 --
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
  | Memorial Sloan-Kettering Cancer Center
  | Weill Medical College of Cornell University
 Contact Info: http://cbio.mskcc.org/~lianos/contact


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] strange error : isS4(x) in gamm function (mgcv package). Variable in data-frame not recognized???

2010-07-28 Thread Joris Meys

Dear all,

I run a gamm with following call :

result - try(gamm(values~ s( VM )+s( RH )+s( TT )+s( PP
)+RF+weekend+s(day)+s(julday) ,correlation=corCAR1(form=~ day|month
),data=tmp) )

with mgcv version 1.6.2

No stress about the data, the error is not data-related. I get :

Error in isS4(x) : object 'VM' not found

What so? I did define the dataframe to be used, and the dataframe
contains a variable VM :

 str(tmp)
'data.frame':   4014 obs. of  12 variables:
 $ values : num  73.45 105.45 74.45 41.45 -4.55 ...
 $ dates  :Class 'Date'  num [1:4014] 9862 9863 9864 9865 9866 ...
 $ year   : num  -5.65 -5.65 -5.65 -5.65 -5.65 ...
 $ day: num  -178 -177 -176 -175 -174 ...
 $ month  : Factor w/ 156 levels 1996-April,1996-August,..: 17 17
17 17 17 17 17 17 17 17 ...
 $ julday : num  -2241 -2240 -2239 -2238 -2237 ...
 $ weekend: num  -0.289 -0.289 -0.289 0.711 0.711 ...
 $ VM : num  0.139 -1.451 0.349 0.839 -0.611 ...
 $ RH : num  55.2 61.4 59.8 64.1 60.7 ...
 $ TT : num  -23.4 -23.6 -19.5 -16.1 -15.3 ...
 $ PP : num  6.17 4.27 -4.93 -9.23 -2.63 ...
 $ RF : Ord.factor w/ 3 levels None2.5mm..: 1 1 1 1 1 1 1 1 1 1 ...
 - attr(*, means)= num

Any idea what I'm missing here?

Cheers
Joris


-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange error : isS4(x) in gamm function (mgcv package). Variable in data-frame not recognized???

2010-07-28 Thread Joris Meys

Dear all,

it gets even more weird. After restarting R, the code I used works
just fine. The call is generated in a function that I debugged using
browser(). Problem is solved, but I have no clue whatsoever how that
error came about. It must have something to do with namespaces, but
the origin is dark. I tried to regenerate the error, but didn't
succeed.

Somebody an idea as to where I have to look for a cause?

Cheers
Joris

On Wed, Jul 28, 2010 at 1:16 PM, Joris Meys jorism...@gmail.com wrote:
 Dear all,

 I run a gamm with following call :

 result - try(gamm(values~ s( VM )+s( RH )+s( TT )+s( PP
 )+RF+weekend+s(day)+s(julday) ,correlation=corCAR1(form=~ day|month
 ),data=tmp) )

 with mgcv version 1.6.2

 No stress about the data, the error is not data-related. I get :

 Error in isS4(x) : object 'VM' not found

 What so? I did define the dataframe to be used, and the dataframe
 contains a variable VM :

 str(tmp)
 'data.frame':   4014 obs. of  12 variables:
  $ values : num  73.45 105.45 74.45 41.45 -4.55 ...
  $ dates  :Class 'Date'  num [1:4014] 9862 9863 9864 9865 9866 ...
  $ year   : num  -5.65 -5.65 -5.65 -5.65 -5.65 ...
  $ day    : num  -178 -177 -176 -175 -174 ...
  $ month  : Factor w/ 156 levels 1996-April,1996-August,..: 17 17
 17 17 17 17 17 17 17 17 ...
  $ julday : num  -2241 -2240 -2239 -2238 -2237 ...
  $ weekend: num  -0.289 -0.289 -0.289 0.711 0.711 ...
  $ VM     : num  0.139 -1.451 0.349 0.839 -0.611 ...
  $ RH     : num  55.2 61.4 59.8 64.1 60.7 ...
  $ TT     : num  -23.4 -23.6 -19.5 -16.1 -15.3 ...
  $ PP     : num  6.17 4.27 -4.93 -9.23 -2.63 ...
  $ RF     : Ord.factor w/ 3 levels None2.5mm..: 1 1 1 1 1 1 1 1 1 1 ...
  - attr(*, means)= num

 Any idea what I'm missing here?

 Cheers
 Joris


 --
 Joris Meys
 Statistical consultant

 Ghent University
 Faculty of Bioscience Engineering
 Department of Applied mathematics, biometrics and process control

 tel : +32 9 264 59 87
 joris.m...@ugent.be
 ---
 Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange error : isS4(x) in gamm function (mgcv package). Variable in data-frame not recognized???

2010-07-28 Thread Joris Meys

Follow up:

I finally succeeded to more or less reproduce the error. The origin
lied in the fact that I accidently loaded a function while being in
browser mode for debugging that function. So something went very much
wrong with the namespaces. Teaches me right...

Cheers
Joris

On Wed, Jul 28, 2010 at 2:27 PM, Joris Meys jorism...@gmail.com wrote:
 Dear all,

 it gets even more weird. After restarting R, the code I used works
 just fine. The call is generated in a function that I debugged using
 browser(). Problem is solved, but I have no clue whatsoever how that
 error came about. It must have something to do with namespaces, but
 the origin is dark. I tried to regenerate the error, but didn't
 succeed.

 Somebody an idea as to where I have to look for a cause?

 Cheers
 Joris

 On Wed, Jul 28, 2010 at 1:16 PM, Joris Meys jorism...@gmail.com wrote:
 Dear all,

 I run a gamm with following call :

 result - try(gamm(values~ s( VM )+s( RH )+s( TT )+s( PP
 )+RF+weekend+s(day)+s(julday) ,correlation=corCAR1(form=~ day|month
 ),data=tmp) )

 with mgcv version 1.6.2

 No stress about the data, the error is not data-related. I get :

 Error in isS4(x) : object 'VM' not found

 What so? I did define the dataframe to be used, and the dataframe
 contains a variable VM :

 str(tmp)
 'data.frame':   4014 obs. of  12 variables:
  $ values : num  73.45 105.45 74.45 41.45 -4.55 ...
  $ dates  :Class 'Date'  num [1:4014] 9862 9863 9864 9865 9866 ...
  $ year   : num  -5.65 -5.65 -5.65 -5.65 -5.65 ...
  $ day    : num  -178 -177 -176 -175 -174 ...
  $ month  : Factor w/ 156 levels 1996-April,1996-August,..: 17 17
 17 17 17 17 17 17 17 17 ...
  $ julday : num  -2241 -2240 -2239 -2238 -2237 ...
  $ weekend: num  -0.289 -0.289 -0.289 0.711 0.711 ...
  $ VM     : num  0.139 -1.451 0.349 0.839 -0.611 ...
  $ RH     : num  55.2 61.4 59.8 64.1 60.7 ...
  $ TT     : num  -23.4 -23.6 -19.5 -16.1 -15.3 ...
  $ PP     : num  6.17 4.27 -4.93 -9.23 -2.63 ...
  $ RF     : Ord.factor w/ 3 levels None2.5mm..: 1 1 1 1 1 1 1 1 1 1 
 ...
  - attr(*, means)= num

 Any idea what I'm missing here?

 Cheers
 Joris


 --
 Joris Meys
 Statistical consultant

 Ghent University
 Faculty of Bioscience Engineering
 Department of Applied mathematics, biometrics and process control

 tel : +32 9 264 59 87
 joris.m...@ugent.be
 ---
 Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php




 --
 Joris Meys
 Statistical consultant

 Ghent University
 Faculty of Bioscience Engineering
 Department of Applied mathematics, biometrics and process control

 tel : +32 9 264 59 87
 joris.m...@ugent.be
 ---
 Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R^2 in loess and predict?

2010-07-09 Thread Joris Meys

I do not agree with your interpretation of the adjusted R^2. The R^2
is no more than the ratio of the explained variance by the total
variance, expressed in sums of squares. The adjusted R^2 is adjusted
for the degrees of freedom, and can only be used for selection
purposes. The interpretation towards the final model is hard, and
definitely not a measure of how well it models the population.

For a loess regression this can be calculated as well. But the loess
is a local regression technique, highly flexible, and highly dependent
on the window you use. In these cases, R^2 (or any other goodness of
fit test) tells you even less. You can get an R^2 of 1 if you chose
the window small enough.

If you want to do inference on nonlinear regression techniques, I
strongly suggest you use Generalized Additive Models, eg from the
package mgcv. There you can use the framework of likelihood ratio
tests for determination of goodness of fit by comparing models.

Cheers
Joris

On Fri, Jul 9, 2010 at 10:42 AM, Ralf B ralf.bie...@gmail.com wrote:
 Parametric regression produces R^2 as a measure of how well the model
 predicts the sample and adjusted R^2 as a measure of how well it
 models the population. What is the equalvalent for non-parametric
 regression (e.g. loess function) ?

 Ralf

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Non-parametric regression

2010-07-09 Thread Joris Meys

Just to be correct : gam is mentioned on the page Tal linked to, but
is a semi-parametric approach using maximum likelihood. It stays valid
though.

Another thing : you detect non-normality. But can you use a Poisson
distribution for example? The framework of generalized linear models
and generalized additive models allows you to deal with non-normality
of your data.

In any case, I suggest you contact a statistician nearby for guidance.

Cheers
Joris

On Fri, Jul 9, 2010 at 10:26 AM, Tal Galili tal.gal...@gmail.com wrote:
 From reviewing the first google page result for Non-parametric regression
 R, I hope this link will prove useful:

 http://socserv.mcmaster.ca/jfox/Courses/Oxford-2005/R-nonparametric-regression.html



 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)
 --




 On Fri, Jul 9, 2010 at 11:01 AM, Ralf B ralf.bie...@gmail.com wrote:

 I have two data sets, each a vector of 1000 numbers, each vector
 representing a distribution (i.e. 1000 numbers each of which
 representing a frequency at one point on a scale between 1 and 1000).
 For similfication, here an short version with only 5 points.


 a - c(8,10,8,12,4)
 b - c(7,11,8,10,5)

 Leaving the obvious discussion about causality aside fro a moment, I
 would like to see how well i can predict b from a using a regression.
 Since I do not know anything about the distribution type and already
 discovered non-normality I cannot use parametric regression or
 anything GLM for that matter.

 How should I proceed in using non-parametric regression to model
 vector a and see how well it predicts b? Perhaps you could extend the
 given lines into a short example script to give me an idea? Are there
 any other options?

 Best,
 Ralf

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] split with list

2010-07-09 Thread Joris Meys

One  solution is  to put these unwanted entries to 

repor$9853312 [1:2,2:3] - 

Cheers
Joris

On Fri, Jul 9, 2010 at 12:18 PM, n.via...@libero.it n.via...@libero.it wrote:

 Dear List I would like to ask you something concenting a better print of the 
 R output:
 I have a bit data frame which has the following structure:
 CFISCALE              RAGSOCB            ANNO       VAR1        VAR2.
 9853312                     astra                 2005           6            
    45

 9853312                     astra                 2006          78            
   45


 9853312                     astra                 2007           55           
    76


 9653421                      geox                 2005           35           
   89



 9653421                     geox                 2006            24           
     33

 9653421                      geox                 2007           54           
    55


 The first thing I did is to split my data frame for CFISCALE. The result is 
 that R has transformed my data frame into a list. The second step was to 
 transpose each element of my list.
 repo=split(rep,rep$CFISCALE)
 repor=lapply(repo,function(x){
 t(x)})


 When I print my list the format is the following
 $9853312
                                   1                           2               
          3

 CFISCALE            9853312         9853312             9853312

 RAGSOCB            astra               astra                    astra

 ANNO                   2005                2006                      
 2007

 VAR1                       6                         78                       
        55

 VAR2                       45                        45                       
       76


 There is a way to remove the  first row I mean 1, 2 , 3 and to have just one 
 CFISCALE and RAGSOCB???
 For the second problem I tried to use unique but it seems that it doesnt work 
 for list. So what I would like to get is:
 $9853312




 CFISCALE            9853312


 RAGSOCB            astra
 ANNO                   2005                2006                      
 2007

 VAR1                       6                         78                       
        55

 VAR2                       45                        45                       
       76


 This is because I next run xtable on my list in order to get a table in 
 Latex, which I woud like to be in a nice format.
 Thanks a lot for your attention!





        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] random sample from arrays

2010-07-09 Thread Joris Meys

Could you elaborate?

Both

 x - 1:4
 set - matrix(nrow = 50, ncol = 11)
  for(i in c(1:11)){
set[,i] -sample(x,50)
print(c(i,-, set), quote = FALSE)
   }

and

 x - 1:4
 set - matrix(nrow = 50, ncol = 11)
 for(i in c(1:50)){
   set[i,] -sample(x,11)
   print(c(i,-, set), quote = FALSE)
  }

run perfectly fine on my computer.
Cheers


On Fri, Jul 9, 2010 at 3:10 PM, Assa Yeroslaviz fry...@gmail.com wrote:
 Hi Joris,
 I guess i did it wrong again.
 but your example didn't work either. I still get the error massage.

 but replicate function just fine. I can even replicate the whole array
 lines.

 THX

 Assa

 On Thu, Jul 8, 2010 at 15:20, Joris Meys jorism...@gmail.com wrote:

 Don't know what exactly you're trying to do, but you make a matrix
 with 11 columns and 50 rows, then treat it as a vector. On top of
 that, you try to fill 50 rows/columns with 50 values. Off course that
 doesn't work. Did you check the warning messages when running the
 code?

 Either do :

  for(i in c(1:11)){
    set[,i] -sample(x,50)
    print(c(i,-, set), quote = FALSE)
   }

 or

  for(i in c(1:50)){
    set[i,] -sample(x,11)
    print(c(i,-, set), quote = FALSE)
   }

 Or just forget about the loop altogether and do :

 set - replicate(11,sample(x,50))
 or
 set - t(replicate(50,sample(x,11)))

 cheers

 On Thu, Jul 8, 2010 at 8:04 AM, Assa Yeroslaviz fry...@gmail.com wrote:
  Hello R users,
 
  I'm trying to extract random samples from a big array I have.
 
  I have a data frame of over 40k lines and would like to produce around
  50
  random sample of around 200 lines each from this array.
 
  this is the matrix
           ID xxx_1c xxx__2c xxx__3c xxx__4c xxx__5T xxx__6T xxx__7T
  xxx__8T
  yyy_1c yyy_1c _2c
  1 A_512  2.150295  2.681759  2.177138  2.142790  2.115344  2.013047
  2.115634  2.189372  1.643328  1.563523
  2 A_134 12.832488 12.596373 12.882581 12.987091 11.956149 11.994779
  11.650336 11.995504 13.024494 12.776322
  3 A_152  2.063276  2.160961  2.067549  2.059732  2.656416  2.075775
  2.033982  2.111937  1.606340  1.548940
  4 A_163  9.570761 10.448615  9.432859  9.732615 10.354234 10.993279
  9.160038  9.104121 10.079177  9.828757
  5 A_184  3.574271  4.680859  4.517047  4.047096  3.623668  3.021356
  3.559434  3.156093  4.308437  4.045098
  6 A_199  7.593952  7.454087  7.513013  7.449552  7.345718  7.367068
  7.410085  7.022582  7.668616  7.953706
  ...
 
  I tried to do it with a for loop:
 
  genelist - read.delim(/user/R/raw_data.txt)
  rownames(genelist) - genelist[,1]
  genes - rownames(genelist)
 
  x - 1:4
  set - matrix(nrow = 50, ncol = 11)
 
  for(i in c(1:50)){
     set[i] -sample(x,50)
     print(c(i,-, set), quote = FALSE)
     }
 
  which basically do the trick, but I just can't save the results outside
  the
  loop.
  After having the random sets of lines it wasn't a problem to extract the
  line from the arrays using subset.
 
  genSet1 -sample(x,50)
  random1 - genes %in% genSet1
  subsetGenelist - subset(genelist, random1)
 
 
  is there a different way of creating these random vectors or saving the
  loop
  results outside tjhe loop so I cn work with them?
 
  Thanks a lot
 
  Assa
 
         [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Joris Meys
 Statistical consultant

 Ghent University
 Faculty of Bioscience Engineering
 Department of Applied mathematics, biometrics and process control

 tel : +32 9 264 59 87
 joris.m...@ugent.be
 ---
 Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php





-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] relation in aggregated data

2010-07-08 Thread Joris Meys

Depending on the data and the research question, a meta-analytic
approach might be appropriate. You can see every campaign as a
study. See the package metafor for example. You can only draw very
general conclusions, but at least your inference will be closer to
correct.

Cheers
Joris

On Thu, Jul 8, 2010 at 10:03 AM, Petr PIKAL petr.pi...@precheza.cz wrote:
 Thank you

 Actually when I do this myself I always try to make day or week averages
 if possible. However this was done by one of my colleagues and basically
 the aggregation was done on basis of campaigns. There is 4 to 6 campaigns
 per year and sometimes there is apparent relationship in aggregated data
 sometimes is not. My opinion is that I can not say much about exact
 relations until I have other clues or ways like expected underlaying laws
 of physics.

 Thanks again

 Best regards
 Petr



 Joris Meys jorism...@gmail.com napsal dne 07.07.2010 17:33:55:

 You examples are pretty extreme... Combining 120 data points in 4
 points is off course never going to give a result. Try :

 fac - rep(1:8,each=15)
 xprum - tapply(x, fac, mean)
 yprum - tapply(y, fac, mean)
 plot(xprum, yprum)

 Relation is not obvious, but visible.

 Yes, you lose information. Yes, your hypothesis changes. But in the
 case you describe, averaging the x-values for every day (so you get an
 average linked to 1 y value) seems like a possibility, given you take
 that into account when formulating the hypothesis. Optimally, you
 should take the standard error on the average into account for the
 analysis, but this is complicated, often not done and in most cases
 ignoring this issue is not influencing the results to that extent it
 becomes important.

 YMMV

 Cheers

 On Wed, Jul 7, 2010 at 4:24 PM, Petr PIKAL petr.pi...@precheza.cz
 wrote:
  Dear all
 
  My question is more on statistics than on R, however it can be
  demonstrated by R. It is about pros and cons trying to find a
 relationship
  by aggregated data. I can have two variables which can be related and
 I
  measure them regularly during some time (let say a year) but I can not
  measure them in a same time - (e.g. I can not measure x and respective
  value of y, usually I have 3 or more values of x and only one value of
 y
  per day).
 
  I can make a aggregated values (let say quarterly). My questions are:
 
  1.      Is such approach sound? Can I use it?
  2.      What could be the problems
  3.      Is there any other method to inspect variables which can be
  related but you can not directly measure them in a same time?
 
  My opinion is, that it is not much sound to inspect aggregated values
 and
  there can be many traps especially if there are only few aggregated
  values. Below you can see my examples.
 
  If you have some opinion on this issue, please let me know.
 
  Best regards
  Petr
 
  Let us have a relation x/y
 
  set.seed(555)
  x - rnorm(120)
  y - 5*x+3+rnorm(120)
  plot(x, y)
 
  As you can see there is clear relation which can be seen from plot.
 Now I
  make a factor for aggregation.
 
  fac - rep(1:4,each=30)
 
  xprum - tapply(x, fac, mean)
  yprum - tapply(y, fac, mean)
  plot(xprum, yprum)
 
  Relationship is completely gone. Now let us make other fake data
 
  xn - runif(120)*rep(1:4, each=30)
  yn - runif(120)*rep(1:4, each=30)
  plot(xn,yn)
 
  There is no visible relation, xn and yn are independent but related to
  aggregation factor.
 
  xprumn - tapply(xn, fac, mean)
  yprumn - tapply(yn, fac, mean)
  plot(xprumn, yprumn)
 
  Here you can see perfect relation which is only due to aggregation
 factor.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Joris Meys
 Statistical consultant

 Ghent University
 Faculty of Bioscience Engineering
 Department of Applied mathematics, biometrics and process control

 tel : +32 9 264 59 87
 joris.m...@ugent.be
 ---
 Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php





-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package installation for Windows 7

2010-07-08 Thread Joris Meys

Hi,

I am running Windows 7 and R 2.11.1, and everything is installing just
fine for me. Did you install R in the Program Files folder? If so,
uninstall and try to re-install R in another folder (e.g.
c:\R\R2.11.1\ like on my computer). I noticed in the past that the
access control of Windows treats the Program Files folder different
compared to other folders.

Cheers
Joris

On Thu, Jul 8, 2010 at 1:15 PM, Dave Bickel
davidbickel.com+re...@gmail.com wrote:
 Neither biocLite nor the GUI menus can install packages on my system. Here
 is relevant output:

 version
 _
 platform i386-pc-mingw32
 arch i386
 os mingw32
 system i386, mingw32
 status
 major 2
 minor 11.1
 year 2010
 month 05
 day 31
 svn rev 52157
 language R
 version.string R version 2.11.1 (2010-05-31)
 source(http://bioconductor.org/biocLite.R;)
 BioC_mirror = http://www.bioconductor.org
 Change using chooseBioCmirror().
 biocLite(ALL)
 Using R version 2.11.1, biocinstall version 2.6.7.
 Installing Bioconductor version 2.6 packages:
 [1] ALL
 Please wait...

 Warning in install.packages(pkgs = pkgs, repos = repos, ...) :
 argument 'lib' is missing: using '\Users\dbickel/R/win-library/2.11'
 Error in if (ok) { : missing value where TRUE/FALSE needed
 source(http://bioconductor.org/biocLite.R;)
 BioC_mirror = http://www.bioconductor.org
 Change using chooseBioCmirror().
 Warning messages:
 1: In safeSource() : Redefining ‘biocinstall’
 2: In safeSource() : Redefining ‘biocinstallPkgGroups’
 3: In safeSource() : Redefining ‘biocinstallRepos’
 biocLite()
 Using R version 2.11.1, biocinstall version 2.6.7.
 Installing Bioconductor version 2.6 packages:
 [1] affy affydata affyPLM affyQCReport
 [5] annaffy annotate Biobase biomaRt
 [9] Biostrings DynDoc gcrma genefilter
 [13] geneplotter GenomicRanges hgu95av2.db limma
 [17] marray multtest vsn xtable
 Please wait...

 Warning in install.packages(pkgs = pkgs, repos = repos, ...) :
 argument 'lib' is missing: using '\Users\dbickel/R/win-library/2.11'
 Error in if (ok) { : missing value where TRUE/FALSE needed
 biocLite()
 Using R version 2.11.1, biocinstall version 2.6.7.
 Installing Bioconductor version 2.6 packages:
 [1] affy affydata affyPLM affyQCReport
 [5] annaffy annotate Biobase biomaRt
 [9] Biostrings DynDoc gcrma genefilter
 [13] geneplotter GenomicRanges hgu95av2.db limma
 [17] marray multtest vsn xtable
 Please wait...

 Warning in install.packages(pkgs = pkgs, repos = repos, ...) :
 argument 'lib' is missing: using '\Users\dbickel/R/win-library/2.11'
 Error in if (ok) { : missing value where TRUE/FALSE needed
 library(Biobase)
 Error in library(Biobase) : there is no package called 'Biobase'
 biocLite(Biobase)
 Using R version 2.11.1, biocinstall version 2.6.7.
 Installing Bioconductor version 2.6 packages:
 [1] Biobase
 Please wait...

 Warning in install.packages(pkgs = pkgs, repos = repos, ...) :
 argument 'lib' is missing: using '\Users\dbickel/R/win-library/2.11'
 Error in if (ok) { : missing value where TRUE/FALSE needed
 source(http://bioconductor.org/biocLite.R;)
 BioC_mirror = http://www.bioconductor.org
 Change using chooseBioCmirror().
 Warning messages:
 1: In safeSource() : Redefining ‘biocinstall’
 2: In safeSource() : Redefining ‘biocinstallPkgGroups’
 3: In safeSource() : Redefining ‘biocinstallRepos’
 biocLite(Biobase)
 Using R version 2.11.1, biocinstall version 2.6.7.
 Installing Bioconductor version 2.6 packages:
 [1] Biobase
 Please wait...

 Warning in install.packages(pkgs = pkgs, repos = repos, ...) :
 argument 'lib' is missing: using '\Users\dbickel/R/win-library/2.11'
 Error in if (ok) { : missing value where TRUE/FALSE needed

 utils:::menuInstallLocal() # Install package(s) from local zip files...
 Error in if (ok) { : missing value where TRUE/FALSE needed

 utils:::menuInstallPkgs() # Install package(s)...
 --- Please select a CRAN mirror for use in this session ---
 Error in if (ok) { : missing value where TRUE/FALSE needed


 I would appreciate any assistance.

 David

 --
 David R. Bickel, PhD
 Associate Professor
 Ottawa Institute of Systems Biology
 Biochem., Micro. and I. Department
 Mathematics and Statistics Department
 University of Ottawa
 451 Smyth Road
 Ottawa, Ontario K1H 8M5

 http://www.statomics.com

 Office Tel: (613) 562-5800 ext. 8670
 Office Fax: (613) 562-5185
 Office Room: RGN 4510F (Follow the signs to the elevator, and take it to the
 fourth floor. Turn left and go all the way to the end of the hall, and enter
 the door to the OISB area.)
 Lab Tel.: (613) 562-5800 ext. 8304
 Lab Room: RGN 4501T

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9

Re: [R] how to determine order of arima bt acf and pacf

2010-07-08 Thread Joris Meys

Did you read these already?

http://www.statoek.wiso.uni-goettingen.de/veranstaltungen/zeitreihen/sommer03/ts_r_intro.pdf

http://www.barigozzi.eu/ARIMA.pdf

Cheers
Joris

On Thu, Jul 8, 2010 at 11:45 AM, vijaysheegi vijay.she...@gmail.com wrote:

 Hi R community,
 I am new to R.Have  some doubts on ACF and pacf.Appying acf and pacf to
 dataframe or table.How to determine ARIMA degrees which suits our example .
 Please assist me on this and also please give me link for the same so that i
 will also try understand the solution.

 Thanks in advance
 vijaysheegi
 student


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/how-to-determine-order-of-arima-bt-acf-and-pacf-tp2282053p2282053.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strsplit(dia ma, \\b) splits characterwise

2010-07-08 Thread Joris Meys

l guess this is expected behaviour, although counterintuitive. \b
represents an empty string indicating a word boundary, but is coerced
to character and thus simply the empty string. This means the output
you get is the same as
 strsplit(dia ma, ,perl=T)
[[1]]
[1] d i a   m a

I'd use the seperating character as split in strsplit, eg

 strsplit(dia ma, \\s)
[[1]]
[1] dia ma

If you need the space in the list as well, you'll have to go around it I guess.

 test - as.vector(gregexpr(\\b, dia ma, perl=TRUE)[[1]])
 test
[1] 1 4 5 7
 apply(embed(test,2),1,function(x) substr(dia ma,x[2],x[1]-1))
[1] dia ma

It would be nice if special characters like \b would be recognized by
strsplit as well though.

Cheers
Joris

On Thu, Jul 8, 2010 at 10:15 AM, Suharto Anggono Suharto Anggono
suharto_angg...@yahoo.com wrote:
 \b is word boundary.
 But, unexpectedly, strsplit(dia ma, \\b) splits character by character.

 strsplit(dia ma, \\b)
 [[1]]
 [1] d i a   m a

 strsplit(dia ma, \\b, perl=TRUE)
 [[1]]
 [1] d i a   m a


 How can that be?

 This is the output of 'gregexpr'.

 gregexpr(\\b, dia ma)
 [[1]]
 [1] 1 2 3 4 5 6
 attr(,match.length)
 [1] 0 0 0 0 0 0

 gregexpr(\\b, dia ma, perl=TRUE)
 [[1]]
 [1] 1 4 5 7
 attr(,match.length)
 [1] 0 0 0 0


 The output from gregexpr(\\b, dia ma, perl=TRUE) is what I expect. I 
 expect 'strsplit' to split at that points.

 This is in Windows. R was installed from binary.

 sessionInfo()
 R version 2.11.1 (2010-05-31)
 i386-pc-mingw32

 locale:
 [1] LC_COLLATE=English_United States.1252
 [2] LC_CTYPE=English_United States.1252
 [3] LC_MONETARY=English_United States.1252
 [4] LC_NUMERIC=C
 [5] LC_TIME=English_United States.1252

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base



 R 2.8.1 shows the same 'strsplit' behavior, but the behavior of default 
 'gregexpr' (i.e. perl=FALSE) is different.

 strsplit(dia ma, \\b)
 [[1]]
 [1] d i a   m a

 strsplit(dia ma, \\b, perl=TRUE)
 [[1]]
 [1] d i a   m a

 gregexpr(\\b, dia ma)
 [[1]]
 [1] 1 4 5 7
 attr(,match.length)
 [1] 0 0 0 0

 gregexpr(\\b, dia ma, perl=TRUE)
 [[1]]
 [1] 1 4 5 7
 attr(,match.length)
 [1] 0 0 0 0

 sessionInfo()
 R version 2.8.1 (2008-12-22)
 i386-pc-mingw32

 locale:
 LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
 States.1252;LC_MON
 ETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United 
 States.1252


 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package installation for Windows 7

2010-07-08 Thread Joris Meys

As far as my experience goes, there is no need whatsoever to run R as
an administrator for installation of any package, including
BioConductor, provided you stay away from the Program Files folder.

Cheers
Joris

On Thu, Jul 8, 2010 at 2:55 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 On 08/07/2010 7:15 AM, Dave Bickel wrote:

 Neither biocLite nor the GUI menus can install packages on my system.

 This is probably a permissions problem.  Are you running R as an
 administrator when you try the install?  If not, you won't be able to
 install to the default location, but you should be able to set up your own
 library somewhere else, and install there.

 The error message should be fixed; we shouldn't give

 Error in if (ok) { : missing value where TRUE/FALSE needed


 but I doubt if that's the source of your problem.

 BTW, in future Bioconductor questions should go to their list.  In this case
 it looks as though it's not a BioC problem, but that just means there's no
 need to list all the biocLite.R material at the start:  short simple bug
 reports are easier to deal with.

 Duncan Murdoch

 Here is relevant output:

   version
 _
 platform i386-pc-mingw32
 arch i386
 os mingw32
 system i386, mingw32
 status
 major 2
 minor 11.1
 year 2010
 month 05
 day 31
 svn rev 52157
 language R
 version.string R version 2.11.1 (2010-05-31)
   source(http://bioconductor.org/biocLite.R;)
 BioC_mirror = http://www.bioconductor.org
 Change using chooseBioCmirror().
   biocLite(ALL)
 Using R version 2.11.1, biocinstall version 2.6.7.
 Installing Bioconductor version 2.6 packages:
 [1] ALL
 Please wait...

 Warning in install.packages(pkgs = pkgs, repos = repos, ...) :
 argument 'lib' is missing: using '\Users\dbickel/R/win-library/2.11'
 Error in if (ok) { : missing value where TRUE/FALSE needed
   source(http://bioconductor.org/biocLite.R;)
 BioC_mirror = http://www.bioconductor.org
 Change using chooseBioCmirror().
 Warning messages:
 1: In safeSource() : Redefining ‘biocinstall’
 2: In safeSource() : Redefining ‘biocinstallPkgGroups’
 3: In safeSource() : Redefining ‘biocinstallRepos’
   biocLite()
 Using R version 2.11.1, biocinstall version 2.6.7.
 Installing Bioconductor version 2.6 packages:
 [1] affy affydata affyPLM affyQCReport
 [5] annaffy annotate Biobase biomaRt
 [9] Biostrings DynDoc gcrma genefilter
 [13] geneplotter GenomicRanges hgu95av2.db limma
 [17] marray multtest vsn xtable
 Please wait...

 Warning in install.packages(pkgs = pkgs, repos = repos, ...) :
 argument 'lib' is missing: using '\Users\dbickel/R/win-library/2.11'
 Error in if (ok) { : missing value where TRUE/FALSE needed
   biocLite()
 Using R version 2.11.1, biocinstall version 2.6.7.
 Installing Bioconductor version 2.6 packages:
 [1] affy affydata affyPLM affyQCReport
 [5] annaffy annotate Biobase biomaRt
 [9] Biostrings DynDoc gcrma genefilter
 [13] geneplotter GenomicRanges hgu95av2.db limma
 [17] marray multtest vsn xtable
 Please wait...

 Warning in install.packages(pkgs = pkgs, repos = repos, ...) :
 argument 'lib' is missing: using '\Users\dbickel/R/win-library/2.11'
 Error in if (ok) { : missing value where TRUE/FALSE needed
   library(Biobase)
 Error in library(Biobase) : there is no package called 'Biobase'
   biocLite(Biobase)
 Using R version 2.11.1, biocinstall version 2.6.7.
 Installing Bioconductor version 2.6 packages:
 [1] Biobase
 Please wait...

 Warning in install.packages(pkgs = pkgs, repos = repos, ...) :
 argument 'lib' is missing: using '\Users\dbickel/R/win-library/2.11'
 Error in if (ok) { : missing value where TRUE/FALSE needed
   source(http://bioconductor.org/biocLite.R;)
 BioC_mirror = http://www.bioconductor.org
 Change using chooseBioCmirror().
 Warning messages:
 1: In safeSource() : Redefining ‘biocinstall’
 2: In safeSource() : Redefining ‘biocinstallPkgGroups’
 3: In safeSource() : Redefining ‘biocinstallRepos’
   biocLite(Biobase)
 Using R version 2.11.1, biocinstall version 2.6.7.
 Installing Bioconductor version 2.6 packages:
 [1] Biobase
 Please wait...

 Warning in install.packages(pkgs = pkgs, repos = repos, ...) :
 argument 'lib' is missing: using '\Users\dbickel/R/win-library/2.11'
 Error in if (ok) { : missing value where TRUE/FALSE needed

   utils:::menuInstallLocal() # Install package(s) from local zip
 files...
 Error in if (ok) { : missing value where TRUE/FALSE needed

   utils:::menuInstallPkgs() # Install package(s)...
 --- Please select a CRAN mirror for use in this session ---
 Error in if (ok) { : missing value where TRUE/FALSE needed


 I would appreciate any assistance.

 David



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied

Re: [R] random sample from arrays

2010-07-08 Thread Joris Meys

Don't know what exactly you're trying to do, but you make a matrix
with 11 columns and 50 rows, then treat it as a vector. On top of
that, you try to fill 50 rows/columns with 50 values. Off course that
doesn't work. Did you check the warning messages when running the
code?

Either do :

 for(i in c(1:11)){
set[,i] -sample(x,50)
print(c(i,-, set), quote = FALSE)
   }

or

 for(i in c(1:50)){
set[i,] -sample(x,11)
print(c(i,-, set), quote = FALSE)
   }

Or just forget about the loop altogether and do :

set - replicate(11,sample(x,50))
or
set - t(replicate(50,sample(x,11)))

cheers

On Thu, Jul 8, 2010 at 8:04 AM, Assa Yeroslaviz fry...@gmail.com wrote:
 Hello R users,

 I'm trying to extract random samples from a big array I have.

 I have a data frame of over 40k lines and would like to produce around 50
 random sample of around 200 lines each from this array.

 this is the matrix
          ID xxx_1c xxx__2c xxx__3c xxx__4c xxx__5T xxx__6T xxx__7T xxx__8T
 yyy_1c yyy_1c _2c
 1 A_512  2.150295  2.681759  2.177138  2.142790  2.115344  2.013047
 2.115634  2.189372  1.643328  1.563523
 2 A_134 12.832488 12.596373 12.882581 12.987091 11.956149 11.994779
 11.650336 11.995504 13.024494 12.776322
 3 A_152  2.063276  2.160961  2.067549  2.059732  2.656416  2.075775
 2.033982  2.111937  1.606340  1.548940
 4 A_163  9.570761 10.448615  9.432859  9.732615 10.354234 10.993279
 9.160038  9.104121 10.079177  9.828757
 5 A_184  3.574271  4.680859  4.517047  4.047096  3.623668  3.021356
 3.559434  3.156093  4.308437  4.045098
 6 A_199  7.593952  7.454087  7.513013  7.449552  7.345718  7.367068
 7.410085  7.022582  7.668616  7.953706
 ...

 I tried to do it with a for loop:

 genelist - read.delim(/user/R/raw_data.txt)
 rownames(genelist) - genelist[,1]
 genes - rownames(genelist)

 x - 1:4
 set - matrix(nrow = 50, ncol = 11)

 for(i in c(1:50)){
    set[i] -sample(x,50)
    print(c(i,-, set), quote = FALSE)
    }

 which basically do the trick, but I just can't save the results outside the
 loop.
 After having the random sets of lines it wasn't a problem to extract the
 line from the arrays using subset.

 genSet1 -sample(x,50)
 random1 - genes %in% genSet1
 subsetGenelist - subset(genelist, random1)


 is there a different way of creating these random vectors or saving the loop
 results outside tjhe loop so I cn work with them?

 Thanks a lot

 Assa

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using nlm or optim

2010-07-08 Thread Joris Meys

Without data I can't check, but try :
mle(nll,start=list(c=0.01,z=2.1,s=200),fixed=list(V=Var,M=Mean))

With a random dataset I get :
 Mean - rnorm(136)

 Var - 1 + rnorm(136)^2
 mle(nll,start=list(c=0.01,z=2.1,s=200),fixed=list(V=Var,M=Mean))
Error in optim(start, f, method = method, hessian = TRUE, ...) :
  initial value in 'vmmin' is not finite

This might be just a data problem, but again, I'm not sure.

Cheers
Joris

On Thu, Jul 8, 2010 at 3:11 AM, Anita Narwani anitanarw...@gmail.com wrote:
 Hello,
 I am trying to use nlm to estimate the parameters that minimize the
 following function:

 Predict-function(M,c,z){
 + v = c*M^z
 + return(v)
 + }

 M is a variable and c and z are parameters to be estimated.

 I then write the negative loglikelihood function assuming normal errors:

 nll-function(M,V,c,z,s){
 n-length(Mean)
 logl- -.5*n*log(2*pi) -.5*n*log(s) - (1/(2*s))*sum((V-Predict(Mean,c,z))^2)
 return(-logl)
 }

 When I put the Mean and Variance (variables with 136 observations) into this
 function, and estimates for c,z, and s, it outputs the estimate for the
 normal negative loglikelihood given the data, so I know that this works.

 However, I am unable to use mle to estimate the parameters c, z, and s. I do
 not know how or where the data i.e. Mean (M) and Variance (V) should enter
 into the mle function. I have tried variations on

 mle(nll,start=list(c=0.01,z=2.1,s=200)) including
 mle(nll,start=list(M=Mean,V=Var, c=0.01,z=2.1,s=200))

 I keep getting errors and am quite certain that I just have a syntax error
 in the script because I don't know how to enter the variables into MLE.

 Thanks for your help,
 Anita.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package installation for Windows 7

2010-07-08 Thread Joris Meys

On Thu, Jul 8, 2010 at 3:39 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 On 08/07/2010 9:26 AM, David Bickel wrote:

 Yes, the User into which I logged in before launching RGui is an
 Administrator. Correct, the problem is not limited to Bioconductor packages.


 On Windows 7, it's not enough to have the user be an administrator, you need
 to run programs specifically with administrative rights.  You do this by
 right clicking on the icon, and choosing Run as administrator.

Reason for this is that Program Files is a protected directory, and
changes can only be done by programs that get administrator rights
(which is not the same as running a program in an administrator
account). Without those administrator rights, R cannot access the
default folder for the packages, as that one is also included in the
Program Files. Also changing the Rprofile.site becomes a hassle, as so
many other tweaks.

Actually, the only programs getting there are Microsoft related
applications. If there's no strict need to place a program there, I
stay away from that folder as far as possible. Vista has the same
problem by the way, and obviously the same solutions.

Cheers
Joris

-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different goodness of fit tests leads to contradictory conclusions

2010-07-07 Thread Joris Meys

The first two options are GOF-tests for ungrouped data, the latter two
can only be used for grouped data. According to my experience, the G^2
test is more influenced by this than the X^2 test (gives -wrongly-
significant deviations more easily when used for ungrouped data).

If you started from binary data, you can only use the Hosmer tests.

Cheers
Joris

On Wed, Jul 7, 2010 at 4:00 PM, Kiyoshi Sasaki skiyoshi2...@yahoo.com wrote:
 I am trying to test goodness of fit for my legalistic regression using 
 several options as shown below.  Hosmer-Lemeshow test (whose function I 
 borrowed from a previous post), Hosmer–le Cessie omnibus lack of fit test 
 (also borrowed from a previous post), Pearson chi-square test, and deviance 
 test.  All the tests, except the deviance tests, produced p-values well above 
 0.05.  Would anyone please teach me why deviance test produce p-values such a 
 small value (0.001687886) that suggest lack of fit, while other tests suggest 
 good fit? Did I do something wrong?

 Thank you for your time and help!

 Kiyoshi


 mod.fit - glm(formula = no.NA$repcnd ~  no.NA$svl, family = binomial(link = 
 logit), data =  no.NA, na.action = na.exclude, control = list(epsilon = 
 0.0001, maxit = 50, trace = F))

 # Option 1: Hosmer-Lemeshow test
 mod.fit - glm(formula = no.NA$repcnd ~  no.NA$svl, family = binomial(link = 
 logit), data =  no.NA, na.action = na.exclude, control = list(epsilon = 
 0.0001, maxit = 50, trace = F))

    hosmerlem - function (y, yhat, g = 10)
 {
 cutyhat - cut(yhat, breaks = quantile(yhat, probs = seq(0, 1, 1/g)), 
 include.lowest = T)
 obs - xtabs(cbind(1 - y, y) ~ cutyhat)
 expect - xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
 chisq - sum((obs - expect)^2/expect)
 P - 1 - pchisq(chisq, g - 2)
 c(X^2 = chisq, Df = g - 2, P(Chi) = P)
 }

 hosmerlem(no.NA$repcnd, fitted(mod.fit))
  X^2       Df       P(Chi)
 7.8320107    8.000    0.4500497


 # Option 2 - Hosmer–le Cessie omnibus lack of fit test:
 library(Design)
 lrm.GOF - lrm(formula = no.NA$repcnd ~  no.NA$svl, data =  no.NA, y = T, x 
 = T)
 resid(lrm.GOF,type = gof)
 Sum of squared errors Expected value|H0    SD 
 Z P
      48.3487115      48.3017399       
    0.1060826 0.4427829 0.6579228

 # Option 3 - Pearson chi-square p-value:
 pp - sum(resid(lrm.GOF,typ=pearson)^2)
 1-pchisq(pp, mod.fit$df.resid)
 [1] 0.4623282


 # Option 4 - Deviance (G^2) test:
 1-pchisq(mod.fit$deviance, mod.fit$df.resid)
 [1] 0.001687886



        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] relation in aggregated data

2010-07-07 Thread Joris Meys

You examples are pretty extreme... Combining 120 data points in 4
points is off course never going to give a result. Try :

fac - rep(1:8,each=15)
xprum - tapply(x, fac, mean)
yprum - tapply(y, fac, mean)
plot(xprum, yprum)

Relation is not obvious, but visible.

Yes, you lose information. Yes, your hypothesis changes. But in the
case you describe, averaging the x-values for every day (so you get an
average linked to 1 y value) seems like a possibility, given you take
that into account when formulating the hypothesis. Optimally, you
should take the standard error on the average into account for the
analysis, but this is complicated, often not done and in most cases
ignoring this issue is not influencing the results to that extent it
becomes important.

YMMV

Cheers

On Wed, Jul 7, 2010 at 4:24 PM, Petr PIKAL petr.pi...@precheza.cz wrote:
 Dear all

 My question is more on statistics than on R, however it can be
 demonstrated by R. It is about pros and cons trying to find a relationship
 by aggregated data. I can have two variables which can be related and I
 measure them regularly during some time (let say a year) but I can not
 measure them in a same time - (e.g. I can not measure x and respective
 value of y, usually I have 3 or more values of x and only one value of y
 per day).

 I can make a aggregated values (let say quarterly). My questions are:

 1.      Is such approach sound? Can I use it?
 2.      What could be the problems
 3.      Is there any other method to inspect variables which can be
 related but you can not directly measure them in a same time?

 My opinion is, that it is not much sound to inspect aggregated values and
 there can be many traps especially if there are only few aggregated
 values. Below you can see my examples.

 If you have some opinion on this issue, please let me know.

 Best regards
 Petr

 Let us have a relation x/y

 set.seed(555)
 x - rnorm(120)
 y - 5*x+3+rnorm(120)
 plot(x, y)

 As you can see there is clear relation which can be seen from plot. Now I
 make a factor for aggregation.

 fac - rep(1:4,each=30)

 xprum - tapply(x, fac, mean)
 yprum - tapply(y, fac, mean)
 plot(xprum, yprum)

 Relationship is completely gone. Now let us make other fake data

 xn - runif(120)*rep(1:4, each=30)
 yn - runif(120)*rep(1:4, each=30)
 plot(xn,yn)

 There is no visible relation, xn and yn are independent but related to
 aggregation factor.

 xprumn - tapply(xn, fac, mean)
 yprumn - tapply(yn, fac, mean)
 plot(xprumn, yprumn)

 Here you can see perfect relation which is only due to aggregation factor.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems with write.table, involving loops paste statement

2010-07-07 Thread Joris Meys

To be correct, everything is written to all folders according to my testing.

There is absolutely no need whatsoever to use l_ply. And in any case,
take as much as possible outside the loop, like the library statement
and the max.col.

Following is _a_ solution, not the most optimal, but as close as
feasible to your code :

A_var_df - data.frame(index=1:length(seq(1.0, -0.9, by= -0.1)),
from=seq(1.0, -0.9, by= -0.1), to=seq(0.9, -1.0, by= -0.1))

# I make some data and make sure I can adjust the number of dirs and the steps
Dchr1 -matrix(rep(1:100,each=10),ncol=100)
dirs - 20
max.col - ncol(Dchr1)
steps = ceiling(max.col/dirs)

cols - seq(1, max.col, by=steps)

for(i in 1:length(A_var_df[,1]))
{
  k - cols[i]
 print(as.data.frame(Dchr1[,k:min(k+steps, max.col)]))
 print(paste(./A_,A_var_df[i,1], /k.csv, sep=))
# I use print here just to show which dataframe is going to which directory,
# you can reconstruct the write.table yourself.
}

Cheers
Joris

On Wed, Jul 7, 2010 at 9:32 PM, CC turtysm...@gmail.com wrote:
 Hi!

 I want to write portions of my data (3573 columns at a time) to twenty
 folders I have available titled A_1 to A_20 such that the first 3573
 columns will go to folder A_1, next 3573 to folder A_2 and so on.

 This code below ensures that the data is written into all 20 folders, but
 only the last iteration of the loop (last 3573 columns) is being written
 into ALL of the folders (A_1 to A_20) rather than the sequential order that
 I would like.

 How can I fix this?

 Thank you!

 *

 Code:

 A_var_df - data.frame(index=1:length(seq(1.0, -0.9, by= -0.1)),
 from=seq(1.0, -0.9, by= -0.1), to=seq(0.9, -1.0, by= -0.1))

 for(i in 1:length(A_var_df[,1]))
 {
 library(plyr)
 max.col - ncol(Dchr1)
 l_ply(seq(1, max.col, by=3573), function(k)
 write.table(as.data.frame(Dchr1[,k:min(k+3572, max.col)]), paste(./A_,
 A_var_df[i,1], /k.csv, sep=), sep=,, row.names=F, quote=F) )
 }

 **

 --
 Thanks,
 CC

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] when all I have is a contingency table....

2010-07-07 Thread Joris Meys

See Teds answer for histogram (I'd go with barplot).

For most statistical procedures there is a weighted version (e.g.
weighted.mean() for the mean). Your counts are valid weights for most
procedures.

Cheers
Joris

On Wed, Jul 7, 2010 at 10:39 PM, Andrei Zorine zoav1...@gmail.com wrote:
 Hello,
 I just need a hint here:
 Suppose I have no raw data, but only a frequency table I have, and I
 want to run basic statistical procedures with it, like histogram,
 descriptive statistics, etc. How do I do this with R?
 For example, how do I plot a histogram for this table for a sample of size 60?

 Value   Count
 1       10
 2       8
 3       12
 4       9
 5       14
 6       7


 Thanks,
 A.Z.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] acf

2010-07-06 Thread Joris Meys

Your question is a bit confusing. acfresidfit is an object, of which
we don't know the origin. with your test file, I arrive at the first
correlations (but with integer headings) :

 residfit - read.table(fileree2_test_out.txt)
 acf(residfit)
 acfresid - acf(residfit)
 acfresid

Autocorrelations of series ‘residfit’, by lag

 0  1  2  3  4  5  6  7  8  9
   10 11 12 13 14 15 16
 1.000 -0.015  0.010  0.099  0.048 -0.014 -0.039 -0.019  0.040  0.018
0.042  0.078 -0.029  0.028 -0.016 -0.021 -0.109
17 18 19 20 21 22 23 24 25
 0.000 -0.038 -0.006  0.015 -0.032 -0.002  0.014 -0.226 -0.030

Could you please check where the object acfresidfit is coming from and
how you generated it?
Cheers
Joris

On Tue, Jul 6, 2010 at 9:47 AM, nuncio m nunci...@gmail.com wrote:
 Hi list,
          I have the following code to compute the acf of a time series
 acfresid - acf(residfit), where residfit is the series
 when I type acfresid at the prompt the follwoing is displayed

 Autocorrelations of series ‘residfit’, by lag

 0. 0.0833 0.1667 0.2500 0. 0.4167 0.5000 0.5833 0.6667 0.7500 0.8333

  1.000 -0.015  0.010  0.099  0.048 -0.014 -0.039 -0.019  0.040  0.018  0.042

 0.9167 1. 1.0833 1.1667 1.2500 1. 1.4167 1.5000 1.5833 1.6667 1.7500

  0.078 -0.029  0.028 -0.016 -0.021 -0.109  0.000 -0.038 -0.006  0.015 -0.032

 1.8333 1.9167 2. 2.0833
 -0.002  0.014 -0.226 -0.030
 Residfit is a timeseries object at monthly interval (0.0833), Here I
 understand R computed the correlation at lags 0 to 2 years.

 What is surprising to me is
 if I type acfresidfit at the prompt the following is displayed

 Autocorrelations of series ‘residfit’, by lag

     0      1      2      3      4      5      6      7      8      9     10

  1.000 -0.004  0.011  0.041 -0.056  0.019 -0.052 -0.027 -0.008 -0.012 -0.034

    11     12     13     14     15     16     17     18     19     20     21

  0.024 -0.005  0.006 -0.045  0.031 -0.035 -0.011 -0.021 -0.020 -0.010 -0.007

    22     23     24     25
 -0.038  0.017  0.051  0.038
 From the header I understand both are autocorrelation computed at the same
 lags. but the correlations are different

 where am I going wrong and which is the correct one.

 file residfit is also attached(filename-fileree2_test_out.txt)
 Thanks
 nuncio
 --
 Nuncio.M
 Research Scientist
 National Center for Antarctic and Ocean research
 Head land Sada
 Vasco da Gamma
 Goa-403804

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] timeseries

2010-07-06 Thread Joris Meys

Difficult to guess why, but I reckon you should use ts() instead of
as.ts. Otherwise set the tsp-attribute correctly. Eg :

 x - cumsum(1+round(rnorm(20),2))
 as.ts(x)
Time Series:
Start = 1
End = 20
Frequency = 1
 [1]  0.87  3.51  4.08  4.20  3.25  4.63  6.30  6.89  9.28  9.93 10.19
 9.97 10.20 11.51 11.52 11.53 12.97 14.04 17.02 18.47

 tseries - ts(x,frequency=12,start=c(2004,3))
 tseries
   Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
2004  0.87  3.51  4.08  4.20  3.25  4.63  6.30  6.89  9.28  9.93
2005 10.19  9.97 10.20 11.51 11.52 11.53 12.97 14.04 17.02 18.47

 tsp(x) - c(2004+2/12,2005.75,12)
 as.ts(x)
   Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
2004  0.87  3.51  4.08  4.20  3.25  4.63  6.30  6.89  9.28  9.93
2005 10.19  9.97 10.20 11.51 11.52 11.53 12.97 14.04 17.02 18.47

See ?ts to get the options right. I suggest to use the function ts()
instead of assigning the tsp attribute.
)
Cheers
Joris


On Mon, Jul 5, 2010 at 9:35 AM, nuncio m nunci...@gmail.com wrote:
 Dear useRs,
 I am trying to construct a time series using as.ts function, surprisingly
 when I plot
 the data the x axis do not show the time in years, however if I use
 ts(data), time in years are shown in the
 x axis.  Why such difference in the results of both the commands
 Thanks
 nuncio


 --
 Nuncio.M
 Research Scientist
 National Center for Antarctic and Ocean research
 Head land Sada
 Vasco da Gamma
 Goa-403804

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculation on series with different time-steps

2010-07-06 Thread Joris Meys

Look at ?ifelse en ?abs, eg :

data_frame$new_column_in_dataframe - ifelse(stage$julian_day ==
baro$julian_day  abs(stage$time -
baro$hour) = 30,
stage$stage.cm - baro$pressure, NA )

Cheers
Joris


On Thu, Jul 1, 2010 at 10:09 PM, Jeana Lee jeana@colorado.edu wrote:
 Hello,

 I have two series, one with stream stage measurements every 5 minutes, and
 the other with barometric pressure measurements every hour.  I want to
 subtract each barometric pressure measurement from the 12 stage measurements
 closest in time to it (6 stage measurements on either side of the hour).

 I want to do something like the following, but I don't know the syntax.

 If the Julian day of the stage measurement is equal to the Julian day of
 the pressure measurement, AND the absolute value of the difference between
 the time of the stage measurement and the hour of the pressure measurement
 is less than or equal to 30 minutes, then subtract the pressure measurement
 from the stage measurement (and put it in a new column in the stage data
 frame).

                     if ( stage$julian_day = baro$julian_day  |stage$time -
 baro$hour| = 30 )
                     then (stage$stage.cm - baro$pressure)

 Can you help me?

 Thanks,
 JL

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help With ANOVA

2010-07-06 Thread Joris Meys

We're missing the samp1 etc. in order to be able to test the code.
Where did you get the other p-value?
Cheers
Joris

On Tue, Jul 6, 2010 at 3:08 PM, Amit Patel amitrh...@yahoo.co.uk wrote:
 Hi I needed some help with ANOVA

 I have a problem with My ANOVA
 analysis. I have a dataset with a known ANOVA p-value, however I can
 not seem to re-create it in R.

 I have created a list (zzzanova) which contains
 1)Intensity Values
 2)Group Number (6 Different Groups)
 3)Sample Number (54 different samples)
 this is created by the script in Appendix 1

 I then conduct ANOVA with the command
 zzz.aov - aov(Intensity ~ Group, data = zzzanova)

 I get a p-value of
 Pr(F)1
 0.9483218

 The
 expected p-value is 0.00490 so I feel I maybe using ANOVA incorrectly
 or have put in a wrong formula. I am trying to do an ANOVA analysis
 across all 6 Groups. Is there something wrong with my formula. But I think I
 have made a mistake in the formula rather than anything else.




 APPENDIX 1

 datalist - c(-4.60517, -4.60517, -4.60517, -4.60517, -4.60517, -4.60517, 
 -4.60517, 3.003749, -4.60517,
    2.045314, 2.482557, -4.60517, -4.60517, -4.60517, -4.60517, 1.592743, 
 -4.60517,
    -4.60517, 0.91328, -4.60517, -4.60517, 1.827744, 2.457795, 0.355075, 
 -4.60517, 2.39127,
    2.016987, 2.319903, 1.146683, -4.60517, -4.60517, -4.60517, 1.846162, 
 -4.60517, 2.121427, 1.973118,
    -4.60517, 2.251568, -4.60517, 2.270724, 0.70338, 0.963816, -4.60517,  
 0.023703, -4.60517,
    2.043382, 1.070586, 2.768289, 1.085169, 0.959334, -0.02428, -4.60517, 
 1.371895, 1.533227)

 zzzanova -
 structure(list(Intensity = c(t(Samp1), t(Samp2), t(Samp3), t(Samp4)),
 Group = structure(c(1,1,1,1,1,1,1,1,1,
         2,2,2,2,2,2,2,2,
         3,3,3,3,3,3,3,3,3,
         4,4,4,4,4,4,4,4,4,4,
         5,5,5,5,5,5,5,5,5,
         6,6,6,6,6,6,6,6,6), .Label = c(Group1, Group2, Group3, 
 Group4, Group5, Group6), class = factor),
    Sample = structure(c( 1, 2, 3, 4, 5, 6, 7, 8, 9,
    10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
    20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
    30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 
 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54)
 ))
 , .Names = c(Intensity,
 Group, Sample), row.names =
 c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
 51, 52, 53, 54),class = data.frame)




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help With ANOVA

2010-07-06 Thread Joris Meys

I still can't reproduce your example. The aov output gives me the following :

 anova(aov(Intensity ~ Group, data = zzzanova))
Analysis of Variance Table

Response: Intensity
  Df Sum Sq Mean Sq F value  Pr(F)
Group  5  98.85  19.771  2.1469 0.07576 .
Residuals 48 442.03   9.209
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Next to that I noticed you have truncated data, which has implications
for the analysis as well. If you use a Kruskal-Wallis test, the
p-value becomes larger :

 kruskal.test(Intensity ~ Group, data = zzzanova)

Kruskal-Wallis rank sum test

data:  Intensity by Group
Kruskal-Wallis chi-squared = 6.6955, df = 5, p-value = 0.2443

Which is to be expected, as you have almost 50% truncated data. So a
p-value of 0.005 seems very wrong to me.

Cheers
Joris

On Tue, Jul 6, 2010 at 7:11 PM, Amit Patel amitrh...@yahoo.co.uk wrote:
 Hi Joris

 Sorry i had a misprint in the appendix code in the last email

 datalist - c(-4.60517, -4.60517, -4.60517, -4.60517, -4.60517, -4.60517, 
 -4.60517, 3.003749, -4.60517,
    2.045314, 2.482557, -4.60517, -4.60517, -4.60517, -4.60517, 1.592743, 
 -4.60517,
    -4.60517, 0.91328, -4.60517, -4.60517, 1.827744, 2.457795, 0.355075, 
 -4.60517, 2.39127,
    2.016987, 2.319903, 1.146683, -4.60517, -4.60517, -4.60517, 1.846162, 
 -4.60517, 2.121427, 1.973118,
    -4.60517, 2.251568, -4.60517, 2.270724, 0.70338, 0.963816, -4.60517,  
 0.023703, -4.60517,
    2.043382, 1.070586, 2.768289, 1.085169, 0.959334, -0.02428, -4.60517, 
 1.371895, 1.533227)

 zzzanova -
 structure(list(Intensity = datalist,
 Group = structure(c(1,1,1,1,1,1,1,1,1,
         2,2,2,2,2,2,2,2,
         3,3,3,3,3,3,3,3,3,
         4,4,4,4,4,4,4,4,4,4,
         5,5,5,5,5,5,5,5,5,
         6,6,6,6,6,6,6,6,6), .Label = c(Group1, Group2, Group3, 
 Group4, Group5, Group6), class = factor),
    Sample = structure(c( 1, 2, 3, 4, 5, 6, 7, 8, 9,
    10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
    20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
    30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 
 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54)
 ))
 , .Names = c(Intensity,
 Group, Sample), row.names =
 c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
 51, 52, 53, 54),class = data.frame)


 Thanks for your reply





 - Original Message 
 From: Joris Meys jorism...@gmail.com
 To: Amit Patel amitrh...@yahoo.co.uk
 Cc: r-help@r-project.org
 Sent: Tue, 6 July, 2010 17:04:40
 Subject: Re: [R] Help With ANOVA

 We're missing the samp1 etc. in order to be able to test the code.
 Where did you get the other p-value?
 Cheers
 Joris

 On Tue, Jul 6, 2010 at 3:08 PM, Amit Patel amitrh...@yahoo.co.uk wrote:
 Hi I needed some help with ANOVA

 I have a problem with My ANOVA
 analysis. I have a dataset with a known ANOVA p-value, however I can
 not seem to re-create it in R.

 I have created a list (zzzanova) which contains
 1)Intensity Values
 2)Group Number (6 Different Groups)
 3)Sample Number (54 different samples)
 this is created by the script in Appendix 1

 I then conduct ANOVA with the command
 zzz.aov - aov(Intensity ~ Group, data = zzzanova)

 I get a p-value of
 Pr(F)1
 0.9483218

 The
 expected p-value is 0.00490 so I feel I maybe using ANOVA incorrectly
 or have put in a wrong formula. I am trying to do an ANOVA analysis
 across all 6 Groups. Is there something wrong with my formula. But I think I
 have made a mistake in the formula rather than anything else.




 APPENDIX 1

 datalist - c(-4.60517, -4.60517, -4.60517, -4.60517, -4.60517, -4.60517, 
 -4.60517, 3.003749, -4.60517,
    2.045314, 2.482557, -4.60517, -4.60517, -4.60517, -4.60517, 1.592743, 
 -4.60517,
    -4.60517, 0.91328, -4.60517, -4.60517, 1.827744, 2.457795, 0.355075, 
 -4.60517, 2.39127,
    2.016987, 2.319903, 1.146683, -4.60517, -4.60517, -4.60517, 1.846162, 
 -4.60517, 2.121427, 1.973118,
    -4.60517, 2.251568, -4.60517, 2.270724, 0.70338, 0.963816, -4.60517,  
 0.023703, -4.60517,
    2.043382, 1.070586, 2.768289, 1.085169, 0.959334, -0.02428, -4.60517, 
 1.371895, 1.533227)

 zzzanova -
 structure(list(Intensity = c(t(Samp1), t(Samp2), t(Samp3), t(Samp4)),
 Group = structure(c(1,1,1,1,1,1,1,1,1,
         2,2,2,2,2,2,2,2,
         3,3,3,3,3,3,3,3,3,
         4,4,4,4,4,4,4,4,4,4,
         5,5,5,5,5,5,5,5,5,
         6,6,6,6,6,6,6,6,6), .Label = c(Group1, Group2, Group3, 
 Group4, Group5, Group6), class = factor),
    Sample = structure(c( 1, 2, 3, 4, 5, 6, 7, 8, 9,
    10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
    20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
    30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 
 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54)
 ))
 , .Names = c(Intensity,
 Group, Sample), row.names =
 c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
 41, 42

Re: [R] PCA and Regression

2010-07-06 Thread Joris Meys

PCA components are orthogonal by definition so no, that doesn't make
sense at all. Do yourself a favor and get a book on multivariate data
analysis. Two books come to mind: Obviously the one of Hair and
colleagues, called multivariate data analysis and easily found in a
university library (or on the internet).

The second one is An R and S-Plus Companion to Multivariate Analysis
by Everitt. This one you have less chance of finding in the library,
but you find it online for sale.

This is a nice introduction as well :
https://netfiles.uiuc.edu/miguez/www/Teaching/MultivariateRGGobi.pdf

Never use a chainsaw without reading the manual.

Cheers
Joris

On Tue, Jul 6, 2010 at 9:07 PM, Marino Taussig De Bodonia, Agnese
agnese.marin...@imperial.ac.uk wrote:
 Hello,

 I am currently analyzing responses to questionnaires about general attitudes. 
 I have performed a PCA on my data, and have retained two Principal 
 Components. Now I would like to use the scores of both the principal 
 comonents in a multiple regression. I would like to know if it makes sense to 
 use the scores of one principal component to explain the variance in the 
 scores of another principal component:

 lm(scores of principal component 1~scores of principal component 2+ age, 
 gender, etc..)

 Please could you let me know if this is statistically sound?

 Thank you in advance for you time,

 Agnese
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] S4 classes and debugging - Is there a summary?

2010-07-02 Thread Joris Meys

Dear all,

I'm getting more and more frustrated with the whole S4 thing and I'm
looking for a more advanced summary on how to deal with them. Often I
get error messages that don't make sense at all, or the code is not
doing what I think it would do. Far too often inspecting the code
requires me to go to the source, which doesn't really help in easily
finding the bit of code you're interested in.

Getting the code with getAnywhere() doesn't always work. Debug()
doesn't work. Using trace() and browser() is not an option, as I can't
find the correct method.

eg :
library(raster)
 getAnywhere(xyValues)
A single object matching ‘xyValues’ was found
It was found in the following places
  package:raster
  namespace:raster
with value

standardGeneric for xyValues defined from package raster

function (object, xy, ...)
standardGeneric(xyValues)
environment: 0x04daf14c
Methods may be defined for arguments: object, xy
Use  showMethods(xyValues)  for currently available ones.
 showMethods(xyValues)
Function: xyValues (package raster)
object=Raster, xy=data.frame
object=Raster, xy=SpatialPoints
object=Raster, xy=vector
object=RasterLayer, xy=matrix
object=RasterStackBrick, xy=matrix

And now...?

Is there an overview that actually explains how you get the
information you're looking for without strolling through the complete
source? Sorry if I sound frustrated, but this is costing me huge
amounts of time, to that extent that I rather write a custom function
than use one in an S4 package if I'm not absolutely sure about what it
does and how it achieves it.

Cheers
Joris


-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] S4 classes and debugging - Is there a summary?

2010-07-02 Thread Joris Meys

Correction: trace off course works when specifying the signature. so eg:

trace(xyValues,tracer=browser,signature=c(object=RasterLayer, xy=matrix))
allows me to browse through the function. Should have specified the
signature, I overlooked that. Still, if there's a manual on S4 that
anybody likes to mention, please go ahead. I find bits and pieces on
the internet, but haven't found an overview yet dealing with all
peculiarities of these classes.

Cheers
Joris

On Fri, Jul 2, 2010 at 2:05 PM, Joris Meys jorism...@gmail.com wrote:
 Dear all,

 I'm getting more and more frustrated with the whole S4 thing and I'm
 looking for a more advanced summary on how to deal with them. Often I
 get error messages that don't make sense at all, or the code is not
 doing what I think it would do. Far too often inspecting the code
 requires me to go to the source, which doesn't really help in easily
 finding the bit of code you're interested in.

 Getting the code with getAnywhere() doesn't always work. Debug()
 doesn't work. Using trace() and browser() is not an option, as I can't
 find the correct method.

 eg :
 library(raster)
 getAnywhere(xyValues)
 A single object matching ‘xyValues’ was found
 It was found in the following places
  package:raster
  namespace:raster
 with value

 standardGeneric for xyValues defined from package raster

 function (object, xy, ...)
 standardGeneric(xyValues)
 environment: 0x04daf14c
 Methods may be defined for arguments: object, xy
 Use  showMethods(xyValues)  for currently available ones.
 showMethods(xyValues)
 Function: xyValues (package raster)
 object=Raster, xy=data.frame
 object=Raster, xy=SpatialPoints
 object=Raster, xy=vector
 object=RasterLayer, xy=matrix
 object=RasterStackBrick, xy=matrix

 And now...?

 Is there an overview that actually explains how you get the
 information you're looking for without strolling through the complete
 source? Sorry if I sound frustrated, but this is costing me huge
 amounts of time, to that extent that I rather write a custom function
 than use one in an S4 package if I'm not absolutely sure about what it
 does and how it achieves it.

 Cheers
 Joris


 --
 Joris Meys
 Statistical consultant

 Ghent University
 Faculty of Bioscience Engineering
 Department of Applied mathematics, biometrics and process control

 tel : +32 9 264 59 87
 joris.m...@ugent.be
 ---
 Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] S4 classes and debugging - Is there a summary?

2010-07-02 Thread Joris Meys

On Fri, Jul 2, 2010 at 4:01 PM, Martin Morgan mtmor...@fhcrc.org wrote:

  selectMethod(xyValues, c('RasterLayer', 'matrix'))

 would be my choice.

Thanks, that's a discovery. I guess I misunderstood the help pages on that one.

 I don't really have the right experience, but Chamber's 2008 Software
 for Data Analysis... and Gentleman's 2008 R Programming for
 Bioinformatics... books would be where I'd start. ?Methods and ?Classes
 are I think under-used.

I did read the posting guide ;-) but I have to admit the help pages
are not always that clear. Which is one of the reasons why S4 gets me
frustrated so easily. I also have Chamber's book and Gentleman's book
is here on the department as well. I'll go again through the relevant
chapters.

Thanks
Joris

-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] left end or right end

2010-07-01 Thread Joris Meys

First of all, read the posting guide carefully :
http://www.R-project.org/posting-guide.html
Your question is far from clear. When you say that the lengths of P
and Q are different, you mean the length of the data or the difference
between start and end? That makes a world of difference.

Regarding the statistical test, that depends on what your data
represents. Is it possible for P to fall close to the left and the
right :
P-
Q   ---
For example.

You should also specify which test you want to use. Then people on the
list will be able to tell you whether that is available in R. You can
off course construct your own test with the tools R provides, but
again, this requires a lot more information. Next to that, the list is
actually not intended for statistical advice, but for advice regarding
R code. Maybe somebody will join in with some statistical guidance,
but if you don't know what to do, you better consult a statistician at
your departement.

Cheers
Joris

On Thu, Jul 1, 2010 at 1:53 PM, ravikumar sukumar
ravikumarsuku...@gmail.com wrote:
 Dear all,
 I am a biologist. I have two sets of distance P(start1, end1) and Q(start2,
 end2).
 The distance will be like this.
 P         
 Q  

 I want to know whether P falls closely to the right end or left  end of Q.
  P and Q are of different lengths for each data point. There are more than
 1 pairs of P and Q.
 Is there any test or function in R to bring a statistically significant
 conclusion.

 Thanks for all,
 Suku

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] run R

2010-06-30 Thread Joris Meys

Welcome to a new world. You're using the _programming language_ R with
thousands of packages, and first of all you should be reading the
introduction material thoroughly :

http://cran.r-project.org/doc/manuals/R-intro.pdf
http://cran.r-project.org/doc/contrib/Owen-TheRGuide.pdf

Regarding your question : Check the help files of source :
?source

if you have a script say foo.R in a folder c:\bar, do :

source(c:/bar/foo.R)

mind the slashes instead of the backslashes. I assume you're on a
Windows system...

Cheers
Joris

On Wed, Jun 30, 2010 at 3:08 PM,  jorge.conr...@cptec.inpe.br wrote:


      Hi,


      I'm starting use the R Package and I have some .R scripts. How
      can I run these .R scripts.


      Conrado

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Use of processor by R 32bit on a 64bit machine

2010-06-29 Thread Joris Meys

Dear all,

I've recently purchased a new 64bit system with an intel i7 quadcore
processor. As I understood (maybe wrongly) that to date the 32bit
version of R is more stable than the 64bit, I installed the 32bit
version and am happily using it ever since. Now I'm running a whole
lot of models, which goes smoothly, and I thought out of curiosity to
check how much processor I'm using. I would have thought I used 25%
(being one core), as on my old dual core R uses 50% of the total
processor capacity. Funny, it turns out that R is currently using only
12-13% of my cpu, which is about half of what I expected.

Did I miss something somewhere? Should I change some settings? I'm
running on a Windows 7 enterprise. I looked around already, but I have
the feeling I overlooked something.

Cheers
Joris

sessionInfo()
R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C   LC_TIME=English_United
States.1252

attached base packages:
[1] grDevices datasets  splines   graphics  stats tcltk utils
   methods   base

other attached packages:
[1] svSocket_0.9-48 TinnR_1.0.3 R2HTML_2.0.0Hmisc_3.7-0
survival_2.35-7

loaded via a namespace (and not attached):
[1] cluster_1.12.3 grid_2.10.1lattice_0.18-3 svMisc_0.9-57  tools_2.10.1


-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use of processor by R 32bit on a 64bit machine

2010-06-29 Thread Joris Meys

*slaps forehead*
Thanks. So out it goes, that hyperthreading. Who invented
hyperthreading on a quad-core anyway?


Cheers
Joris

2010/6/29 Uwe Ligges lig...@statistik.tu-dortmund.de:


 On 29.06.2010 15:30, Joris Meys wrote:

 Dear all,

 I've recently purchased a new 64bit system with an intel i7 quadcore
 processor. As I understood (maybe wrongly) that to date the 32bit
 version of R is more stable than the 64bit, I installed the 32bit
 version and am happily using it ever since. Now I'm running a whole
 lot of models, which goes smoothly, and I thought out of curiosity to
 check how much processor I'm using. I would have thought I used 25%
 (being one core), as on my old dual core R uses 50% of the total
 processor capacity. Funny, it turns out that R is currently using only
 12-13% of my cpu, which is about half of what I expected.


 An Intel Core i7 Quadcore has 8 virtual cores since it supports
 hyperthreading. R uses one of these virtual cores. Note that 2 virtual cores
 won't be twice as fast since they are running on the same physical core.
 Hence this is expected.

 Uwe Ligges



 Did I miss something somewhere? Should I change some settings? I'm
 running on a Windows 7 enterprise. I looked around already, but I have
 the feeling I overlooked something.

 Cheers
 Joris

 sessionInfo()
 R version 2.10.1 (2009-12-14)
 i386-pc-mingw32

 locale:
 [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
 States.1252    LC_MONETARY=English_United States.1252
 [4] LC_NUMERIC=C                           LC_TIME=English_United
 States.1252

 attached base packages:
 [1] grDevices datasets  splines   graphics  stats     tcltk     utils
    methods   base

 other attached packages:
 [1] svSocket_0.9-48 TinnR_1.0.3     R2HTML_2.0.0    Hmisc_3.7-0
 survival_2.35-7

 loaded via a namespace (and not attached):
 [1] cluster_1.12.3 grid_2.10.1    lattice_0.18-3 svMisc_0.9-57
  tools_2.10.1






-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (New) Popularity of R, SAS, SPSS, Stata...

2010-06-28 Thread Joris Meys

Dear Robert,

I've tried to acces that link, but to no prevail. Seems the server
r4stats.com is down, as he doesn't respond. This link got me to the
site :
http://sites.google.com/site/r4statistics/popularity

Cheers
Joris

On Mon, Jun 28, 2010 at 3:52 PM, Muenchen, Robert A (Bob)
muenc...@utk.edu wrote:
 Greeting Listserv Readers,

 At http://r4stats.com/popularity I have added plots, data, and/or
 discussion of:

 1. Scholarly impact of each package across the years
 2. The number of subscribers to some of the listservs
 3. How popular each package is among Google searches across the years
 4. Survey results from a Rexer Analytics poll
 5. Survey results from a KDnuggests poll
 6. A rudimentary analysis of the software skills that employers are
 seeking

 Thanks very much to all the folks who helped on this project including:
 John Fox, Marc Schwartz, Duncan Murdoch, Martin Weiss, John (Jiangtang)
 HU, Andre Wielki, Kjetil Halvorsen, Dario Solari, Joris Meys, Keo
 Ormsby, Karl Rexer, and Gregory Piatetsky-Shapiro.

 If anyone can think of other angles, please let me know.

 Cheers,
 Bob

 =
  Bob Muenchen (pronounced Min'-chen), Manager
  Research Computing Support
  Voice: (865) 974-5230
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research,
  News:  http://oit.utk.edu/research/news.php
  Feedback: http://oit.utk.edu/feedback/
 =

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different standard errors from R and other software

2010-06-26 Thread Joris Meys

If I understand correctly from their website, discrete choice models
are mostly generalized linear models with the common link functions
for discrete data? Apart from a few names I didn't recognize, all
analyses seem quite standard to me. So I wonder why you would write
the log-likelihood yourself for techniques that are implemented in R.

Unless I missed something pretty important, or you want to do a
specific analysis that wasn't clear to me, you should take a closer
look at the possibilities in R for generalized linear (mixed)
modelling and so on.

Binary choice translates to a simple glm with a logit function.
Multinomial choice can be done with eg. multinom() from nnet. Ordered
choice can be done with polr() from the MASS package. A nice one to
look at is the package mgcv or gamm4 in case of big datasets. They
offer very flexible models that can include random terms, specific
variance-covariance structures and non-linear relations in the form of
splines.

Apologies if this is all obvious and known to you. In that case you
might want to specify what exactly it is you are comparing and how
exactly you calculated it yourself.

Cheers
Joris

On Fri, Jun 25, 2010 at 11:47 PM, Min Chen chenmin0...@gmail.com wrote:
 Hi all,

    Sorry to bother you. I'm estimating a discrete choice model in R using
 the maxBFGS command. Since I wrote the log-likelihood myself, in order to
 double check, I run the same model in Limdep. It turns out that the
 coefficient estimates are quite close; however, the standard errors are very
 different. I also computed the hessian and outer product of the gradients in
 R using the numDeriv package, but the results are still very different from
 those in Limdep. Is it the routine to compute the inverse hessian that
 causes the difference? Thank you very much!

     Best wishes.


 Min


 --
 Min Chen
 Ph.D. Candidate
 Department of Agricultural, Food, and Resource Economics
 125 Cook Hall
 Michigan State University

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple qqplot question

2010-06-25 Thread Joris Meys

Sorry, missed the two variable thing. Go with the lm solution then,
and you can tweak the plot yourself (the confidence intervals are
easily obtained via predict(lm.object, interval=prediction) ). The
function qq.plot uses robust regression, but in your case normal
regression will do.

Regarding the shapes : this just indicates both tails are shorter than
expected, so you have a kurtosis greater than 3 (or positive,
depending whether you do the correction or not)

Cheers
Joris

On Fri, Jun 25, 2010 at 4:10 AM, Ralf B ralf.bie...@gmail.com wrote:
 Short rep: I have two distributions, data and data2; each build from
 about 3 million data points; they appear similar when looking at
 densities and histograms. I plotted qqplots for further eye-balling:

 qqplot(data, data2, xlab = 1, ylab = 2)

 and get an almost perfect diagonal line which means they are in fact
 very alike. Now I tried to check normality using qqnorm -- and I think
 I am doing something wrong here:

 qqnorm(data, main = Q-Q normality plot for 1)
 qqnorm(data2, main = Q-Q normality plot for 2)

 I am getting perfect S-shaped curves (??) for both distributions. Am I
 something missing here?

 |
 |                               *  *   *  *
 |                           *
 |                        *
 |                    *
 |               *
 |            *
 |         *
 | * * *
 |-

 Thanks, Ralf




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wilcoxon signed rank test and its requirements

2010-06-25 Thread Joris Meys

As a remark on your histogram : use less breaks! This histogram tells
you nothing. An interesting function is ?density , eg :

x-rnorm(250)
hist(x,freq=F)
lines(density(x),col=red)

See also this ppt, a very nice and short introduction to graphics in R :
http://csg.sph.umich.edu/docs/R/graphics-1.pdf

2010/6/25 Atte Tenkanen atte...@utu.fi:
 Is there anything for me?

 There is a lot of data, n=2418, but there are also a lot of ties.
 My sample n≈250-300

You should think about the central limit theorem. Actually, you can
just use a t-test to compare means, as with those sample sizes the
mean is almost certainly normally distributed.

 i would like to test, whether the mean of the sample differ significantly 
 from the population mean.

According to probability theory, this will be in 5% of the cases if
you repeat your sampling infinitly. But as David asked: why on earth
do you want to test that?

cheers
Joris

-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Euclidean Distance Matrix Analysis (EDMA) in R?

2010-06-25 Thread Joris Meys

I've been looking around myself, but I couldn't find any. Maybe
somebody will chime in to direct you to the correct places. I also
checked the papers, and it seems not too hard to implement. If I find
some time, I'll take a look at it next week.

For the other two gentlemen, check:

http://www.getahead.psu.edu/PDF/EuclideanDistanceMatrixAnalysis.pdf
http://www.getahead.psu.edu/PDF/no.1.pdf

Cheers
Joris

On Fri, Jun 25, 2010 at 8:30 AM, gokhanocakoglu ocako...@uludag.edu.tr wrote:

 In fact, Euclidean Distance Matrix Analysis (EDMA) of form is a coordinate
 free approach to the analysis of form using landmark data which was
 developed by  Subhash Lele and Joan Richstmeier. They also developed a
 computer program (http://www.getahead.psu.edu/comment/edma.asp) that allow
 to perform several techniques including EDMA I-II but I wonder is there a
 package or code available in R to perform EDMA...

 thanks
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Euclidean-Distance-Matrix-Analysis-EDMA-in-R-tp2266797p2268018.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimizing given two vectors of data

2010-06-25 Thread Joris Meys

Optim uses vectors of _parameters_, not of data. You add a
(likelihood) function, give initial values of the parameters, and get
the optimized parameters back. See ?optim and the examples therein. It
contains an example for optimization using multiple data columns.

Cheers
Joris

On Fri, Jun 25, 2010 at 8:12 AM, confusedSoul ruchir_402...@infosys.com wrote:

I am trying to estimate an Arrhenius-exponential model in R. I have one
vector of data containing failure times, and another containing
corresponding temperatures. I am trying to optimize a maximum likelihood
function given BOTH these vectors. However, the optim command takes only
one such vector parameter.

How can I pass both vectors into the function?
--
View this message in context:
http://r.789695.n4.nabble.com/Optimizing-given-two-vectors-of-data-tp2268002p2268002.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

Re: [R] i want create script

2010-06-25 Thread Joris Meys

Please read the posting guide :  http://www.R-project.org/posting-guide.html

Your question is very vague. One could assume you're completely new to
R and want the commands to read a csv file (see ?read.csv), and to
write away a table (eg ?write.table to write your predicted data in a
text format).

My guess is you want to run this script in the shell without having to
open R, similar to a perl scipt. For this, take a look at:

http://cran.r-project.org/doc/manuals/R-intro.html#Scripting-with-R
http://projects.uabgrid.uab.edu/r-group/wiki/CommandLineProcessing

Cheers
Joris

On Fri, Jun 25, 2010 at 8:26 AM, vijaysheegi vijay.she...@gmail.com wrote:

 Hi R community,
 I want to create a script which will take the .csv table as input and do
 some prediction and output should be returned to some file.Inputs is exel
 sheet containing some tables of data.out should be table of predicted
 data.Will some one help me in this regards...
 Thanks in advance.

 I am using Windows R.Please advise proccedure to create Rscript.


 Regards
 -
 Vijay
 Research student
 Bangalore
 India
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/i-want-create-script-tp2268011p2268011.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wilcoxon signed rank test and its requirements

2010-06-25 Thread Joris Meys

2010/6/25 Frank E Harrell Jr f.harr...@vanderbilt.edu:
 The central limit theorem doesn't help.  It just addresses type I error,
 not power.

 Frank

I don't think I stated otherwise. I am aware of the fact that the
wilcoxon has an Asymptotic Relative Efficiency greater than 1 compared
to the t-test in case of skewed distributions. Apologies if I caused
more confusion.

The problem with the wilcoxon is twofold as far as I understood this
data correctly :
- there are quite some ties
- the wilcoxon assumes under the null that the distributions are the
same, not only the location. The influence of unequal variances and/or
shapes of the distribution is enhanced in the case of unequal sample
sizes.

The central limit theory makes that :
- the t-test will do correct inference in the presence of ties
- unequal variances can be taken into account using the modified
t-test, both in the case of equal and unequal sample sizes

For these reasons, I would personally use the t-test for comparing two
samples from the described population. Your mileage may vary.

Cheers
Joris


 On 06/25/2010 04:29 AM, Joris Meys wrote:
 As a remark on your histogram : use less breaks! This histogram tells
 you nothing. An interesting function is ?density , eg :

 x-rnorm(250)
 hist(x,freq=F)
 lines(density(x),col=red)

 See also this ppt, a very nice and short introduction to graphics in R :
 http://csg.sph.umich.edu/docs/R/graphics-1.pdf

 2010/6/25 Atte Tenkanenatte...@utu.fi:
 Is there anything for me?

 There is a lot of data, n=2418, but there are also a lot of ties.
 My sample n≈250-300

 You should think about the central limit theorem. Actually, you can
 just use a t-test to compare means, as with those sample sizes the
 mean is almost certainly normally distributed.

 i would like to test, whether the mean of the sample differ significantly 
 from the population mean.

 According to probability theory, this will be in 5% of the cases if
 you repeat your sampling infinitly. But as David asked: why on earth
 do you want to test that?

 cheers
 Joris



 --
 Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Euclidean Distance Matrix Analysis (EDMA) in R?

2010-06-25 Thread Joris Meys

Thanks for the link, very interesting book. Yet, I couldn't find the
part about EDMA. It would have surprised me anyway, as the input of
multidimensional scaling is one matrix with euclidean distances
between your observations, whereas in EDMA the data consist of a
number of distance matrices.

Quite a different thing if you ask me. Neither cmdscale nor isoMDS or
its derivated functions (eg metaMDS in the vegan package) are going to
be of any help.

Now I come to think of it, vegan has a procrustes function, but I'm
not sure if it is generalized to be of use in EDMA.

Cheers
Joris

On Fri, Jun 25, 2010 at 7:42 PM, Kjetil Halvorsen
kjetilbrinchmannhalvor...@gmail.com wrote:
 There is a freely downloadable and very relevant ( readable) book at
 https://ccrma.stanford.edu/~dattorro/mybook.html
 Convex Optimization and Euclidean Distance geometry, and it indeed names 
 EDMA
 as a form of multidimensional scaling (or maybe in the oposite way).
 You should have a look
 at the codes for multidimensional scaling in R.

 Kjetil

 On Fri, Jun 25, 2010 at 6:25 AM, gokhanocakoglu ocako...@uludag.edu.tr 
 wrote:

 thanks for your interests Joris



 Gokhan OCAKOGLU
 Uludag University
 Faculty of Medicine
 Department of Biostatistics
 http://www20.uludag.edu.tr/~biostat/ocakoglui.htm

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Euclidean-Distance-Matrix-Analysis-EDMA-in-R-tp2266797p2268257.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Average 2 Columns when possible, or return available value

2010-06-25 Thread Joris Meys

Just want to add that if you want to clean out the NA rows in a matrix
or data frame, take a look at ?complete.cases. Can be handy to use
with big datasets. I got curious, so I just ran the codes given here
on a big dataset, before and after removing NA rows. I have to be
honest, this is surely an illustration of the power of rowMeans. I'm
amazed myself.

DF - data.frame(
  A=rep(DF$A,1),
  B=rep(DF$B,1)
)

 system.time(apply(DF,1,mean,na.rm=TRUE))
   user  system elapsed
  13.260.06   13.46

 system.time(matrix(rowMeans(DF, na.rm=TRUE), ncol=1))
   user  system elapsed
   0.030.000.03

 system.time(t(as.matrix(aggregate(t(as.matrix(DF)),list(rep(1:1,each=2)),mean,
+ na.rm=TRUE)[,-1]))
+ )

Timing stopped at: 227.84 1.03 249.31  -- I got impatient and pressed the escape

 DF - DF[complete.cases(DF),]

 system.time(apply(DF,1,mean,na.rm=TRUE))
   user  system elapsed
   0.390.000.39

 system.time(matrix(rowMeans(DF, na.rm=TRUE), ncol=1))
   user  system elapsed
   0.010.000.02

 system.time(t(as.matrix(aggregate(t(as.matrix(DF)),list(rep(1:1,each=2)),mean,
+ na.rm=TRUE)[,-1]))
+ )
   user  system elapsed
  10.010.07   13.40

Cheers
Joris


On Sat, Jun 26, 2010 at 1:08 AM, emorway emor...@engr.colostate.edu wrote:

 Forum,

 Using the following data:

 DF-read.table(textConnection(A B
 22.60 NA
  NA NA
  NA NA
  NA NA
  NA NA
  NA NA
  NA NA
  NA NA
 102.00 NA
  19.20 NA
  19.20 NA
  NA NA
  NA NA
  NA NA
  11.80 NA
  7.62 NA
  NA NA
  NA NA
  NA NA
  NA NA
  NA NA
  75.00 NA
  NA NA
  18.30 18.2
  NA NA
  NA NA
  8.44 NA
  18.00 NA
  NA NA
  12.90 NA),header=T)
 closeAllConnections()

 The second column is a duplicate reading of the first column, and when two
 values are available, I would like to average column 1 and 2 (example code
 below).  But if there is only one reading, I would like to retain it, but I
 haven't found a good way to exclude NA's using the following code:

 t(as.matrix(aggregate(t(as.matrix(DF)),list(rep(1:1,each=2)),mean)[,-1]))

 Currently, row 24 is the only row with a returned value.  I'd like the
 result to return column A if it is the only available value, and average
 where possible.  Of course, if both columns are NA, NA is the only possible
 result.

 The result I'm after would look like this (row 24 is an avg):

  22.60
    NA
    NA
    NA
    NA
    NA
    NA
    NA
 102.00
  19.20
  19.20
    NA
    NA
    NA
  11.80
  7.62
    NA
    NA
    NA
    NA
    NA
  75.00
    NA
  18.25
    NA
    NA
  8.44
  18.00
    NA
  12.90

 This is a small example from a much larger data frame, so if you're
 wondering what the deal is with list(), that will come into play for the
 larger problem I'm trying to solve.

 Respectfully,
 Eric
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Average-2-Columns-when-possible-or-return-available-value-tp2269049p2269049.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] All a column to a data frame with a specific condition

2010-06-25 Thread Joris Meys

merge(sum_plan_a,data,by=plan_a)

Cheers
Joris


On Sat, Jun 26, 2010 at 2:27 AM, Yi liuyi.fe...@gmail.com wrote:
 Hi, folks,

 Please first look at the codes:

 plan_a=c('apple','orange','apple','apple','pear','bread')
 plan_b=c('bread','bread','orange','bread','bread','yogurt')
 value=1:6
 data=data.frame(plan_a,plan_b,value)
 library(plyr)
 library(reshape)
 mm=melt(data, id=c('plan_a','plan_b'))
 sum_plan_a=cast(mm,plan_a~variable,sum)

 ### I would like to add a new column to the data.frame named 'data',  with
 the same sum of value for the same type of plan_a
 ### The result should come up like this:

   plan_a  plan_b  value  sum_plan_a
 1  apple  bread      1        8
 2 orange  bread     2        2
 3  apple orange     3        8
 4  apple  bread      4        8
 5   pear  bread      5         5
 6  bread yogurt     6         6

 Any tips?

 Thank you.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-25 Thread Joris Meys

I had taken the opposite tack with Google Trends by subtracting
 keywords
like:
SAS -shoes -airlines -sonar...
but never got as good results as that beautiful X code for search.
When you see the end-of-semester panic bumps in traffic, you know
 you're
nailing it!

 I have to eat those words already. The R code for search that showed a
 peak every December did not have quotes around it, so it was searching
 for those three words not the complete phrase. When you add the quotes,
 the peaks vanish.

Don't swallow! You're looking through search terms, not through web
pages. R code for regression, regression code R etc. are all valid
searches, no quotation marks needed.

http://www.google.com/insights/search/#q=code%20for%20r%2Ccode%20for%20SAS%2Ccode%20for%20SPSS%2Ccode%20for%20matlabcmpt=q

This one is nice too. You can see that the bump in the autumn semester
for R is replacing the one for Matlab. Then in the spring semester
Matlab stays high but R drops. And both the US and India always have a
very large search index, whereas the rest of the world is essentially
worthless. Which leads me to the conclusion that : 1) The results are
probably coming from google.com, excluding local versions, and 2) in
the US (and India), statistics is mainly taught in the autumn
semester. Given the fact that daylight has a beneficial effect on the
emotional well being, the impopularity of statistics is likely caused
by unfortunate scheduling.

Forget Excel. Google rocks! ;-)

Cheers
Joris


 Once you go the phrase route, you gain precision but end up with zero
 counts on various phrases. I avoided that by combining them with + to
 get enough to plot. The resulting graph shows SAS dominant until
 mid-2006 when SPSS takes the top position, followed by R, SAS, Stata in
 order:

 http://www.google.com/insights/search/#q=%22r%20code%20for%22%2B%22r%20m
 anual%22%2B%22r%20tutorial%22%2B%22r%20graph%22%2C%22sas%20code%20for%22
 %2B%22sas%20manual%22%2B%22sas%20tutorial%22%2B%22sas%20graph%22%2C%22sp
 ss%20code%20for%22%2B%22spss%20manual%22%2B%22spss%20tutorial%22%2B%22sp
 ss%20graph%22%2C%22stata%20code%20for%22%2B%22stata%20manual%22%2B%22sta
 ta%20tutorial%22%2B%22stata%20graph%22%2C%22s-plus%20code%20for%22%2B%22
 s-plus%20manual%22%2Bs-plus%20tutorial%22%2B%22s-plus%20graph%22cmpt=q

 This might be a good one to add to http://r4stats.com/popularity

 Bob


I see that there's a car, the R Code Mustang, that adding for gets
 rid
of.

Thanks for getting me back on a topic that I had given up on!

Bob

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Joris Meys
Sent: Thursday, June 24, 2010 7:56 PM
To: Dario Solari
Cc: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

Nice idea, but quite sensitive to search terms, if you compare your
result on ... code with ... code for:
http://www.google.com/insights/search/#q=r%20code%20for%2Csas%20code%2
 0
f
or%2Cspss%20code%20forcmpt=q

On Thu, Jun 24, 2010 at 10:48 PM, Dario Solari
 dario.sol...@gmail.com
wrote:
 First: excuse for my english

 My opinion: a useful font for measuring popoularity can be Google
 Insights for Search - http://www.google.com/insights/search/#

 Every person using a software like R, SAS, SPSS needs first to learn
 it. So probably he make a web-search for a manual, a tutorial, a
 guide. One can measure the share of this kind of serach query.
 This kind of results can be useful to determine trends of
 popularity.

 Example 1: R tutorial/manual/guide, SAS tutorial/manual/guide,
 SPSS tutorial/manual/guide

http://www.google.com/insights/search/#q=%22r%20tutorial%22%2B%22r%20m
 a
n
ual%22%2B%22r%20guide%22%2B%22r%20vignette%22%2C%22spss%20tutorial%22%
 2
B
%22spss%20manual%22%2B%22spss%20guide%22%2C%22sas%20tutorial%22%2B%22s
 a
s
%20manual%22%2B%22sas%20guide%22cmpt=q

 Example 2: R software, SAS software, SPSS software

http://www.google.com/insights/search/#q=%22r%20software%22%2C%22spss%
 2
0
software%22%2C%22sas%20software%22cmpt=q

 Example 3: R code, SAS code, SPSS code

http://www.google.com/insights/search/#q=%22r%20code%22%2C%22spss%20co
 d
e
%22%2C%22sas%20code%22cmpt=q

 Example 4: R graph, SAS graph, SPSS graph

http://www.google.com/insights/search/#q=%22r%20graph%22%2C%22spss%20g
 r
a
ph%22%2C%22sas%20graph%22cmpt=q

 Example 5: R regression, SAS regression, SPSS regression

http://www.google.com/insights/search/#q=%22r%20regression%22%2C%22sps
 s
%
20regression%22%2C%22sas%20regression%22cmpt=q

 Some example are cross-software (learning needs - Example1), other
can
 be biased by the tarditional use of that software (in SPSS usually
you
 don't manipulate graph, i think)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
 and provide commented, minimal, self-contained, reproducible code.




--
Joris Meys

1 2 3 4 5 >

1 - 100 of 434 matches

Mail list logo