date:20110927

[R] Does replacing some values of a zoo object by NA reduce it's size ?

2011-09-27 Thread Ashim Kapoor

Dear R-helpers,

Please have a look at the following. f1 is the same as f2 except that it has
some values replaced by NA. But it's corresponding file is slightly bigger
than the file containing f2. Could someone please tell me if this is an
anomaly ?

 load(file1)
 ls()
[1] f1
 load(file2)
 ls()
[1] f1 f2


 dput(f1)
structure(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, NA, NA, NA, 2, 2,
2, NA, NA, NA, NA), .Dim = c(10L, 2L), .Dimnames = list(NULL,
c(v1, l1)), index = 1:10, class = zoo)
 dput(f2)
structure(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2), .Dim = c(10L, 2L), .Dimnames = list(NULL, c(v1,
l1)), index = 1:10, class = zoo)

-rw-r--r-- 1 ashimkapoor ashimkapoor 192 2011-09-27 11:08 file1
-rw-r--r-- 1 ashimkapoor ashimkapoor 179 2011-09-27 11:08 file2

Best Regards,
Ashim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two-way anova help

2011-09-27 Thread Indrajit Sengupta

Hi Paul,

There should not be any problem. Here is how I visualize the data table looks 
like:

Obs Male_typeÂ Â  
 Female_typeÂ  
 Response 
1 1 1 34 
2 1 1 44 
3 1 2 38 
â¦ â¦ â¦ â¦ 
If your data frame has the above structure, R will automatically understand 
that there is replication. Your model form will remain the same.

Regards,
Indrajit



From: Austin Paul austi...@usc.edu

Cc: r-help@r-project.org r-help@r-project.org
Sent: Tuesday, September 27, 2011 10:57 AM
Subject: Re: [R] two-way anova help


Hi,

Yes.Â  As I explained, the three male and three female types were crossed in 
all combinations (9 ways).Â  For each of the 9 types, I have 5 replicate tanks 
(45 total tanks).Â  And from each of the 45 tanks I have 50 observations for 
size.Â  So the 5 replicates are somehow nested within the two-way 
interaction?Â  If there was just 1 tank for each of the 9 crosses, yes, it 
would be very easy to code the two-way anova.Â  It may still be very easy, but 
I'm not quite sure how to account for the replicate tanks.Â  

Hope this makes more sense. 

Austin




On Mon, Sep 26, 2011 at 10:21 PM, Indrajit Sengupta indra_cali...@yahoo.com 
wrote:

Can you explain what do you mean by 5 replicate tanks?


Doing a two way anova is very simple in R. You would need to fit a linear 
model (lm function). 



Eg.:


 model - lm(y ~ male + female + male:female, data =)




Regards,
Indrajit






 From: Austin Paul austi...@usc.edu
To: r-help@r-project.org
Sent: Tuesday, September 27, 2011 6:13 AM
Subject: [R] two-way anova help


Hello,

I am having some trouble coding a two-way anova due to replicated
treatments.

I have a factorial design with three male parents and three female parents.
They were mated in all combinations and their babies were grown out and
measured for size.Â  50 babies were measured for each of the 9 crosses.Â  If I
stopped here, I would have no troubles.Â  But I also have 5 replicate tanks
for each of the 9 crosses.Â  My question is how to I code in the 5 replicate
tanks per treatment?

Thanks,
Austin

Â Â Â  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two-way anova help

2011-09-27 Thread Indrajit Sengupta

Somehow the table didn't come in the email as I had visualized. But I hope you 
are able to understand the columns. They are 


1. Obs number
2. Male Type
3. Female Type
4. Response variable

Male type can take three values 1, 2 and 3
Female type can take three values 1, 2 and 3

Note that to consider the above variables as factors and not covariates - you 
need to apply as.factor to those variables in R. 


Then you can run fit the linear model with or without interaction term and 
study the ANOVA table.

Hope this helps.


Regards,
Indrajit





To: Austin Paul austi...@usc.edu
Cc: r-help@r-project.org r-help@r-project.org
Sent: Tuesday, September 27, 2011 12:10 PM
Subject: Re: [R] two-way anova help

Hi Paul,

There should not be any problem. Here is how I visualize the data table looks 
like:

Obs Male_typeÂ Â  
Female_typeÂ  
Response 
1 1 1 34 
2 1 1 44 
3 1 2 38 
â¦ â¦ â¦ â¦ 
If your data frame has the above structure, R will automatically understand 
that there is replication. Your model form will remain the same.

Regards,
Indrajit



From: Austin Paul austi...@usc.edu

Cc: r-help@r-project.org r-help@r-project.org
Sent: Tuesday, September 27, 2011 10:57 AM
Subject: Re: [R] two-way anova help


Hi,

Yes.Â  As I explained, the three male and three female types were crossed in 
all combinations (9 ways).Â  For each of the 9 types, I have 5 replicate tanks 
(45 total tanks).Â  And from each of the 45 tanks I have 50 observations for 
size.Â  So the 5 replicates are somehow nested within the two-way 
interaction?Â  If there was just 1 tank for each of the 9 crosses, yes, it 
would be very easy to code the two-way anova.Â  It may still be very easy, but 
I'm not quite sure how to account for the replicate tanks.Â  

Hope this makes more sense. 

Austin




On Mon, Sep 26, 2011 at 10:21 PM, Indrajit Sengupta indra_cali...@yahoo.com 
wrote:

Can you explain what do you mean by 5 replicate tanks?


Doing a two way anova is very simple in R. You would need to fit a linear 
model (lm function). 



Eg.:


 model - lm(y ~ male + female + male:female, data =)




Regards,
Indrajit






 From: Austin Paul austi...@usc.edu
To: r-help@r-project.org
Sent: Tuesday, September 27, 2011 6:13 AM
Subject: [R] two-way anova help


Hello,

I am having some trouble coding a two-way anova due to replicated
treatments.

I have a factorial design with three male parents and three female parents.
They were mated in all combinations and their babies were grown out and
measured for size.Â  50 babies were measured for each of the 9 crosses.Â  If I
stopped here, I would have no troubles.Â  But I also have 5 replicate tanks
for each of the 9 crosses.Â  My question is how to I code in the 5 replicate
tanks per treatment?

Thanks,
Austin

Â Â Â  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Â Â Â  [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two-way anova help

2011-09-27 Thread Bert Gunter

Tanks are the experimental unit, fishes within tanks are repeated measures
of the treatment, there is no nesting of replicates. You can analyze the
45 x 50 individual data values by repeated measures (mixed effects models)
or by summarizing the 50 measurements per tank=treatment into a single value
and then doing exactly what Indrajit said with the 45 summarized values.

-- Bert

On Mon, Sep 26, 2011 at 10:27 PM, Austin Paul austi...@usc.edu wrote:

 Hi,

 Yes.  As I explained, the three male and three female types were crossed in
 all combinations (9 ways).  For each of the 9 types, I have *5 replicate
 tanks* (45 total tanks).  And from each of the 45 tanks I have 50
 observations for size.  So the 5 replicates are somehow nested within the
 two-way interaction?  If there was just 1 tank for each of the 9 crosses,
 yes, it would be very easy to code the two-way anova.  It may still be very
 easy, but I'm not quite sure how to account for the replicate tanks.

 Hope this makes more sense.

 Austin



 On Mon, Sep 26, 2011 at 10:21 PM, Indrajit Sengupta 
 indra_cali...@yahoo.com
  wrote:

  Can you explain what do you mean by 5 replicate tanks?
 
  Doing a two way anova is very simple in R. You would need to fit a linear
  model (lm function).
 
  Eg.:
 
   model - lm(y ~ male + female + male:female, data =)
 
 
  Regards,
  Indrajit
 
 
  --
  *From:* Austin Paul austi...@usc.edu
  *To:* r-help@r-project.org
  *Sent:* Tuesday, September 27, 2011 6:13 AM
  *Subject:* [R] two-way anova help
 
  Hello,
 
  I am having some trouble coding a two-way anova due to replicated
  treatments.
 
  I have a factorial design with three male parents and three female
 parents.
  They were mated in all combinations and their babies were grown out and
  measured for size.  50 babies were measured for each of the 9 crosses.
  If
  I
  stopped here, I would have no troubles.  But I also have 5 replicate
 tanks
  for each of the 9 crosses.  My question is how to I code in the 5
 replicate
  tanks per treatment?
 
  Thanks,
  Austin
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package for Neural Network Classification Quality

2011-09-27 Thread Claudia Beleites


Alejandro,


Hi, somebody knows about one R-package which i can evaluate quality
(recall, precision, accuracy, etc) of Neural network classification
with more than 2 classes. I found ROCT package,
http://cran.r-project.org/web/packages/ROCR/index.html, but it only
workes with binary classifications,

I guess that is because strictly these measures are defined for binary
problems (though I expand them to multi-class situations by using
class-A ./. not-class-A binary measures which comes quite naturally for
my classes).

In case you need something that takes soft or fuzzy class measures: I
put my ideas about that into package softclassval and would much
appreciate feedback.

Best,

Claudia




Best regards,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Claudia Beleites
Spectroscopy/Imaging
Institute of Photonic Technology
Albert-Einstein-Str. 9
07745 Jena
Germany

email: claudia.belei...@ipht-jena.de
phone: +49 3641 206-133
fax:   +49 2641 206-399

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two-way anova help

2011-09-27 Thread Austin Paul

Hi Indrajit and Bert,

I really appeciate your help.  I have coded as you mentioned, but I feel
like I am losing a lot of data because I am not accounting for the replicate
tanks (what if the 5 replicates of the same cross type vary more among tanks
than within?).  Below is what my data would look like (if it comes through)

   obs  male female rep response 1 1 1 1 34 2 1 1 1 44 . . . . . . . . . .
51 1 1 2 37 . . . . . . . . . . 251 1 2 1 42

So if I understand correctly, the five replicate tanks of each cross cannot
be treated as technical replicates?  They are not exactly repeated measures
in the sense they are different individuals in different replicate tanks.
If I pool all 250 observations for each cross, instead of treating it as
5x50 observations, I feel like I am losing a lot of information.

Austin


On Mon, Sep 26, 2011 at 11:40 PM, Indrajit Sengupta indra_cali...@yahoo.com
 wrote:

  Hi Paul,

 There should not be any problem. Here is how I visualize the data table
 looks like:

  Obs Male_type
 Female_type
 Response 1 1 1 34 2 1 1 44 3 1 2 38    
 If your data frame has the above structure, R will automatically understand
 that there is replication. Your model form will remain the same.

 Regards,
 Indrajit

   --
 *From:* Austin Paul austi...@usc.edu
 *To:* Indrajit Sengupta indra_cali...@yahoo.com
 *Cc:* r-help@r-project.org r-help@r-project.org
 *Sent:* Tuesday, September 27, 2011 10:57 AM
 *Subject:* Re: [R] two-way anova help

 Hi,

 Yes.  As I explained, the three male and three female types were crossed in
 all combinations (9 ways).  For each of the 9 types, I have *5 replicate
 tanks* (45 total tanks).  And from each of the 45 tanks I have 50
 observations for size.  So the 5 replicates are somehow nested within the
 two-way interaction?  If there was just 1 tank for each of the 9 crosses,
 yes, it would be very easy to code the two-way anova.  It may still be very
 easy, but I'm not quite sure how to account for the replicate tanks.

 Hope this makes more sense.

 Austin



 On Mon, Sep 26, 2011 at 10:21 PM, Indrajit Sengupta 
 indra_cali...@yahoo.com wrote:

  Can you explain what do you mean by 5 replicate tanks?

 Doing a two way anova is very simple in R. You would need to fit a linear
 model (lm function).

 Eg.:

  model - lm(y ~ male + female + male:female, data =)


 Regards,
 Indrajit


  --
 *From:* Austin Paul austi...@usc.edu
 *To:* r-help@r-project.org
 *Sent:* Tuesday, September 27, 2011 6:13 AM
 *Subject:* [R] two-way anova help

 Hello,

 I am having some trouble coding a two-way anova due to replicated
 treatments.

 I have a factorial design with three male parents and three female parents.
 They were mated in all combinations and their babies were grown out and
 measured for size.  50 babies were measured for each of the 9 crosses.  If
 I
 stopped here, I would have no troubles.  But I also have 5 replicate tanks
 for each of the 9 crosses.  My question is how to I code in the 5 replicate
 tanks per treatment?

 Thanks,
 Austin

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.







[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to terminate R program if found any execution error

2011-09-27 Thread Allan Engelhardt

The original poster may also be interested in options(error) to 
capture the 'any execution error' requirement.  From the examples in 
help(options):


## Not run: ## on error, terminate the R session with error status 66
options(error = quote(q(no, status=66, runLast=FALSE)))
stop(test it)


Allan

On 27/09/2011 06:22, Duncan Murdoch wrote:

On 11-09-27 12:20 AM, arunkumar wrote:

Hi

   I want to terminate R process if there are any execution error.

a=a
b=10
c=try(a/b)

if(class(c)[1]==try-error)
{
stop(Wrong Input Value)
}

d=c*c



if c fails then it should terminate the process.
Please can anyone help


Keep the try(a/b), but replace the conditional with

if (inherits(c, try-error)) q(no)

See ?q if you want to set an error status to be returned, or do want 
to save the workspace, etc.


(And do use inherits() rather than comparing to a particular entry: 
your code will probably work in this example, but it's not the right 
way to test the class of something, and some day try-error might not 
be the first entry.)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Does replacing some values of a zoo object by NA reduce it's size ?

2011-09-27 Thread Allan Engelhardt

It is not an anomaly.  The object is the same size 
(object.size(f1)==object.size(f2)); its file representation is different.


On 27/09/2011 07:00, Ashim Kapoor wrote:

Dear R-helpers,

Please have a look at the following. f1 is the same as f2 except that it has
some values replaced by NA. But it's corresponding file is slightly bigger
than the file containing f2. Could someone please tell me if this is an
anomaly ?


load(file1)
ls()

[1] f1

load(file2)
ls()

[1] f1 f2

dput(f1)

structure(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, NA, NA, NA, 2, 2,
2, NA, NA, NA, NA), .Dim = c(10L, 2L), .Dimnames = list(NULL,
 c(v1, l1)), index = 1:10, class = zoo)

dput(f2)

structure(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2), .Dim = c(10L, 2L), .Dimnames = list(NULL, c(v1,
l1)), index = 1:10, class = zoo)

-rw-r--r-- 1 ashimkapoor ashimkapoor 192 2011-09-27 11:08 file1
-rw-r--r-- 1 ashimkapoor ashimkapoor 179 2011-09-27 11:08 file2

Best Regards,
Ashim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] regression with ordered arguments

2011-09-27 Thread Francesco Sarracino

Dear R listers,

I am trying to be a new R user, but life is not that easy.
My problem is the following one: let's assume to have 3 outcome variables
(y1, y2, y3) and 3 explanatory ones (x1, x2, x3).
How can I run the following three separate regressions without having to
repeat the lm command three times?

fit.1 - lm(y1 ~ x1)
fit.2 - lm(y2 ~ x2)
fit.3 - lm(y3 ~ x3)


Both the y and x variables have been generated extracting random numbers
from uniform distributions using a command such as:

y1 - runif(100, min = 0, max = 1)

I went to several introductory manuals, the manual R for stata users,
econometrics in R, Introductory statistics with R and several blogs and help
files, but I didn't find an answer to my question.
can you please help me? In Stata I wouldn't have any problem  in running
this as a loop, but I really can't figure out how to do that with R.
Thanks in advance for all your help.
Best,
f.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: regression with ordered arguments

2011-09-27 Thread Petr PIKAL

Hi

 Dear R listers,
 
 I am trying to be a new R user, but life is not that easy.
 My problem is the following one: let's assume to have 3 outcome 
variables
 (y1, y2, y3) and 3 explanatory ones (x1, x2, x3).
 How can I run the following three separate regressions without having to
 repeat the lm command three times?
 
 fit.1 - lm(y1 ~ x1)
 fit.2 - lm(y2 ~ x2)
 fit.3 - lm(y3 ~ x3)
 
 
 Both the y and x variables have been generated extracting random numbers
 from uniform distributions using a command such as:
 
 y1 - runif(100, min = 0, max = 1)
 
 I went to several introductory manuals, the manual R for stata users,
 econometrics in R, Introductory statistics with R and several blogs and 
help
 files, but I didn't find an answer to my question.
 can you please help me? In Stata I wouldn't have any problem  in running
 this as a loop, but I really can't figure out how to do that with R.

You can construct loop with naming through paste, numbers and get in R too 
but you will find your life much easier to use R powerfull list 
operations.

Insted of

y1 - runif(100, min = 0, max = 1)
...

lll - vector(mode=list, length=3)
lll - lapply(1, function(x) runif(100, min = 0, max = 1))

you can use probably mapply for doing your regression.
Or you can easily access part of the list by loop

for (i in 1:3) lm(lll[[i]]~xx[[i]])

(if you have your x's in list xx)

Regards
Petr

 Thanks in advance for all your help.
 Best,
 f.
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mahalanobis Distance

2011-09-27 Thread david.jessop

Hi

One thought would be to fit say a GARCH model to your historical data series, 
divide the returns by the sigma estimates and then repeat.  This would have the 
advantage of getting the data to be closer to the same scale. 

Regards

David  

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of jorgeA
Sent: 26 September 2011 21:07
To: r-help@r-project.org
Subject: Re: [R] Mahalanobis Distance

Hello David,

Thank you for the help anyway. Well answering your question However, I wonder 
how much value there is to computing the Mahalanobis distance with two 
variables that are measured on such different scales?:

These two variables are subseries of the same time series. What I'm doing is 
using one method of forecasting time series that searches in the past of the 
time series, similar subseries to be an input of a forecasting function. I'm 
testing several distance measures, and one of that is the mahalanobis
distance.   

But right now I'm stuck with this problem

Best regards,

Jorge Aikes Junior
Universidade Estadual do Oeste do Paraná - Brazil.

--
View this message in context: 
http://r.789695.n4.nabble.com/Mahalanobis-Distance-tp3844960p3845168.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Issued by UBS AG or affiliates to professional investors...{{dropped:30}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] creating a vector automatically containing numbers 1, 100, 10000, 1000000, ...

2011-09-27 Thread Marion Wenty

hello,

i am looking for a way to get a vector containing the numbers:

1,100,1,100,...

each element should have two ceros more than the one before it until the
number

1e+200

without using a loop.

thank you very much in advance for your help!

marion

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] creating a vector automatically containing numbers 1, 100, 10000, 1000000, ...

2011-09-27 Thread Barry Rowlingson

On Tue, Sep 27, 2011 at 9:33 AM, Marion Wenty marion.we...@gmail.com wrote:
 hello,

 i am looking for a way to get a vector containing the numbers:

 1,100,1,100,...

 each element should have two ceros more than the one before it until the
 number

 1e+200

 without using a loop.

 10^(seq(0,200,by=2)

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Auto size of plots.

2011-09-27 Thread Joel

Hi

I'm wondering if there is anyway to auto set the margin of a plot. Right now
I'm setting it with par(mar=...) so that all the text for a bar shows up
inside the plot. But I don't want to change par(mar=) for every new plot I
make. 
So is there any way to tell R to increase the margin automatic as long as
something is outside its borders?

//Joel

--
View this message in context: 
http://r.789695.n4.nabble.com/Auto-size-of-plots-tp3846560p3846560.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two-way anova help

2011-09-27 Thread Indrajit Sengupta

Hi Paul,

Let me ask you one question - when you say tank does it indicate some kind of 
chamber or area in which all the 9 pairs are tested / treated / observed? If so 
- then each of the tanks would constitute a sort of block or a homogeneous 
group. Your data would look like this:

ObsÂ Â Â  TankÂ Â Â  Male_typeÂ Â Â  Female_typeÂ Â Â  Response
1Â Â Â  1Â Â Â  1Â Â Â  1Â Â Â  43
2Â Â Â  2Â Â Â  1Â Â Â  1Â Â Â  52
3Â Â Â  3Â Â Â  1Â Â Â  1Â Â Â  48
4Â Â Â  4Â Â Â  1Â Â Â  1Â Â Â  58
5Â Â Â  5Â Â Â  1Â Â Â  1Â Â Â  55
6Â Â Â  1Â Â Â  1Â Â Â  2Â Â Â  52
7Â Â Â  2Â Â Â  1Â Â Â  2Â Â Â  58
8Â Â Â  3Â Â Â  1Â Â Â  2Â Â Â  46
9Â Â Â  4Â Â Â  1Â Â Â  2Â Â Â  51
10Â Â Â  5Â Â Â  1Â Â Â  2Â Â Â  54
11Â Â Â  1Â Â Â  1Â Â Â  3Â Â Â  40
12Â Â Â  2Â Â Â  1Â Â Â  3Â Â Â  49
13Â Â Â  3Â Â Â  1Â Â Â  3Â Â Â  57
14Â Â Â  4Â Â Â  1Â Â Â  3Â Â Â  49
15Â Â Â  5Â Â Â  1Â Â Â  3Â Â Â  40
â¦Â Â Â  â¦Â Â Â  â¦Â Â Â  â¦Â Â Â  â¦

The only reason you would code it in the above way is if you believe there is 
some kind of effect of tank on the outcome.
Then you can add the block effect / tank effect in your model and test its 
significance in a 3 - way ANOVA design.

Your initial model will be:

model1 - lm(y ~ male + female + tank + male:female, data = dataset)

If the model says - your tank effect is not significant you can remove the 
term and look at the reduced form of the model:

model2 - lm(y ~ male + female + male:female, data = dataset)

Let me know if this is making sense to you.

Regards,
Indrajit



From: Austin Paul austi...@usc.edu

Cc: r-help@r-project.org r-help@r-project.org
Sent: Tuesday, September 27, 2011 12:30 PM
Subject: Re: [R] two-way anova help


Hi Indrajit and Bert,
Â 
I really appeciate your help.Â  I have coded as you mentioned, but I feel like 
I am losing a lot of data because I am not accounting for the replicate tanks 
(what if the 5 replicates of the same cross type vary more among tanks than 
within?).Â  Below is what my data would look like (if it comes through)
Â 
obsÂ  male female rep response 
1 1 1 1 34 
2 1 1 1 44 
. . . . . 
. . . . . 
51 1 1 2 37 
. . . . . 
. . . . . 
251 1 2 1 42 
Â 
So if I understand correctly, the five replicate tanks of each cross cannot be 
treated as technical replicates?Â  They are not exactly repeated measures in 
the sense they are different individuals in different replicate tanks.Â  If I 
pool all 250 observations for each cross, instead of treating it as 5x50 
observations, I feel like I am losing a lot of information.
Â 
Austin
Â  
Â 
On Mon, Sep 26, 2011 at 11:40 PM, Indrajit Sengupta indra_cali...@yahoo.com 
wrote:

Hi Paul,


There should not be any problem. Here is how I visualize the data table looks 
like:


Obs Male_typeÂ Â  
 Female_typeÂ  
 Response 
1 1 1 34 
2 1 1 44 
3 1 2 38 
â¦ â¦ â¦ â¦ 
If your data frame has the above structure, R will automatically understand 
that there is replication. Your model form will remain the same.


Regards,
Indrajit




 From: Austin Paul austi...@usc.edu

Cc: r-help@r-project.org r-help@r-project.org
Sent: Tuesday, September 27, 2011 10:57 AM
Subject: Re: [R] two-way anova help
 


Hi,

Yes.Â  As I explained, the three male and three female types were crossed in 
all combinations (9 ways).Â  For each of the 9 types, I have 5 replicate tanks 
(45 total tanks).Â  And from each of the 45 tanks I have 50 observations for 
size.Â  So the 5 replicates are somehow nested within the two-way 
interaction?Â  If there was just 1 tank for each of the 9 crosses, yes, it 
would be very easy to code the two-way anova.Â  It may still be very easy, but 
I'm not quite sure how to account for the replicate tanks.Â  

Hope this makes more sense. 

Austin




On Mon, Sep 26, 2011 at 10:21 PM, Indrajit Sengupta indra_cali...@yahoo.com 
wrote:

Can you explain what do you mean by 5 replicate tanks?


Doing a two way anova is very simple in R. You would need to fit a linear 
model (lm function). 



Eg.:


 model - lm(y ~ male + female + male:female, data =)




Regards,
Indrajit






 From: Austin Paul austi...@usc.edu
To: r-help@r-project.org
Sent: Tuesday, September 27, 2011 6:13 AM
Subject: [R] two-way anova help
 

Hello,

I am having some trouble coding a two-way anova due to replicated
treatments.

I have a factorial design with three male parents and three female parents.
They were mated in all combinations and their babies were grown out and
measured for size.Â  50 babies were measured for each of the 9 crosses.Â  If I
stopped here, I would have no troubles.Â  But I also have 5 replicate tanks
for each of the 9 crosses.Â  My question is how to I code in the 5 replicate
tanks per treatment?

Thanks,
Austin

Â Â Â  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

Re: [R] creating a vector automatically containing numbers 1, 100, 10000, 1000000, ...

2011-09-27 Thread Marion Wenty

hello barry,
thank you very much!
i had thought of the seq function but only in regard to normal differences.
marion

2011/9/27 Barry Rowlingson b.rowling...@lancaster.ac.uk

 On Tue, Sep 27, 2011 at 9:33 AM, Marion Wenty marion.we...@gmail.com
 wrote:
  hello,
 
  i am looking for a way to get a vector containing the numbers:
 
  1,100,1,100,...
 
  each element should have two ceros more than the one before it until the
  number
 
  1e+200
 
  without using a loop.

  10^(seq(0,200,by=2)

 Barry


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two-way anova help

2011-09-27 Thread S Ellison

 
Austin,

As you describe it, 'tank' is indeed nested within 'cross' - no tank contains 
the offspring from more than one cross. 

How you treat that depends on whether the tank effect is fixed or random. I 
would _guess_ it's essentially a nuisance effect; if it were the principal 
effect of interest, you'd not have needed to do multiple parent crosses to find 
out about it. If it's random and appreciable, it affects inference about the 
parentage, roughly because the random tank effect increases the uncertainty of 
the cross means. The lm model for tank nested in the cross groups would 
probably look something like

response~male*female+male:female:tank

although if tanks were numbered 1-45 you'd get away without the male:female: 
prefix, at least in lm - the important thing is that lm is told that each 
male:female cross has _different_ tanks. That should give you an anova table 
with the tank effect appearing as the last row but one. 
The classical treatment of that with tank as a random effect would be to 
recalculate F and p for the parent effects and interaction by calculating F 
from the ratio of their mean squares to the tank mean square if the tank mean 
square is significant.. If not you would have the usual soul-searching about 
whether to compare with the residual MS or to re-fit the model without the tank 
effect. (the results ought to be pretty similar either way unless the tank 
effect is close to significance - in which case I'd check against it anyway and 
then think about the answer).

If tank is a fixed effect, you can use the table as is.

A somewhat simpler way to treat the tank-as-random case in R (simoper for you 
and I, that is - the maths involved is not simpler but fortunately Douglas 
Bates did it all for us) would be to treat it as a mixed effects model, using 
lme from the nlme package.  Assuming tank is numbered 1-45 or otherwise 
identified as something like tankID-factor(male:femal:tank) to make sure lme 
knows the tanks are different for each cross, that would look like

library(nlme)
mod - lme(response~male*female, random=~1|tank, ...)

anova(mod) would then give you a test for the parent effects and interactions, 
taking into account the estimated between-tank variance whether it's large or 
not.
and
summary(mod) will tell you the between-tank component of variance.

Finally, if you think the tank effect is probably real and certainly if it's 
large, taking the tank means and doing two-way anova on the resulting 45 mans 
should be pretty much the same as the more complicated stuff above.


Caveat: I'm a chemist. Taking a chemist's advice about statistics is about as 
sensible as taking a statistician's advice about chemistry - the mileage can 
vary.

S Ellison



 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Austin Paul
 Sent: 27 September 2011 06:28
 To: Indrajit Sengupta
 Cc: r-help@r-project.org
 Subject: Re: [R] two-way anova help
 
 Hi,
 
 Yes.  As I explained, the three male and three female types 
 were crossed in all combinations (9 ways).  For each of the 9 
 types, I have *5 replicate
 tanks* (45 total tanks).  And from each of the 45 tanks I 
 have 50 observations for size.  So the 5 replicates are 
 somehow nested within the two-way interaction?  If there was 
 just 1 tank for each of the 9 crosses, yes, it would be very 
 easy to code the two-way anova.  It may still be very easy, 
 but I'm not quite sure how to account for the replicate tanks.
 
 Hope this makes more sense.
 
 Austin
 
 
 
 On Mon, Sep 26, 2011 at 10:21 PM, Indrajit Sengupta 
 indra_cali...@yahoo.com
  wrote:
 
  Can you explain what do you mean by 5 replicate tanks?
 
  Doing a two way anova is very simple in R. You would need to fit a 
  linear model (lm function).
 
  Eg.:
 
   model - lm(y ~ male + female + male:female, data =)
 
 
  Regards,
  Indrajit
 
 
  --
  *From:* Austin Paul austi...@usc.edu
  *To:* r-help@r-project.org
  *Sent:* Tuesday, September 27, 2011 6:13 AM
  *Subject:* [R] two-way anova help
 
  Hello,
 
  I am having some trouble coding a two-way anova due to replicated 
  treatments.
 
  I have a factorial design with three male parents and three 
 female parents.
  They were mated in all combinations and their babies were grown out 
  and measured for size.  50 babies were measured for each of the 9 
  crosses.  If I stopped here, I would have no troubles.  But I also 
  have 5 replicate tanks for each of the 9 crosses.  My 
 question is how 
  to I code in the 5 replicate tanks per treatment?
 
  Thanks,
  Austin
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
   [[alternative

[R] Workflow for binary classification problem using svm via e1071 package

2011-09-27 Thread Jokel Meyer

Dear R-list!
I am using the e1071 package in R to solve a binary classification problem
in a dataset of round 180 predictor variables (blood metabolites) of two
groups of subjects (patients and healthy controls). I am confused regarding
the correct way to assess the classification accuracy of the trained svm.
(A) The svm command allows to specificy via the 'cross=k' parameter to
specify a k-fold crossvalidation. This results in k values for
classification accuracy and their corresponding mean. (B) On the other hand
most textbooks and tutorials I was browsing, recommend separating the data
into a training and a test-set and then to assess prediction accuarcy by
checking the accuracy of the trained svm when applied to the test-set.
I am not sure whether (A) and (B) would be alternative ways to assess
prediction accuracy? Or is option (A) only the accuracy of the svm when
applied to the test set and therefore I should implement option (B) after I
used option (A)?
So would it be the correct way to use first (A) then do grid-search (via the
tune command) to find the best hyperparameters and then test the trained svm
by applying it to the test set? And in case I use a linear kernel instead of
RBF, I guess I do not need to run grid-search as there are no
hyperparameters to estimate?

BEst,
Jokel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Coercing a character zoo to a numeric

2011-09-27 Thread Ashim Kapoor

Dear R-helpers,

It seems to me that a character zoo cannot be coerced to a numeric zoo.
Below is a minimal example. Can someone tell me what I have done wrong?

 z-zoo(1:4,order.by=1:4)
 coredata(z)-as.character(coredata(z))
 str(z)
zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4
 coredata(z)-as.numeric(coredata(z))
 str(z)
zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4


Many thanks,
Ashim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sm.regression , one-covariate graph and regression line width

2011-09-27 Thread Giulio Di Giovanni


Hi all,
I'm using sm library. In the two-dimensional plot of the analysis with one 
covariate, I'm able to modify several graphic parameters, either specifying 
them directly in the sm.regression() function, either through sm.options().The 
only graphical parameter i cannot modify is the width of the smoothed 
regression line. I can change its color, style, but not the width (which I must 
increase). The only result I was able to obtain is after specifying  lwd in 
par(), which increases the width of not only the regression line , but of 
everything (axes, points, ect ect), which is definitely too much.
par(lwd=3)sm.regression(a, b, col=red, lty=2)
does anybody have a suggestion? Do I have to tweak the function code?
Thanks in advance,
D
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] change rownames

2011-09-27 Thread Charlotte . Ndiribe

Hi All,

How do I add com to the row names of a data frame, without
deleting the plot numbers?
Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regression with ordered arguments

2011-09-27 Thread Francesco Sarracino

Dear Petr,

thank you so much for your quick reply. I was sure that there were some
smart ways to address my issue. I went through it and took some time to look
at the help for lapply and mapply.
However, some doubts still remain. Following your example, I did:
lll -vector(mode = list, length = 3)
mmm -vector(mode = list, length = 3)

yyy - lapply(lll, function(x) runif(100, min =0, max = 1))
xxx - lapply(mmm, function(x) runif(100, min =0, max = 1))

but then I get stucking again. It's not clear to me how to pass a lm command
to mapply. I tried to give a look at lapply and sapply, but I did not manage
to go much further.
It would be of big help if you could give me some more hints on this or if
you could provide me with some references. I am sorry, but I find the help
files quite cryptic. Is there a manual or some other source that you would
advice me where I could find some more example on how to deal with similar
issues?

Thank you very much for your precious support,
f.


On 27 September 2011 10:08, Petr PIKAL petr.pi...@precheza.cz wrote:

 Hi

  Dear R listers,
 
  I am trying to be a new R user, but life is not that easy.
  My problem is the following one: let's assume to have 3 outcome
 variables
  (y1, y2, y3) and 3 explanatory ones (x1, x2, x3).
  How can I run the following three separate regressions without having to
  repeat the lm command three times?
 
  fit.1 - lm(y1 ~ x1)
  fit.2 - lm(y2 ~ x2)
  fit.3 - lm(y3 ~ x3)
 
 
  Both the y and x variables have been generated extracting random numbers
  from uniform distributions using a command such as:
 
  y1 - runif(100, min = 0, max = 1)
 
  I went to several introductory manuals, the manual R for stata users,
  econometrics in R, Introductory statistics with R and several blogs and
 help
  files, but I didn't find an answer to my question.
  can you please help me? In Stata I wouldn't have any problem  in running
  this as a loop, but I really can't figure out how to do that with R.

 You can construct loop with naming through paste, numbers and get in R too
 but you will find your life much easier to use R powerfull list
 operations.

 Insted of

 y1 - runif(100, min = 0, max = 1)
 ...

 lll - vector(mode=list, length=3)
 lll - lapply(1, function(x) runif(100, min = 0, max = 1))

 you can use probably mapply for doing your regression.
 Or you can easily access part of the list by loop

 for (i in 1:3) lm(lll[[i]]~xx[[i]])

 (if you have your x's in list xx)

 Regards
 Petr

  Thanks in advance for all your help.
  Best,
  f.
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coercing a character zoo to a numeric

2011-09-27 Thread R. Michael Weylandt

Yes, once made into a character zoo, the core data is marked to be of mode
character and most attempts to modify involve implicit coercion to that
mode.

The following however works:

library(zoo)
z - zoo(1:4, order.by=1:4)
str(z)

z.Str - z
coredata(z.Str) - as.character(coredata(z))
str(z.Str)

z.Num - z.Str
mode(z.Num) - numeric
str(z.Num)

However, I prefer to use this sort of line of code:

z.Num - zoo(as.double(z.Str), index(z.Str))

finding it a little more transparent.

Hope this helps,
Michael Weylandt

On Tue, Sep 27, 2011 at 5:56 AM, Ashim Kapoor ashimkap...@gmail.com wrote:

 Dear R-helpers,

 It seems to me that a character zoo cannot be coerced to a numeric zoo.
 Below is a minimal example. Can someone tell me what I have done wrong?

  z-zoo(1:4,order.by=1:4)
  coredata(z)-as.character(coredata(z))
  str(z)
 zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4
  coredata(z)-as.numeric(coredata(z))
  str(z)
 zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4


 Many thanks,
 Ashim

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coercing a character zoo to a numeric

2011-09-27 Thread Ashim Kapoor

Dear Michael,

I don't think this is mentioned in the zoo FAQ. May I ask where you read
this? Some references ?

Thank you for your help,
Ashim

On Tue, Sep 27, 2011 at 4:06 PM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 Yes, once made into a character zoo, the core data is marked to be of mode
 character and most attempts to modify involve implicit coercion to that
 mode.

 The following however works:

 library(zoo)

 z - zoo(1:4, order.by=1:4)
 str(z)

 z.Str - z
 coredata(z.Str) - as.character(coredata(z))
 str(z.Str)

 z.Num - z.Str
 mode(z.Num) - numeric
 str(z.Num)

 However, I prefer to use this sort of line of code:

 z.Num - zoo(as.double(z.Str), index(z.Str))

 finding it a little more transparent.

 Hope this helps,
 Michael Weylandt

 On Tue, Sep 27, 2011 at 5:56 AM, Ashim Kapoor ashimkap...@gmail.comwrote:

 Dear R-helpers,

 It seems to me that a character zoo cannot be coerced to a numeric zoo.
 Below is a minimal example. Can someone tell me what I have done wrong?

  z-zoo(1:4,order.by=1:4)
  coredata(z)-as.character(coredata(z))
  str(z)
 zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4
  coredata(z)-as.numeric(coredata(z))
  str(z)
 zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4


 Many thanks,
 Ashim

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Pearson chi-square test

2011-09-27 Thread Michael Haenlein

Dear all,

I have some trouble understanding the chisq.test function.
Take the following example:

set.seed(1)
A - cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)
B - cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)
C - cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)
x - table(A,B)
y - table(A,C)

When I calculate the test statistic by hand I get a value of approximately
75.9:
http://en.wikipedia.org/wiki/Pearson's_chi-square_test#Calculating_the_test-statistic
sum((x-y)^2/y)

But when I do chisq.test(x,y) I get a value of 12.2 while chisq.test(y,x)
gives a value of 10.3.

I understand that I must be doing something wrong here, but I'm not sure
what.

Thanks,

Michael

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coercing a character zoo to a numeric

2011-09-27 Thread R. Michael Weylandt

It's just a guess from playing around with a few things. The author of the
package is on this list and he could both confirm that I'm right and say why
exactly it is implemented like this.

My hunch is that it ultimately comes from the fact that

coredata(z) != z

Consider this:

x = letters[1:5]
x2 - x1 - x

mode(x)

x1[1:5] - 1:5 # Modify the values of x1 without changing the mode
print(x1)

x2 - 1:5 # Replace x2
print(x2)

Michael Weylandt

On Tue, Sep 27, 2011 at 6:41 AM, Ashim Kapoor ashimkap...@gmail.com wrote:

 Dear Michael,

 I don't think this is mentioned in the zoo FAQ. May I ask where you read
 this? Some references ?

 Thank you for your help,
 Ashim


 On Tue, Sep 27, 2011 at 4:06 PM, R. Michael Weylandt 
 michael.weyla...@gmail.com wrote:

 Yes, once made into a character zoo, the core data is marked to be of mode
 character and most attempts to modify involve implicit coercion to that
 mode.

 The following however works:

 library(zoo)

 z - zoo(1:4, order.by=1:4)
 str(z)

 z.Str - z
 coredata(z.Str) - as.character(coredata(z))
 str(z.Str)

 z.Num - z.Str
 mode(z.Num) - numeric
 str(z.Num)

 However, I prefer to use this sort of line of code:

 z.Num - zoo(as.double(z.Str), index(z.Str))

 finding it a little more transparent.

 Hope this helps,
 Michael Weylandt

 On Tue, Sep 27, 2011 at 5:56 AM, Ashim Kapoor ashimkap...@gmail.comwrote:

 Dear R-helpers,

 It seems to me that a character zoo cannot be coerced to a numeric zoo.
 Below is a minimal example. Can someone tell me what I have done wrong?

  z-zoo(1:4,order.by=1:4)
  coredata(z)-as.character(coredata(z))
  str(z)
 zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4
  coredata(z)-as.numeric(coredata(z))
  str(z)
 zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4


 Many thanks,
 Ashim

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coercing a character zoo to a numeric

2011-09-27 Thread Ashim Kapoor

Dear Micheal,

Thank you. So to make a zoo which has factors in into a numeric,we have to
go from factor to character to numeric. The coredata goes fine from factor
to character. In the numeric conversion we need a new zoo object.

Best,
Ashim

On Tue, Sep 27, 2011 at 4:21 PM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 It's just a guess from playing around with a few things. The author of the
 package is on this list and he could both confirm that I'm right and say why
 exactly it is implemented like this.

 My hunch is that it ultimately comes from the fact that

 coredata(z) != z

 Consider this:

 x = letters[1:5]
 x2 - x1 - x

 mode(x)

 x1[1:5] - 1:5 # Modify the values of x1 without changing the mode
 print(x1)

 x2 - 1:5 # Replace x2
 print(x2)

 Michael Weylandt


 On Tue, Sep 27, 2011 at 6:41 AM, Ashim Kapoor ashimkap...@gmail.comwrote:

 Dear Michael,

 I don't think this is mentioned in the zoo FAQ. May I ask where you read
 this? Some references ?

 Thank you for your help,
 Ashim


 On Tue, Sep 27, 2011 at 4:06 PM, R. Michael Weylandt 
 michael.weyla...@gmail.com wrote:

 Yes, once made into a character zoo, the core data is marked to be of
 mode character and most attempts to modify involve implicit coercion to
 that mode.

 The following however works:

 library(zoo)

 z - zoo(1:4, order.by=1:4)
 str(z)

 z.Str - z
 coredata(z.Str) - as.character(coredata(z))
 str(z.Str)

 z.Num - z.Str
 mode(z.Num) - numeric
 str(z.Num)

 However, I prefer to use this sort of line of code:

 z.Num - zoo(as.double(z.Str), index(z.Str))

 finding it a little more transparent.

 Hope this helps,
 Michael Weylandt

 On Tue, Sep 27, 2011 at 5:56 AM, Ashim Kapoor ashimkap...@gmail.comwrote:

 Dear R-helpers,

 It seems to me that a character zoo cannot be coerced to a numeric zoo.
 Below is a minimal example. Can someone tell me what I have done wrong?

  z-zoo(1:4,order.by=1:4)
  coredata(z)-as.character(coredata(z))
  str(z)
 zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4
  coredata(z)-as.numeric(coredata(z))
  str(z)
 zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4


 Many thanks,
 Ashim

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A very big matrix inside a function

2011-09-27 Thread Francisco Fabuel

Dear listers:

As a part of a intermediate process, I need to use and modify a very big
matrix (some 3x3) inside the body of a function. If the matrix is
defined in the function, R shows a error message Cannot allocate vector of
size 6.7 Gb. But if I define the matrix before the function is called (as
soon as the dimension can be calculated) R can allocate it. The problem is
that although my function can use the matrix, cannot modify it. The schema
is the following:


# OPTION 1
# Function definition
Myfunction-function(parameters, ...) {
...
Verybigmatrix-matrix(0, n, n) # Matrix definition inside the function
...
}
# Main process
...
Value.returned-Myfunction(parameters, ...) # R issues an error message 
Cannot allocate vector of size 6.7 Gb
...



# OPTION 2
# Function definition...
Myfunction-function(parameters, ...) {
...
Verybigmatrix[index1, index2]-newvalue # Issues an error message  Cannot
allocate vector of size 6.7 Gb 
# It assignation is made with `-` error is issued too
...
}
# Main process
...
Verybigmatrix-matrix(0, n, n) # Definition before Myfunction is called
doesn't issue an error
...
Value.returned-Myfunction(parameters, ...) # R sees Verybigmatrix but if I
try to modify it issues an error message  Cannot allocate vector of size
6.7 Gb
...


Characteristics of the machine I'm working on:
Operating system: Redhat Enterprise Linux 6
Kernel and CPU: Linux 2.6.32-131.6.1.el6.x86_64 on x86_64
Processor information: Intel(R) Xeon(TM) CPU 3.60GHz, 4 cores
Memory: 8 GB

Thanks in advance for your help.

Francisco

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question concerning Box.test

2011-09-27 Thread Samir Benzerfa

Hi everyone,

 

I've got a question concerning the function Box.test for testing
autocorrelation in my data.

 

My data consist of (daily) returns of several stocks over time (first
row=time, all other rows=stock returns). I intend to perform a Box-Ljung
test for my returns (for each stock). Since I have about 3000 stocks in my
list, I'm not able to perform the test individually for each stock.
Unfortunately the Box.test only works for univariate series. My goal is to
get a list with every p-value (from the output) of the 3000 tests (that is a
list with 3000 p-values). Any hint how to do this? I tried to do this with
the function mapply, but it didn't work.

 

Many thanks in advance  best regards

S.B.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Capturing the error information

2011-09-27 Thread arunkumar1111


Hi,

I want to capture the error information without throwing  it out.

if i use try command it throws and also stores. and not able to get the
error if i use tryCatch command.


a=a
b=10
c=try(a/b)

Please help



--
View this message in context: 
http://r.789695.n4.nabble.com/Capturing-the-error-information-tp3846624p3846624.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with zoo::window()

2011-09-27 Thread Agustin Lobo


I have the following time series:

 class(CCasadesz2)
[1] zoo
 setmanes - cut(time(CCasadesz2),breaks=weeks)
 CCasadeswz - aggregate(CCasadesz2,sum,by=setmanes)
 class(CCasadeswz)
[1] zoo
 summary(CCasadeswz)
 Index  CCasadeswz
 2009-01-12 00:00:00: 1   Min.   :   4.0
 2009-01-19 00:00:00: 1   1st Qu.: 150.0
 2009-01-26 00:00:00: 1   Median : 268.0
 2009-02-02 00:00:00: 1   Mean   : 316.6
 2009-02-09 00:00:00: 1   3rd Qu.: 387.5
 2009-02-23 00:00:00: 1   Max.   :1435.0
 (Other):93

 summary(time(CCasadeswz))
shows dates for 2009 and 2010, and

 plot(CCasadeswz)
correctly displays the data for 2009 and 2010.

But when I try to select year 2010:

 CCasadeswz2010 = window(CCasadeswz,start=as.Date(2010-01-01), 
end=as.Date(2010-12-31))

Warning messages:
1: In which(in.index  all.indexes = start  all.indexes = end) :
  Incompatible methods (Ops.factor, Ops.Date) for =
2: In which(in.index  all.indexes = start  all.indexes = end) :
  Incompatible methods (Ops.factor, Ops.Date) for =

 summary(CCasadeswz2010)
 Index
 2009-01-12 00:00:00:0
 2009-01-19 00:00:00:0
 2009-01-26 00:00:00:0
 2009-02-02 00:00:00:0
 2009-02-09 00:00:00:0
 2009-02-16 00:00:00:0
 (Other):0

(it seems no data have been selected!)

 plot(CCasadeswz2010)
Error in plot.window(xlim = xlim, ylim = ylim, log = log, yaxs = 
pars$yaxs) :

  need finite 'ylim' values
Calls: plot ... boxplot - boxplot.default - do.call - bxp - plot.window
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
5: In is.na(rows) : is.na() applied to non-(list or vector) of type 'NULL'
Error in plot.window(xlim = xlim, ylim = ylim, log = log, yaxs = 
pars$yaxs) :

  need finite 'ylim' values
Calls: plot ... boxplot - boxplot.default - do.call - bxp - plot.window

why do I get the warning in window()? My command follows what is 
mentioned in the manual:

R window(z, start = as.Date(2005-02-15), end = as.Date(2005-02-28))

Thanks
Agus
--
Dr. Agustin Lobo
Institut de Ciencies de la Terra Jaume Almera (CSIC)
Lluis Sole Sabaris S/N
08028 Barcelona
Spain
Tel. +34 934095410
Fax. +34 934110012
email: agustin.l...@ictja.csic.es

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coercing a character zoo to a numeric

2011-09-27 Thread R. Michael Weylandt

Yes, that sounds right.

Michael

PS -- If you are interested, the code zoo:::`coredata-.zoo` contains the
line

x[] - value

confirming my hunch about the old mode being inherited unless a coercion to
a more general one is needed.

On Tue, Sep 27, 2011 at 7:01 AM, Ashim Kapoor ashimkap...@gmail.com wrote:

 Dear Micheal,

 Thank you. So to make a zoo which has factors in into a numeric,we have to
 go from factor to character to numeric. The coredata goes fine from factor
 to character. In the numeric conversion we need a new zoo object.

 Best,
 Ashim


 On Tue, Sep 27, 2011 at 4:21 PM, R. Michael Weylandt 
 michael.weyla...@gmail.com wrote:

 It's just a guess from playing around with a few things. The author of the
 package is on this list and he could both confirm that I'm right and say why
 exactly it is implemented like this.

 My hunch is that it ultimately comes from the fact that

 coredata(z) != z

 Consider this:

 x = letters[1:5]
 x2 - x1 - x

 mode(x)

 x1[1:5] - 1:5 # Modify the values of x1 without changing the mode
 print(x1)

 x2 - 1:5 # Replace x2
 print(x2)

 Michael Weylandt


 On Tue, Sep 27, 2011 at 6:41 AM, Ashim Kapoor ashimkap...@gmail.comwrote:

 Dear Michael,

 I don't think this is mentioned in the zoo FAQ. May I ask where you read
 this? Some references ?

 Thank you for your help,
 Ashim


 On Tue, Sep 27, 2011 at 4:06 PM, R. Michael Weylandt 
 michael.weyla...@gmail.com wrote:

 Yes, once made into a character zoo, the core data is marked to be of
 mode character and most attempts to modify involve implicit coercion to
 that mode.

 The following however works:

 library(zoo)

 z - zoo(1:4, order.by=1:4)
 str(z)

 z.Str - z
 coredata(z.Str) - as.character(coredata(z))
 str(z.Str)

 z.Num - z.Str
 mode(z.Num) - numeric
 str(z.Num)

 However, I prefer to use this sort of line of code:

 z.Num - zoo(as.double(z.Str), index(z.Str))

 finding it a little more transparent.

 Hope this helps,
 Michael Weylandt

 On Tue, Sep 27, 2011 at 5:56 AM, Ashim Kapoor ashimkap...@gmail.comwrote:

 Dear R-helpers,

 It seems to me that a character zoo cannot be coerced to a numeric zoo.
 Below is a minimal example. Can someone tell me what I have done wrong?

  z-zoo(1:4,order.by=1:4)
  coredata(z)-as.character(coredata(z))
  str(z)
 zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4
  coredata(z)-as.numeric(coredata(z))
  str(z)
 zoo series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4


 Many thanks,
 Ashim

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.







[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question concerning Box.test

2011-09-27 Thread R. Michael Weylandt

Did you try regular apply? If you have univariate input, there's no reason
to use the multivariate mapply. Or more generally:

apply(P[-1,],1,function(p) Box.test(p)$p.value)

Michael

On Tue, Sep 27, 2011 at 4:45 AM, Samir Benzerfa benze...@gmx.ch wrote:

 Hi everyone,



 I've got a question concerning the function Box.test for testing
 autocorrelation in my data.



 My data consist of (daily) returns of several stocks over time (first
 row=time, all other rows=stock returns). I intend to perform a Box-Ljung
 test for my returns (for each stock). Since I have about 3000 stocks in my
 list, I'm not able to perform the test individually for each stock.
 Unfortunately the Box.test only works for univariate series. My goal is to
 get a list with every p-value (from the output) of the 3000 tests (that is
 a
 list with 3000 p-values). Any hint how to do this? I tried to do this with
 the function mapply, but it didn't work.



 Many thanks in advance  best regards

 S.B.




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Intersection between circle and line?

2011-09-27 Thread Chris82

Thanks!

In principle it is a mathematical problem. I have already found some
solutions with google.
I thought there is maybe an existing R function or package which handel this
stuff. But it seems not to be.
sos package gives me no suitable information for that.

With best regards



--
View this message in context: 
http://r.789695.n4.nabble.com/Intersection-between-circle-and-line-tp3844063p3846846.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pearson chi-square test

2011-09-27 Thread Meyners, Michael

Not sure what you want to test here with two matrices, but reading the manual 
helps here as well: 

y   a vector; ignored if x is a matrix.

x and y are matrices in your example, so it comes as no surprise that you get 
different results. On top of that, your manual calculation is not correct if 
you want to test whether two samples come from the same distribution (so don't 
be surprised if R still gives a different value...).

HTH, Michael

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Michael Haenlein
 Sent: Tuesday, September 27, 2011 12:45
 To: r-help@r-project.org
 Subject: [R] Pearson chi-square test
 
 Dear all,
 
 I have some trouble understanding the chisq.test function.
 Take the following example:
 
 set.seed(1)
 A - cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)
 B - cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)
 C - cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)
 x - table(A,B)
 y - table(A,C)
 
 When I calculate the test statistic by hand I get a value of
 approximately
 75.9:
 http://en.wikipedia.org/wiki/Pearson's_chi-
 square_test#Calculating_the_test-statistic
 sum((x-y)^2/y)
 
 But when I do chisq.test(x,y) I get a value of 12.2 while
 chisq.test(y,x)
 gives a value of 10.3.
 
 I understand that I must be doing something wrong here, but I'm not
 sure
 what.
 
 Thanks,
 
 Michael
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing colour in barchart

2011-09-27 Thread shikantaza

Thanks for your help - very useful!

Martyn.

--
View this message in context: 
http://r.789695.n4.nabble.com/Changing-colour-in-barchart-tp3843209p3846866.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Capturing the error information

2011-09-27 Thread Duncan Murdoch


On 11-09-27 5:39 AM, arunkumar wrote:


Hi,

I want to capture the error information without throwing  it out.

if i use try command it throws and also stores. and not able to get the
error if i use tryCatch command.


a=a
b=10
c=try(a/b)


That's how.  c has now captured the error information.  It did not throw 
the error, it just printed it.  If you don't want it to print, see ?try.


Duncan Murdoch




Please help



--
View this message in context: 
http://r.789695.n4.nabble.com/Capturing-the-error-information-tp3846624p3846624.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with zoo::window()

2011-09-27 Thread Gabor Grothendieck

On Tue, Sep 27, 2011 at 5:23 AM, Agustin Lobo
agustin.l...@ictja.csic.es wrote:
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Provide the output of dput(CCasadesz2) or if that is very large try to
cut it down to make the example minimal.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coercing a character zoo to a numeric

2011-09-27 Thread Gabor Grothendieck

On Tue, Sep 27, 2011 at 5:56 AM, Ashim Kapoor ashimkap...@gmail.com wrote:
 Dear R-helpers,

 It seems to me that a character zoo cannot be coerced to a numeric zoo.
 Below is a minimal example. Can someone tell me what I have done wrong?

 z-zoo(1:4,order.by=1:4)
 coredata(z)-as.character(coredata(z))
 str(z)
 ‘zoo’ series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4
 coredata(z)-as.numeric(coredata(z))
 str(z)
 ‘zoo’ series from 1 to 4
  Data: chr [1:4] 1 2 3 4
  Index:  int [1:4] 1 2 3 4



See ?zoo where it says that the zoo object may be a numeric vector,
matrix or a factor.

Thus character is not supported although I suspect that a number of
operations continue to work anyways -- although evidently not that
one.

This seems to result in a numeric zoo object:

aggregate(z, identity, as.numeric)

although these sorts of computations with zoo objects that strictly
speaking are not legal are, of course, not officially supported.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regression with ordered arguments

2011-09-27 Thread Petr PIKAL

Hi Francesco

 Dear Petr,
 
 thank you so much for your quick reply. I was sure that there were some 
 smart ways to address my issue. I went through it and took some time to 
 look at the help for lapply and mapply.
 However, some doubts still remain. Following your example, I did:
 lll -vector(mode = list, length = 3)
 mmm -vector(mode = list, length = 3) 
 
 yyy - lapply(lll, function(x) runif(100, min =0, max = 1))
 xxx - lapply(mmm, function(x) runif(100, min =0, max = 1))
 
 but then I get stucking again. It's not clear to me how to pass a lm 
 command to mapply. I tried to give a look at lapply and sapply, but I 
did 
 not manage to go much further.
 It would be of big help if you could give me some more hints on this or 
if
 you could provide me with some references. I am sorry, but I find the 
help
 files quite cryptic. Is there a manual or some other source that you 
would
 advice me where I could find some more example on how to deal with 
similar issues?

You can use for cycle

for (i in 1:3) lll[[i]] -lm(yyy[[i]]~xxx[[i]])

put result of lm to list object lll then

lapply(lll, summary)

gives you summary of each lm function, if you want to use mapply

mmm-mapply(function(x,y) lm(y~x), xxx, yyy, SIMPLIFY=F)

you can see structure of any object by
str(mmm)

Both results shall be similar, for this case I believe that for loop is 
easier to understand.

Regards
Petr



 
 Thank you very much for your precious support,
 f.
 

 On 27 September 2011 10:08, Petr PIKAL petr.pi...@precheza.cz wrote:
 Hi
 
  Dear R listers,
 
  I am trying to be a new R user, but life is not that easy.
  My problem is the following one: let's assume to have 3 outcome
 variables
  (y1, y2, y3) and 3 explanatory ones (x1, x2, x3).
  How can I run the following three separate regressions without having 
to
  repeat the lm command three times?
 
  fit.1 - lm(y1 ~ x1)
  fit.2 - lm(y2 ~ x2)
  fit.3 - lm(y3 ~ x3)
 
 
  Both the y and x variables have been generated extracting random 
numbers
  from uniform distributions using a command such as:
 
  y1 - runif(100, min = 0, max = 1)
 
  I went to several introductory manuals, the manual R for stata users,
  econometrics in R, Introductory statistics with R and several blogs 
and
 help
  files, but I didn't find an answer to my question.
  can you please help me? In Stata I wouldn't have any problem  in 
running
  this as a loop, but I really can't figure out how to do that with R.

 You can construct loop with naming through paste, numbers and get in R 
too
 but you will find your life much easier to use R powerfull list
 operations.
 
 Insted of
 
 y1 - runif(100, min = 0, max = 1)
 ...
 
 lll - vector(mode=list, length=3)
 lll - lapply(1, function(x) runif(100, min = 0, max = 1))
 
 you can use probably mapply for doing your regression.
 Or you can easily access part of the list by loop
 
 for (i in 1:3) lm(lll[[i]]~xx[[i]])
 
 (if you have your x's in list xx)
 
 Regards
 Petr
 
  Thanks in advance for all your help.
  Best,
  f.
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] wgrib2 for windows 7

2011-09-27 Thread Bhaskaran

Hi all,

I want to install wgrib2 on my windows 7 laptop.  It appears that a
precompiled version exists and works for many.   I managed to download a
precompiled version and placed it in a directory and added this directory to
the PATH.

However, when execute R-command, I get the error suggesting that wgrib2 is
not recognized as either internal or external command.

Any suggestions?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ANOVA define as factor or not

2011-09-27 Thread HronnE

Hi all

This is probably a simple problem but somehow I am having much trouble with
finding a solution, so I seek your help!

I have a data-set with continuous response variables. The explanatory
variably is 4xpH treatments (so 8.08, 7.94, 7.81 and 7.71)  so  also
continuous and not technically factorial. 

However I have decided to do Anova's (as well as regression) to explore the
effect of pH.

What I don't understand is why the anova done in such a way:

summary(aov(BioMass~pH))

... gives me completely different p-values if I define the pH as factor or
not. And what would be the correct approach?

Help on this subject would be much appreciated!

Hronn


--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-define-as-factor-or-not-tp3846861p3846861.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] model selection using logistf package

2011-09-27 Thread mael

Hi everyone,

I'm wondering how to select the best model when using logistf? AIC does
not work neither does anova. I tried fitting a glm model but got the
separation warning message so I tried using the logistf package but as I
stepwise simplify the model I don't know if the simplification is motivated
or not... Can anyone explain to me how I should approach this problem? I
have data with y=0 and 1 and three x variables for 328 observations.

Thanks!

--
View this message in context: 
http://r.789695.n4.nabble.com/model-selection-using-logistf-package-tp3846958p3846958.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] binomial logistic regression question

2011-09-27 Thread majesty

Dear subscribers, 

I am looking for a function which would allow me to model the dependent
variable as the number of successes in a series of Bernoulli trials. My data
looks like this

ID  TRIALS SUCCESSESS   INDEP1 INDEP2 INDEP3
1  00.273 0.055  0.156
2  98170  74   0.123 0.456  0.789
3  14548630   0.124 0.235  0.007
4  14714949   0.888 0.357  0.321
5  60585  11   0.484 0.235  0.235
6  19895343   0.295 0.123  0.856

I want to find out how independent variables influence the number of
successful trials (dependent variable)
I had tried to use glm formula 

regression.glm - glm( SUCCESSESS ~ INDEP1 + INDEP2 + INDEP3, data = data,
family = binomial, weights= TRIALS) 

but got the following: Error in eval(expr, envir, enclos) : y values must be
0 = y = 1
but there y has to be between 0 and 1 (which does make sense, but that's why
I am using weights?)

I would be grateful for any hints/suggestions on how to proceed.
Cheers, 
Juta


--
View this message in context: 
http://r.789695.n4.nabble.com/binomial-logistic-regression-question-tp3846954p3846954.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] model selection using logistf package

2011-09-27 Thread Frank Harrell

Stepwise variable selection without heavy penalization is invalid.
Frank

mael wrote:
 
 Hi everyone,
 
 I'm wondering how to select the best model when using logistf? AIC does
 not work neither does anova. I tried fitting a glm model but got the
 separation warning message so I tried using the logistf package but as I
 stepwise simplify the model I don't know if the simplification is
 motivated or not... Can anyone explain to me how I should approach this
 problem? I have data with y=0 and 1 and three x variables for 328
 observations.
 
 Thanks!
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/model-selection-using-logistf-package-tp3846958p3847087.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with zoo::window()

2011-09-27 Thread Gabor Grothendieck

On Tue, Sep 27, 2011 at 5:23 AM, Agustin Lobo
agustin.l...@ictja.csic.es wrote:
 I have the following time series:

 class(CCasadesz2)
 [1] zoo
 setmanes - cut(time(CCasadesz2),breaks=weeks)
 CCasadeswz - aggregate(CCasadesz2,sum,by=setmanes)

cut produces a factor, not a Date. The by= argument in
aggregate.zoo should not be a factor but rather should define the new
index. If you want the new index to be a Date class then by= should
define a Date vector.  See ?aggregate.zoo .

Try:

setmanes - as.Date(format(setmanes))

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regression with ordered arguments

2011-09-27 Thread Francesco Sarracino

That's cool!

it works :-)))

for me (as a stata user) these are quite basic things and I didn't find them
anywhere for what concerns R. I can't figure out why.
Really, thank you so much,
f.


On 27 September 2011 14:20, Petr PIKAL petr.pi...@precheza.cz wrote:

 Hi Francesco

  Dear Petr,
 
  thank you so much for your quick reply. I was sure that there were some
  smart ways to address my issue. I went through it and took some time to
  look at the help for lapply and mapply.
  However, some doubts still remain. Following your example, I did:
  lll -vector(mode = list, length = 3)
  mmm -vector(mode = list, length = 3)
 
  yyy - lapply(lll, function(x) runif(100, min =0, max = 1))
  xxx - lapply(mmm, function(x) runif(100, min =0, max = 1))
 
  but then I get stucking again. It's not clear to me how to pass a lm
  command to mapply. I tried to give a look at lapply and sapply, but I
 did
  not manage to go much further.
  It would be of big help if you could give me some more hints on this or
 if
  you could provide me with some references. I am sorry, but I find the
 help
  files quite cryptic. Is there a manual or some other source that you
 would
  advice me where I could find some more example on how to deal with
 similar issues?

 You can use for cycle

 for (i in 1:3) lll[[i]] -lm(yyy[[i]]~xxx[[i]])

 put result of lm to list object lll then

 lapply(lll, summary)

 gives you summary of each lm function, if you want to use mapply

 mmm-mapply(function(x,y) lm(y~x), xxx, yyy, SIMPLIFY=F)

 you can see structure of any object by
 str(mmm)

 Both results shall be similar, for this case I believe that for loop is
 easier to understand.

 Regards
 Petr



 
  Thank you very much for your precious support,
  f.
 

  On 27 September 2011 10:08, Petr PIKAL petr.pi...@precheza.cz wrote:
  Hi
 
   Dear R listers,
  
   I am trying to be a new R user, but life is not that easy.
   My problem is the following one: let's assume to have 3 outcome
  variables
   (y1, y2, y3) and 3 explanatory ones (x1, x2, x3).
   How can I run the following three separate regressions without having
 to
   repeat the lm command three times?
  
   fit.1 - lm(y1 ~ x1)
   fit.2 - lm(y2 ~ x2)
   fit.3 - lm(y3 ~ x3)
  
  
   Both the y and x variables have been generated extracting random
 numbers
   from uniform distributions using a command such as:
  
   y1 - runif(100, min = 0, max = 1)
  
   I went to several introductory manuals, the manual R for stata users,
   econometrics in R, Introductory statistics with R and several blogs
 and
  help
   files, but I didn't find an answer to my question.
   can you please help me? In Stata I wouldn't have any problem  in
 running
   this as a loop, but I really can't figure out how to do that with R.

  You can construct loop with naming through paste, numbers and get in R
 too
  but you will find your life much easier to use R powerfull list
  operations.
 
  Insted of
 
  y1 - runif(100, min = 0, max = 1)
  ...
 
  lll - vector(mode=list, length=3)
  lll - lapply(1, function(x) runif(100, min = 0, max = 1))
 
  you can use probably mapply for doing your regression.
  Or you can easily access part of the list by loop
 
  for (i in 1:3) lm(lll[[i]]~xx[[i]])
 
  (if you have your x's in list xx)
 
  Regards
  Petr
 
   Thanks in advance for all your help.
   Best,
   f.
  
  [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] binomial logistic regression question

2011-09-27 Thread Patrick Breheny


On 09/27/2011 07:53 AM, majesty wrote:

Dear subscribers,

I am looking for a function which would allow me to model the dependent
variable as the number of successes in a series of Bernoulli trials. My data
looks like this

ID  TRIALS SUCCESSESS   INDEP1 INDEP2 INDEP3
1  00.273 0.055  0.156
2  98170  74   0.123 0.456  0.789
3  14548630   0.124 0.235  0.007
4  14714949   0.888 0.357  0.321
5  60585  11   0.484 0.235  0.235
6  19895343   0.295 0.123  0.856

I want to find out how independent variables influence the number of
successful trials (dependent variable)
I had tried to use glm formula

regression.glm- glm( SUCCESSESS ~ INDEP1 + INDEP2 + INDEP3, data = data,
family = binomial, weights= TRIALS)


Try

regression.glm- glm(cbind(SUCCESSESS,TRIALS) ~ INDEP1 + INDEP2 + 
INDEP3, data = data,family = binomial)


as is specified in the details of ?glm.

--
Patrick Breheny
Assistant Professor
Department of Biostatistics
Department of Statistics
University of Kentucky

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mahalanobis Distance

2011-09-27 Thread David L Carlson

David Cross is correct. Your covariance matrix is singular because you have
more columns (15) than rows (2). David actually produced the covariance
matrix for cbind(s.1, s.2) which is not singular, but you are using
rbind(s.1, s.2). Try cor(rbind(s.1, s.2)) and you will see that the
correlations are all +/-1.

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of david.jes...@ubs.com
Sent: Tuesday, September 27, 2011 3:20 AM
To: thetrooperm...@gmail.com; r-help@r-project.org
Subject: Re: [R] Mahalanobis Distance

Hi

One thought would be to fit say a GARCH model to your historical data
series, divide the returns by the sigma estimates and then repeat.  This
would have the advantage of getting the data to be closer to the same scale.


Regards

David  

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of jorgeA
Sent: 26 September 2011 21:07
To: r-help@r-project.org
Subject: Re: [R] Mahalanobis Distance

Hello David,

Thank you for the help anyway. Well answering your question However, I
wonder how much value there is to computing the Mahalanobis distance with
two variables that are measured on such different scales?:

These two variables are subseries of the same time series. What I'm doing is
using one method of forecasting time series that searches in the past of the
time series, similar subseries to be an input of a forecasting function. I'm
testing several distance measures, and one of that is the mahalanobis
distance.   

But right now I'm stuck with this problem

Best regards,

Jorge Aikes Junior
Universidade Estadual do Oeste do Paraná - Brazil.

--
View this message in context:
http://r.789695.n4.nabble.com/Mahalanobis-Distance-tp3844960p3845168.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Issued by UBS AG or affiliates to professional =\ invest...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Matrix and list indices

2011-09-27 Thread fernando.cabrera

Hi guys,

I am trying to replace all elements of earth that are equal to zero with their 
corresponding elements in mars. I can do the replace with a bunch of for-loops, 
but I don't think this is the R way of doing things.

my_list - list( earth=array(c(0,0,45,0,0,45,0,45),dim=c(2,2,2)), 
mars=array(c(8:1),dim=c(2,2,2)))
my_list
for (i in c(1:2)) {
for (j in c(1:2)) {
for (k in c(1:2)) {
if (my_list$earth[i,j,k] == 0) {

my_list$earth[i,j,k] - my_list$mars[i,j,k]
}
}
}
}
my_list

Do you guys have any suggestions for getting rid of the ugly for-loops?

Many thanks,

Fernando Álvarez

Nordea e-Markets
Strandgade 3
PO Box 850
DK-0900 Copenhagen C
Denmark
Tel.: +45 33 33 32 67
Mobile: +45 61 55 27 54

This transmission is intended solely for the person or entity to whom it is 
addressed. It may contain privileged and confidential information. If you are 
not the intended recipient, please be notified that any dissemination, 
distribution or copying is strictly prohibited. If you have received this 
transmission by mistake, please let us know and then delete it from your system.

P Please consider the impact on the environment before printing this e-mail.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Counting similar rows

2011-09-27 Thread Metronome123

Startsituation: 

structure(c(1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 
0, 1, 1), .Dim = 4:5, .Dimnames = structure(list(subject = c(s1, 
s2, s3, s4), class = c(c1, c2, c3, c4, c5)), .Names =
c(subject, 
class)), class = c(xtabs, table), call = xtabs(formula = ~subject + 
class, data = ia))

I want a count of subjects that match the same classes in subject_cnt and a
count of the number of classes in class_cnt The result of this example
should be:

structure(list(subject_cnt = c(2L, 1L, 1L), class_cnt = c(3L, 
2L, 3L), c1 = c(1L, 1L, 0L), c2 = c(0L, 1L, 0L), c3 = c(1L, 0L, 
1L), c4 = c(0L, 0L, 1L), c5 = c(1L, 0L, 1L)), .Names = c(subject_cnt, 
class_cnt, c1, c2, c3, c4, c5), class = data.frame, row.names
= c(NA, 
-3L))

How can I achieve this in R, without complicated loops? 

PS. Note that the number of classes and subjects are in real quite big.

Cheers,


Lars



--
View this message in context: 
http://r.789695.n4.nabble.com/Counting-similar-rows-tp3847051p3847051.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How does the survfit.coxph calculate the survival value?

2011-09-27 Thread koshihaku

Sorry for a long question.

I am in an urgent using R to fit the cox proportional model. My data is a
data frame including 100 individuals which are software stress tests. 
There are time-to-failure and six covariates meaning the system resource
(every 4 minutes).
Is it possible to use R to fit the Cox model?

I used *coxph* to estimated the coefficients for both time-independent and
time-dependent covariates.
And next I need to calculated the survival and hazard rate for cox model
with time-independent and that with time-dependent covariates.

I see many people use *survfit.coxph *to get the survival.
Could anybody please tell me how is the survival value calculated? For
example, what kind of procedure.



Any suggestion is a great help!

--
View this message in context: 
http://r.789695.n4.nabble.com/How-does-the-survfit-coxph-calculate-the-survival-value-tp3847179p3847179.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] M-Splines design matrix

2011-09-27 Thread Tigisti G

Hi

Can any one inform me an r package used to fit Cubic M-splines or r functions 
used to have design and penalty matrices in fitting cubic  M-splines please?


Thanks,
Tigist
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix and list indices

2011-09-27 Thread R. Michael Weylandt

On Tue, Sep 27, 2011 at 9:43 AM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 Untested, I believe this should work, though you might need to modify for
 floating point funny business in testing the equalities:

 my_list - list( earth=array(c(0,0,45,0,0,45,0,45),dim=c(2,2,2)),
 mars=array(c(8:1),dim=c(2,2,2)))
 my_list$earth[my_list$earth==0] - my_list$mars[my_list$earth==0]

 Hope this helps,

 Michael Weylandt

 On Tue, Sep 27, 2011 at 8:49 AM, fernando.cabr...@nordea.com wrote:

 Hi guys,

 I am trying to replace all elements of earth that are equal to zero with
 their corresponding elements in mars. I can do the replace with a bunch of
 for-loops, but I don't think this is the R way of doing things.

 my_list - list( earth=array(c(0,0,45,0,0,45,0,45),dim=c(2,2,2)),
 mars=array(c(8:1),dim=c(2,2,2)))
 my_list
 for (i in c(1:2)) {
for (j in c(1:2)) {
for (k in c(1:2)) {
if (my_list$earth[i,j,k] ==
 0) {

  my_list$earth[i,j,k] - my_list$mars[i,j,k]
}
}
}
 }
 my_list

 Do you guys have any suggestions for getting rid of the ugly for-loops?

 Many thanks,

 Fernando Álvarez

 Nordea e-Markets
 Strandgade 3
 PO Box 850
 DK-0900 Copenhagen C
 Denmark
 Tel.: +45 33 33 32 67
 Mobile: +45 61 55 27 54

 This transmission is intended solely for the person or entity to whom it
 is addressed. It may contain privileged and confidential information. If you
 are not the intended recipient, please be notified that any dissemination,
 distribution or copying is strictly prohibited. If you have received this
 transmission by mistake, please let us know and then delete it from your
 system.

 P Please consider the impact on the environment before printing this
 e-mail.


[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] change rownames

2011-09-27 Thread R. Michael Weylandt

Your data frame does not seem to have come through with the formatting.
Could you send a plain text example (easily created using dput() ) of what
you have now and what you want?

Michael

On Tue, Sep 27, 2011 at 7:24 AM, charlotte.ndir...@unil.ch wrote:

 Hi,

 I already know  row.names(comm)[1:5]- paste(com, 1:5,
 sep=)
 But, my data looks like e.g.
 plots sp1  sp2
 1 0.2  0.5
 2
 4
 7
 10
 I would like to add com in front of the plot numbers
 without changing them?

 Thank you

 - Original Message -
 From: R. Michael Weylandt michael.weyla...@gmail.com
 To: charlotte.ndir...@unil.ch
 Subject: Re: [R] change rownames
 Date: Tue, 27 Sep 2011 06:39:18 -0400

  I have no idea what you mean by plot numbers, but consider
  this:
 
  x - data.frame(a = 1:5, b = sin(1:5) , c = letters[1:5])
  print(x)
  rownames(x) - paste(rownames(x),com)
  print(x)
 
  Hope this helps,
 
  Michael Weylandt
 
  On Tue, Sep 27, 2011 at 6:33 AM,
  charlotte.ndir...@unil.ch wrote:
 
   Hi All,
  
   How do I add com to the row names of a data frame,
   without deleting the plot numbers?
   Thanks
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
  reproducible code. 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem about rbpspline-selection of knots

2011-09-27 Thread Elif Keten

Hi,

My problem is that i will use rbpspline{SpatialExtremes} to fit a penalized
spline with radial basis functions to my data
but i have two predictor variables and i cannot select knots.
In R help there is an example such that



n - 200
x - runif(n)
fun - function(x) sin(3 * pi * x)
y - fun(x) + rnorm(n, 0, sqrt(0.4))
knots - quantile(x, prob = 1:(n/4) / (n/4 + 1))
fitted - rbpspline(y, x, knots = knots, degree = 3)
fitted
plot(x, y)
lines(fitted, col = 2)

In my case,

n-200
x1-runif(n)
x2-runif(n)
fun-function(x1,x2) 2*x1+x2^2
y - fun(x1,x2) + rnorm(n, 0, sqrt(0.4))

 but at this point i do not know how to select knots.
I want to take cross products of quantiles of x1 and x2.
In R help it says that knots is defined as a vector that gives the
coordinates of the knots.
How can i give the coordinates of two dimensional knots as a vector?



Thanks in advance

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ANOVA define as factor or not

2011-09-27 Thread Jean V Adams

HronnE wrote on 09/27/2011 06:27:38 AM:
 
 Hi all
 
 This is probably a simple problem but somehow I am having much trouble 
with
 finding a solution, so I seek your help!
 
 I have a data-set with continuous response variables. The explanatory
 variably is 4xpH treatments (so 8.08, 7.94, 7.81 and 7.71)  so  also
 continuous and not technically factorial. 
 
 However I have decided to do Anova's (as well as regression) to explore 
the
 effect of pH.
 
 What I don't understand is why the anova done in such a way:
 
 summary(aov(BioMass~pH))
 
 ... gives me completely different p-values if I define the pH as factor 
or
 not. And what would be the correct approach?
 
 Help on this subject would be much appreciated!
 
 Hronn


If pH is a numeric variable (see ?class) then your formula is instructing 
aov() to include it as a linear term with one degree of freedom, just like 
lm() does if you're fitting a regression.  If you want to treat pH as a 
categorical variable, then you must make sure that it is included in the 
model as a factor.  Then you will see that it uses three degrees of 
freedom.

BioMass - rnorm(100)
pH - sample(c(8.08, 7.94, 7.81, 7.71), size=100, replace=TRUE)
pH.f - as.factor(pH)

summary(aov(BioMass ~ pH))
summary(lm(BioMass ~ pH))

summary(aov(BioMass ~ pH.f))
summary(lm(BioMass ~ pH.f))


Jean
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pearson chi-square test

2011-09-27 Thread Meyners, Michael

Just for completeness: the manual calculation you'd want is most likely

sum((x-y)^2  / (x+y))

(that's one you can find on the Wikipedia link you provided). To get the same 
from chisq.test, try something like 

chisq.test(data.frame(x,y)[,c(3,6)])

(there are surely smarter ways, but at least it works here). Note that 
something like 

chisq.test(as.vector(x), as.vector(y)) 

will give a different test, i.e. based on a contingency table of x cross y).
M. 

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Meyners, Michael
 Sent: Tuesday, September 27, 2011 13:28
 To: Michael Haenlein; r-help@r-project.org
 Subject: Re: [R] Pearson chi-square test
 
 Not sure what you want to test here with two matrices, but reading the
 manual helps here as well:
 
 y a vector; ignored if x is a matrix.
 
 x and y are matrices in your example, so it comes as no surprise that
 you get different results. On top of that, your manual calculation is
 not correct if you want to test whether two samples come from the same
 distribution (so don't be surprised if R still gives a different
 value...).
 
 HTH, Michael
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
  project.org] On Behalf Of Michael Haenlein
  Sent: Tuesday, September 27, 2011 12:45
  To: r-help@r-project.org
  Subject: [R] Pearson chi-square test
 
  Dear all,
 
  I have some trouble understanding the chisq.test function.
  Take the following example:
 
  set.seed(1)
  A - cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)
  B - cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)
  C - cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)
  x - table(A,B)
  y - table(A,C)
 
  When I calculate the test statistic by hand I get a value of
  approximately
  75.9:
  http://en.wikipedia.org/wiki/Pearson's_chi-
  square_test#Calculating_the_test-statistic
  sum((x-y)^2/y)
 
  But when I do chisq.test(x,y) I get a value of 12.2 while
  chisq.test(y,x)
  gives a value of 10.3.
 
  I understand that I must be doing something wrong here, but I'm not
  sure
  what.
 
  Thanks,
 
  Michael
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Keep consecutive year observations (remove gap's) in panel data (dataframes). Difficulties in using lag(). Package plm.

2011-09-27 Thread Cecilia Carmo

Hi everyone.

I have two questions. Ive found some other questions and answers similar to
these but they didnt solve my problem.

 

Im working with a panel of firm/years observations (see my reproducible
example). Im using the plm package.

My panel not only is unbalanced but also have some gaps in years. 

 

#reproducible example

data1-data.frame(year=c(2001,2002,2003,2004,2005,2001,2002,2004,2005,2001,2
002,2003,2005),

firm=c(1,1,1,1,1,2,2,2,2,3,3,3,3),x=c(11,22,32,25,26,47,85,98,101,14,87,56,1
4))

data1

#load package plm and format data

data2-plm.data(data1,index=c(firm,year))

 

First I want to keep for each firm the longest serie of consecutive years.

 

So I want a dataframe like this (keeping years 2001 and 2002 in firm 2)

 

   year firm  x

1  20011 11

2  20021 22

3  20031 32

4  20041 25

5  20051 26

6  20012 47

7  20022 85

8  20013 14

9  20023 87

10 20033 56

 

Or like this (keeping years 2004 and 2005 in firm 2)

   year firm   x

1  20011  11

2  20021  22

3  20031  32

4  20041  25

5  20051  26

6  20042  98

7  20052 101

8  20013  14

9  20023  87

10 20033  56

 

Second, I need to create a new variable that is the lagged value of x. I've
done 

newdata1-transform(data1,y=lag(x,1))

But it doesn't work.

 

I also need to create a new variable that is the opposite of lag(). I've
done

newdata2-transform(data1,z=lag(x,-1))

But, of course, it doesn't work neither.

 

Thank you for all your help.

 

Cecília Carmo 

(Universidade de Aveiro  Portugal)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pearson chi-square test

2011-09-27 Thread Michael Haenlein

Dear Michael,



Thanks very much for your answers!



The purpose of my analysis is to test whether the contingency table x is
different from the contingency table y.

Or, to put it differently, whether there is a significant difference between
the joint distribution AB and AC.



Based on your answer I'm wondering whether the best way to do this is really
a chisq.test?

Or is there probably a different function or package I should use
altogether?



Thanks,



Michael







-Original Message-
From: Meyners, Michael [mailto:meyner...@pg.com]
Sent: Dienstag, 27. September 2011 17:00
To: Michael Haenlein; r-help@r-project.org
Subject: RE: [R] Pearson chi-square test



Just for completeness: the manual calculation you'd want is most likely



sum((x-y)^2  / (x+y))



(that's one you can find on the Wikipedia link you provided). To get the
same from chisq.test, try something like



chisq.test(data.frame(x,y)[,c(3,6)])



(there are surely smarter ways, but at least it works here). Note that
something like



chisq.test(as.vector(x), as.vector(y))



will give a different test, i.e. based on a contingency table of x cross y).

M.



 -Original Message-

 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-

 project.org] On Behalf Of Meyners, Michael

 Sent: Tuesday, September 27, 2011 13:28

 To: Michael Haenlein; r-help@r-project.org

 Subject: Re: [R] Pearson chi-square test



 Not sure what you want to test here with two matrices, but reading the

 manual helps here as well:



 y   a vector; ignored if x is a matrix.



 x and y are matrices in your example, so it comes as no surprise that

 you get different results. On top of that, your manual calculation is

 not correct if you want to test whether two samples come from the same

 distribution (so don't be surprised if R still gives a different

 value...).



 HTH, Michael



  -Original Message-

  From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-

  project.org] On Behalf Of Michael Haenlein

  Sent: Tuesday, September 27, 2011 12:45

  To: r-help@r-project.org

  Subject: [R] Pearson chi-square test

 

  Dear all,

 

  I have some trouble understanding the chisq.test function.

  Take the following example:

 

  set.seed(1)

  A - cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)

  B - cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)

  C - cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)

  x - table(A,B)

  y - table(A,C)

 

  When I calculate the test statistic by hand I get a value of

  approximately

  75.9:

  http://en.wikipedia.org/wiki/Pearson's_chi-

  square_test#Calculating_the_test-statistic

  sum((x-y)^2/y)

 

  But when I do chisq.test(x,y) I get a value of 12.2 while

  chisq.test(y,x)

  gives a value of 10.3.

 

  I understand that I must be doing something wrong here, but I'm not

  sure

  what.

 

  Thanks,

 

  Michael

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] binomial logistic regression question

2011-09-27 Thread Göran Broström

On Tue, Sep 27, 2011 at 2:59 PM, Patrick Breheny patrick.breh...@uky.eduwrote:

 On 09/27/2011 07:53 AM, majesty wrote:

 Dear subscribers,

 I am looking for a function which would allow me to model the dependent
 variable as the number of successes in a series of Bernoulli trials. My
 data
 looks like this

 ID  TRIALS SUCCESSESS   INDEP1 INDEP2 INDEP3
 1  00.273 0.055  0.156
 2  98170  74   0.123 0.456  0.789
 3  14548630   0.124 0.235  0.007
 4  14714949   0.888 0.357  0.321
 5  60585  11   0.484 0.235  0.235
 6  19895343   0.295 0.123  0.856

 I want to find out how independent variables influence the number of
 successful trials (dependent variable)
 I had tried to use glm formula

 regression.glm- glm( SUCCESSESS ~ INDEP1 + INDEP2 + INDEP3, data = data,
 family = binomial, weights= TRIALS)


 Try

 regression.glm- glm(cbind(SUCCESSESS,TRIALS) ~ INDEP1 + INDEP2 + INDEP3,
 data = data,family = binomial)

 That should be

regression.glm -  glm(cbind(SUCCESSESS,TRIALS - SUCCESSESS) ~ INDEP1 +
INDEP2 + INDEP3, data = data,family = binomial)

i.e., the second column in the response is the number of failures.

as is specified in the details of ?glm.

 --
 Patrick Breheny
 Assistant Professor
 Department of Biostatistics
 Department of Statistics
 University of Kentucky


 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
GÃ¶ran BrostrÃ¶m

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SVM accuracy question

2011-09-27 Thread Riccardo G-Mail


Il 27/09/11 01:58, R. Michael Weylandt ha scritto:

Why exactly do you want to stabilize your results?

If it's in preparation for publication/classroom demo/etc., certainly
resetting the seed before each run (and hence getting the same sample()
output) will make your results exactly reproducible. However, if you are
looking for a clearer picture of the true efficacy of your svm and
there's no real underlying order to the data set (i.e., not a time
series), then a straight sample() seems better to me.

I'm not particularly well read on the svm literature, but it sounds like
you are worried by widely varying performance of the svm itself. If
that's the case, it seems (to me at least) that there are certain data
points that are strongly informative and it might be a more interesting
question to look into which ones those are.

I guess my answer, as a total non-savant in the field, is that it
depends on your goal: repeated runs with sample will give you more
information about the strength of the svm while setting the seed will
give you reproducibility. Importance sampling might be of interest,
particularly if it could be tied to the information content of each data
point, and a quick skim of the MC variance reduction literature might
just provide some fun insights.

I'm not entirely sure how you mean to bootstrap the act of setting the
seed (a randomly set seed seems to be the same as not setting a seed at
all) but that might give you a nice middle ground.

Sorry this can't be of more help,

Michael

On Mon, Sep 26, 2011 at 6:32 PM, Riccardo G-Mail ric.rom...@gmail.com
mailto:ric.rom...@gmail.com wrote:

Hi, I'm working with support vector machine for the classification
purpose, and I have a problem about the accuracy of prediction.

I divided my data set in train (1/3 of enteire data set) and test
(2/3 of data set) using the sample function. Each time I perform
the svm model I obtain different result, according with the result
of the sample function. I would like to stabilize the
performance of my analysis. To do this I used the set.seed
function. Is there a better way to do this? Should I perform a
bootstrap on my work-flow (sample and svm)?

Here is an example of my workflow:
### not to run
index - 1:nrow(myData)
set.seed(23)
testindex - sample(index, trunc(length(index)/3))
testset - myData[testindex, ]
trainset - myData[-testindex, ]

tune.svm()
svm.model - svm(Factor ~ ., data = myData, cost = from tune.svm,
 gamma = from tune.svm, cross= 10, subset= testset)
summary(svm.model)
predict(svm.model, testset)

Best
Riccardo


R-help@r-project.org mailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/__listinfo/r-help
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/__posting-guide.html
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Thanks for your suggestion, I'm agree with you about the uselessness of 
set.seed inside a bootstrap; the idea of bootstrap exclude the set.seed. 
In my mind the bootstrap could allow me to understand the distribution 
of the prediction accuracy of the model. My doubt stems from the fact 
that I'm not a statistician.


Best

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Counting similar rows

2011-09-27 Thread Jean V Adams

Metronome123 wrote on 09/27/2011 07:24:50 AM:
 
 Startsituation: 
 
 structure(c(1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 
 0, 1, 1), .Dim = 4:5, .Dimnames = structure(list(subject = c(s1, 
 s2, s3, s4), class = c(c1, c2, c3, c4, c5)), .Names =
 c(subject, 
 class)), class = c(xtabs, table), call = xtabs(formula = ~subject 
+ 
 class, data = ia))
 
 I want a count of subjects that match the same classes in subject_cnt 
and a
 count of the number of classes in class_cnt The result of this example
 should be:
 
 structure(list(subject_cnt = c(2L, 1L, 1L), class_cnt = c(3L, 
 2L, 3L), c1 = c(1L, 1L, 0L), c2 = c(0L, 1L, 0L), c3 = c(1L, 0L, 
 1L), c4 = c(0L, 0L, 1L), c5 = c(1L, 0L, 1L)), .Names = c(subject_cnt, 
 class_cnt, c1, c2, c3, c4, c5), class = data.frame, 
row.names
 = c(NA, 
 -3L))
 
 How can I achieve this in R, without complicated loops? 
 
 PS. Note that the number of classes and subjects are in real quite big.
 
 Cheers,
 
 
 Lars


Try this:

xt - structure(c(1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 
1), 
.Dim = 4:5, 
.Dimnames = structure(list(subject = c(s1, s2, s3, s4), 
class = c(c1, c2, c3, c4, c5)), 
.Names = c(subject, class)), 
class = c(xtabs, table)) 
df - as.data.frame(unclass(xt))

dfu - unique(df)
class_cnt - apply(dfu, 1, sum)
subject_cnt - tabulate(match(apply(df, 1, paste, collapse=-), 
apply(dfu, 1, paste, collapse=-)))

data.frame(subject_cnt, class_cnt, dfu)


Jean
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pearson chi-square test

2011-09-27 Thread Meyners, Michael

I suspect that the chisquare-test might not be appropriate, as you have 
constraints (same number of observations for A in both contingency tables). I 
further suspect that there is no test readily available for that, but I might 
be wrong. Maybe randomization tests could help here, but it would require a bit 
of thinking AND programming to accomplish that. chisq.test might give you an 
approximate solution, but I can't say how good this will be (and it might also 
depend on the data, btw).
Best, Michael


From: Michael Haenlein
Sent: Tuesday, September 27, 2011 17:05
To: r-help@r-project.org
Cc: Meyners, Michael
Subject: RE: [R] Pearson chi-square test

Dear Michael,
 
Thanks very much for your answers!
 
The purpose of my analysis is to test whether the contingency table x is 
different from the contingency table y.
Or, to put it differently, whether there is a significant difference between 
the joint distribution AB and AC.
 
Based on your answer I'm wondering whether the best way to do this is really a 
chisq.test?
Or is there probably a different function or package I should use altogether?
 
Thanks,
 
Michael
 
 
 
-Original Message-
From: Meyners, Michael 
Sent: Dienstag, 27. September 2011 17:00
To: Michael Haenlein; r-help@r-project.org
Subject: RE: [R] Pearson chi-square test
 
Just for completeness: the manual calculation you'd want is most likely
 
sum((x-y)^2  / (x+y))
 
(that's one you can find on the Wikipedia link you provided). To get the same 
from chisq.test, try something like 
 
chisq.test(data.frame(x,y)[,c(3,6)])
 
(there are surely smarter ways, but at least it works here). Note that 
something like 
 
chisq.test(as.vector(x), as.vector(y)) 
 
will give a different test, i.e. based on a contingency table of x cross y).
M. 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Meyners, Michael
 Sent: Tuesday, September 27, 2011 13:28
 To: Michael Haenlein; r-help@r-project.org
 Subject: Re: [R] Pearson chi-square test
 
 Not sure what you want to test here with two matrices, but reading the
 manual helps here as well:
 
 y   a vector; ignored if x is a matrix.
 
 x and y are matrices in your example, so it comes as no surprise that
 you get different results. On top of that, your manual calculation is
 not correct if you want to test whether two samples come from the same
 distribution (so don't be surprised if R still gives a different
 value...).
 
 HTH, Michael
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
  project.org] On Behalf Of Michael Haenlein
  Sent: Tuesday, September 27, 2011 12:45
  To: r-help@r-project.org
  Subject: [R] Pearson chi-square test
 
  Dear all,
 
  I have some trouble understanding the chisq.test function.
  Take the following example:
 
  set.seed(1)
  A - cut(runif(100),c(0.0, 0.35, 0.50, 0.65, 1.00), labels=FALSE)
  B - cut(runif(100),c(0.0, 0.25, 0.40, 0.75, 1.00), labels=FALSE)
  C - cut(runif(100),c(0.0, 0.25, 0.50, 0.80, 1.00), labels=FALSE)
  x - table(A,B)
  y - table(A,C)
 
  When I calculate the test statistic by hand I get a value of
  approximately
  75.9:
  http://en.wikipedia.org/wiki/Pearson's_chi-
  square_test#Calculating_the_test-statistic
  sum((x-y)^2/y)
 
  But when I do chisq.test(x,y) I get a value of 12.2 while
  chisq.test(y,x)
  gives a value of 10.3.
 
  I understand that I must be doing something wrong here, but I'm not
  sure
  what.
 
  Thanks,
 
  Michael

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ANOVA define as factor or not

2011-09-27 Thread HronnE

Thank you very much for a quick reply!

I had not realized the degrees of freedom changed. It was my lack of
understanding of the aov function.

I will continue defining the pH as factor for the ANOVA's. 

Cheers,

Hronn



-
-- 
Hrönn Egilsdóttir
PhD Student
Marine Research Institute
Skúlagata 4
121 Reykjavík
ICELAND
--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-define-as-factor-or-not-tp3846861p3847781.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] wgrib2 for windows 7

2011-09-27 Thread Uwe Ligges




On 27.09.2011 14:06, Bhaskaran wrote:

Hi all,

I want to install wgrib2


I guess you will have to explain what wgrib2 is and why it is related to R.

Uwe Ligges


on my windows 7 laptop.  It appears that a
precompiled version exists and works for many.   I managed to download a
precompiled version and placed it in a directory and added this directory to
the PATH.

However, when execute R-command, I get the error suggesting that wgrib2 is
not recognized as either internal or external command.

Any suggestions?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question concerning Box.test

2011-09-27 Thread R. Michael Weylandt

Send this again using dput() to give a plain text output and I'll look at
it.

Also, I think you should probably look into the difference between a row and
a column.

Michael

On Tue, Sep 27, 2011 at 11:48 AM, Samir Benzerfa benze...@gmx.ch wrote:

 Many thanks for your hint. I tried regular apply now. However, it still
 doesnt work. Function apply works fine with other regular functions like
 sum or mean. But for the function Box.test(x,) it gives me the following
 error message: 

 ** **

 *Error in Box.test() : *

 *  x is not a vector or univariate time series*

 * *

 For simplicity, I tried to do the test with a simple 2x20 Matrix for 2
 stocks (see below), but it still does not work. It works well if I do the
 test individually for each row à Box.test(x[,1],) and Box.test(x[,2],)**
 **

 ** **

BANK.ABC   ABC.MATERIAL

  1  0.00.0

 2  0.00.0

 3  0.00.0

 4  0.003181659   -0.008194479

  5 -0.006386799   -0.008352074

 6  0.0280287240.008352074

 7 -0.0153476920.004116566

 8 -0.0159100020.016086820

 9  0.0032289700.019305155

 10 -0.013062473   -0.011479818

 11  0.00.0

 12 -0.038090050   -0.011791525

 13  0.021189299   -0.008042720

 14 -0.003460532   -0.008194479

 15 -0.010550182   -0.012589127

 16  0.0174438900.016705694

 17  0.0101396310.0

 18 -0.017090.012120633

 19  0.0102999570.023271342

 20  0.0   -0.007619397

 ** **

 Any other hints? My goal is to do the Box.test for each row (for each
 stock) separately. So I want R to take each row one by one and perform the
 test.

 ** **

 ** **

 *Von:* R. Michael Weylandt [mailto:michael.weyla...@gmail.com]
 *Gesendet:* Dienstag, 27. September 2011 13:12
 *An:* Samir Benzerfa
 *Cc:* r-help@r-project.org
 *Betreff:* Re: [R] Question concerning Box.test

 ** **

 Did you try regular apply? If you have univariate input, there's no reason
 to use the multivariate mapply. Or more generally:

 apply(P[-1,],1,function(p) Box.test(p)$p.value)

 Michael

 On Tue, Sep 27, 2011 at 4:45 AM, Samir Benzerfa benze...@gmx.ch wrote:**
 **

 Hi everyone,



 I've got a question concerning the function Box.test for testing
 autocorrelation in my data.



 My data consist of (daily) returns of several stocks over time (first
 row=time, all other rows=stock returns). I intend to perform a Box-Ljung
 test for my returns (for each stock). Since I have about 3000 stocks in my
 list, I'm not able to perform the test individually for each stock.
 Unfortunately the Box.test only works for univariate series. My goal is to
 get a list with every p-value (from the output) of the 3000 tests (that is
 a
 list with 3000 p-values). Any hint how to do this? I tried to do this with
 the function mapply, but it didn't work.



 Many thanks in advance  best regards

 S.B.




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 ** **


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] searching several subsequences in a single string sequence

2011-09-27 Thread Marcelo Araya

Hi all

 

I am analyzing bird song element sequences. I would like to know how can I
get how many times a given subsequence is found in single string sequence.  

 

 

For example:

 

If I have this single sequence:

 

ABCABAABABABCAB

 

I am looking for the subsequence ABC. Want I need to get here is that the
subsequence is found twice. 

 

Any idea how can I do this?

 

Thanks in advance

 

 

Marcelo Araya-Salas

Ph.D.  Student

Avian Communication and Evolution Lab

Department of Biology

New Mexico State University

Lab: 575-646-4863

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] searching several subsequences in a single string sequence

2011-09-27 Thread Ivan Calandra


Hi Marcelo,

Try this:
x - ABCABAABABABCAB
length(gregexpr(pattern=ABC, x)[[1]])

See ?gregexpr for more details (though I admit that it is not easy to 
understand this help page)


HTH,
Ivan

Le 9/27/2011 18:51, Marcelo Araya a écrit :

Hi all



I am analyzing bird song element sequences. I would like to know how can I
get how many times a given subsequence is found in single string sequence.





For example:



If I have this single sequence:



ABCABAABABABCAB



I am looking for the subsequence ABC. Want I need to get here is that the
subsequence is found twice.



Any idea how can I do this?



Thanks in advance





Marcelo Araya-Salas

Ph.D.  Student

Avian Communication and Evolution Lab

Department of Biology

New Mexico State University

Lab: 575-646-4863




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Dept. Mammalogy
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] searching several subsequences in a single string sequence

2011-09-27 Thread Barry Rowlingson

On Tue, Sep 27, 2011 at 5:51 PM, Marcelo Araya marcelo...@gmail.com wrote:
 Hi all



 I am analyzing bird song element sequences. I would like to know how can I
 get how many times a given subsequence is found in single string sequence.





 For example:



 If I have this single sequence:



 ABCABAABABABCAB



 I am looking for the subsequence ABC. Want I need to get here is that the
 subsequence is found twice.



 Any idea how can I do this?


 gregexpr will return the position and length of multiple matches. And
you can feed it a vector. So:


  songs=c(ABCABAABABABCAB,ABACAB,ABABCABCBC)
  gregexpr(m,songs)
[[1]]
[1]  1 11
attr(,match.length)
[1] 3 3

[[2]]
[1] -1
attr(,match.length)
[1] -1

[[3]]
[1] 3 6
attr(,match.length)
[1] 3 3

- in the first item, it was found at posn 1 and 11
 - in the second it wasnt found at all
 - in the third, it was found at posn 3 and 6

 so just do some apply-ing to the returned list and get the length of
each element. Job done!

Barry

PS bonus points for spotting the hidden prog-rock song title.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding axis to an ellipse: ellipse package

2011-09-27 Thread Antoine

Dear list members,

This might be a silly question but I just can't figure it out. I am using
the ellipse package on covariance matrices. I would simply like to plot my
ellipses WITH its two axis ploted as well. These axis represents the 2 eigen
vectors of my matrix and it is important that I can graphically show them.
Is there an easy way to do so?

Many thanks,
Antoine

--
View this message in context: 
http://r.789695.n4.nabble.com/Adding-axis-to-an-ellipse-ellipse-package-tp3847954p3847954.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merger two 3-d scatter plot

2011-09-27 Thread Uwe Ligges

If you are talking about the scatterplot3d package, and we assume your 
data is in data.frame called dat:



library(scatterplot3d)
s3d - scatterplot3d(dat$obs30, dat$Cases, dat$RANK1)
s3d$points3d(dat$obs30, dat$Cases, dat$RANK2, col=2)

Uwe Ligges



On 26.09.2011 22:08, XINLI LI wrote:


Dear David and R groups:

  I have the data as follows, I want to plot the Rank1 ~ obs30*Cases and
Rank2 ~ obs30*Cases on the same plot as one 3-D scatter plot, how to do
that? Any help is highly appreciated.
//



ID obs30 Cases RANK1 RANK2
1 0.03175 63 82 81
2 0.0 34 1 34
3 0.0 36 2 41
4 0.0 54 3 26
5 0.0 22 4 42
6 0.00746 134 39 32
7 0.0 2 5 53
8 0.01190 168 46 31
9 0.03012 166 78 86
10 0.00775 129 43 37
11 0.01290 155 51 43
12 0.00459 218 24 6
13 0.04348 23 92 73
14 0.02198 182 66 71
15 0.01546 194 60 62
16 0.01370 73 47 40
17 0.00424 236 23 2
18 0.00735 136 31 19
19 0.03030 66 86 85
20 0.03030 33 65 58
21 0.02273 132 59 59
22 0.02439 123 68 68
23 0.08333 24 83 78
24 0.01266 79 50 45
25 0.01024 293 28 4
26 0.00926 108 29 14
27 0.03750 160 85 95
28 0.01290 155 55 55
29 0.00935 107 30 20
30 0.04598 87 89 94
31 0.01087 92 41 36
32 0.0 2 6 50
33 0.01695 118 42 23
34 0.04918 61 88 92

# my test codes


print(scatterplot3d(RANK1 ~ obs30 + Cases, type = h, angle = plot.angle,
color = rgb(0,0,0,0.5),
   pch = 20, cex.symbols=2, col.axis=gray, col.grid=gray))

#

print(scatterplot3d(RANK2 ~ obs30 + Cases, type = h, angle = plot.angle,
color = rgb(0,0,0,0.5),
   pch = 20, cex.symbols=2, col.axis=gray, col.grid=gray))

On Mon, Sep 26, 2011 at 1:33 PM, David Winsemiusdwinsem...@comcast.netwrote:



On Sep 26, 2011, at 3:06 PM, XINLI LI wrote:

Dear R groups:


  I have the data as follows, I want to plot the Rank1 ~ obs30*Cases and
Rank2 ~ obs30*Cases on the same plot as one 3-D scatter plot, how to do
that? Any help is highly appreciated.



There is no Method in that data and your code throws an error.



ID obs30 Cases RANK1 RANK2
1 0.03175 63 82 81
2 0.0 34 1 34
3 0.0 36 2 41
4 0.0 54 3 26
5 0.0 22 4 42
6 0.00746 134 39 32
7 0.0 2 5 53
8 0.01190 168 46 31
9 0.03012 166 78 86
10 0.00775 129 43 37
11 0.01290 155 51 43
12 0.00459 218 24 6
13 0.04348 23 92 73
14 0.02198 182 66 71
15 0.01546 194 60 62
16 0.01370 73 47 40
17 0.00424 236 23 2
18 0.00735 136 31 19
19 0.03030 66 86 85
20 0.03030 33 65 58
21 0.02273 132 59 59
22 0.02439 123 68 68
23 0.08333 24 83 78
24 0.01266 79 50 45
25 0.01024 293 28 4
26 0.00926 108 29 14
27 0.03750 160 85 95
28 0.01290 155 55 55
29 0.00935 107 30 20
30 0.04598 87 89 94
31 0.01087 92 41 36
32 0.0 2 6 50
33 0.01695 118 42 23
34 0.04918 61 88 92

I used the following code for one plot:
print(cloud(RANK1 ~ Cases * obs30, data = xll, groups = Method, screen =
list(z=20, x= -70),
perspective = FALSE, key = list(title = Ranking HOSP, x = 100, y =
100,  corner= c(0, 0), border = TRUE,
points = Rows(trellis.par.get(**superpose.symbol), 1:2

#
##**##**
##**
print(cloud(RANK2 ~ Cases * obs30, data = xll, groups = Method, screen =
list(z=20, x= -70),
perspective = FALSE, key = list(title = Ranking HOSP, x = 100, y =
100,  corner= c(0, 0), border = TRUE,
points = Rows(trellis.par.get(**superpose.symbol), 1:2


[[alternative HTML version deleted]]

__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.htmlhttp://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



David Winsemius, MD
West Hartford, CT




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Counting similar rows

2011-09-27 Thread Metronome123

Jean: Thanks! 

Works great!


Lars
Op 27 sep. 2011 (w39), om 17:22 heeft Jean V Adams [via R] het volgende 
geschreven:

df - as.data.frame(unclass(xt)) 

dfu - unique(df) 
class_cnt - apply(dfu, 1, sum) 
subject_cnt - tabulate(match(apply(df, 1, paste, collapse=-), 
apply(dfu, 1, paste, collapse=-))) 

data.frame(subject_cnt, class_cnt, dfu) 



--
View this message in context: 
http://r.789695.n4.nabble.com/Counting-similar-rows-tp3847051p3847973.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] survival analysis: interval censored data

2011-09-27 Thread Ruth Arias



halo david

when I use type= 'interval'

Call: survfit(formula = Surv(ingreso, fecha, estado, type = interval) ~ 
    categoria)

    categoria=C 
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
 2004  95.00   13.14    0.862  0.0354    0.795    0.934
 2007  31.86    7.19    0.667  0.0695    0.544    0.818
 2008   1.67    1.67    0.000
 NaN   NA   NA

    categoria=E 
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
 2004  112.0   18.47    0.835  0.0351    0.769    0.907
 2005   40.5    1.06    0.813  0.0401    0.738    0.896
 2007   37.5    7.46    0.651  0.0620    0.540    0.785

and when I use just 

Call: survfit(formula = Surv(ingreso, fecha, estado) ~ categoria)

    categoria=C 
 time n.risk n.event entered censored survival std.err lower 95% CI upper 95% CI
 2006 63   5   0   23    0.921  0.0341    0.856    0.990
 2007 35   2  30    1    0.868  0.0483    0.778    0.968
 2008 62   5   1   
 3    0.798  0.0536    0.700    0.910
 2009 55   4   0    5    0.740  0.0570    0.636    0.861
 2010 46   5   0   41    0.660  0.0611    0.550    0.791

    categoria=E 
 time n.risk n.event entered censored survival std.err lower 95% CI upper 95% CI
 2005
 71   7   3    0    0.901  0.0354    0.835    0.973
 2006 67   2   0   22    0.875  0.0391    0.801    0.955
 2007 43   1  36    1    0.854  0.0432    0.774    0.943
 2008 77   5  
 0    8    0.799  0.0469    0.712    0.896
 2009 64   4   1    3    0.749  0.0502    0.657    0.854
 2010 58   7   0   51    0.658  0.0545    0.560    0.774

and I don t know why when I use type = interval it does not survival calculed 
for very year


regards




De: Daniel Malter dan...@umd.edu
Para: r-help@r-project.org
Enviado: martes 27 de septiembre de 2011 7:06
Asunto: Re: [R] survival analysis: interval censored data

Please adhere to the posting guide (i.e., provide a sample of self contained
code and provide the error message). And what does but it is not working
mean? Is there an error code?


rueu wrote:
 

 hello:
 my data looks like:
 time1  time2   event  catagoria
 
 2004    2006        1            C
 2004    2005        0            C
 
 2005    2010        1            E
 
 2007    2009        1            C
 
 2006    2007        0            E
 
 2008    2010        0            C
 

 2008    2010        1            E
 ...
 
 
 and the census interval is 1 year
 
 I have tried  this 
 
 surara-survfit(Surv(time1,time2,event,type=interval)~categoria)
 
 but it is not working
     [[alternative HTML version deleted]]
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal,
 self-contained, reproducible code.
 

--
View this message in context: 
http://r.789695.n4.nabble.com/survival-analysis-interval-censored-data-tp3845269p3846205.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] searching several subsequences in a single string sequence

2011-09-27 Thread Jean V Adams

Barry Rowlingson wrote on 09/27/2011 12:06:21 PM:
 
 On Tue, Sep 27, 2011 at 5:51 PM, Marcelo Araya marcelo...@gmail.com 
wrote:
  Hi all
 
 
 
  I am analyzing bird song element sequences. I would like to know how 
can I
  get how many times a given subsequence is found in single string 
sequence.
 
 
 
 
 
  For example:
 
 
 
  If I have this single sequence:
 
 
 
  ABCABAABABABCAB
 
 
 
  I am looking for the subsequence ABC. Want I need to get here is 
that the
  subsequence is found twice.
 
 
 
  Any idea how can I do this?
 
 
  gregexpr will return the position and length of multiple matches. And
 you can feed it a vector. So:
 
 
   songs=c(ABCABAABABABCAB,ABACAB,ABABCABCBC)
   gregexpr(m,songs)
 [[1]]
 [1]  1 11
 attr(,match.length)
 [1] 3 3
 
 [[2]]
 [1] -1
 attr(,match.length)
 [1] -1
 
 [[3]]
 [1] 3 6
 attr(,match.length)
 [1] 3 3
 
 - in the first item, it was found at posn 1 and 11
  - in the second it wasnt found at all
  - in the third, it was found at posn 3 and 6
 
  so just do some apply-ing to the returned list and get the length of
 each element. Job done!
 
 Barry
 
 PS bonus points for spotting the hidden prog-rock song title.


For example,

songs - c(ABCABAABABABCAB, ABACAB, ABABCABCBC)
counts - gregexpr(ABC, songs)
sapply(counts, length)

Jean

P.S.  1981 Genesis album!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mahalanobis Distance

2011-09-27 Thread jorgeA

Hello David(s),

First of all, thank you for your help.

I was running some tests, and I wish to know if I have correctly understood
your explanation. Well, when I use rbind(), I get the variables binded by
row, and when I use cbind() I get the variables binded by column.

The dist() function, as the help says, computes and returns the distance
matrix computed by using the specified distance measure to compute the
distances between the rows of a data matrix, so, in that case I use rbind()
(as the help example does).

The mahalanobis() function help says returns the squared Mahalanobis
distance of all rows in x and the vector mu = center with respect to Sigma =
cov., so, here again, the calculations are done by row. Using cbind() I get
one result for each row like this:

mahalanobis(testeCbind, center = colMeans(testeCbind), cov=var(testeCbind))

I get as result 15 values (the number of rows).

With dist(), using euclidean and rbind() I get only one value (because is
calculated by row). 

Thinking on that way, mahalanobis distance is not so aproprietad for my kind
of input data. Am I correct? Or is there a way to make the calculation of
mahalanobis of all points and get only one value as the result of how
distante the variables (subseries) are?

Thank you all again.

Best regars,
Jorge Aikes Junior

--
View this message in context: 
http://r.789695.n4.nabble.com/Mahalanobis-Distance-tp3844960p3848247.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] M-Splines design matrix

2011-09-27 Thread Spencer Graves

library(sos)
???'M-spline'


   This produces ZERO matches.  If you want a nonnegative fit and 
you don't necessarily need M-splines, there are many ways to do that, 
some of which can be found with the sos package (which also includes a 
vignette).


   There are many different packages offering spline capabilities, 
but none seem to mention M-Splines.  From the Wikipedia article, it 
appears that M-splines are standard B-splines with a fit constrained to 
be non-negative.  The construction of a B-spline basis matrix is 
described in section 3.5 of Functional Data Analysis with R and Matlab 
(Ramsay, Hooker and Graves, 2009, Springer).  The fda package includes 
script files to recreate all but one of the 76 figures in the book.  The 
following code (patterned after section 3.5) produces a B-spline matrix:


tst - create.bspline.basis()
predict(tst, seq(0, 1, .1))


   Hope this helps.
   Spencer Graves


On 9/27/2011 5:25 AM, Tigisti G wrote:
 Hi

 Can any one inform me an r package used to fit Cubic M-splines or r functions 
 used to have design and penalty matrices in fitting cubic  M-splines please?


 Thanks,
 Tigist
   [[alternative HTML version deleted]]



 __
 R-help@r-project.org  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] my R query

2011-09-27 Thread David Winsemius



On Sep 27, 2011, at 1:47 PM, arindam fadikar wrote:


I have made a level plot in R of a variable using the lattice package.
This grid corresponds to South Asia. I am only interested in viewing
the values of this variable (aerosol optical depth) for certain
countries in South Asia. I have a dummy variable that takes the value
1 for the countries I am interested in and 0 otherwise. Is it possible
for me to colour this part of the grid black or any other colour?

Here is my R code:

levelplot(aod ~ longitude + latitude | factor(day), data = aod_Jan,
 aspect=iso, contour = TRUE, layout=c(1,1))


This was cross-posted on StackOverflow where the user name was  
ridhima.  I posted a response there about 45 minutes ago. Cross- 
posting at short intervals is deprecated on r-help. Obviously when one  
has not gotten an answer in some reasonable period of time, say 48  
hours, it would make sense to repost in on a different list.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normalizing a negative binomial distribution and/or incorporating variance structures in a GAMM

2011-09-27 Thread Ben Bolker

Meredith Jantzen mjantzen at uwo.ca writes:

   Hello everyone, Apologies in advance, as this is partially a stats
 question and partially an R question.  I have been using a GAM to
 model the activity level of bats going into and coming out from a
 forested edge.  I had eight microphones set up in a line transect at
 each of eight sites, and I am hoping to construct a model for each
 of 7 species. 

  My count data has a reverse J-shaped skew and is overdispersed with
 a fair amount of zeros, and I haven't found any transformations that
 will completely normalize it (I've tried square roots and logs). 
 Meanwhile, the variance in call numbers  varies between sites and
 between microphones.  I wanted to use a GAMM to incorporate varComb
 and varIdent, but these can only be applied on data with a gaussian
 distribution. 

 Are there any packages I should be looking into that I don't know
 about that will apply a variance structure on a negative binomial
 distribution?  Or is there some transformation that I should be
 using that will solve my normality issues?  I've been searching the
 R-help boards, everything in Zuur and Woods, but I haven't found an
 answer yet. 


  I'm not entirely clear about this, but this question and the
previous question that Simon Wood answered (about neg binom and
GAMM) suggest to me that you might be going in slightly the
wrong direction.

  If your data are non-normally distributed, your choices are
typically (1) pick an alternative family of distributions to
characterize the variation (e.g. neg binomial or ZINB), (2) use
some form of robust estimation (e.g. rlm in the MASS package),
or (3) try to find a transformation of the data that makes
the data normal (and/or homoscedastic, and/or linear with respect
to the predictor variables). Among ecologists #3 is the classical
approach and #1 is the most common modern approach. Combining
#1 and #3 doesn't make that much sense to me.

  One doesn't necessarily expect the variance to be constant
in a negative binomial model; are the *standardized* residuals
heteroscedastic?  (i.e. does the boxplot of residuals(m,type=pearson)
vs site, microphone, or site*microphone combination look
funky?)

  It's not absolutely clear whether you need zero-inflation 
explicitly or not. There are tests for zero-inflation and
overdispersion (see ref below), but I don't know of any
that are implemented in R ... your choices seem to be

* negative binomial in mgcv:gam, without zero-inflation;
* ZINB in pscl, without the sophisticated GAM machinery
  of mgcv (but you can use spline terms via splines::ns(v,n)
where v is the predictor variable and n is the number of
knots -- it just won't do all the slick automatic complexity
selection that mgcv does)
* it looks like the COZIGAM package will do zero-inflated
GAMs, but it doesn't do negative binomials ...


@article{deng_score_2005,
title = {Score tests for zero-inflation and 
   over-dispersion in generalized linear models},
volume = {15},
url = {http://www3.stat.sinica.edu.tw/statistica/j15n1/j15n115/j15n115.html},
journal = {Statistica Sinica},
author = {Deng, D. and Paul, {S.R.}},
year = {2005},
pages = {257–276}
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How can I check a package is installed or not?

2011-09-27 Thread Fabrice Tourre

Dear list,

How can I detect a package is installed or not? If not, then install it.

For example, in a script, I want to check the package DESeq is
installed or not. If not, then I will using this script to install it.

source(http://www.bioconductor.org/biocLite.R;)
biocLite(DESeq)

The pseudo script would be like this:

try:
library(DESeq)
catch:
source(http://www.bioconductor.org/biocLite.R;)
biocLite(DESeq)

Thank you

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] inset one map on top of another map

2011-09-27 Thread Jean V Adams

I want to overlay a small inset map on top of another map, but I can't 
figure out how to do it. 
For example, here are two different maps:

# map 1 - Ohio
map(state, region= ohio)

# map 2 - US with Ohio darkened
map(state)
map(state, region=ohio, fill=T, add=T)

I would like to add map 2 as a small inset in the corner of map 1. 
I have tried:

map(state, region= ohio)
par(new=TRUE, mar=c(3, 3, 15, 15))
map(state)
map(state, region=ohio, fill=T, add=T)

but this seems to erase map 1 and replace it with a full size version of 
map 2.

I can successfully overlay an unrelated plot using similar code:

map(state, region= ohio)
par(new=TRUE, mar=c(3, 3, 15, 15))
plot(1:10, 1:10)

So, there must be something about the maps() function that I'm tripping 
over.

Any suggestions?

I am using R for Windows 2.13.0
and the maps package version 2.1-5.

Jean


`·.,,  (((º   `·.,,  (((º   `·.,,  (((º

Jean V. Adams
Statistician
U.S. Geological Survey
Great Lakes Science Center
223 East Steinfest Road
Antigo, WI 54409  USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import

2011-09-27 Thread B77S

I see what you mean.
Sorry and thanks for pointing that out to me Ben.


bbolker wrote:
 
 B77S bps0002 at auburn.edu writes:
 
 
 I have never used that function, but I know that with read.csv() you can
 do
 the following to select only the columns you want:
 
 chosen_vars - read.csv(Workbook1.csv, header=T)[c(variable1,
 variable3)]
 
 
 
   This is not actually selectively importing: it's importing the
 whole thing and *then* selecting the relevant columns.
   If the original poster is trying to avoid importing the whole
 data set because (for example) it's got a gigantic number of
 columns and will be very slow and/or tax their system, then this
 idiom won't help.
 
   Ben Bolker
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/Data-import-tp3842196p3848802.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I check a package is installed or not?

2011-09-27 Thread David Winsemius



On Sep 27, 2011, at 3:19 PM, Fabrice Tourre wrote:


Dear list,

How can I detect a package is installed or not? If not, then install  
it.


For example, in a script, I want to check the package DESeq is
installed or not. If not, then I will using this script to install it.

source(http://www.bioconductor.org/biocLite.R;)
biocLite(DESeq)

The pseudo script would be like this:

try:
   library(


Perhaps:

if ( require(DESeq) ) {  do-stuff } else {
   source(http://www.bioconductor.org/biocLite.R;)
   biocLite(DESeq) }

'require' reports success or failure with a logical and loads the  
package if successful. You need to add logical.return=TRUE to a  
'library' call to get avoid getting a character value back.



catch:
   source(http://www.bioconductor.org/biocLite.R;)
   biocLite(DESeq)

Thank you

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] compare proportions

2011-09-27 Thread array chip

Hi, I have a seemingly simple proportional test.  here is the question I am 
trying to answer:
 
There is a test running each day in the lab, the test comes out as
either positive or negative. So at the end of each month, we can calculate a
positive rate in that month as the proportion of positive test results. The
data look like:
 
Month      # positive   # total tests    positive rate
January  24 205  11.7%
February    31    234      13.2%
March 26   227 11.5%
:
:
:
August 42    241 17.4%
 
The total # of positive before August is 182, and the total # of tests
before August is 1526. It appears that from January to July, the positive rate
is between 11% to 13%, the rate in August is up around 17%. So the question is
whether is up in August is statistically significant?
 
I can think of 3 ways to do this test:
 
1.1. Use binom.test(), set “p” as the average positive
rate between January and July (=182/1526):
 
 binom.test(42,241,182/1526)
 
    Exact binomial test
 
data:  42 and 241 
number of successes = 42, number
of trials = 241, p-value = 0.0125
alternative hypothesis: true
probability of success is not equal to 0.1192661 
95 percent confidence interval:
 0.1285821 0.2281769 
sample estimates:
probability of success 
 0.1742739
 
2. 2. Use prop.test(), where I compare the average
positive rate between January  July with the positive rate in August:
 
 prop.test(c(182,42),c(1526,241))
 
    2-sample test for equality of
proportions with continuity correction
 
data:  c(182, 42) out of c(1526, 241) 
X-squared = 5.203, df = 1,
p-value = 0.02255
alternative hypothesis:
two.sided 
95 percent confidence interval:
 -0.107988625 -0.002026982 
sample estimates:
   prop 1    prop 2 
0.1192661 0.1742739
3.   
2. 3. Use prop.test(), where I compare the average
monthly positive rate between January  July with the positive rate in
August. The average monthly # of positives is 182/7=26, the average monthly $
of total tests is 1526/7=216:
 
 prop.test(c(26,42),c(218,241))
 
    2-sample test for equality of
proportions with continuity correction
 
data:  c(26, 42) out of c(218, 241) 
X-squared = 2.3258, df = 1,
p-value = 0.1272
alternative hypothesis:
two.sided 
95 percent confidence interval:
 -0.12375569  0.01374008 
sample estimates:
   prop 1    prop 2 
0.1192661 0.1742739
 
As you can see that the method 3 gave insignificant p value compared to
method 1  2. While I understand each method is testing different hypothesis,
but for the question I am trying to answer (does August have higher positive
rate compare to earlier months?), which method is more relevant?
 
Thanks for any suggestion,
 
John

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] remove NaN from element in a vector in a list

2011-09-27 Thread Ben qant

Hello,

What is the best way to turn a matrix into a list removing NaN's? I'm new to
R...

Start:

 mt = matrix(c(1,4,NaN,5,3,6),2,3)
 mt
 [,1] [,2] [,3]
[1,]1  NaN3
[2,]456

Desired result:

 lst
[[1]]
[1] 1 3

[[2]]
[1] 4 5 6


Thanks!

Ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with switch function across R versions 2.10 and 2.13

2011-09-27 Thread jose romero

Hello,

The following piece of code works fine in R.2.10 (ubuntu):

switch(distr,
        normal    = {if (is.infinite(param[desv]))
                 n - c(n,La desv. estándar no puede ser Inf.)
                   if (param[desv]0)
                 n - c(n,La desv. estándar no puede ser 0.)
                   },
        expo    = {if (param[tasa]=0)
                 n - c(n,La tasa debe ser 0.) },
        gamma   = {if (param[s]=0)
                     n - c(n,El parametro s debe ser 0.)
                   if (param[a]=0)
                 n - c(n,El parámetro a debe ser 0.) },
        unifc   = {if (param[min]param[max])
                     n - c(n,Min. no puede ser mayor que Max.)},
        binom   = {if (param[p]0|param[p]1)
                     n - c(n,p debe estar entre 0 y 1)
                     if (param[n]0|!all.equal(param[n],
                        trunc(param[n])))
                        n - c(n,Valor de n no entero o negativo.)
                     if (param[n]max(x))
                        n - c(n,paste(El valor máximo de,nombre,
                     no puede exceder el valor máximo de n.))
                   },
        geom    = {if (param[p]0|param[p]1)
                      n - c(n,p debe estar entre 0 y 1) },
        pois    = {if (param[tasa]0)
                      n - c(n,tasa no puede ser negativa)},
        unifd   = {if (param[min]param[max])
                      n - c(n,minimo no puede ser mayor que maximo) },       
           ,
        stop(paste(Distribución,distr,no reconocida.)) )

However, using R2.13 (under Windows), I get the following error:
Error en .local(x, distr, ...) : 
  duplicado de interuptores por defecto: '' y 'stop(paste...'

The 2.13 help on the function says the following:

In the case of no match, if there is a unnamed element of ... its value is 
returned.  (If there is more than one such argument 
an error is returned.  Before R 2.13.0 the first one would have
been used.)

So it appears to be an issue of the pre or post 2.13 version being used.  
However, i don't see how the above code has more than one unamed element and 
thus how this error is produced.  Can anyone enlighten me on this error?

Thanks in advance,

josé loreto romero palma
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove NaN from element in a vector in a list

2011-09-27 Thread R. Michael Weylandt

Try this:

alply(mt, 1, function(x) as.numeric(na.omit(x)))

The as.numeric() addition may be necessary to strip the extra attributes
na.omit() wants to add.

Michael

On Tue, Sep 27, 2011 at 4:02 PM, Ben qant ccqu...@gmail.com wrote:

 Hello,

 What is the best way to turn a matrix into a list removing NaN's? I'm new
 to
 R...

 Start:

  mt = matrix(c(1,4,NaN,5,3,6),2,3)
  mt
 [,1] [,2] [,3]
 [1,]1  NaN3
 [2,]456

 Desired result:

  lst
 [[1]]
 [1] 1 3

 [[2]]
 [1] 4 5 6


 Thanks!

 Ben

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove NaN from element in a vector in a list

2011-09-27 Thread R. Michael Weylandt

alply is from the plyr package. You'll need to call that if its not already
loaded.

M

On Tue, Sep 27, 2011 at 4:07 PM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 Try this:

 alply(mt, 1, function(x) as.numeric(na.omit(x)))

 The as.numeric() addition may be necessary to strip the extra attributes
 na.omit() wants to add.

 Michael


 On Tue, Sep 27, 2011 at 4:02 PM, Ben qant ccqu...@gmail.com wrote:

 Hello,

 What is the best way to turn a matrix into a list removing NaN's? I'm new
 to
 R...

 Start:

  mt = matrix(c(1,4,NaN,5,3,6),2,3)
  mt
 [,1] [,2] [,3]
 [1,]1  NaN3
 [2,]456

 Desired result:

  lst
 [[1]]
 [1] 1 3

 [[2]]
 [1] 4 5 6


 Thanks!

 Ben

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I check a package is installed or not?

2011-09-27 Thread Duncan Murdoch


On 11-09-27 3:19 PM, Fabrice Tourre wrote:

Dear list,

How can I detect a package is installed or not? If not, then install it.


Here's one way.  This might be slow if you have all of CRAN installed, 
but it's quick enough if you only have a few dozen packages:


installed - rownames(installed.packages())

DESeq %in% installed

Another way is to use

if (!require(DESeq)) ...

but that will give warnings if DESeq is not there.

Duncan Murdoch



For example, in a script, I want to check the package DESeq is
installed or not. If not, then I will using this script to install it.

source(http://www.bioconductor.org/biocLite.R;)
biocLite(DESeq)

The pseudo script would be like this:

try:
 library(DESeq)
catch:
 source(http://www.bioconductor.org/biocLite.R;)
 biocLite(DESeq)

Thank you

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove NaN from element in a vector in a list

2011-09-27 Thread Ben qant

Excellent! Thank you!

ben

On Tue, Sep 27, 2011 at 2:07 PM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 alply is from the plyr package. You'll need to call that if its not already
 loaded.

 M


 On Tue, Sep 27, 2011 at 4:07 PM, R. Michael Weylandt 
 michael.weyla...@gmail.com wrote:

 Try this:

 alply(mt, 1, function(x) as.numeric(na.omit(x)))

 The as.numeric() addition may be necessary to strip the extra attributes
 na.omit() wants to add.

 Michael


 On Tue, Sep 27, 2011 at 4:02 PM, Ben qant ccqu...@gmail.com wrote:

 Hello,

 What is the best way to turn a matrix into a list removing NaN's? I'm new
 to
 R...

 Start:

  mt = matrix(c(1,4,NaN,5,3,6),2,3)
  mt
 [,1] [,2] [,3]
 [1,]1  NaN3
 [2,]456

 Desired result:

  lst
 [[1]]
 [1] 1 3

 [[2]]
 [1] 4 5 6


 Thanks!

 Ben

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with switch function across R versions 2.10 and 2.13

2011-09-27 Thread Duncan Murdoch


On 11-09-27 4:03 PM, jose romero wrote:

Hello,

The following piece of code works fine in R.2.10 (ubuntu):

switch(distr,
 normal= {if (is.infinite(param[desv]))
  n- c(n,La desv. estándar no puede ser Inf.)
if (param[desv]0)
  n- c(n,La desv. estándar no puede ser0.)
},
 expo= {if (param[tasa]=0)
  n- c(n,La tasa debe ser0.) },
 gamma   = {if (param[s]=0)
  n- c(n,El parametro s debe ser0.)
if (param[a]=0)
  n- c(n,El parámetro a debe ser0.) },
 unifc   = {if (param[min]param[max])
  n- c(n,Min. no puede ser mayor que Max.)},
 binom   = {if (param[p]0|param[p]1)
  n- c(n,p debe estar entre 0 y 1)
  if (param[n]0|!all.equal(param[n],
 trunc(param[n])))
 n- c(n,Valor de n no entero o negativo.)
  if (param[n]max(x))
 n- c(n,paste(El valor máximo de,nombre,
  no puede exceder el valor máximo de n.))
},
 geom= {if (param[p]0|param[p]1)
   n- c(n,p debe estar entre 0 y 1) },
 pois= {if (param[tasa]0)
   n- c(n,tasa no puede ser negativa)},
 unifd   = {if (param[min]param[max])
   n- c(n,minimo no puede ser mayor que maximo) },   
   ,


You have an extra comma beyond the end of this line.

Duncan Murdoch


 stop(paste(Distribución,distr,no reconocida.)) )

However, using R2.13 (under Windows), I get the following error:
Error en .local(x, distr, ...) :
   duplicado de interuptores por defecto: '' y 'stop(paste...'

The 2.13 help on the function says the following:

In the case of no match, if there is a unnamed element of ... its value is 
returned.  (If there is more than one such argument
an error is returned.  Before R 2.13.0 the first one would have
been used.)

So it appears to be an issue of the pre or post 2.13 version being used.  
However, i don't see how the above code has more than one unamed element and 
thus how this error is produced.  Can anyone enlighten me on this error?

Thanks in advance,

josé loreto romero palma
[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove NaN from element in a vector in a list

2011-09-27 Thread David Winsemius



On Sep 27, 2011, at 4:02 PM, Ben qant wrote:


Hello,

What is the best way to turn a matrix into a list removing NaN's?  
I'm new to

R...

Start:


mt = matrix(c(1,4,NaN,5,3,6),2,3)
mt

[,1] [,2] [,3]
[1,]1  NaN3
[2,]456


 apply(mt, 1, function(x) x[!is.nan(x)] )
[[1]]
[1] 1 3

[[2]]
[1] 4 5 6

The function is.finite would also remove infinities as well as the NaNs.




Desired result:


lst

[[1]]
[1] 1 3

[[2]]
[1] 4 5 6


Thanks!



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I check a package is installed or not?

2011-09-27 Thread Fabrice Tourre

Thank you for your sharing

I got it.

On Tue, Sep 27, 2011 at 9:44 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Sep 27, 2011, at 3:19 PM, Fabrice Tourre wrote:

 Dear list,

 How can I detect a package is installed or not? If not, then install it.

 For example, in a script, I want to check the package DESeq is
 installed or not. If not, then I will using this script to install it.

 source(http://www.bioconductor.org/biocLite.R;)
 biocLite(DESeq)

 The pseudo script would be like this:

 try:
   library(

 Perhaps:

 if ( require(DESeq) ) {  do-stuff } else {
               source(http://www.bioconductor.org/biocLite.R;)
               biocLite(DESeq) }

 'require' reports success or failure with a logical and loads the package if
 successful. You need to add logical.return=TRUE to a 'library' call to get
 avoid getting a character value back.

 catch:
   source(http://www.bioconductor.org/biocLite.R;)
   biocLite(DESeq)

 Thank you

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove NaN from element in a vector in a list

2011-09-27 Thread Hadley Wickham

 apply(mt, 1, function(x) x[!is.nan(x)] )
 [[1]]
 [1] 1 3

 [[2]]
 [1] 4 5 6

You need to be a little careful with apply:

 mt2 - matrix(c(1,4,2,5,3,6),2,3)
 apply(mt2, 1, function(x) x[!is.nan(x)] )
 [,1] [,2]
[1,]14
[2,]25
[3,]36

Depending on the input you will get a list or matrix as output.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I check a package is installed or not?

2011-09-27 Thread Robin Cura

Hi

I'm using this :)

if (is.element('DESeq', installed.packages()[,1]) == FALSE) {
install.packages('DESeq') }

Robin

2011/9/27 Fabrice Tourre fabrice.c...@gmail.com

 Dear list,

 How can I detect a package is installed or not? If not, then install it.

 For example, in a script, I want to check the package DESeq is
 installed or not. If not, then I will using this script to install it.

 source(http://www.bioconductor.org/biocLite.R;)
 biocLite(DESeq)

 The pseudo script would be like this:

 try:
library(DESeq)
 catch:
source(http://www.bioconductor.org/biocLite.R;)
biocLite(DESeq)

 Thank you

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Installing packages in R for UBUNTU

2011-09-27 Thread gbrenes

Hi.

We downloaded R 2.13.1 for UBUNTU.  We try to install several packages:
car, maps, maptools, raster, and we found the following warning or error
message:


Error:

**buffer overflow detected ***=/user/lib/R/bin/exec/R terminated***

We also found the usual error message that the packages are programmed in
an earlier version.

Do any of you know what could we do in order to install these packages
successfully?  Do we need to download a previous version of R for UBUNTU?

Thanks


Gilbert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Is there a latex summary function in the quantreg package for just 1 tau?

2011-09-27 Thread Tal Galili

Hello dear R help members,

I wish to get a nice LaTeX table for a rq object.
Trying to use the functions I found so far wouldn't work.  I can
start opening the functions up, but I am wondering if I had missed some
function which is the one I should be using.

Here is an example session for a bunch of possible errors:

(Thanks)


data(stackloss)
y - stack.loss
x - stack.x
rq_object - rq(y ~ x, tau = .5)
rq_object_summary - summary(rq_object)

latex(rq_object)
# Error in UseMethod(latex) :
#   no applicable method for 'latex' applied to an object of class rq
latex(rq_object_summary)
# Error in UseMethod(latex) :
#   no applicable method for 'latex' applied to an object of class
summary.rq
latex.summary.rqs(rq_object_summary)
# Error in x$tau : $ operator is invalid for atomic vectors
summary.rqs(rq_object)
# Error in xi$coefficients[, i] : incorrect number of dimensions
summary.rqs(rq_object_summary)
# Error in xi$residuals[, i] : incorrect number of dimensions



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing packages in R for UBUNTU

2011-09-27 Thread Henri-Paul Indiogine

Hi!

2011/9/27  gbre...@ssc.wisc.edu:
 We downloaded R 2.13.1 for UBUNTU.  We try to install several packages:
 car, maps, maptools, raster, and we found the following warning or error
 message:

I suggest to use apt-get (synaptic) as much as possible to install R
packages and then use install.packages() for those that are not in the
Ubuntu repositories.

Henri-Paul

-- 
Henri-Paul Indiogine

Curriculum  Instruction
Texas AM University
TutorFind Learning Centre

Email: hindiog...@gmail.com
Skype: hindiogine
Website: http://people.cehd.tamu.edu/~sindiogine

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mahalanobis Distance

2011-09-27 Thread David L Carlson

Since you are only looking at the distance between two points, they must
fall on a line so no matter how many values you have for each point, their
dimension is still 1. Mahalanobis distance is a way of measuring distance in
multivariate space when the variables (columns) are correlated with one
another.  In this case, Euclidian distance (which assumes each dimension is
orthogonal to all the others) is inappropriate. With two points and one
dimension, all distance measures are effectively equivalent since they can
be converted to one another by multiplying by an appropriate constant.


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of jorgeA
Sent: Tuesday, September 27, 2011 12:08 PM
To: r-help@r-project.org
Subject: Re: [R] Mahalanobis Distance

Hello David(s),

First of all, thank you for your help.

I was running some tests, and I wish to know if I have correctly understood
your explanation. Well, when I use rbind(), I get the variables binded by
row, and when I use cbind() I get the variables binded by column.

The dist() function, as the help says, computes and returns the distance
matrix computed by using the specified distance measure to compute the
distances between the rows of a data matrix, so, in that case I use rbind()
(as the help example does).

The mahalanobis() function help says returns the squared Mahalanobis
distance of all rows in x and the vector mu = center with respect to Sigma =
cov., so, here again, the calculations are done by row. Using cbind() I get
one result for each row like this:

mahalanobis(testeCbind, center = colMeans(testeCbind), cov=var(testeCbind))

I get as result 15 values (the number of rows).

With dist(), using euclidean and rbind() I get only one value (because is
calculated by row). 

Thinking on that way, mahalanobis distance is not so aproprietad for my kind
of input data. Am I correct? Or is there a way to make the calculation of
mahalanobis of all points and get only one value as the result of how
distante the variables (subseries) are?

Thank you all again.

Best regars,
Jorge Aikes Junior

--
View this message in context:
http://r.789695.n4.nabble.com/Mahalanobis-Distance-tp3844960p3848247.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] searching several subsequences in a single string sequence

2011-09-27 Thread Barry Rowlingson

On Tue, Sep 27, 2011 at 6:15 PM, Jean V Adams jvad...@usgs.gov wrote:

 For example,

 songs - c(ABCABAABABABCAB, ABACAB, ABABCABCBC)
 counts - gregexpr(ABC, songs)
 sapply(counts, length)

 That will still return '1' for the case where its not found, because of the -1.

sapply(counts,function(x){sum(x0)})

 will return 0.

 you might get faster performance if you do 'fixed=TRUE' on the
gregexpr call as well.

 P.S.  1981 Genesis album!

 +1 bonus.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 117 matches

Mail list logo