[R] SNPRelate package error

2013-01-22 Thread sun-ye
Dear,


I am using the R package SNPRelate but I found an error when I run the
following command. Do you know what might be the problem? Thanks in
advance.


 vcf.fn - system.file(extdata,str.vcf,package=SNPRelate)
 snpgdsVCF2GDS(vcf.fn,test.gds)


Start snpgdsVCF2GDS ...
Extracting bi-allelic and polymorhpic SNPs.
Scanning ...
file: D:/Program Files/R/R-2.14.2/library/SNPRelate/extdata/str.vcf
Error in scan.vcf.marker(fn, method) :
  The file (D:/Program Files/R/R-2.14.2/library/SNPRelate/extdata/str.vcf) has 
different numbers of columns.


Best regrads


--
Dr. Ye SUN
Key Laboratory of Plant Resources Conservation and Sustainable Utilization
South China Botanical Garden, Chinese Academy of Sciences
Xingke Road 723,Tianhe District, Guangzhou 510650, PR China




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] simple reshape

2013-01-22 Thread Troels Ring

Dear friends - this is a very simple question -  I have a data frame
'data.frame':   87 obs. of  3 variables:
 $ ID   : int  1 1 1 2 2 2 3 3 3 4 ...
 $ prep : num  1.18 1.38 1.34 1.93 2.38 2.24 1.17 1.13 1.21 1.89 ...
 $ postp: num  0.63 0.71 0.75 1.01 1.12 1.07 0.87 0.64 0.7 0.8 ...

- 29 persons (ID) each measured three times before and after an 
intervention: prep and postp -

I need data rearranged like

IDtimeval
11prep
12postp
11
12
11
12
I cannot make reshape or stack do the trick.

I'm on windows 7
R version 2.15.2 (2012-10-26)

Best wishes
Troels Ring, Nephrology
Aalborg, Denmark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SNPRelate package error

2013-01-22 Thread Pascal Oettli

Hello,

Why do you think it is a package error?

The error message says that the file [...] has different numbers of 
columns. Please check that file first.


Regards,
Pascal


Le 22/01/2013 16:05, sun...@scib.ac.cn a écrit :

Dear,


I am using the R package SNPRelate but I found an error when I run the
following command. Do you know what might be the problem? Thanks in
advance.



vcf.fn - system.file(extdata,str.vcf,package=SNPRelate)
snpgdsVCF2GDS(vcf.fn,test.gds)



Start snpgdsVCF2GDS ...
 Extracting bi-allelic and polymorhpic SNPs.
 Scanning ...
 file: D:/Program Files/R/R-2.14.2/library/SNPRelate/extdata/str.vcf
Error in scan.vcf.marker(fn, method) :
   The file (D:/Program Files/R/R-2.14.2/library/SNPRelate/extdata/str.vcf) has 
different numbers of columns.


Best regrads


--
Dr. Ye SUN
Key Laboratory of Plant Resources Conservation and Sustainable Utilization
South China Botanical Garden, Chinese Academy of Sciences
Xingke Road 723,Tianhe District, Guangzhou 510650, PR China




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simple use of dcast (reshape2 package)

2013-01-22 Thread Patrick Connolly
Suppose I have a small dataframe

 aa
 Target Eaten ID
50  TPP 0  1
51  TPP 1  2
52  TPP 3  3
53  TPP 1  4
54  TPP 2  5
50.1GPA 9  1
51.1GPA11  2
52.1GPA 8  3
53.1GPA 8  4
54.1GPA10  5

And I want to reshape it into 

  ID TPP GPA
1  1   0   9
2  2   1  11
3  3   3   8
4  4   1   8
5  5   2  10

I realise that dcast function in the reshape2 package can handle much
more complicated tasks than that, but I can't make it do a simple one.

If I simply tried 

 dcast(aa, ... ~ Target)
Using ID as value column: use value.var to override.
Aggregation function missing: defaulting to length
  Eaten GPA TPP
1 0   0   1
2 1   0   2
3 2   0   1
4 3   0   1
5 8   2   0
6 9   1   0
710   1   0
811   1   0

As per the help file, it's giving counts of the numbers in the Eaten
column since that's the default fun.aggregate value.

My questions are: what fun.aggregate would work?  Alternatively, can
value.var be set to something useful?

TIA

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to align group based on the common values of two columns in r

2013-01-22 Thread Tammy Ma

HI,

I met this problem:

I have the feature data frame:


   Feature OS
 4  2
 4  1
 4  3
 1  2
 4  1


what I want to do is to autimatically create one more column called group:

   Feature OS  Group
 4  2 1
 4  1 2
 4  3 3
 1  2 4
 4  1 2



I don't want Ifelse, because I have so many combination of feature and OS, I 
even can not account.  I just want to have sth to autimatically create group 
indicator based on the difference combination of feature and OS.

Thanks for your help.


Kind regards,
Tammy


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ellipse in PCA with parameters a and bdefined.

2013-01-22 Thread mary
Ok...
so, in my model my a  is built using the standard deviation of the first
principal component and b with the second, so my x and Y should be :
 PCA $ scores [, 1], PCA $ scores [, 2] 
but in this way I do not get out a confidence interval set on my parameters
but many ellipses.

Thanks 
Mary



--
View this message in context: 
http://r.789695.n4.nabble.com/Ellipse-in-PCA-with-parameters-a-and-b-defined-tp4656215p4656242.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to align group based on the common values of two columnsinr

2013-01-22 Thread Gerrit Eichner

Hi, Tammy,

maybe you find something interesting looking at

?interaction

and/or try (with df being your data frame)

df$Group - as.integer( with( df, interaction( Feature, OS)[, drop = 
TRUE]))


 HtH  --  Gerrit

On Tue, 22 Jan 2013, Tammy Ma wrote:



HI,

I met this problem:

I have the feature data frame:


  Feature OS
4  2
4  1
4  3
1  2
4  1


what I want to do is to autimatically create one more column called group:

  Feature OS  Group
4  2 1
4  1 2
4  3 3
1  2 4
4  1 2



I don't want Ifelse, because I have so many combination of feature and OS, I 
even can not account.  I just want to have sth to autimatically create group indicator 
based on the difference combination of feature and OS.

Thanks for your help.


Kind regards,
Tammy



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple use of dcast (reshape2 package)

2013-01-22 Thread D. Rizopoulos
you could try the following:

DF - read.table(textConnection(
  Target Eaten ID
50  TPP 0  1
51  TPP 1  2
52  TPP 3  3
53  TPP 1  4
54  TPP 2  5
50.1GPA 9  1
51.1GPA11  2
52.1GPA 8  3
53.1GPA 8  4
54.1GPA10  5), header = TRUE)


newDF - as.data.frame(with(DF, tapply(Eaten, list(ID, Target), c)))
newDF$ID - unique(DF$ID)
newDF


I hope it helps.

Best,
Dimitris



On 1/22/2013 10:23 AM, Patrick Connolly wrote:
 Suppose I have a small dataframe

 aa
   Target Eaten ID
 50  TPP 0  1
 51  TPP 1  2
 52  TPP 3  3
 53  TPP 1  4
 54  TPP 2  5
 50.1GPA 9  1
 51.1GPA11  2
 52.1GPA 8  3
 53.1GPA 8  4
 54.1GPA10  5

 And I want to reshape it into

ID TPP GPA
 1  1   0   9
 2  2   1  11
 3  3   3   8
 4  4   1   8
 5  5   2  10

 I realise that dcast function in the reshape2 package can handle much
 more complicated tasks than that, but I can't make it do a simple one.

 If I simply tried

 dcast(aa, ... ~ Target)
 Using ID as value column: use value.var to override.
 Aggregation function missing: defaulting to length
Eaten GPA TPP
 1 0   0   1
 2 1   0   2
 3 2   0   1
 4 3   0   1
 5 8   2   0
 6 9   1   0
 710   1   0
 811   1   0

 As per the help file, it's giving counts of the numbers in the Eaten
 column since that's the default fun.aggregate value.

 My questions are: what fun.aggregate would work?  Alternatively, can
 value.var be set to something useful?

 TIA


-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple reshape

2013-01-22 Thread Jim Lemon

On 01/22/2013 07:19 PM, Troels Ring wrote:

Dear friends - this is a very simple question - I have a data frame
'data.frame': 87 obs. of 3 variables:
$ ID : int 1 1 1 2 2 2 3 3 3 4 ...
$ prep : num 1.18 1.38 1.34 1.93 2.38 2.24 1.17 1.13 1.21 1.89 ...
$ postp: num 0.63 0.71 0.75 1.01 1.12 1.07 0.87 0.64 0.7 0.8 ...

- 29 persons (ID) each measured three times before and after an
intervention: prep and postp -
I need data rearranged like

ID time val
1 1 prep
1 2 postp
1 1
1 2
1 1
1 2
I cannot make reshape or stack do the trick.


Hi Troels,
With a bit of extra processing I think rep_n_stack (prettyR) will do 
what you want:


# fake some data
tr.df-data.frame(ID=rep(1:29,each=3),prep=runif(87,1,3),postp=runif(87,0.5,1.5))
# add a repeat number
tr.df$repno-rep(1:3,29)
# get the reshaped data frame
trlong.df-rep_n_stack(tr.df,to.stack=2:3,
 stack.names=c(prepost,value))
# reorder it
trlong.df[order(trlong.df$ID,trlong.df$repno),]

ID repno prepost value
11 1prep 2.9158693
88   1 1   postp 0.9932342
21 2prep 1.2852817
89   1 2   postp 0.8187234
31 3prep 2.5771902
90   1 3   postp 1.0033936
42 1prep 2.2969320
91   2 1   postp 0.6837140
52 2prep 1.3083553
92   2 2   postp 1.4537096
62 3prep 2.8654184
93   2 3   postp 1.0880881
...

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple use of dcast (reshape2 package)

2013-01-22 Thread Gerrit Eichner

Hi, Patrick,

I think (with reshape from the stats package)

reshape( aa, idvar = ID, v.names = Eaten, timevar = Target,
 direction = wide)

does the trick (followed by renaming the columns of the resulting data 
frame).


 Hth  --  Gerrit


On Tue, 22 Jan 2013, Patrick Connolly wrote:


Suppose I have a small dataframe


aa

Target Eaten ID
50  TPP 0  1
51  TPP 1  2
52  TPP 3  3
53  TPP 1  4
54  TPP 2  5
50.1GPA 9  1
51.1GPA11  2
52.1GPA 8  3
53.1GPA 8  4
54.1GPA10  5

And I want to reshape it into

 ID TPP GPA
1  1   0   9
2  2   1  11
3  3   3   8
4  4   1   8
5  5   2  10

I realise that dcast function in the reshape2 package can handle much
more complicated tasks than that, but I can't make it do a simple one.

If I simply tried


dcast(aa, ... ~ Target)

Using ID as value column: use value.var to override.
Aggregation function missing: defaulting to length
 Eaten GPA TPP
1 0   0   1
2 1   0   2
3 2   0   1
4 3   0   1
5 8   2   0
6 9   1   0
710   1   0
811   1   0

As per the help file, it's giving counts of the numbers in the Eaten
column since that's the default fun.aggregate value.

My questions are: what fun.aggregate would work?  Alternatively, can
value.var be set to something useful?

TIA

--
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
  ___Patrick Connolly
{~._.~}   Great minds discuss ideas
_( Y )_  Average minds discuss events
(:_~*~_:)  Small minds discuss people
(_)-(_)   . Eleanor Roosevelt

~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Concatenate two lists, list by list

2013-01-22 Thread Alaios
Dear all,
I would like to concatenate the lists below

str(Part2$dataset)
List of 3
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...



str(Part1$dataset)
List of 3
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...


I tried concatenating those with:


 str(cbind(Part1$datase,Part2$dataset))
List of 6
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 - attr(*, dim)= int [1:2] 3 2


but I want something different. To concatenate those into  a list by list 
operation so I will end up with something looking like that

str(concatenatedLists)

List of 3
 $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ...
 - attr(*, dim)= int [1:2] 3 2


Is there anything that can do that in R?

Regards
Alex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] FactoMineR

2013-01-22 Thread Dániel Kehl
Dear Users,

I installed R Commander and the FactoMineR plug-in. Everything is fine, I can 
see the new menu,
I can import datasets, but if I want to use any of the items in the FactoMineR 
menu, i get the following error:

Error in get(.activeDataSet) : object '.activeDataSet' not found

even if there is an active dataset (if there is none, all the menu items are 
grey of course).

I have R version 2.15.2 using Windows 7 but experienced the same on other 
machines.

Please let me know if you have any idea!

Thanks a lot

daniel
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Concatenate two lists, list by list

2013-01-22 Thread PIKAL Petr
Hi

Maybe you could use mapply

mapply(c, Part1$dataset,Part2$dataset)

Regards
Petr

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Alaios
 Sent: Tuesday, January 22, 2013 11:26 AM
 To: R help
 Subject: [R] Concatenate two lists, list by list
 
 Dear all,
 I would like to concatenate the lists below
 
 str(Part2$dataset)
 List of 3
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 
 
 
 str(Part1$dataset)
 List of 3
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
 
 
 I tried concatenating those with:
 
 
  str(cbind(Part1$datase,Part2$dataset))
 List of 6
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
  $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ...
  - attr(*, dim)= int [1:2] 3 2
 
 
 but I want something different. To concatenate those into  a list by
 list operation so I will end up with something looking like that
 
 str(concatenatedLists)
 
 List of 3
  $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ...
  $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ...
  $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ...
  - attr(*, dim)= int [1:2] 3 2
 
 
 Is there anything that can do that in R?
 
 Regards
 Alex
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New book announcement: R and Data Mining - Examples and Case Studies

2013-01-22 Thread Yanchang Zhao
R and Data Mining: Examples and Case Studies
Author: Yanchang Zhao
Publisher: Academic Press, Elsevier
Publish date: December 2012
ISBN: 978-0-12-396963-7
Length: 256 pages
URL: http://www.rdatamining.com/books/rdm

This book introduces into using R for data mining with examples and
case studies. It contains 1) examples on decision trees, random
forest, regression, clustering, outlier detection, time series
analysis, association rules, text mining and social network analysis;
and 2) three real-world case studies.

Table of Contents and Abstracts:
http://www.rdatamining.com/books/rdm/toc

R Code and Data for the book:
http://www.rdatamining.com/books/rdm/code

Sample pages on Google Books:
http://books.google.com.au/books?id=FEOh08LBD9UCprintsec=frontcoversource=gbs_ge_summary_rcad=0#v=onepageqf=false

Buy the book on Amazon:
http://www.amazon.com/Data-Mining-Examples-Case-Studies/dp/0123969638

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regex for ^ (the caret symbol)?

2013-01-22 Thread S Ellison
 

 -Original Message-
  So what is the special behavior of the ^ symbol when not at 
 the beginning of the string that occurs when it is not escaped?
 
 I think it retains its meaning as an assertion that it occurs 
 at the beginning of the line, and so a pattern like a^b 
 could never match anything.  

... unless a or b are newlines and you are matching multi-line expressions, 
when ^ and $ match before and after line breaks as well as beginning and end of 
string.

S Ellison

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Approximating discrete distribution by continuous distribution

2013-01-22 Thread Michael Haenlein
Dear all,

I have a discrete distribution showing how age is distributed across a
population using a certain set of bands:

Age - matrix(c(74045062, 71978405, 122718362, 40489415), ncol=1,
dimnames=list(c(18, 18-34, 35-64, 65+),c()))
Age_dist - Age/sum(Age)

For example I know that 23.94% of all people are between 0-18 years, 23.28%
between 18-34 years and so forth.

I would like to find a continuous approximation of this discrete
distribution in order to estimate the probability that a person is for
example 16 years old.

Is there some automatic way in R through which this can be done? I tried a
Kernel density estimation of the histogram but this does not seem to
provide what I'm looking for.

Thanks very much for your help,

Michael


Michael Haenlein
Associate Professor of Marketing
ESCP Europe
Paris, France

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Approximating discrete distribution by continuous distribution

2013-01-22 Thread Barry Rowlingson
On Tue, Jan 22, 2013 at 11:49 AM, Michael Haenlein
haenl...@escpeurope.eu wrote:

 I would like to find a continuous approximation of this discrete
 distribution in order to estimate the probability that a person is for
 example 16 years old.

 Given that people age continuously (and continually...), you sound
like you are trying to replace one discrete distribution with another
(discretised by year).

 A continuous distribution would give you, for example, the
probability that a person is between 16.0 and 16.1 years old.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Approximating discrete distribution by continuous distribution

2013-01-22 Thread Prof Brian Ripley

On 22/01/2013 11:49, Michael Haenlein wrote:

Dear all,

I have a discrete distribution showing how age is distributed across a
population using a certain set of bands:

Age - matrix(c(74045062, 71978405, 122718362, 40489415), ncol=1,
dimnames=list(c(18, 18-34, 35-64, 65+),c()))
Age_dist - Age/sum(Age)

For example I know that 23.94% of all people are between 0-18 years, 23.28%
between 18-34 years and so forth.

I would like to find a continuous approximation of this discrete
distribution in order to estimate the probability that a person is for
example 16 years old.

Is there some automatic way in R through which this can be done? I tried a
Kernel density estimation of the histogram but this does not seem to
provide what I'm looking for.


This is not really an R question, but a statistics one.  It is almost 
guesswork: if for example these were drivers in the UK, the answer is 0. 
 So you need to supply some information about the shape of the 
distribution of 18 year olds.


You have estimates of the cumulative distribution function at c(0, 18, 
35, 65, Inf) (or some better upper limit).  You want to interpolate it. 
 You could use linear interpolation (approx[fun]) or a monotone spline 
interpolation (spline[fun]) or any other interpolation method which 
meets your needs.  But whatever you use, you will supplying a lot of 
information not actually in your data.




Thanks very much for your help,

Michael


Michael Haenlein
Associate Professor of Marketing
ESCP Europe
Paris, France

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FactoMineR

2013-01-22 Thread John Fox
Dear Daniel,

There were changes to the new version 1.9-3 of the Rcmdr so that it conforms to 
CRAN policies. These changes can break plug-ins that haven't been modified for 
compatibility. 

One change is that the environment in which the Rcmdr stores state information 
is no longer put on the search path. That's apparently preventing the 
FactoMineR plug-in from finding the active data set. The solution is for the 
author to replace get(.activeDataSet) with something like 
get(getRcmdr(.activeDataSet)). I'll correspond with the package author to 
suggest this.

I apologize for the difficulties introduced by these changes.

John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

On Tue, 22 Jan 2013 10:34:28 +
 Dániel Kehl ke...@ktk.pte.hu wrote:
 Dear Users,
 
 I installed R Commander and the FactoMineR plug-in. Everything is fine, I can 
 see the new menu,
 I can import datasets, but if I want to use any of the items in the 
 FactoMineR menu, i get the following error:
 
 Error in get(.activeDataSet) : object '.activeDataSet' not found
 
 even if there is an active dataset (if there is none, all the menu items are 
 grey of course).
 
 I have R version 2.15.2 using Windows 7 but experienced the same on other 
 machines.
 
 Please let me know if you have any idea!
 
 Thanks a lot
 
 daniel
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FactoMineR

2013-01-22 Thread Dániel Kehl
Dear John,

great news, thank you for your kind answer and quick response. I am sure that 
the author is going to do his best as well.

An other good experience why I love R! :)

Have a nice day,

daniel

Feladó: John Fox [j...@mcmaster.ca]
Küldve: 2013. január 22. 13:39
To: Dániel Kehl
Cc: R-help
Tárgy: Re: [R] FactoMineR

Dear Daniel,

There were changes to the new version 1.9-3 of the Rcmdr so that it conforms to 
CRAN policies. These changes can break plug-ins that haven't been modified for 
compatibility.

One change is that the environment in which the Rcmdr stores state information 
is no longer put on the search path. That's apparently preventing the 
FactoMineR plug-in from finding the active data set. The solution is for the 
author to replace get(.activeDataSet) with something like 
get(getRcmdr(.activeDataSet)). I'll correspond with the package author to 
suggest this.

I apologize for the difficulties introduced by these changes.

John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

On Tue, 22 Jan 2013 10:34:28 +
 Dániel Kehl ke...@ktk.pte.hu wrote:
 Dear Users,

 I installed R Commander and the FactoMineR plug-in. Everything is fine, I can 
 see the new menu,
 I can import datasets, but if I want to use any of the items in the 
 FactoMineR menu, i get the following error:

 Error in get(.activeDataSet) : object '.activeDataSet' not found

 even if there is an active dataset (if there is none, all the menu items are 
 grey of course).

 I have R version 2.15.2 using Windows 7 but experienced the same on other 
 machines.

 Please let me know if you have any idea!

 Thanks a lot

 daniel
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple use of dcast (reshape2 package)

2013-01-22 Thread Ista Zahn
Hi,

ID is not the value column. Your casting call should be

dcast(aa, ... ~ Target, value.var = Eaten)

Best,
Ista

On Tue, Jan 22, 2013 at 4:23 AM, Patrick Connolly
p_conno...@slingshot.co.nz wrote:
 Suppose I have a small dataframe

 aa
  Target Eaten ID
 50  TPP 0  1
 51  TPP 1  2
 52  TPP 3  3
 53  TPP 1  4
 54  TPP 2  5
 50.1GPA 9  1
 51.1GPA11  2
 52.1GPA 8  3
 53.1GPA 8  4
 54.1GPA10  5

 And I want to reshape it into

   ID TPP GPA
 1  1   0   9
 2  2   1  11
 3  3   3   8
 4  4   1   8
 5  5   2  10

 I realise that dcast function in the reshape2 package can handle much
 more complicated tasks than that, but I can't make it do a simple one.

 If I simply tried

 dcast(aa, ... ~ Target)
 Using ID as value column: use value.var to override.
 Aggregation function missing: defaulting to length
   Eaten GPA TPP
 1 0   0   1
 2 1   0   2
 3 2   0   1
 4 3   0   1
 5 8   2   0
 6 9   1   0
 710   1   0
 811   1   0

 As per the help file, it's giving counts of the numbers in the Eaten
 column since that's the default fun.aggregate value.

 My questions are: what fun.aggregate would work?  Alternatively, can
 value.var be set to something useful?

 TIA

 --
 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
___Patrick Connolly
  {~._.~}   Great minds discuss ideas
  _( Y )_ Average minds discuss events
 (:_~*~_:)  Small minds discuss people
  (_)-(_)  . Eleanor Roosevelt

 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] applying a formula from text

2013-01-22 Thread IlyaNovikov
Dear Arun,

Thank you very much.
Yours,
Ilya

On Sun, Jan 20, 2013 at 9:18 PM, arun kirshna [via R] 
ml-node+s789695n4656104...@n4.nabble.com wrote:


 Dear Ilya,

 Please check these links.


 http://stackoverflow.com/questions/4556524/whats-the-way-to-learn-r

 http://www.r-bloggers.com/learn-to-use-r-for-free-with-coursera/
 http://stackoverflow.com/questions/192369/books-for-learning-the-r-language

 You may also benefit from  An introduction to R from the
 http://cran.r-project.org/manuals.html.

 A.K.


 - Original Message -
 From: IlyaNovikov [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4656104i=0

 To: [hidden email] http://user/SendEmail.jtp?type=nodenode=4656104i=1
 Cc:
 Sent: Sunday, January 20, 2013 1:21 AM
 Subject: Re: [R] applying a formula from text

 Dear Arun,
 I am a novice in R bu some my friends that use R for a long time were not
 able to help me. Thank you really.
 Concerning your question why I need it, I think that it can be situations
 where the condition, that I have to apply, depends on the data.
 May be you can advice me a good text to learn programming in R.
 Thank you again.
 Ilya Novikov

 Sat, Jan 19, 2013 at 8:02 PM, arun kirshna [via R] 
 [hidden email] http://user/SendEmail.jtp?type=nodenode=4656104i=2
 wrote:

  HI,
  Not sure why you need to do this:
   s- x5
   h(1,eval(parse(text=s)))
  #[1] FALSE
  A.K.
 
  --
   If you reply to this email, your message will be added to the
 discussion
  below:
 
 

  .
  NAML
 http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml

 



 --
 Sincerely,
 Ilya Novikov




 --
 View this message in context:
 http://r.789695.n4.nabble.com/applying-a-formula-from-text-tp4656045p4656084.html
 Sent from the R help mailing list archive at Nabble.com.
 [[alternative HTML version deleted]]

 __
 [hidden email] http://user/SendEmail.jtp?type=nodenode=4656104i=3mailing 
 list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 [hidden email] http://user/SendEmail.jtp?type=nodenode=4656104i=4mailing 
 list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://r.789695.n4.nabble.com/applying-a-formula-from-text-tp4656045p4656104.html
  To unsubscribe from applying a formula from text, click 
 herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4656045code=aW5vdmlrb3ZAZ21haWwuY29tfDQ2NTYwNDV8MTU4NTI3OTk5MA==
 .
 NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml




-- 
Sincerely,
Ilya Novikov




--
View this message in context: 
http://r.789695.n4.nabble.com/applying-a-formula-from-text-tp4656045p4656238.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to remove the vertical space between two graps

2013-01-22 Thread Purna chander
Hi,

I have created a barplot using the following code.

a-c(11,23,15,34,42,31)
m-matrix(a,nrow=2)
m[2,]-(-1)*m[2,]

par(mar=c(4,4,4,0))
barplot(m[2,],horiz=T)

par(mar=c(4,0,4,2))
barplot(m[1,],horiz=T,col=black)

and the plot obtained is shown in plot1.tiff.

I was not willing to see the gap (vertical space) between two graphs.
How can I achieve it?


Further I tried to achieve my goal in a single plot, for which I tried
this code:

a-c(11,23,15,34,42,31)
m-matrix(a,nrow=2)
m[2,]-(-1)*m[2,]

barplot(m,horiz=T,beside=T)

and the plot obtained is showed in plot2.tiff

in the second attempt I'm able to place the bars next to each other
using beside=T argument. However, I fail when I use beside=F
argument (obtained plot3.tiff with this).

Can you suggest me in achieving my goal (similar to plot2 with no
vertical space)?

Regards,
Purna
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Erro message in glmmADMB

2013-01-22 Thread peixotop
Hello everybody,

I am using glmmADMB and when I run some models, I recieve the following
message:


Erro em glmmadmb(eumencells ~ 1 + (1 | owners), data = pred3, family =
nbinom,  :
The function maximizer failed (couldn't find STD file)
Furthermore: Lost warning messages:
Command execution 'C:\Windows\system32\cmd.exe /c
C:/Users/helenametal/Documents/R/win-library/2.15/glmmADMB/bin/windows32/glmmadmb.exe
-maxfn 500 -maxph 5 -noinit -shess' teve status 1
: Mensagens de aviso perdidas:
execução do comando 'C:\Windows\system32\cmd.exe /c
C:/Users/helenametal/Documents/R/win-library/2.15/glmmADMB/bin/windows32/glmmadmb.exe
-maxfn 500 -maxph 5 -noinit -shess' teve status 1

Does anyone know what is this and why does it happen?

Thanks a lot!

Maria




--
View this message in context: 
http://r.789695.n4.nabble.com/Erro-message-in-glmmADMB-tp4656253.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Remove my adress from mailing list

2013-01-22 Thread M. Maurice
Hello!

I wish, that my email-adress is removed from the R-help mailing list.

Thanks!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] solve equations

2013-01-22 Thread paladini

Hello !
I have a rather mathematical than statistical question. I have a 
formula:  P=R*T/(v-b) -a/(sqrt(T)*V*(V+b)) and I want to solve the 
equation for V , in terms of V= . Is this possible with R or have I 
to use another program perhaps octave?


 thanking you in anticipation

Claudia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] solve equations

2013-01-22 Thread Berend Hasselman

On 22-01-2013, at 14:20, paladini palad...@beuth-hochschule.de wrote:

 Hello !
 I have a rather mathematical than statistical question. I have a formula:  
 P=R*T/(v-b) -a/(sqrt(T)*V*(V+b)) and I want to solve the equation for V , in 
 terms of V= . Is this possible with R or have I to use another program 
 perhaps octave?
 

Have a look at uniroot.
Since this is a single equation there is probably no need to look at more high 
powered alternatives such as packages nleqslv or BB.
And since you can rewrite your formula a quadratic you can also solve it  with 
a simple function.

Berend

 thanking you in anticipation
 
 Claudia
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Percentiles with R for a big data.frame

2013-01-22 Thread Simonas Kecorius
Hey Duncan,

Neither me do imagine what formula OpenOffice uses for quantiles. I have
checked a data string, 24 values, to calculate a quantiles with OpenOffice
and R. The result is identical. The problem arises when I try to implement
quantile calculation in this form:
dat2-with(dat1,aggregate(cbind(dat1[,1:71]),by=list(newID),quantiles,0.1,type=4))
. This code does not generate an error, but I guess neither a right result.
So my question would be:
How I could calculate quantiles for a big data.frame in R (71 columns and
288 rows). I need to take 24 rows, calculate quantiles, then take another
24 rows etc..for 71 columns.

Thanks in advance.




2013/1/22 Duncan Murdoch murdoch.dun...@gmail.com

 On 13-01-21 6:41 PM, Simonas Kecorius wrote:

 Dear R users,

 I came up to a problem dealing with percentiles in R.

  From my previous questions: I do have a big data.frame, with lots of

 columns and rows. The following command enables me to calculate means for
 all data frame.

 dat1$newID-rep(1:(nrow(dat1)/**12),each=12) #if nrow(dat1)/12 is integer

 dat2-with(dat1,aggregate(**cbind(dat1[,1:71]),by=list(**newID),mean))

 What I need is to calculate percentiles for each group (there are 12
 values
 in a group). I tried the following:

 duomenai-with(dat1,aggregate(**cbind(dat1[,1:71]),by=list(**
 newID),quantiles,0.1,type=4))


 You didn't define quantiles, so that won't work.  Assuming that's a typo,
 and you meant quantile...



 First, is the following syntax is right?
 Secondly, I tried to calculate percentiles using OpenOffice and there is
 disagreement between values. If I do calculation for some number row, than
 R and OpenOffice numbers coincide, but for a data.frame it seams that
 something goes wrong.


 There are lots of different formulas for empirical quantiles.  The ones
 available in R are described in the ?quantile help topic.  What formula
 does OpenOffice use?

 Duncan Murdoch




-- 
Simonas Kecorius
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ellipse in PCA with parameters a and bdefined.

2013-01-22 Thread David L Carlson
Try x - mean(PCA$scores[,1]) and y - mean(PCA$scores[,2]) which should be
the same as x - 0, y - 0 within rounding error.

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of mary
 Sent: Tuesday, January 22, 2013 3:40 AM
 To: r-help@r-project.org
 Subject: Re: [R] Ellipse in PCA with parameters a and bdefined.
 
 Ok...
 so, in my model my a  is built using the standard deviation of the
 first
 principal component and b with the second, so my x and Y should
 be :
  PCA $ scores [, 1], PCA $ scores [, 2]
 but in this way I do not get out a confidence interval set on my
 parameters
 but many ellipses.
 
 Thanks
 Mary
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Ellipse-in-
 PCA-with-parameters-a-and-b-defined-tp4656215p4656242.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Approximating discrete distribution by continuous distribution

2013-01-22 Thread peter dalgaard

On Jan 22, 2013, at 13:45 , Prof Brian Ripley wrote:

 On 22/01/2013 11:49, Michael Haenlein wrote:
 Dear all,
 
 I have a discrete distribution showing how age is distributed across a
 population using a certain set of bands:
 
 Age - matrix(c(74045062, 71978405, 122718362, 40489415), ncol=1,
 dimnames=list(c(18, 18-34, 35-64, 65+),c()))
 Age_dist - Age/sum(Age)
 
 For example I know that 23.94% of all people are between 0-18 years, 23.28%
 between 18-34 years and so forth.
 
 I would like to find a continuous approximation of this discrete
 distribution in order to estimate the probability that a person is for
 example 16 years old.
 
 Is there some automatic way in R through which this can be done? I tried a
 Kernel density estimation of the histogram but this does not seem to
 provide what I'm looking for.
 
 This is not really an R question, but a statistics one.  It is almost 
 guesswork: if for example these were drivers in the UK, the answer is 0.  So 
 you need to supply some information about the shape of the distribution of 
 18 year olds.
 
 You have estimates of the cumulative distribution function at c(0, 18, 35, 
 65, Inf) (or some better upper limit).  You want to interpolate it.  You 
 could use linear interpolation (approx[fun]) or a monotone spline 
 interpolation (spline[fun]) or any other interpolation method which meets 
 your needs.  But whatever you use, you will supplying a lot of information 
 not actually in your data.


Agreed. The linear interpolation method is sometimes described as the sum 
polygon, and sort of assumes that there is a uniform distribution of ages in 
each range. I.e., the number of 16 year olds would be 1/18 of the 0-17 y.o. 
However, I'd feel somewhat uneasy about doing this with such wide age-bands.

There is also the option of fitting a standard distribution like the Weibull to 
the data and using that. The mle() function should do this if you write out the 
log-likelihood using something like 

dmultinom(Age, prob=diff(pweibull(c(0,18,15,65,Inf), shape, scale), log=TRUE)

With a quarter of a billion observations, the fit might be less than perfect, 
but on the other hand, extracting more than two parameters from four data 
points sound a bit ominous.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to remove the vertical space between two graps

2013-01-22 Thread David L Carlson
Most attachments are automatically stripped from r-help so we cannot see
your results. You may be able to get what you want with pyramid.plot in
package plotrix by changing the default values (it is designed for
population pyramids) or you may be able to get there using the layout()
function in base graphics before your plotting commands. Alternatively you
can set up a plot and then use the polygon() function to place the
rectangles where you want them.

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Purna chander
 Sent: Tuesday, January 22, 2013 3:41 AM
 To: r-help
 Subject: [R] How to remove the vertical space between two graps
 
 Hi,
 
 I have created a barplot using the following code.
 
 a-c(11,23,15,34,42,31)
 m-matrix(a,nrow=2)
 m[2,]-(-1)*m[2,]
 
 par(mar=c(4,4,4,0))
 barplot(m[2,],horiz=T)
 
 par(mar=c(4,0,4,2))
 barplot(m[1,],horiz=T,col=black)
 
 and the plot obtained is shown in plot1.tiff.
 
 I was not willing to see the gap (vertical space) between two graphs.
 How can I achieve it?
 
 
 Further I tried to achieve my goal in a single plot, for which I tried
 this code:
 
 a-c(11,23,15,34,42,31)
 m-matrix(a,nrow=2)
 m[2,]-(-1)*m[2,]
 
 barplot(m,horiz=T,beside=T)
 
 and the plot obtained is showed in plot2.tiff
 
 in the second attempt I'm able to place the bars next to each other
 using beside=T argument. However, I fail when I use beside=F
 argument (obtained plot3.tiff with this).
 
 Can you suggest me in achieving my goal (similar to plot2 with no
 vertical space)?
 
 Regards,
 Purna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with interpolation

2013-01-22 Thread Jessica Streicher
Next time please provide sample data in a form we can easily read in (look at 
?dput for example)

If i understand this right:

yourData-read.table(header=T,text=
datedays  rate
1996_01_02  155.74590
1996_01_02  505.67332
1996_01_02  785.60888
1996_01_02 1695.47376
1996_01_02 2605.35267
1996_01_02 3515.27619

1996_01_03  145.74740
1996_01_03  495.67226
1996_01_03  775.60371
1996_01_03 1685.47058
1996_01_03 2595.34662
1996_01_03 3505.26630
)

results-sapply(unique(yourData$date),function(thisDate){
subSet - yourData[yourData$date==thisDate,]
appr-approx(subSet$days,subSet$rate,xout=seq(0,360, 
by=30))
rates-appr$y
names(rates)-appr$x
rates
})
colnames(results)-unique(yourData$date)

This gives 13 results per date though, and it can't interpolate the first and 
last value. If you need those values that are not in-between, try spline 
instead of approx (you never specified how you wanted to interpolate).

On 17.01.2013, at 15:50, beanbandit wrote:

 hi guys
 
 I need to interpolate values for the zero coupon yield curve. Following data
 is given
 
 
 
datedays  rate
 
 1996 01 02  155.74590
 1996 01 02  505.67332
 1996 01 02  785.60888
 1996 01 02 1695.47376
 1996 01 02 2605.35267
 1996 01 02 3515.27619
 
 1996 01 03  145.74740
 1996 01 03  495.67226
 1996 01 03  775.60371
 1996 01 03 1685.47058
 1996 01 03 2595.34662
 1996 01 03 3505.26630
 
 For every day i have to interpolate 10 values, for example for maturities of
 30,60 or 90 days. I have interpolate data for a one year period, 10
 interpolation values a day, so that equals 3600 values. 
 
 what's the easiest way to implement this in R?
 
 please hlep!
 
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Help-with-interpolation-tp4655843.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ellipse in PCA with parameters a and bdefined.

2013-01-22 Thread mary
thank you David,
it was my first idea, but i don't know if it is right statistically
speaking!!!





--
View this message in context: 
http://r.789695.n4.nabble.com/Ellipse-in-PCA-with-parameters-a-and-b-defined-tp4656215p4656274.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Percentiles with R for a big data.frame

2013-01-22 Thread David Winsemius


On Jan 22, 2013, at 5:58 AM, Simonas Kecorius wrote:


Hey Duncan,

Neither me do imagine what formula OpenOffice uses for quantiles. I  
have
checked a data string, 24 values, to calculate a quantiles with  
OpenOffice
and R. The result is identical. The problem arises when I try to  
implement

quantile calculation in this form:
dat2-with(dat1,aggregate(cbind(dat1[, 
1:71]),by=list(newID),quantiles,0.1,type=4))
. This code does not generate an error, but I guess neither a right  
result.


You guess? What result and what is right?


So my question would be:
How I could calculate quantiles for a big data.frame in R (71  
columns and
288 rows). I need to take 24 rows, calculate quantiles, then take  
another



24 rows etc..for 71 columns.



You have already been told that you are misspelling the name of the R  
function.


The other open question in my mind is whether you were hoping for  
something other than a single quantile (in this case the 10th  
percentile, or perhaps wanted the quantiles that would divide your  
data into deciles?


If you want to do the calculation within groups then the second  
argument to `aggregate` must specify the grouping. By design  
`aggregate` will apply the function on all columns.

--
David.


Thanks in advance.




2013/1/22 Duncan Murdoch murdoch.dun...@gmail.com


On 13-01-21 6:41 PM, Simonas Kecorius wrote:


Dear R users,

I came up to a problem dealing with percentiles in R.

From my previous questions: I do have a big data.frame, with lots of


columns and rows. The following command enables me to calculate  
means for

all data frame.

dat1$newID-rep(1:(nrow(dat1)/**12),each=12) #if nrow(dat1)/12 is  
integer


dat2-with(dat1,aggregate(**cbind(dat1[, 
1:71]),by=list(**newID),mean))


What I need is to calculate percentiles for each group (there are 12
values
in a group). I tried the following:

duomenai-with(dat1,aggregate(**cbind(dat1[,1:71]),by=list(**
newID),quantiles,0.1,type=4))



You didn't define quantiles, so that won't work.  Assuming that's a  
typo,

and you meant quantile...




First, is the following syntax is right?
Secondly, I tried to calculate percentiles using OpenOffice and  
there is
disagreement between values. If I do calculation for some number  
row, than
R and OpenOffice numbers coincide, but for a data.frame it seams  
that

something goes wrong.



There are lots of different formulas for empirical quantiles.  The  
ones
available in R are described in the ?quantile help topic.  What  
formula

does OpenOffice use?

Duncan Murdoch





--
Simonas Kecorius
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A smart way to use $ in data frame

2013-01-22 Thread Yuan, Rebecca
Hello Greg,

Thanks very much!

This helps!

Cheers,

Rebecca

From: Greg Snow [mailto:538...@gmail.com]
Sent: Friday, January 18, 2013 5:17 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] A smart way to use $ in data frame

The important thing to understand is that $ is a shortcut for [[ and you are 
moving into the realm where a shortcut is the longest distance between 2 points 
(see fortune(312)).

So your code can be something like:

state - 'oldstate'
balance - 'oldbalance'
dataa[[balance]][ dataa[[state]]=='AR' ]

You may also benefit from learning to use tools like with and subset 
(though subset has its own complications when used inside of other functions) 
or grep and match to find the columns of interest.

On Fri, Jan 18, 2013 at 12:40 PM, Yuan, Rebecca 
rebecca.y...@bankofamerica.commailto:rebecca.y...@bankofamerica.com wrote:
Hello all,

I have a data frame dataa:

newdate newstate newid newbalance newaccounts
1 31DEC2001AR 1 1170   61
2 31DEC2001VA 2  4565   54
3 31DEC2001WA 3 2726   35
4 31DEC2001AR 3 2700   35

The following gives me the balance of state AR:

dataa$newbalance[data$newstate == 'AR']
1170
2700

Now, I have another different data frame datab, it is very similar to data, 
except that the name of the columns are different, and the order of the columns 
are different:

oldstate olddate oldbalance oldid oldaccounts
1 AR   31DEC20121234 7  40
2 WA 31DEC2012 3  30
3 VA   31DEC20122345 5  23
3 AR   31DEC20125673 5  23

datab$oldbalance[datab$oldstate== 'AR' ]
1234
5673

Could I have a way to quote

data$balance[data$state == 'AR']

in general, where balance=oldbalance, state=oldstate when data=dataa, and 
balance = newbalance, state = newstate when data=datab ?

Thanks very much!

Cheers,

Rebecca

--
This message, and any attachments, is for the intended r...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread Yuan, Rebecca
Hello,

I do have two different time series A and B, they are different in length and 
starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in 
March, 2012 and ends in Nov, 2012.

How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 
- Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, 
it would have two data points from A and B, and in December 2012, it would have 
one data point from A.

Thanks very much!

Cheers,

Rebecca


--
This message, and any attachments, is for the intended r...{{dropped:5}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread PIKAL Petr
Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different
 starting point in one figure.
 
 Hello,
 
 I do have two different time series A and B, they are different in
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012
 and B starts in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from
 Jan. 2012 - Feb, 2012, it would have one data point from A and from
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and in
 December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 
 Thanks very much!
 
 Cheers,
 
 Rebecca
 
 
 --
 This message, and any attachments, is for the intended
 r...{{dropped:5}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread Yuan, Rebecca
Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz] 
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one figure.
 
 Hello,
 
 I do have two different time series A and B, they are different in 
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 
 and B starts in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from 
 Jan. 2012 - Feb, 2012, it would have one data point from A and from 
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and 
 in December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 
 Thanks very much!
 
 Cheers,
 
 Rebecca
 
 
 --
 This message, and any attachments, is for the intended...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Erro message in glmmADMB

2013-01-22 Thread Ben Bolker
peixotop peixotop at leuphana.de writes:


 I am using glmmADMB and when I run some models, I recieve the following
 message:
 
 Erro em glmmadmb(eumencells ~ 1 + (1 | owners), data = pred3, family =
 nbinom,  :
 The function maximizer failed (couldn't find STD file)
 Furthermore: Lost warning messages:
 Command execution 'C:\Windows\system32\cmd.exe /c
 C:/Users/helenametal/Documents/R/win-library/2.15/
glmmADMB/bin/windows32/glmmadmb.exe
 -maxfn 500 -maxph 5 -noinit -shess' teve status 1
 : Mensagens de aviso perdidas:
 execução do comando 'C:\Windows\system32\cmd.exe /c
 C:/Users/helenametal/Documents/R/win-library/2.15/
glmmADMB/bin/windows32/glmmadmb.exe
 -maxfn 500 -maxph 5 -noinit -shess' teve status 1

  Sorry, this is not nearly enough information  for diagnosis.
This message just means that *something* went wrong during the
optimization step (I do appreciate that it would be good to
improve the error messages, although there may not be that
much more information available).

  Please (1) follow-up to r-sig-mixed-mod...@r-project.org
and (2) give more complete information on the full model you
ran, contents of pred3, etc. (see e.g. http://tinyurl.com/reproducible-000)

Here's a minimal example that shows that a model of the
form you present *could* work:
 
pred3 - data.frame(owners=rep(letters[1:20],each=20))
set.seed(1001)
u - rnorm(20,sd=2)
pred3$eumencells - rnbinom(nrow(pred3),mu=exp(1.5+u),size=2)
library(glmmADMB)
glmmadmb(eumencells ~ 1 + (1|owners),family=nbinom,data=pred3)

-- although it doesn't work very well -- it essentially estimates
the random effects as zero, lumps the among-owner variance into
the NB variance, and mis-estimates the intercept.  I don't blame
glmmADMB for this, though, it's a small data set and a tough problem.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assistant

2013-01-22 Thread Adelabu Ahmmed
Good-day Sir,

I am R.Language users but am try to  estimate parameter of beta distribution 
particular dataset but give this error, which is not clear to me: (Initial 
value in vmmin is not finite)
 beta.fit - fitdistr(data,densfun=dbeta,shape1=value , shape2=value)
 kindly assist.
 expecting your reply:

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple use of dcast (reshape2 package)

2013-01-22 Thread arun
Hi,

This could be done with ?aggregate()
res-aggregate(aa$Eaten,by=list(ID=aa$ID),FUN=function(x) x)
res1-data.frame(ID=res[,1],data.frame(res[[2]]))
 names(res1)[2:3]-unique(aa$Target)
 res1
#  ID TPP GPA
#1  1   0   9
#2  2   1  11
#3  3   3   8
#4  4   1   8
#5  5   2  10
A.K.




- Original Message -
From: Patrick Connolly p_conno...@slingshot.co.nz
To: R-help r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 4:23 AM
Subject: [R] Simple use of dcast (reshape2 package)

Suppose I have a small dataframe

 aa
     Target Eaten ID
50      TPP     0  1
51      TPP     1  2
52      TPP     3  3
53      TPP     1  4
54      TPP     2  5
50.1    GPA     9  1
51.1    GPA    11  2
52.1    GPA     8  3
53.1    GPA     8  4
54.1    GPA    10  5

And I want to reshape it into 

  ID TPP GPA
1  1   0   9
2  2   1  11
3  3   3   8
4  4   1   8
5  5   2  10

I realise that dcast function in the reshape2 package can handle much
more complicated tasks than that, but I can't make it do a simple one.

If I simply tried 

 dcast(aa, ... ~ Target)
Using ID as value column: use value.var to override.
Aggregation function missing: defaulting to length
  Eaten GPA TPP
1     0   0   1
2     1   0   2
3     2   0   1
4     3   0   1
5     8   2   0
6     9   1   0
7    10   1   0
8    11   1   0

As per the help file, it's giving counts of the numbers in the Eaten
column since that's the default fun.aggregate value.

My questions are: what fun.aggregate would work?  Alternatively, can
value.var be set to something useful?

TIA

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.  
   ___    Patrick Connolly  
{~._.~}                   Great minds discuss ideas    
_( Y )_               Average minds discuss events 
(:_~*~_:)                  Small minds discuss people  
(_)-(_)                            . Eleanor Roosevelt
      
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove my adress from mailing list

2013-01-22 Thread arun
HI,

Please check the link:
https://stat.ethz.ch/mailman/listinfo/r-help


At the end, there is an option to unsubscribe:
To unsubscribe from R-help, get a password reminder,
or change your subscription options enter your subscription
email address:
Hope it helps:

A.K.



- Original Message -
From: M. Maurice m.mauric...@yahoo.de
To: R-help@r-project.org R-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 7:13 AM
Subject: [R] Remove my adress from mailing list

Hello!

I wish, that my email-adress is removed from the R-help mailing list.

Thanks!
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to align group based on the common values of two columns in r

2013-01-22 Thread arun
Hi,

I am not sure about the logic behind creation of groups, especially, how do you 
want to assign the group number to a particular combination of Feature and OS.
One possible way would be:
 dat1$Group-paste(dat1[,1],dat1[,2],sep=)
 dat1
#  Feature OS Group
#1   4  2    42
#2   4  1    41
#3   4  3    43
#4   1  2    12
#5   4  1    41
A.K.






- Original Message -
From: Tammy Ma metal_lical...@live.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 4:28 AM
Subject: [R] How to align group based on the common values of two columns in r


HI,

I met this problem:

I have the feature data frame:


   Feature     OS
     4              2
     4              1
     4              3
     1              2
     4              1


what I want to do is to autimatically create one more column called group:

   Feature     OS      Group
     4              2         1
     4              1         2
     4              3         3
     1              2         4
     4              1         2



I don't want Ifelse, because I have so many combination of feature and OS, I 
even can not account.  I just want to have sth to autimatically create group 
indicator based on the difference combination of feature and OS.

Thanks for your help.


Kind regards,
Tammy


                          
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple reshape

2013-01-22 Thread arun
Hi,

You could also do this by:
set.seed(15)
tr.df-data.frame(ID=rep(1:29,each=3),prep=runif(87,1,3),postp=runif(87,0.5,1.5))
tr.df$time-1:87
res- reshape(tr.df, varying=2:3, v.name=value, 
times=c(prep,postp),idvar=time,timevar=prepost,direction=long)
res-res[order(res$ID,res$time),]
 row.names(res)-1:nrow(res)
 head(res,4)
#  ID time prepost value
#1  1    1    prep 2.2042281
#2  1    1   postp 1.3553657
#3  1    2    prep 1.3900879
#4  1    2   postp 0.8674933
A.K.



- Original Message -
From: Jim Lemon j...@bitwrit.com.au
To: Troels Ring tr...@gvdnet.dk
Cc: r-help@r-project.org
Sent: Tuesday, January 22, 2013 4:46 AM
Subject: Re: [R] simple reshape

On 01/22/2013 07:19 PM, Troels Ring wrote:
 Dear friends - this is a very simple question - I have a data frame
 'data.frame': 87 obs. of 3 variables:
 $ ID : int 1 1 1 2 2 2 3 3 3 4 ...
 $ prep : num 1.18 1.38 1.34 1.93 2.38 2.24 1.17 1.13 1.21 1.89 ...
 $ postp: num 0.63 0.71 0.75 1.01 1.12 1.07 0.87 0.64 0.7 0.8 ...

 - 29 persons (ID) each measured three times before and after an
 intervention: prep and postp -
 I need data rearranged like

 ID time val
 1 1 prep
 1 2 postp
 1 1
 1 2
 1 1
 1 2
 I cannot make reshape or stack do the trick.

Hi Troels,
With a bit of extra processing I think rep_n_stack (prettyR) will do 
what you want:

# fake some data
tr.df-data.frame(ID=rep(1:29,each=3),prep=runif(87,1,3),postp=runif(87,0.5,1.5))
# add a repeat number
tr.df$repno-rep(1:3,29)
# get the reshaped data frame
trlong.df-rep_n_stack(tr.df,to.stack=2:3,
  stack.names=c(prepost,value))
# reorder it
trlong.df[order(trlong.df$ID,trlong.df$repno),]

     ID repno prepost     value
1    1     1    prep 2.9158693
88   1     1   postp 0.9932342
2    1     2    prep 1.2852817
89   1     2   postp 0.8187234
3    1     3    prep 2.5771902
90   1     3   postp 1.0033936
4    2     1    prep 2.2969320
91   2     1   postp 0.6837140
5    2     2    prep 1.3083553
92   2     2   postp 1.4537096
6    2     3    prep 2.8654184
93   2     3   postp 1.0880881
...

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assistant

2013-01-22 Thread Jessica Streicher
You're not giving people much to work with. I googled the error, and it seems 
to come from the call to optim and has likely to do with bad starting 
parameters.

That said, the documentation of fitdistr doesn't suggest it even supports 
dbeta, there is only a beta mentioned.

On 22.01.2013, at 17:07, Adelabu Ahmmed wrote:

 Good-day Sir,
 
 I am R.Language users but am try to  estimate parameter of beta distribution 
 particular dataset but give this error, which is not clear to me: (Initial 
 value in vmmin is not finite)
 beta.fit - fitdistr(data,densfun=dbeta,shape1=value , shape2=value)
 kindly assist.
 expecting your reply:
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assistant

2013-01-22 Thread Rui Barradas

Hello,

You are calling the function in a wrong way. In the case of a beta fit, 
densfun should be the quoted string beta and the initial parameter 
values are elements of a named list. Like this:



library(MASS)

x - rbeta(1000, shape1 = 2, shape2 = 0.5)
fitdistr(x, densfun = beta, start = list(shape1 = 1, shape2 = 1))


As for your error, I only got it if the data clearly can not fit a beta.

y - rgamma(1000, shape = 2, rate = 0.5)
fitdistr(y, densfun = beta, start = list(shape1 = 1, shape2 = 1))
Error in optim(x = c(6.19809706003757, 2.32632108817696, 
3.60844436009277,  :

  initial value in 'vmmin' is not finite


So revise the way you call fitdistr and then, if the error persists, 
revise the parametric distribution to be fitted.



Hope this helps,

Rui Barradas

Em 22-01-2013 16:07, Adelabu Ahmmed escreveu:

Good-day Sir,

I am R.Language users but am try to  estimate parameter of beta distribution particular 
dataset but give this error, which is not clear to me: (Initial value in 
vmmin is not finite)
  beta.fit - fitdistr(data,densfun=dbeta,shape1=value , shape2=value)
  kindly assist.
  expecting your reply:

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] c(), rbind and cbind functions - why type of resulting object is double

2013-01-22 Thread Lourdes Peña Castillo
Hello Everyone,

I am using R 2.15.0 and I came across this behaviour and I was wondering
why I don't get an integer vector or and integer matrix with the following
code:

 z - c(1, 2:0, 3, 4:8)

 typeof(z)

[1] double

 z - rbind(1, 2:0, 3, 4:8)

Warning message:

In rbind(1, 2:0, 3, 4:8) :

  number of columns of result is not a multiple of vector length (arg 2)

 typeof(z)

[1] double

 z - matrix(c(1, 2:0, 3, 4:8), nrow = 5)

 typeof(z)

[1] double


Shouldn't be typeof integer? According to the online help if everything is
integer the output should be integer.
But if I do this, I get an integer matrix.

 z - matrix(1:20, nrow = 5)

 typeof(z)

[1] integer

Thanks!

Lourdes

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] c(), rbind and cbind functions - why type of resulting object is double

2013-01-22 Thread William Dunlap
 I was wondering
 why I don't get an integer vector or and integer matrix with the following
 code:
  z - c(1, 2:0, 3, 4:8)
  typeof(z)
 [1] double

It is because the literals 1 and 3 have type double.  Append L to make
them literal integers.
   typeof(c(1L, 2:0, 3L, 4:8))
  [1] integer
The colon function (:) returns an integer vector if it can do so without 
giving
a numerically incorrect answer.

   typeof(1.0:3.0)
  [1] integer
   typeof(1.5:3.5)
  [1] double

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Lourdes Peña Castillo
 Sent: Tuesday, January 22, 2013 9:26 AM
 To: r-help@r-project.org
 Subject: [R] c(), rbind and cbind functions - why type of resulting object is 
 double
 
 Hello Everyone,
 
 I am using R 2.15.0 and I came across this behaviour and I was wondering
 why I don't get an integer vector or and integer matrix with the following
 code:
 
  z - c(1, 2:0, 3, 4:8)
 
  typeof(z)
 
 [1] double
 
  z - rbind(1, 2:0, 3, 4:8)
 
 Warning message:
 
 In rbind(1, 2:0, 3, 4:8) :
 
   number of columns of result is not a multiple of vector length (arg 2)
 
  typeof(z)
 
 [1] double
 
  z - matrix(c(1, 2:0, 3, 4:8), nrow = 5)
 
  typeof(z)
 
 [1] double
 
 
 Shouldn't be typeof integer? According to the online help if everything is
 integer the output should be integer.
 But if I do this, I get an integer matrix.
 
  z - matrix(1:20, nrow = 5)
 
  typeof(z)
 
 [1] integer
 
 Thanks!
 
 Lourdes
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] c(), rbind and cbind functions - why type of resulting object is double

2013-01-22 Thread Patrick Burns

One place that talks about what Bill says is:

http://www.burns-stat.com/documents/tutorials/impatient-r/more-r-key-objects/more-r-numbers/

Pat

On 22/01/2013 17:35, William Dunlap wrote:

I was wondering
why I don't get an integer vector or and integer matrix with the following
code:

z - c(1, 2:0, 3, 4:8)
typeof(z)

[1] double


It is because the literals 1 and 3 have type double.  Append L to make
them literal integers.
typeof(c(1L, 2:0, 3L, 4:8))
   [1] integer
The colon function (:) returns an integer vector if it can do so without 
giving
a numerically incorrect answer.

typeof(1.0:3.0)
   [1] integer
typeof(1.5:3.5)
   [1] double

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf
Of Lourdes Peña Castillo
Sent: Tuesday, January 22, 2013 9:26 AM
To: r-help@r-project.org
Subject: [R] c(), rbind and cbind functions - why type of resulting object is 
double

Hello Everyone,

I am using R 2.15.0 and I came across this behaviour and I was wondering
why I don't get an integer vector or and integer matrix with the following
code:


z - c(1, 2:0, 3, 4:8)



typeof(z)


[1] double


z - rbind(1, 2:0, 3, 4:8)


Warning message:

In rbind(1, 2:0, 3, 4:8) :

   number of columns of result is not a multiple of vector length (arg 2)


typeof(z)


[1] double


z - matrix(c(1, 2:0, 3, 4:8), nrow = 5)



typeof(z)


[1] double


Shouldn't be typeof integer? According to the online help if everything is
integer the output should be integer.
But if I do this, I get an integer matrix.


z - matrix(1:20, nrow = 5)



typeof(z)


[1] integer

Thanks!

Lourdes

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Impatient R' and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] change confidence interval line length in barplot2 (plotrix package)

2013-01-22 Thread Martin Batholdy
Hi,

is there any way to change the width of the horizontal line of confidence 
intervals
in the barplot2 function in the plotrix package (independent of the width of 
the bars)?


example code:

library(plotrix)
# Example with confidence intervals and grid
hh - t(VADeaths)[, 1]
mybarcol - gray20
ci.l - hh * 0.85
ci.u - hh * 1.15
mp - barplot2(hh, beside = TRUE,
col = c(lightblue, mistyrose,
lightcyan, lavender),
legend = colnames(VADeaths), ylim = c(0, 20),
main = Death Rates in Virginia, font.main = 4,
sub = Faked 95 percent error bars, col.sub = mybarcol,
cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)



thanks!
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread David Winsemius

On Jan 22, 2013, at 7:07 AM, Yuan, Rebecca wrote:

 Hello,
 
 I do have two different time series A and B, they are different in length and 
 starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in 
 March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from Jan. 
 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 
 2012, it would have two data points from A and B, and in December 2012, it 
 would have one data point from A.

You could set the xlim argument to c( min(timeA, timeB), max(timeA, timeB) ) in 
the `plot` of either of the series and then use `lines` for the other series, 
perhaps with a different color argument.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread Yuan, Rebecca
Hello Arun,

This would help me to get the date type of data. A new question comes out that 
since the dates are not exactly the same on two date sets, there are some NA 
values in the merged data set, such as

2012-09-28   NA  NA5400726 14861715970
2012-09-30  5035606 14832837436 NA  NA

Does R have a function to convert the date to some format of Sep,2012, 
therefore when I merge those two, they will not have those NA numbers...

Thanks,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 2:15 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

Assuming that 'raw_data' is data.frame with first column as raw_time:
You could convert the raw_time to date format by 

 as.Date(28FEB2002,format=%d%B%Y)
#[1] 2002-02-28

In your data, it should  be:
raw_data$raw_time- as.Date(raw_time,format=%d%B%Y)

Could you just dput() a few lines of your dataset if this is not working?
Tx.

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 2:08 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

My data shows that I do not have a date type of data:

summary(raw_data)
      raw_time      raw_acct         raw_baln
28FEB2002:  1   Min.   : 61714   Min.   :117079835
28FEB2003:  1   1st Qu.: 75587   1st Qu.:158035150
28FEB2005:  1   Median :100234   Median :206906298
28FEB2006:  1   Mean   : 96058   Mean   :210550369
28FEB2007:  1   3rd Qu.:116908   3rd Qu.:263623782
28FEB2009:  1   Max.   :121853   Max.   :325290870
(Other)  :127                                      


How could I transfer the raw_time column to a date format, such as

summary(dateA)
        Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
2012-01-01 2012-04-01 2012-07-01 2012-07-01 2012-09-30 2012-12-31


Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 12:39 PM
To: Yuan, Rebecca
Cc: R help; Petr PIKAL
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi,

You could also try this:
dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day)
 
dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day)
set.seed(15)
 A-data.frame(dateA,value=sample(1:300,366,replace=TRUE))
 set.seed(25)
 B-data.frame(dateB,value=sample(1:300,275,replace=TRUE))
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
res1-res[complete.cases(res),]
library(zoo)
plot.zoo(res1)
plot.zoo(res)

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'PIKAL Petr' petr.pi...@precheza.cz
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 10:36 AM
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz]
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one figure.
 
 Hello,
 
 I do have two different time series A and B, they are different in 
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 
 and B starts in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from 
 Jan. 2012 - Feb, 2012, it would have one data point from A and from 
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and 
 in December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 
 Thanks very much!
 
 Cheers,
 
 Rebecca
 
 
 --
 This message, and any attachments, is for the \  in...{{dropped:23}}

__
R-help@r-project.org mailing list

Re: [R] change confidence interval line length in barplot2 (plotrix package)

2013-01-22 Thread Rolf Turner



There does not appear to be any such function as barplot2 in
the current version (3.4-5) of the plotrix package.  Moreover
I can find no reference to such a function in the NEWS for
plotrix.

cheers,

Rolf Turner

On 01/23/2013 07:28 AM, Martin Batholdy wrote:

Hi,

is there any way to change the width of the horizontal line of confidence 
intervals
in the barplot2 function in the plotrix package (independent of the width of 
the bars)?


example code:

library(plotrix)
# Example with confidence intervals and grid
hh - t(VADeaths)[, 1]
mybarcol - gray20
ci.l - hh * 0.85
ci.u - hh * 1.15
mp - barplot2(hh, beside = TRUE,
 col = c(lightblue, mistyrose,
 lightcyan, lavender),
 legend = colnames(VADeaths), ylim = c(0, 20),
 main = Death Rates in Virginia, font.main = 4,
 sub = Faked 95 percent error bars, col.sub = mybarcol,
 cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread Yuan, Rebecca
Hello David,

If I use plot with the following code:

plot(A, type = o, col = plot_colors[plotcolor], axes = FALSE, 
ann = FALSE)
par(new=TRUE)
plot(B, type = o, col = plot_colors[plotcolor+1], axes = 
FALSE, ann = FALSE)
box()

I will have the two series in one plot, but they are only from March,2012 to 
Nov, 2012, the nonoverlapping months are dropped out...

I know in Matlab that I can specify the x axis such as 

Plot(timeofA, A)
Hold on;
Plot(timeofB, B)

to get them in the same figure, but in R, I do not know how to do it.

Thanks,

Rebecca

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Tuesday, January 22, 2013 2:34 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.


On Jan 22, 2013, at 7:07 AM, Yuan, Rebecca wrote:

 Hello,
 
 I do have two different time series A and B, they are different in length and 
 starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in 
 March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from Jan. 
 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 
 2012, it would have two data points from A and B, and in December 2012, it 
 would have one data point from A.

You could set the xlim argument to c( min(timeA, timeB), max(timeA, timeB) ) in 
the `plot` of either of the series and then use `lines` for the other series, 
perhaps with a different color argument.

-- 

David Winsemius
Alameda, CA, USA

--
This message, and any attachments, is for the intended r...{{dropped:2}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to assign time series to a vector with one leap year

2013-01-22 Thread Janesh Devkota
Hello All,

I am trying to do the time series analysis in R and I want to assign a
vector as a time series. The data I provided is hourly. The data is from
Jan 1 2008 to Dec 31 2009. How can I assign the data such that the first
year is leap year and second is not ?

airtemp - read.csv(airtemp.csv,header=T,sep=)

aw - ts(airtemp,start=2008,frequency=8784,end=2009)

I assigned frequency as 8784 because 2008 year will have 8784 hourly data
points and 2009 has 8760 data points. The total data points are 17544

The data can be found on
https://www.dropbox.com/s/03z74632v1f3g1e/airtemp.csv

I apologize if this is very trivial to some of you.

Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread David Winsemius

On Jan 22, 2013, at 11:42 AM, Yuan, Rebecca wrote:

 Hello David,
 
 If I use plot with the following code:
 
   plot(A, type = o, col = plot_colors[plotcolor], axes = FALSE, 
 ann = FALSE)
   par(new=TRUE)
   plot(B, type = o, col = plot_colors[plotcolor+1], axes = 
 FALSE, ann = FALSE)
   box()
 
 I will have the two series in one plot, but they are only from March,2012 to 
 Nov, 2012, the nonoverlapping months are dropped out...
 
 I know in Matlab that I can specify the x axis such as 
 
 Plot(timeofA, A)
 Hold on;
 Plot(timeofB, B)
 
 to get them in the same figure, but in R, I do not know how to do it.

As I said before . You need to use the xlim argument to 'plot'. If you 
insist on using plot twice then you will need to use 'xlim=' twice, although 
I thought it would be easier to use `plo`t first and `lines` second.

-- 
David.
 
 Thanks,
 
 Rebecca
 
 -Original Message-
 From: David Winsemius [mailto:dwinsem...@comcast.net] 
 Sent: Tuesday, January 22, 2013 2:34 PM
 To: Yuan, Rebecca
 Cc: R help
 Subject: Re: [R] plot two time series with different length and different 
 starting point in one figure.
 
 
 On Jan 22, 2013, at 7:07 AM, Yuan, Rebecca wrote:
 
 Hello,
 
 I do have two different time series A and B, they are different in length 
 and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts 
 in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from Jan. 
 2012 - Feb, 2012, it would have one data point from A and from Mar, 
 2012-Nov, 2012, it would have two data points from A and B, and in December 
 2012, it would have one data point from A.
 
 You could set the xlim argument to c( min(timeA, timeB), max(timeA, timeB) ) 
 in the `plot` of either of the series and then use `lines` for the other 
 series, perhaps with a different color argument.
 
 -- 
 
 David Winsemius
 Alameda, CA, USA
 
 --
 This message, and any attachments, is for the intended...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating a Data Frame from an XML

2013-01-22 Thread Adam Gabbert
Hello,

I'm attempting to read information from an XML into a data frame in R using
the XML package. I am unable to get the data into a data frame as I would
like.  I have some sample code below.

*XML Code:*

Header...

Data I want in a data frame:

   data
  row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 /
  row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 /
  row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 /
  row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 /
  row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 /
  row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 /
  row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 /
  row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS /
  row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 /
  row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 /
  /data

*R Code:*

doc -xmlInternalTreeParse (Sample2.xml)
top - xmlRoot (doc)
xmlName (top)
names (top)
art - top [[row]]
art
**
*Output:*

 artrow BRAND=GMC NUM=1 YEAR=1999 VALUE=1/




This is where I am having difficulties.  I am unable to access additional
rows; ( i.e.  row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / )

and I am unable to access the individual entries to actually create the
data frame.  The data frame I would like is as follows:

BRANDNUMYEARVALUE
GMC1  1999  1
FORD   2  2000  12000
GMC1  2001   12500
etc

Any help or suggestions would be appreciated.  Conversly, my eventual goal
would be to take a data frame and write it into an XML in the previously
shown format.

Thank you

AG

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Introduction and help request

2013-01-22 Thread Ross Tinsley
Hello all

I am a researcher in the field of tourism and have just recently installed R64 
and RStudio onto my Mac (running latest OS). I am ran into some problems 
installing additional packages. I have looked through the General FAQs and Mac 
FAQS but haven't been able to find a solution.

I have downloaded the various packages I need from CRAN sources and while some 
have successfully installed others have not. I have been following the 
instructions on the Mac FAQ to unzip and install the downloaded packages using 
the command line but the results seem to indicate an error (they are installed 
but then don't work properly and so are subsequently uninstalled). It happens 
on more than one so that's why I thought it might be something generic I am 
doing. Here is a copy of the command line results:

Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
/private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/Hmisc_3.10-1.tar.gz
 
* installing to library 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
* installing *source* package ‘Hmisc’ ...
** package ‘Hmisc’ successfully unpacked and MD5 sums checked
** libs
*** arch - i386
sh: make: command not found
ERROR: compilation failed for package ‘Hmisc’
* removing 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/Hmisc’
Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
/private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/acepack_1.3-3.2.tar.gz
 
* installing to library 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
* installing *source* package ‘acepack’ ...
** package ‘acepack’ successfully unpacked and MD5 sums checked
** libs
*** arch - i386
sh: make: command not found
ERROR: compilation failed for package ‘acepack’
* removing 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/acepack’
Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
/private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/arm_1.6-01.02.tar.gz
 
* installing to library 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
ERROR: dependency ‘lme4’ is not available for package ‘arm’
* removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/arm’
Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
/private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/chron_2.3-43.tar.gz
 
* installing to library 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
* installing *source* package ‘chron’ ...
** package ‘chron’ successfully unpacked and MD5 sums checked
** libs
*** arch - i386
sh: make: command not found
ERROR: compilation failed for package ‘chron’
* removing 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/chron’


Thank you for any help

Ross
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot.mob() fails with cut() error 'breaks' are not unique

2013-01-22 Thread Jason Musil
DeaR all,

I am using mob() for model based partitioning, with a dichotomous variable 
(participant's correct/incorrect response to a test item) regressed onto a 
continuous predictor related to a given property of the test item. Although 
this variable is continuous, the value of this variable for many items in this 
particular analysis is 0. The partitioning criterion is self-reported ability 
in a related area.

 mob1 - mob(
correct ~ circular.mean | srp.dimension,
control=mob_control(alpha=.001),
model=glinearModel,
family=binomial()
  )

 plot(mob1)

Error in cut.default(x, breaks = breaks, include.lowest = TRUE) : 
  'breaks' are not unique

The same persists if I specify either a desired number of breaks, or explicit 
breakpoints (e.g. breaks=3 or breaks=c(-0.1,0.1,0.5)). I guess this is to do 
with the funny distribution of the predictor variable, but I'm not sure what to 
do about it.

Many thanks and apologies if this doesn't fit the mailing list---it is my first 
posting!
Jason Musil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to align group based on the common values of two columns in r

2013-01-22 Thread arun
Hi,
You could also try:

dat1-read.table(text=
 Feature    OS
    4  2
    4  1
    4  3
    1  2
    4  1
,sep=,header=TRUE)
 dat1$Group- as.numeric(factor(Reduce(paste0,dat1)))
A.K.

- Original Message -
From: Tammy Ma metal_lical...@live.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 4:28 AM
Subject: [R] How to align group based on the common values of two columns in r


HI,

I met this problem:

I have the feature data frame:


   Feature     OS
     4              2
     4              1
     4              3
     1              2
     4              1


what I want to do is to autimatically create one more column called group:

   Feature     OS      Group
     4              2         1
     4              1         2
     4              3         3
     1              2         4
     4              1         2



I don't want Ifelse, because I have so many combination of feature and OS, I 
even can not account.  I just want to have sth to autimatically create group 
indicator based on the difference combination of feature and OS.

Thanks for your help.


Kind regards,
Tammy


                          
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread arun
Hi,

dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day)
 
dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day)
set.seed(15)
 A-data.frame(dateA,value=sample(1:300,366,replace=TRUE))
 set.seed(25)
 B-data.frame(dateB,value=sample(1:300,275,replace=TRUE))
res-merge(A,B,by.x=dateA,by.y=dateB) #it works


A.K.



- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'PIKAL Petr' petr.pi...@precheza.cz
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 10:36 AM
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz] 
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one figure.
 
 Hello,
 
 I do have two different time series A and B, they are different in 
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 
 and B starts in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from 
 Jan. 2012 - Feb, 2012, it would have one data point from A and from 
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and 
 in December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 
 Thanks very much!
 
 Cheers,
 
 Rebecca
 
 
 --
 This message, and any attachments, is for the intended...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread arun
Hi,

You could also try this:
dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day)
 
dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day)
set.seed(15)
 A-data.frame(dateA,value=sample(1:300,366,replace=TRUE))
 set.seed(25)
 B-data.frame(dateB,value=sample(1:300,275,replace=TRUE))
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
res1-res[complete.cases(res),]
library(zoo)
plot.zoo(res1)
plot.zoo(res)

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'PIKAL Petr' petr.pi...@precheza.cz
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 10:36 AM
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz] 
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one figure.
 
 Hello,
 
 I do have two different time series A and B, they are different in 
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 
 and B starts in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from 
 Jan. 2012 - Feb, 2012, it would have one data point from A and from 
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and 
 in December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 
 Thanks very much!
 
 Cheers,
 
 Rebecca
 
 
 --
 This message, and any attachments, is for the intended...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fdHess function

2013-01-22 Thread Douglas Bates
Your question is better addressed to the R-help@R-project.org mailing list,
which I am copying on this reply.

You are confusing a statistical concept, the Fisher Information matrix,
with a numerical concept, the Hessian matrix of a scalar function of a
vector argument.

The Fisher information matrix is the Hessian matrix of a particular
function at its optimum and I have forgotten whether that function is the
log-likelihood or negative twice the log-likelihood or ...  Rather than get
it wrong I am sending a copy of this reply to the list where many of the
readers will be able to answer you more reliably than I can.


On Tue, Jan 22, 2013 at 1:22 PM, Marcos Coque Jr mcoqu...@yahoo.com.brwrote:

 Dear Bates,

 I am using the fdHess function for R language.
 And I have a question.

 What is the relationship with the Hessian and Fisher Information in your
 function?
 Because I think that Fisher Information=-Hessian, but I found the oposite
 in your function.
 Maybe I be something wrong...

 Thanks,

 Marcos


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread arun
Hi Rebecca,

Assuming that 'raw_data' is data.frame with first column as raw_time:
You could convert the raw_time to date format by 

 as.Date(28FEB2002,format=%d%B%Y)
#[1] 2002-02-28

In your data, it should  be:
raw_data$raw_time- as.Date(raw_time,format=%d%B%Y)

Could you just dput() a few lines of your dataset if this is not working?
Tx.

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 2:08 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

My data shows that I do not have a date type of data:

summary(raw_data)
      raw_time      raw_acct         raw_baln        
28FEB2002:  1   Min.   : 61714   Min.   :117079835  
28FEB2003:  1   1st Qu.: 75587   1st Qu.:158035150  
28FEB2005:  1   Median :100234   Median :206906298  
28FEB2006:  1   Mean   : 96058   Mean   :210550369  
28FEB2007:  1   3rd Qu.:116908   3rd Qu.:263623782  
28FEB2009:  1   Max.   :121853   Max.   :325290870  
(Other)  :127                                      


How could I transfer the raw_time column to a date format, such as

summary(dateA)
        Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
2012-01-01 2012-04-01 2012-07-01 2012-07-01 2012-09-30 2012-12-31


Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 12:39 PM
To: Yuan, Rebecca
Cc: R help; Petr PIKAL
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi,

You could also try this:
dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day)
 
dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day)
set.seed(15)
 A-data.frame(dateA,value=sample(1:300,366,replace=TRUE))
 set.seed(25)
 B-data.frame(dateB,value=sample(1:300,275,replace=TRUE))
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
res1-res[complete.cases(res),]
library(zoo)
plot.zoo(res1)
plot.zoo(res)

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'PIKAL Petr' petr.pi...@precheza.cz
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 10:36 AM
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz]
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one figure.
 
 Hello,
 
 I do have two different time series A and B, they are different in 
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 
 and B starts in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from 
 Jan. 2012 - Feb, 2012, it would have one data point from A and from 
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and 
 in December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 
 Thanks very much!
 
 Cheers,
 
 Rebecca
 
 
 --
 This message, and any attachments, is for the 
 intended...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
This message, and any attachments, is for the intended recipient(s) only, may 
contain information that is privileged, confidential and/or proprietary and 
subject to important terms and conditions available at 
http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended 
recipient, please delete this message.


__
R-help@r-project.org mailing list

Re: [R] change confidence interval line length in barplot2 (plotrix package)

2013-01-22 Thread David Winsemius

On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote:

 Hi,
 
 is there any way to change the width of the horizontal line of confidence 
 intervals
 in the barplot2 function in the plotrix package (independent of the width of 
 the bars)?
 
 
 example code:
 
 library(plotrix)
 # Example with confidence intervals and grid
 hh - t(VADeaths)[, 1]
 mybarcol - gray20
 ci.l - hh * 0.85
 ci.u - hh * 1.15
 mp - barplot2(hh, beside = TRUE,
col = c(lightblue, mistyrose,
lightcyan, lavender),
legend = colnames(VADeaths), ylim = c(0, 20),
main = Death Rates in Virginia, font.main = 4,
sub = Faked 95 percent error bars, col.sub = mybarcol,
cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)

When I did an sos::findFn(barplot2) search to locate the real `barplot2` O 
alos noted in the same package (gplots) a function named `ooplot`. It calls 
itself an extenstion of barplot2 and has a ci.lwd argument. Might save you the 
time of doing what I thought might be needed, hacking te code.

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] change confidence interval line length in barplot2 (plotrix package)

2013-01-22 Thread Martin Batholdy
Ok, I have to apologize,
I confused the packages.

It's the function barplot2 from the gplots package!


  It calls itself an extenstion of barplot2 and has a ci.lwd argument. Might 
 save you the time of doing what I thought might be needed, hacking te code.

Unfortunately ci.lwd controls the thickness of the line but not the horizontal 
width.



On Jan 22, 2013, at 21:24 , David Winsemius dwinsem...@comcast.net wrote:

 
 On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote:
 
 Hi,
 
 is there any way to change the width of the horizontal line of confidence 
 intervals
 in the barplot2 function in the plotrix package (independent of the width of 
 the bars)?
 
 
 example code:
 
 library(plotrix)
 # Example with confidence intervals and grid
 hh - t(VADeaths)[, 1]
 mybarcol - gray20
 ci.l - hh * 0.85
 ci.u - hh * 1.15
 mp - barplot2(hh, beside = TRUE,
   col = c(lightblue, mistyrose,
   lightcyan, lavender),
   legend = colnames(VADeaths), ylim = c(0, 20),
   main = Death Rates in Virginia, font.main = 4,
   sub = Faked 95 percent error bars, col.sub = mybarcol,
   cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)
 
 When I did an sos::findFn(barplot2) search to locate the real `barplot2` 
 O alos noted in the same package (gplots) a function named `ooplot`. It calls 
 itself an extenstion of barplot2 and has a ci.lwd argument. Might save you 
 the time of doing what I thought might be needed, hacking te code.
 
 -- 
 David Winsemius
 Alameda, CA, USA
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Introduction and help request

2013-01-22 Thread Berend Hasselman

On 22-01-2013, at 19:20, Ross Tinsley rtins...@htmi.ch wrote:

 Hello all
 
 I am a researcher in the field of tourism and have just recently installed 
 R64 and RStudio onto my Mac (running latest OS). I am ran into some problems 
 installing additional packages. I have looked through the General FAQs and 
 Mac FAQS but haven't been able to find a solution.
 
 I have downloaded the various packages I need from CRAN sources and while 
 some have successfully installed others have not. I have been following the 
 instructions on the Mac FAQ to unzip and install the downloaded packages 
 using the command line but the results seem to indicate an error (they are 
 installed but then don't work properly and so are subsequently uninstalled). 
 It happens on more than one so that's why I thought it might be something 
 generic I am doing. Here is a copy of the command line results:
 
 Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
 /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/Hmisc_3.10-1.tar.gz
  
 * installing to library 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
 * installing *source* package ‘Hmisc’ ...
 ** package ‘Hmisc’ successfully unpacked and MD5 sums checked
 ** libs
 *** arch - i386
 sh: make: command not found
 ERROR: compilation failed for package ‘Hmisc’
 * removing 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/Hmisc’
 Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
 /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/acepack_1.3-3.2.tar.gz
  
 * installing to library 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
 * installing *source* package ‘acepack’ ...
 ** package ‘acepack’ successfully unpacked and MD5 sums checked
 ** libs
 *** arch - i386
 sh: make: command not found
 ERROR: compilation failed for package ‘acepack’
 * removing 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/acepack’
 Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
 /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/arm_1.6-01.02.tar.gz
  
 * installing to library 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
 ERROR: dependency ‘lme4’ is not available for package ‘arm’
 * removing 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/arm’
 Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
 /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/chron_2.3-43.tar.gz
  
 * installing to library 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
 * installing *source* package ‘chron’ ...
 ** package ‘chron’ successfully unpacked and MD5 sums checked
 ** libs
 *** arch - i386
 sh: make: command not found
 ERROR: compilation failed for package ‘chron’
 * removing 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/chron’


1. This belongs on the R-SIG-Mac mailing list

2. Why don't you use the R.app GUI to install the binary versions of the 
required packages? Much easier.

3. The message:  sh: make: command not found means that you don't have make 
installed.
Most likely you don't have other required tools installed.
If you use the R.app GUI you don't really need tall those tools.

Advice: use R.app to install and if needed get the Xcode tools but only if you 
intend to compile your own packages.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fdHess function

2013-01-22 Thread Mark Leeds
Hi Doug: I was just looking at this coincidentally. When X is a vector, the
Fisher Information I_{theta} = the negative expectation of the second
derivatives of the log likelihood. So it's a matrix.  In other words,
I_theta = E(partial^2 /partial theta^2(log(X,theta).) where X is a vector.

But, even though the the Fisher Information has a seemingly nice formula, (
and this is where my confusion arose when I was dealing with this and why
I'm looking at it right
now. I have  short document that I wrote to myself  explaining it so if
anyone wants it, email me individually. It's nothing earth shattering !
) in many cases taking the that expectation is not easy so the  Fischer
Information is approximated by its empirical counterpart which is obtained
by summing each of the elements in the matrix given the n observations and
then dividing each of the elements in the matrix by n.













On Tue, Jan 22, 2013 at 3:27 PM, Douglas Bates ba...@stat.wisc.edu wrote:

 Your question is better addressed to the R-help@R-project.org mailing
 list,
 which I am copying on this reply.

 You are confusing a statistical concept, the Fisher Information matrix,
 with a numerical concept, the Hessian matrix of a scalar function of a
 vector argument.

 The Fisher information matrix is the Hessian matrix of a particular
 function at its optimum and I have forgotten whether that function is the
 log-likelihood or negative twice the log-likelihood or ...  Rather than get
 it wrong I am sending a copy of this reply to the list where many of the
 readers will be able to answer you more reliably than I can.


 On Tue, Jan 22, 2013 at 1:22 PM, Marcos Coque Jr mcoqu...@yahoo.com.br
 wrote:

  Dear Bates,
 
  I am using the fdHess function for R language.
  And I have a question.
 
  What is the relationship with the Hessian and Fisher Information in your
  function?
  Because I think that Fisher Information=-Hessian, but I found the oposite
  in your function.
  Maybe I be something wrong...
 
  Thanks,
 
  Marcos
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fdHess function

2013-01-22 Thread Mark Leeds
I neglected to mention that, once you get either I_theta or some empirical
estimate
of it, you then invert it to get an estimate of the asymptotic covariance
matrix of the
MLE.


On Tue, Jan 22, 2013 at 3:48 PM, Mark Leeds marklee...@gmail.com wrote:

 Hi Doug: I was just looking at this coincidentally. When X is a vector,
 the Fisher Information I_{theta} = the negative expectation of the second
 derivatives of the log likelihood. So it's a matrix.  In other words,
 I_theta = E(partial^2 /partial theta^2(log(X,theta).) where X is a vector.

 But, even though the the Fisher Information has a seemingly nice formula,
 ( and this is where my confusion arose when I was dealing with this and why
 I'm looking at it right
 now. I have  short document that I wrote to myself  explaining it so if
 anyone wants it, email me individually. It's nothing earth shattering !
 ) in many cases taking the that expectation is not easy so the  Fischer
 Information is approximated by its empirical counterpart which is obtained
 by summing each of the elements in the matrix given the n observations and
 then dividing each of the elements in the matrix by n.













 On Tue, Jan 22, 2013 at 3:27 PM, Douglas Bates ba...@stat.wisc.eduwrote:

 Your question is better addressed to the R-help@R-project.org mailing
 list,
 which I am copying on this reply.

 You are confusing a statistical concept, the Fisher Information matrix,
 with a numerical concept, the Hessian matrix of a scalar function of a
 vector argument.

 The Fisher information matrix is the Hessian matrix of a particular
 function at its optimum and I have forgotten whether that function is the
 log-likelihood or negative twice the log-likelihood or ...  Rather than
 get
 it wrong I am sending a copy of this reply to the list where many of the
 readers will be able to answer you more reliably than I can.


 On Tue, Jan 22, 2013 at 1:22 PM, Marcos Coque Jr mcoqu...@yahoo.com.br
 wrote:

  Dear Bates,
 
  I am using the fdHess function for R language.
  And I have a question.
 
  What is the relationship with the Hessian and Fisher Information in your
  function?
  Because I think that Fisher Information=-Hessian, but I found the
 oposite
  in your function.
  Maybe I be something wrong...
 
  Thanks,
 
  Marcos
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread arun
Hi Rebecca,

In the previous email, 
  res-merge(Anew,Bnew)
head(res)
#   Anew Bnew
#2012-01-01  181   NA
#2012-01-02   59   NA
#2012-01-03  290   NA
#2012-01-04  196   NA
#2012-01-05  111   NA
#2012-01-06  297   NA
 
plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I 
guess, it would remove that from plotting)  

If you want to remove the NA rows:
use, na.omit() or complete.cases()? #as I did in the previous email.

Could you dput() an example dataset?

A.K.
 




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 2:38 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

This would help me to get the date type of data. A new question comes out that 
since the dates are not exactly the same on two date sets, there are some NA 
values in the merged data set, such as

2012-09-28       NA          NA    5400726 14861715970
2012-09-30  5035606 14832837436         NA          NA

Does R have a function to convert the date to some format of Sep,2012, 
therefore when I merge those two, they will not have those NA numbers...

Thanks,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 2:15 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

Assuming that 'raw_data' is data.frame with first column as raw_time:
You could convert the raw_time to date format by 

 as.Date(28FEB2002,format=%d%B%Y)
#[1] 2002-02-28

In your data, it should  be:
raw_data$raw_time- as.Date(raw_time,format=%d%B%Y)

Could you just dput() a few lines of your dataset if this is not working?
Tx.

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 2:08 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

My data shows that I do not have a date type of data:

summary(raw_data)
      raw_time      raw_acct         raw_baln
28FEB2002:  1   Min.   : 61714   Min.   :117079835
28FEB2003:  1   1st Qu.: 75587   1st Qu.:158035150
28FEB2005:  1   Median :100234   Median :206906298
28FEB2006:  1   Mean   : 96058   Mean   :210550369
28FEB2007:  1   3rd Qu.:116908   3rd Qu.:263623782
28FEB2009:  1   Max.   :121853   Max.   :325290870
(Other)  :127                                      


How could I transfer the raw_time column to a date format, such as

summary(dateA)
        Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
2012-01-01 2012-04-01 2012-07-01 2012-07-01 2012-09-30 2012-12-31


Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 12:39 PM
To: Yuan, Rebecca
Cc: R help; Petr PIKAL
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi,

You could also try this:
dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day)
 
dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day)
set.seed(15)
 A-data.frame(dateA,value=sample(1:300,366,replace=TRUE))
 set.seed(25)
 B-data.frame(dateB,value=sample(1:300,275,replace=TRUE))
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
res1-res[complete.cases(res),]
library(zoo)
plot.zoo(res1)
plot.zoo(res)

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'PIKAL Petr' petr.pi...@precheza.cz
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 10:36 AM
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz]
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one 

Re: [R] user units in plotrix

2013-01-22 Thread Greg Snow
If you want to convert between different units using base graphics then
look at the grconvertX and grconvertY functions (in the graphics package).
 These functions will convert from/to user coordinates, inches, device,
figure, and plot coordinates.  So you could use grconvertX to  find out
what user value on the x scale to give to draw.circle that would then
generate a circle with a given size in inches, or relative to the device,
figure, or plotting region.


On Sun, Jan 20, 2013 at 2:59 PM, Murat Tasan mmu...@gmail.com wrote:

 hi all - i'm having some difficulty figuring out how to convert
 between user units (which i can't find a definition for in the
 plotrix package) and either (a) device units (e.g. inches with PDFs)
 or (b) user coordinates along any particular axis.

 as an example, suppose i set up a PDF device with inches, the device
 has both outer and inner magins, and the plot region has drastically
 different x and y coordinate ranges (e.g. xlim = c(0, 1), ylim =  c(0,
 SOME_VERY_LARGE_NUMBER)).

 now i'd like to draw.circle(...) but i can't figure out what units the
 radius argument takes.
 user units doesn't appear to be inches in this case, and it it
 corresponds to user coordinates, i don't know which axis' scaling is
 to be used as the reference.

 ideally, one would be able to specify the radius in user coordinates
 while specifying _which_ axis to use as the standard (e.g. an axis =
 y or axis = x argument).

 getFigCtr(...) can help in figuring this out, but its argument takes
 the relative position of the figure region, rather than the plot
 region, which is more apt for properly placing shapes.

 i know the grid package has extensive unit conversion code, but i'm
 trying to update a series of figures using only base graphics...

 i can't seem to find a rigorous definition of user units anywhere in
 the plotrix package.
 anyone know of where i can find this info?

 cheers,

 -m

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to give a lengend in symbols functions

2013-01-22 Thread Greg Snow
I don't see a symbols function in the gtools package, do you mean the
symbols function in the graphics package?

If so, there is not a simple legend or key function to create the legend
(the number of possible options would make it more complicated than
building the legend by hand).  You will need to construct the legend by
hand.  You can use the symbols function to add the example symbols to the
legend and the text function to add the explanatory text.  The functions
grconvertX, grconvertY, strheight, and strwidth will help with deciding
where to place the text and symbols.


On Mon, Jan 21, 2013 at 6:37 PM, Jie Tang totang...@gmail.com wrote:

 hi Rusers

   I am trying to use symbos in gtools package
 symbols(data1,data3,circle=data1/data3,inches=0.1,bg=lightgreen)

 Now I want to give a lengend to tell the reader the meaning or magnitude of
 these  circle.
 How can I add these information in symbols plot just like legend in plot ?
  thank you .
 --
 TANG Jie
 Email: totang...@gmail.com
 Tel: 0086-2154896104
 Shanghai Typhoon Institute,China

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Create a Data Frame from an XML

2013-01-22 Thread Adam Gabbert
 Hello,

I'm attempting to read information from an XML into a data frame in R using
the XML package. I am unable to get the data into a data frame as I would
like.  I have some sample code below.

*XML Code:*

Header...

Data I want in a data frame:

   data
  row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 /
  row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 /
  row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 /
  row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 /
  row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 /
  row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 /
  row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 /
  row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS /
  row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 /
  row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 /
  /data

*R Code:*

doc -xmlInternalTreeParse (Sample2.xml)
top - xmlRoot (doc)
xmlName (top)
names (top)
art - top [[row]]
art
**
*Output:*

 artrow BRAND=GMC NUM=1 YEAR=1999 VALUE=1/

* *


This is where I am having difficulties.  I am unable to access additional
rows; ( i.e.  row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / )

and I am unable to access the individual entries to actually create the
data frame.  The data frame I would like is as follows:

BRANDNUMYEARVALUE
GMC1  1999  1
FORD   2  2000  12000
GMC1  2001   12500
etc

Any help or suggestions would be appreciated.  Conversly, my eventual goal
would be to take a data frame and write it into an XML in the previously
shown format.

Thank you

AG

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] change confidence interval line length in barplot2 (plotrix package)

2013-01-22 Thread David L Carlson
Maybe a fortunate mistake. If you use the base graphics barplot(), you can
use plotCI() in plotrix to add the confidence intervals with control over
the width of the horizontal ends of the bars (if needed, the defaults are
much narrower):

out - barplot(hh, beside = TRUE,
   col = c(lightblue, mistyrose, lightcyan, lavender),
   legend = colnames(VADeaths), ylim = c(0, 20),
   main = Death Rates in Virginia, font.main = 4,
   sub = Faked 95 percent error bars, col.sub = mybarcol,
   cex.names = 1.5)
plotCI(out, hh, pch=, gap=0, ui=ci.u, li=ci.l, add=TRUE)

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Martin Batholdy
 Sent: Tuesday, January 22, 2013 2:42 PM
 To: r-help@r-project.org
 Subject: Re: [R] change confidence interval line length in barplot2
 (plotrix package)
 
 Ok, I have to apologize,
 I confused the packages.
 
 It's the function barplot2 from the gplots package!
 
 
   It calls itself an extenstion of barplot2 and has a ci.lwd argument.
 Might save you the time of doing what I thought might be needed,
 hacking te code.
 
 Unfortunately ci.lwd controls the thickness of the line but not the
 horizontal width.
 
 
 
 On Jan 22, 2013, at 21:24 , David Winsemius dwinsem...@comcast.net
 wrote:
 
 
  On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote:
 
  Hi,
 
  is there any way to change the width of the horizontal line of
 confidence intervals
  in the barplot2 function in the plotrix package (independent of the
 width of the bars)?
 
 
  example code:
 
  library(plotrix)
  # Example with confidence intervals and grid
  hh - t(VADeaths)[, 1]
  mybarcol - gray20
  ci.l - hh * 0.85
  ci.u - hh * 1.15
  mp - barplot2(hh, beside = TRUE,
col = c(lightblue, mistyrose,
lightcyan, lavender),
legend = colnames(VADeaths), ylim = c(0, 20),
main = Death Rates in Virginia, font.main = 4,
sub = Faked 95 percent error bars, col.sub = mybarcol,
cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)
 
  When I did an sos::findFn(barplot2) search to locate the real
 `barplot2` O alos noted in the same package (gplots) a function named
 `ooplot`. It calls itself an extenstion of barplot2 and has a ci.lwd
 argument. Might save you the time of doing what I thought might be
 needed, hacking te code.
 
  --
  David Winsemius
  Alameda, CA, USA
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] change confidence interval line length in barplot2 (plotrix package)

2013-01-22 Thread Marc Schwartz
On Jan 22, 2013, at 2:41 PM, Martin Batholdy batho...@googlemail.com wrote:

 Ok, I have to apologize,
 I confused the packages.
 
 It's the function barplot2 from the gplots package!
 
 
 It calls itself an extenstion of barplot2 and has a ci.lwd argument. Might 
 save you the time of doing what I thought might be needed, hacking te code.
 
 Unfortunately ci.lwd controls the thickness of the line but not the 
 horizontal width.


barplot2() in gplots uses a hard coded width for the CI's, which is 50% of the 
bar width, so it is a consistent proportion.

You could hack the code or simply use base graphics barplot() along with either 
?segments or perhaps more easily, ?arrows, which would give you more 
flexibility.

Compare:

mp - barplot(1:5)
arrows(mp, 1:5 + 0.5, mp, 1:5 - 0.5, code = 3, angle = 90, length = 0.1)

with:

mp - barplot(1:5)
arrows(mp, 1:5 + 0.5, mp, 1:5 - 0.5, code = 3, angle = 90, length = 0.25)

where the 'length' argument to arrows() defines the width of the upper and 
lower boundary lines.

There are a fair number of other functions around that can add CI's to plots as 
well and a search of the archives should bear fruit.

Regards,

Marc Schwartz


 
 On Jan 22, 2013, at 21:24 , David Winsemius dwinsem...@comcast.net wrote:
 
 
 On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote:
 
 Hi,
 
 is there any way to change the width of the horizontal line of confidence 
 intervals
 in the barplot2 function in the plotrix package (independent of the width 
 of the bars)?
 
 
 example code:
 
 library(plotrix)
 # Example with confidence intervals and grid
 hh - t(VADeaths)[, 1]
 mybarcol - gray20
 ci.l - hh * 0.85
 ci.u - hh * 1.15
 mp - barplot2(hh, beside = TRUE,
  col = c(lightblue, mistyrose,
  lightcyan, lavender),
  legend = colnames(VADeaths), ylim = c(0, 20),
  main = Death Rates in Virginia, font.main = 4,
  sub = Faked 95 percent error bars, col.sub = mybarcol,
  cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)
 
 When I did an sos::findFn(barplot2) search to locate the real `barplot2` 
 O alos noted in the same package (gplots) a function named `ooplot`. It 
 calls itself an extenstion of barplot2 and has a ci.lwd argument. Might save 
 you the time of doing what I thought might be needed, hacking te code.
 
 -- 
 David Winsemius
 Alameda, CA, USA


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread arun
HI Rebecca,
Try this:

dateA-seq.Date(as.Date(28JAN2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month)
dateB-seq.Date(as.Date(30JAN2012,format=%d%B%Y),as.Date(30DEC2012,format=%d%B%Y),by=month)
set.seed(15)
 A-data.frame(dateA,value=cumsum(sample(1:50,12,replace=TRUE)))
 set.seed(25)
 B-data.frame(dateB,value=cumsum(sample(1:72,12,replace=TRUE)))
B[,1]-as.Date(gsub(\\d+$,28,B[,1]))
 
B[,1][duplicated(B[,1],fromLast=TRUE)]-as.Date(gsub((.*-).*(-.*),\\102\\2,B[,1][duplicated(B[,1],fromLast=TRUE)]))
 #this step may not be needed in ur data.  In the month of march, there were 
two values
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
library(zoo)
plot.zoo(res)
A.K.



- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 3:53 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

I do not want to remove those NA values because they are the monthly data but 
recorded as the last calendar date in A and last business date in B.

I tried to use 

raw_time    - substr(raw_time,3,9)
raw_time    - as.Date(raw_time,format=%d%B%Y)

to cutoff the date and leave the month and year in raw_time, and then convert 
it to a valid date type of data, but I failed.

Is there a way that I can present

2012-09-28       NA          NA    5400726 14861715970 
2012-09-30  5035606 14832837436         NA          NA

into something like

2012-09-30  5035606 14832837436    5400726 14861715970

By converting 2012-09-28 to the last calendar date as of 2012-09-30 then B will 
be recorded at the last business date of the month, and will not have any NA 
values.

Dput() gives me

 dput(tail(res))
structure(c(121, NA, 111, 111, 120, 119, 309, 
NA, 313, 307, 30, 313, 130, 
130, NA, 130, 130, 130, 309, 313, 
NA, 309, 310, 315), class = c(xts, 
zoo), .indexCLASS = Date, .indexTZ = , tclass = Date, tzone = , index 
= structure(c(134, 
134, 134, 135, 135, 135), tzone = , tclass = Date), .Dim = c(6L, 
4L), .Dimnames = list(NULL, c(raw_acct, raw_baln, raw_acct.1, 
raw_baln.1)))

Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 3:41 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

In the previous email,
  res-merge(Anew,Bnew)
head(res)
#   Anew Bnew
#2012-01-01  181   NA
#2012-01-02   59   NA
#2012-01-03  290   NA
#2012-01-04  196   NA
#2012-01-05  111   NA
#2012-01-06  297   NA
 
plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I 
guess, it would remove that from plotting)  

If you want to remove the NA rows:
use, na.omit() or complete.cases()? #as I did in the previous email.

Could you dput() an example dataset?

A.K.
 




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 2:38 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

This would help me to get the date type of data. A new question comes out that 
since the dates are not exactly the same on two date sets, there are some NA 
values in the merged data set, such as

2012-09-28       NA          NA    5400726 14861715970 2012-09-30  5035606 
14832837436         NA          NA

Does R have a function to convert the date to some format of Sep,2012, 
therefore when I merge those two, they will not have those NA numbers...

Thanks,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 2:15 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

Assuming that 'raw_data' is data.frame with first column as raw_time:
You could convert the raw_time to date format by 

 as.Date(28FEB2002,format=%d%B%Y)
#[1] 2002-02-28

In your data, it should  be:
raw_data$raw_time- as.Date(raw_time,format=%d%B%Y)

Could you just dput() a few lines of your dataset if this is not working?
Tx.

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 2:08 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

My data shows that I do not have a date type of data:

summary(raw_data)
      raw_time      raw_acct         raw_baln
28FEB2002:  1   Min.   : 61714   Min.   :117079835
28FEB2003:  1   1st Qu.: 75587   1st Qu.:158035150
28FEB2005:  1   Median :100234   Median :206906298
28FEB2006:  1   Mean   : 96058   Mean   :210550369
28FEB2007:  1   3rd 

Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread Yuan, Rebecca
Hello Arun,

Thanks very much! In this way, it works! I convert both A and B to the same day 
of the month, and therefore there is no NA shown for different last business 
day and last calendar day of the month.

You are very help!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 5:06 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

HI Rebecca,
Try this:

dateA-seq.Date(as.Date(28JAN2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month)
dateB-seq.Date(as.Date(30JAN2012,format=%d%B%Y),as.Date(30DEC2012,format=%d%B%Y),by=month)
set.seed(15)
 A-data.frame(dateA,value=cumsum(sample(1:50,12,replace=TRUE)))
 set.seed(25)
 B-data.frame(dateB,value=cumsum(sample(1:72,12,replace=TRUE)))
B[,1]-as.Date(gsub(\\d+$,28,B[,1]))
 
B[,1][duplicated(B[,1],fromLast=TRUE)]-as.Date(gsub((.*-).*(-.*),\\102\\2,B[,1][duplicated(B[,1],fromLast=TRUE)]))
 #this step may not be needed in ur data.  In the month of march, there were 
two values
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
library(zoo)
plot.zoo(res)
A.K.



- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 3:53 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

I do not want to remove those NA values because they are the monthly data but 
recorded as the last calendar date in A and last business date in B.

I tried to use 

raw_time    - substr(raw_time,3,9)
raw_time    - as.Date(raw_time,format=%d%B%Y)

to cutoff the date and leave the month and year in raw_time, and then convert 
it to a valid date type of data, but I failed.

Is there a way that I can present

2012-09-28       NA          NA    5400726 14861715970 2012-09-30  5035606 
14832837436         NA          NA

into something like

2012-09-30  5035606 14832837436    5400726 14861715970

By converting 2012-09-28 to the last calendar date as of 2012-09-30 then B will 
be recorded at the last business date of the month, and will not have any NA 
values.

Dput() gives me

 dput(tail(res))
structure(c(121, NA, 111, 111, 120, 119, 309, NA, 313, 307, 30, 313, 130, 130, 
NA, 130, 130, 130, 309, 313, NA, 309, 310, 315), class = c(xts, zoo), 
.indexCLASS = Date, .indexTZ = , tclass = Date, tzone = , index = 
structure(c(134, 134, 134, 135, 135, 135), tzone = , tclass = Date), .Dim = 
c(6L, 4L), .Dimnames = list(NULL, c(raw_acct, raw_baln, raw_acct.1,
raw_baln.1)))

Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 3:41 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

In the previous email,
  res-merge(Anew,Bnew)
head(res)
#   Anew Bnew
#2012-01-01  181   NA
#2012-01-02   59   NA
#2012-01-03  290   NA
#2012-01-04  196   NA
#2012-01-05  111   NA
#2012-01-06  297   NA
 
plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I 
guess, it would remove that from plotting)  

If you want to remove the NA rows:
use, na.omit() or complete.cases()? #as I did in the previous email.

Could you dput() an example dataset?

A.K.
 




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 2:38 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

This would help me to get the date type of data. A new question comes out that 
since the dates are not exactly the same on two date sets, there are some NA 
values in the merged data set, such as

2012-09-28       NA          NA    5400726 14861715970 2012-09-30  5035606 
14832837436         NA          NA

Does R have a function to convert the date to some format of Sep,2012, 
therefore when I merge those two, they will not have those NA numbers...

Thanks,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 2:15 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

Assuming that 'raw_data' is data.frame with first column as raw_time:
You could convert the raw_time to date format by 

 as.Date(28FEB2002,format=%d%B%Y)
#[1] 2002-02-28

In your data, it should  be:
raw_data$raw_time- as.Date(raw_time,format=%d%B%Y)

Could you just dput() a few lines of your dataset if this is not working?
Tx.

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, 

[R] tapply and functions with more than one objects

2013-01-22 Thread Dominic Roye
Hello,

How i can use a costum function in tapply which has more than one variable?

I mean sum(x) only needs one object but what when i have a function
function(x,y) with more, how i indicate where are the other variables
to use?7


I hope someone can help me. Thank you!!

Best regards,

Dominic

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Adding a line to barchart

2013-01-22 Thread Jonathan Greenberg
R-helpers:

I need a quick help with the following graph (I'm a lattice newbie):

require(lattice)
npp=1:5
names(npp)=c(A,B,C,D,E)
barchart(npp,origin=0,box.width=1)

# What I want to do, is add a single vertical line positioned at x = 2 that
lays over the bars (say, using a dotted line).  How do I go about doing
this?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] density of hist(freq = FALSE) inversely affected by data magnitude

2013-01-22 Thread J Toll
Hi,

I have a couple of observations, a question or two, and perhaps a
suggestion related to the plotting of density on the y-axis within the
hist() function when freq=FALSE.  I was using the function and trying
to develop an intuitive understanding of what the density is telling
me.  After reading through this fairly helpful post:

http://stats.stackexchange.com/questions/17258/odd-problem-with-a-histogram-in-r-with-a-relative-frequency-axis

I finally realized that in the case where freq = FALSE, the y-axis
isn't really telling me the density.  It's actually indicating the
density multiplied by the bin size.  I assume this is for the case
where the bins may be of non-regular size.

from hist.default:

dens - counts/(n * diff(breaks))

So the count in each bin is divided by the total number of
observations (n) multiplied by the size of the bin.  The problem, as I
see it, is that the density ends up being scaled by the size of the
bins, which is inversely proportional to the magnitude of the data.
Therefore the magnitude of the data is directly affecting the density,
which seems problematic.

For example*:

set.seed()
x - runif(100)
y - x / 1000

par(mfrow = c(2, 1))
hist(x, prob = TRUE)
hist(y, prob = TRUE)

From this example, you see that the density for the y histogram is
1000 times larger, simply because the y data is 1000 times smaller.
Again, that seems problematic.  It seems to me, that the density
should be unit-less, but here it's affected by the magnitude of the
data.

So, my question is, why is density calculated this way?

For the case where all the bins are of the same size, I would think
density should simply be calculated as:

dens - counts / n

Of course, that might be somewhat misleading for the case where the
bin sizes vary.  So then why not calculate density as:

dens - counts / (n * diff(breaks) / min(diff(breaks)))

Dividing diff(breaks) by min(diff(breaks)) removes the scaling effect
of the magnitude of the data, and simply leaves the relative
difference in bin size.

For the case where all the bins are the same size, the calculation is
equivalent to dens - counts / n

For all other cases, the density is scaled by the size of the bin, but
unaffected by the magnitude of the data.

So, what am I misunderstanding?  Why is density calculated as it is,
and what does it mean?

Thanks,


James


*example from 
http://stats.stackexchange.com/questions/17258/odd-problem-with-a-histogram-in-r-with-a-relative-frequency-axis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread arun
Hi Rebecca,
No problem.
Just a doubt regarding the last calendar day and last business day.
dateA-seq(as.Date(01FEB2012,format=%d%B%Y),length=15,by=1 month)-1   
#gives the last calendar day/month
dateB- 
seq.Date(as.Date(28MAR2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month)
 #here I used day 28.  If it didn't change
#then this works.  
set.seed(15)
 A-data.frame(dateA,value=cumsum(sample(1:50,15,replace=TRUE)))
 set.seed(25)
 B-data.frame(dateB,value=cumsum(sample(1:72,10,replace=TRUE)))
 A[,1]-as.Date(gsub(\\d+$,28,A[,1]))
library(xts)
library(zoo)
 Anew-as.xts(A[,-1],order.by=A[,1])
  Bnew-as.xts(B[,-1],order.by=B[,1])
  res-merge(Anew,Bnew)
 plot.zoo(res)
 
From your reply, it seems like dateB day didn't change.

A.K.





- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 5:28 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

Thanks very much! In this way, it works! I convert both A and B to the same day 
of the month, and therefore there is no NA shown for different last business 
day and last calendar day of the month.

You are very help!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 5:06 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

HI Rebecca,
Try this:

dateA-seq.Date(as.Date(28JAN2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month)
dateB-seq.Date(as.Date(30JAN2012,format=%d%B%Y),as.Date(30DEC2012,format=%d%B%Y),by=month)
set.seed(15)
 A-data.frame(dateA,value=cumsum(sample(1:50,12,replace=TRUE)))
 set.seed(25)
 B-data.frame(dateB,value=cumsum(sample(1:72,12,replace=TRUE)))
B[,1]-as.Date(gsub(\\d+$,28,B[,1]))
 
B[,1][duplicated(B[,1],fromLast=TRUE)]-as.Date(gsub((.*-).*(-.*),\\102\\2,B[,1][duplicated(B[,1],fromLast=TRUE)]))
 #this step may not be needed in ur data.  In the month of march, there were 
two values
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
library(zoo)
plot.zoo(res)
A.K.



- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 3:53 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

I do not want to remove those NA values because they are the monthly data but 
recorded as the last calendar date in A and last business date in B.

I tried to use 

raw_time    - substr(raw_time,3,9)
raw_time    - as.Date(raw_time,format=%d%B%Y)

to cutoff the date and leave the month and year in raw_time, and then convert 
it to a valid date type of data, but I failed.

Is there a way that I can present

2012-09-28       NA          NA    5400726 14861715970 2012-09-30  5035606 
14832837436         NA          NA

into something like

2012-09-30  5035606 14832837436    5400726 14861715970

By converting 2012-09-28 to the last calendar date as of 2012-09-30 then B will 
be recorded at the last business date of the month, and will not have any NA 
values.

Dput() gives me

 dput(tail(res))
structure(c(121, NA, 111, 111, 120, 119, 309, NA, 313, 307, 30, 313, 130, 130, 
NA, 130, 130, 130, 309, 313, NA, 309, 310, 315), class = c(xts, zoo), 
.indexCLASS = Date, .indexTZ = , tclass = Date, tzone = , index = 
structure(c(134, 134, 134, 135, 135, 135), tzone = , tclass = Date), .Dim = 
c(6L, 4L), .Dimnames = list(NULL, c(raw_acct, raw_baln, raw_acct.1,
raw_baln.1)))

Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 3:41 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

In the previous email,
  res-merge(Anew,Bnew)
head(res)
#   Anew Bnew
#2012-01-01  181   NA
#2012-01-02   59   NA
#2012-01-03  290   NA
#2012-01-04  196   NA
#2012-01-05  111   NA
#2012-01-06  297   NA
 
plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I 
guess, it would remove that from plotting)  

If you want to remove the NA rows:
use, na.omit() or complete.cases()? #as I did in the previous email.

Could you dput() an example dataset?

A.K.
 




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 2:38 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

This would help me to get the date type of data. A new question comes out that 
since the dates are not exactly the same on two date sets, there are some NA 
values 

[R] summarise subsets of a vector

2013-01-22 Thread Wim Kreinen
Hello,

I have vector called test. And now I wish to measure the mean of the first
10 number, the second 10 numbers etc
How does it work?
Thanks Wim

  dput (test)
c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0.71, 0.21875, 0, 0.27375, 0.26125,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.84125,
0.0575, 0.92625, 0.12, 0, 0)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to assign time series to a vector with one leap year

2013-01-22 Thread arun
HI,
You can check this link:
http://r.789695.n4.nabble.com/leap-years-in-temporal-series-command-ts-td3309014.html

Also, this may help you:
library(lubridate), ?leap_year()
 leap_year(2008)
#[1] TRUE
 ymd(2008-2-29)
 1 parsed with %Y-%m-%d
#[1] 2008-02-29 UTC
A.K.




- Original Message -
From: Janesh Devkota janesh.devk...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 2:46 PM
Subject: [R] How to assign time series to a vector with one leap year

Hello All,

I am trying to do the time series analysis in R and I want to assign a
vector as a time series. The data I provided is hourly. The data is from
Jan 1 2008 to Dec 31 2009. How can I assign the data such that the first
year is leap year and second is not ?

airtemp - read.csv(airtemp.csv,header=T,sep=)

aw - ts(airtemp,start=2008,frequency=8784,end=2009)

I assigned frequency as 8784 because 2008 year will have 8784 hourly data
points and 2009 has 8760 data points. The total data points are 17544

The data can be found on
https://www.dropbox.com/s/03z74632v1f3g1e/airtemp.csv

I apologize if this is very trivial to some of you.

Thanks.

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tapply and functions with more than one objects

2013-01-22 Thread David Winsemius

On Jan 22, 2013, at 2:24 PM, Dominic Roye wrote:

 Hello,
 
 How i can use a costum function in tapply which has more than one variable?
 
 I mean sum(x) only needs one object but what when i have a function
 function(x,y) with more, how i indicate where are the other variables
 to use?7

You can use:

lapply(split( multi_col_object, category_vec) , function(x,y){sum(x,y)}  ) 

aggregate(dat, category, FUN=sum)

Or:

do.call(rbind, by( multi_col_object, category_vec, function(x,y){ } )

Sometimes `Reduce` is more compact. Other times `mapply` is needed.
-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] density of hist(freq = FALSE) inversely affected by data magnitude

2013-01-22 Thread William Dunlap
The probability density function is not unitless - it is the derivative of the
[cumulative] probability distribution function so it has units 
delta-probability-mass
over delta-x.  It must integrate to 1 (over the all possible x).  
hist(freq=FALSE,x)
or hist(prob=TRUE,x) displays an estimate of the density function and the 
following
example shows how the scale matches what you get from the presumed 
population density function.

 f
function (n, sd) 
{
x - rnorm(n, sd = sd)
hist(x, freq = FALSE) # estimated density
s - seq(min(x), max(x), len = 129)
lines(s, dnorm(s, sd = sd), col = red) # overlay expected density for 
this sample
}
 f(1e6, sd=1)
 f(100, sd=1)
 f(100, sd=0.0001)
 f(1e6, sd=0.0001)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of J Toll
 Sent: Tuesday, January 22, 2013 2:48 PM
 To: r-help
 Subject: [R] density of hist(freq = FALSE) inversely affected by data 
 magnitude
 
 Hi,
 
 I have a couple of observations, a question or two, and perhaps a
 suggestion related to the plotting of density on the y-axis within the
 hist() function when freq=FALSE.  I was using the function and trying
 to develop an intuitive understanding of what the density is telling
 me.  After reading through this fairly helpful post:
 
 http://stats.stackexchange.com/questions/17258/odd-problem-with-a-histogram-in-r-
 with-a-relative-frequency-axis
 
 I finally realized that in the case where freq = FALSE, the y-axis
 isn't really telling me the density.  It's actually indicating the
 density multiplied by the bin size.  I assume this is for the case
 where the bins may be of non-regular size.
 
 from hist.default:
 
 dens - counts/(n * diff(breaks))
 
 So the count in each bin is divided by the total number of
 observations (n) multiplied by the size of the bin.  The problem, as I
 see it, is that the density ends up being scaled by the size of the
 bins, which is inversely proportional to the magnitude of the data.
 Therefore the magnitude of the data is directly affecting the density,
 which seems problematic.
 
 For example*:
 
 set.seed()
 x - runif(100)
 y - x / 1000
 
 par(mfrow = c(2, 1))
 hist(x, prob = TRUE)
 hist(y, prob = TRUE)
 
 From this example, you see that the density for the y histogram is
 1000 times larger, simply because the y data is 1000 times smaller.
 Again, that seems problematic.  It seems to me, that the density
 should be unit-less, but here it's affected by the magnitude of the
 data.
 
 So, my question is, why is density calculated this way?
 
 For the case where all the bins are of the same size, I would think
 density should simply be calculated as:
 
 dens - counts / n
 
 Of course, that might be somewhat misleading for the case where the
 bin sizes vary.  So then why not calculate density as:
 
 dens - counts / (n * diff(breaks) / min(diff(breaks)))
 
 Dividing diff(breaks) by min(diff(breaks)) removes the scaling effect
 of the magnitude of the data, and simply leaves the relative
 difference in bin size.
 
 For the case where all the bins are the same size, the calculation is
 equivalent to dens - counts / n
 
 For all other cases, the density is scaled by the size of the bin, but
 unaffected by the magnitude of the data.
 
 So, what am I misunderstanding?  Why is density calculated as it is,
 and what does it mean?
 
 Thanks,
 
 
 James
 
 
 *example from 
 http://stats.stackexchange.com/questions/17258/odd-problem-with-a-
 histogram-in-r-with-a-relative-frequency-axis
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is the convergence criterion for binomial logit in glm?

2013-01-22 Thread David Winsemius

On Jan 22, 2013, at 2:55 PM, Dimitri Liakhovitski wrote:

 Dear R-ers,
 
 I am running logistics regression using package glm: glm(myDV ~ .,
 data=mydata, family=binomial(logit))
 
 I have a general question: in glm (binary logit) - what convergence
 criterion is being used?

You should look at the help page for `glm` (and follow the obvious links.)
 
 -- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding a line to barchart

2013-01-22 Thread arun
Hi,

May be this helps:
 barchart(npp,origin=0,box.width=1,
 panel=function(x,y,...){
 panel.barchart(x,y,...)
 panel.abline(v=2,col.line=red,lty=3)})
A.K.




- Original Message -
From: Jonathan Greenberg j...@illinois.edu
To: r-help r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 5:41 PM
Subject: [R] Adding a line to barchart

R-helpers:

I need a quick help with the following graph (I'm a lattice newbie):

require(lattice)
npp=1:5
names(npp)=c(A,B,C,D,E)
barchart(npp,origin=0,box.width=1)

# What I want to do, is add a single vertical line positioned at x = 2 that
lays over the bars (say, using a dotted line).  How do I go about doing
this?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] summarise subsets of a vector

2013-01-22 Thread arun
Hi,
try this:
 unlist(lapply(split(test,((seq_along(test)-1)%/% 10)+1),mean))
#   1    2    3    4    5    6    7    8 
#0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.146375 
  #     9   10   11 
#0.00 0.194500 0.00 

A.K.



- Original Message -
From: Wim Kreinen wkrei...@gmail.com
To: r-help r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 6:09 PM
Subject: [R] summarise subsets of a vector

Hello,

I have vector called test. And now I wish to measure the mean of the first
10 number, the second 10 numbers etc
How does it work?
Thanks Wim

 dput (test)
c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0.71, 0.21875, 0, 0.27375, 0.26125,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.84125,
0.0575, 0.92625, 0.12, 0, 0)

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is the convergence criterion for binomial logit in glm?

2013-01-22 Thread Dimitri Liakhovitski
I already looked. This help file for loglin (
http://127.0.0.1:12583/library/stats/html/loglin.html) says:
The Iterative Proportional Fitting algorithm as presented in Haberman
(1972) is used for fitting the model. At most iter iterations are
performed, convergence is taken to occur when the maximum deviation between
observed and fitted margins is less than eps. And the default eps is 0.1

So, is it then the convergence criterion used by glm when
family=binomial(logit)?
I just need to know for sure.

Thanks for confirming!
Dimitri



On Tue, Jan 22, 2013 at 6:37 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Jan 22, 2013, at 2:55 PM, Dimitri Liakhovitski wrote:

  Dear R-ers,
 
  I am running logistics regression using package glm: glm(myDV ~ .,
  data=mydata, family=binomial(logit))
 
  I have a general question: in glm (binary logit) - what convergence
  criterion is being used?

 You should look at the help page for `glm` (and follow the obvious links.)
 
  --

 David Winsemius
 Alameda, CA, USA




-- 
Dimitri Liakhovitski
gfk.com http://marketfusionanalytics.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is the convergence criterion for binomial logit in glm?

2013-01-22 Thread David Winsemius

On Jan 22, 2013, at 3:59 PM, Dimitri Liakhovitski wrote:

 I already looked. This help file for loglin 
 (http://127.0.0.1:12583/library/stats/html/loglin.html) says:
 The Iterative Proportional Fitting algorithm as presented in Haberman (1972) 
 is used for fitting the model. At most iter iterations are performed, 
 convergence is taken to occur when the maximum deviation between observed and 
 fitted margins is less than eps. And the default eps is 0.1
  
 So, is it then the convergence criterion used by glm when 
 family=binomial(logit)?
 I just need to know for sure.

I assumed that you would follow the link on help(glm) to `glm.control` where 
the convergence criteria is described and can be altered. The link to that help 
page is at the end of the line that reads:

control
a list of parameters for controlling the fitting process. For glm.fit this is 
passed to glm.control.

The default epsilon is NOT 0.1

-- 
David.
  
 Thanks for confirming!
 Dimitri
 
 
  
 On Tue, Jan 22, 2013 at 6:37 PM, David Winsemius dwinsem...@comcast.net 
 wrote:
 
 On Jan 22, 2013, at 2:55 PM, Dimitri Liakhovitski wrote:
 
  Dear R-ers,
 
  I am running logistics regression using package glm: glm(myDV ~ .,
  data=mydata, family=binomial(logit))
 
  I have a general question: in glm (binary logit) - what convergence
  criterion is being used?
 
 You should look at the help page for `glm` (and follow the obvious links.)
 
  --
 
 David Winsemius
 Alameda, CA, USA
 
 
 
 
 -- 
 Dimitri Liakhovitski
 gfk.com

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to construct a valid seed for l'Ecuyer's method with given .Random.seed?

2013-01-22 Thread Marius Hofert
Dear expeRts,

I struggle with the following problem using snow clusters for parallel 
computing: I would like to specify l'Ecuyer's random number generator. Base R 
creates a .Random.seed of length 7, the first value indicating the kind fo 
random number generator. I would thus like to use the components 2 to 7 as the 
seed for l'Ecuyer's random number generator.

By doing so, I receive (see the minimal example below):

,
|  Loading required package: Rmpi
| Loading required package: grDevices
| Loading required package: grDevices
| Loading required package: grDevices
| Loading required package: grDevices
|   4 slaves are spawned successfully. 0 failed.
| Loading required package: rlecuyer
| Error in .lec.SetPackageSeed(seed) (from #11) :
|   Seed[0] = -930997252, Seed is not set.
`

What's the problem? How can I construct a valid seed for l'Ecuyer's rng with
just the information in .Random.seed?

Thanks  Cheers,

Marius


Here is the minimal example:

require(doSNOW)
require(foreach)

doForeach - function(n, seed=1, type=MPI)
{
## create cluster object
cl - snow::makeCluster(parallel::detectCores(), type=type)
on.exit(snow::stopCluster(cl)) ## shut down cluster and terminate execution 
environment
registerDoSNOW(cl) ## register the cluster object with foreach

## seed
if(seed==L'Ecuyer-CMRG) {
if(!exists(.Random.seed)) stop(.Random.seed does not exist - in 
l'Ecuyer setting)
.t - snow::clusterSetupRNG(cl, seed=.Random.seed[2:7]) # = fails!
}

## actual work
foreach(i=seq_len(n)) %dopar% {
runif(1)
}
}

## standard (base) way of specifying l'Ecuyer
RNGkind(L'Ecuyer-CMRG) # = .Random.seed is of length 7
res - doForeach(10, seed=L'Ecuyer-CMRG)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create a Data Frame from an XML

2013-01-22 Thread Duncan Temple Lang

Hi Adam

 [You seem to have sent the same message twice to the mailing list.]

There are various strategies/approaches to creating the data frame
from the XML.

Perhaps the approach that most closely follows your approach is

  xmlRoot(doc)[ row ]

which  returns a list of XML nodes whose node name is row that are
children of the root node data.

So
  sapply(xmlRoot(doc) [ row ], xmlAttrs)

yields a matrix with as many columns as there are  row nodes
and with 3 rows - one for each of the BRAND, YEAR and VALUE attributes.

So

  d = t( sapply(xmlRoot(doc) [ row ], xmlAttrs) )

gives you a matrix with the correct rows and column orientation
and now you can turn that into a data frame, converting the
columns into numbers, etc. as you want with regular R commands
(i.e. independently of the XML).


 D.

On 1/22/13 1:43 PM, Adam Gabbert wrote:
  Hello,
 
 I'm attempting to read information from an XML into a data frame in R using
 the XML package. I am unable to get the data into a data frame as I would
 like.  I have some sample code below.
 
 *XML Code:*
 
 Header...
 
 Data I want in a data frame:
 
data
   row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 /
   row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 /
   row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 /
   row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 /
   row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 /
   row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 /
   row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 /
   row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS /
   row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 /
   row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 /
   /data
 
 *R Code:*
 
 doc -xmlInternalTreeParse (Sample2.xml)
 top - xmlRoot (doc)
 xmlName (top)
 names (top)
 art - top [[row]]
 art
 **
 *Output:*
 
 artrow BRAND=GMC NUM=1 YEAR=1999 VALUE=1/
 
 * *
 
 
 This is where I am having difficulties.  I am unable to access additional
 rows; ( i.e.  row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / )
 
 and I am unable to access the individual entries to actually create the
 data frame.  The data frame I would like is as follows:
 
 BRANDNUMYEARVALUE
 GMC1  1999  1
 FORD   2  2000  12000
 GMC1  2001   12500
 etc
 
 Any help or suggestions would be appreciated.  Conversly, my eventual goal
 would be to take a data frame and write it into an XML in the previously
 shown format.
 
 Thank you
 
 AG
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] csv mask order

2013-01-22 Thread Todd Sformo
I have imported a CSV file:

rfishR - read.csv(file=rfishR.csv,stringsAsFactors = FALSE,

  strip.white = TRUE, na.strings = c(NA,) )

attach(rfishR)

When I call it up in R, it starts with line 2066 rather than 1 and some of the 
headers (used Headers = TRUE, too) are masked?
Sample data
loc

lat

lon

datum

water

date

obs

net

species

length

mass

other

Dispos

NS10

69.5

-156.8

NAD83

Chuc



pt

f

fourhorn sculpin

225

na

na

id

NS10

69.5

-156.4

NAD83

Chuc



pt

f

fourhorn sculpin

293

na

na

id

NS10

69.5

-156.2

NAD83

Chuc



pt

f

fourhorn sculpin

243

na

na

id

Please help.
-TS


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] density of hist(freq = FALSE) inversely affected by data magnitude

2013-01-22 Thread J Toll
Bill,

Thank you.  I got it.  That can require a fair amount of work to
interpret the density, especially with odd or irregular bin sizes.

Thanks again,

James



On Tue, Jan 22, 2013 at 5:33 PM, William Dunlap wdun...@tibco.com wrote:
 The probability density function is not unitless - it is the derivative of the
 [cumulative] probability distribution function so it has units 
 delta-probability-mass
 over delta-x.  It must integrate to 1 (over the all possible x).  
 hist(freq=FALSE,x)
 or hist(prob=TRUE,x) displays an estimate of the density function and the 
 following
 example shows how the scale matches what you get from the presumed
 population density function.

 f
 function (n, sd)
 {
 x - rnorm(n, sd = sd)
 hist(x, freq = FALSE) # estimated density
 s - seq(min(x), max(x), len = 129)
 lines(s, dnorm(s, sd = sd), col = red) # overlay expected density for 
 this sample
 }
 f(1e6, sd=1)
 f(100, sd=1)
 f(100, sd=0.0001)
 f(1e6, sd=0.0001)

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is the convergence criterion for binomial logit in glm?

2013-01-22 Thread Dimitri Liakhovitski
Thanks a lot, David.
Yes, now I see it - it's 1e-8
Dimitri


On Tue, Jan 22, 2013 at 7:08 PM, David Winsemius dwinsem...@comcast.netwrote:

 glm.control





-- 
Dimitri Liakhovitski
gfk.com http://marketfusionanalytics.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a Data Frame from an XML

2013-01-22 Thread Ben Tupper

On Jan 22, 2013, at 3:11 PM, Adam Gabbert wrote:

 Hello,
 
 I'm attempting to read information from an XML into a data frame in R using
 the XML package. I am unable to get the data into a data frame as I would
 like.  I have some sample code below.
 
 *XML Code:*
 
 Header...
 
 Data I want in a data frame:
 
   data
  row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 /
  row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 /
  row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 /
  row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 /
  row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 /
  row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 /
  row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 /
  row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS /
  row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 /
  row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 /
  /data
 
 *R Code:*
 
 doc -xmlInternalTreeParse (Sample2.xml)
 top - xmlRoot (doc)
 xmlName (top)
 names (top)
 art - top [[row]]
 art
 **
 *Output:*
 
 artrow BRAND=GMC NUM=1 YEAR=1999 VALUE=1/
 
 
 
 
 This is where I am having difficulties.  I am unable to access additional
 rows; ( i.e.  row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / )
 
 and I am unable to access the individual entries to actually create the
 data frame.  The data frame I would like is as follows:
 
 BRANDNUMYEARVALUE
 GMC1  1999  1
 FORD   2  2000  12000
 GMC1  2001   12500
etc
 
 Any help or suggestions would be appreciated.  Conversly, my eventual goal
 would be to take a data frame and write it into an XML in the previously
 shown format.
 
Hi,

You are so close!

You have a number of nodes with the name 'row'.  The [[ function selects just 
one item from a list, and when there's a number that have that name it returns 
just the first.  So you really want to use the [ function instead and then 
select by order index using [[

library(XML)

 s - c(  data,  row BRAND=\GMC\ NUM=\1\ YEAR=\1999\ 
 VALUE=\1\ /, 
 row BRAND=\FORD\ NUM=\1\ YEAR=\2000\ VALUE=\12000\ /, 
 row BRAND=\GMC\ NUM=\1\ YEAR=\2001\ VALUE=\12500\ /, 
 row BRAND=\FORD\ NUM=\1\ YEAR=\2002\ VALUE=\13000\ /, 
 row BRAND=\GMC\ NUM=\1\ YEAR=\2003\ VALUE=\14000\ /, 
 row BRAND=\FORD\ NUM=\1\ YEAR=\2004\ VALUE=\17000\ /, 
 row BRAND=\GMC\ NUM=\1\ YEAR=\2005\ VALUE=\15000\ /, 
 row BRAND=\GMC\ NUM=\1\ YEAR=\1967\ VALUE=\PRICLESS\ /, 
 row BRAND=\FORD\ NUM=\1\ YEAR=\2007\ VALUE=\17500\ /, 
 row BRAND=\GMC\ NUM=\1\ YEAR=\2008\ VALUE=\22000\ /, 
 /data)

 x - xmlRoot(xmlTreeParse(s, asText = TRUE, useInternalNodes = TRUE))

 x[row][[1]]
 row BRAND=GMC NUM=1 YEAR=1999 VALUE=1/

 x[row][[2]]
 row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000/ 

Your rows are set up so the attributes have the values you want - use xmlAttrs 
to retrieve them.

 xmlAttrs(x[row][[2]])
  BRAND NUMYEAR   VALUE 
 FORD 1  2000 12000 


You can use lapply to iterate through each row and apply the xmlAttrs function. 
 You'll end up with a list if character vectors.

 y - lapply(x[row], xmlAttrs)
 str(y)
List of 10
 $ row: Named chr [1:4] GMC 1 1999 1
  ..- attr(*, names)= chr [1:4] BRAND NUM YEAR VALUE
 $ row: Named chr [1:4] FORD 1 2000 12000
  ..- attr(*, names)= chr [1:4] BRAND NUM YEAR VALUE
 $ row: Named chr [1:4] GMC 1 2001 12500
  ..- attr(*, names)= chr [1:4] BRAND NUM YEAR VALUE
.
.
.

Next make a character matrix using do.call and rbind ...

 m - do.call(rbind, y)
 str(m)
 chr [1:10, 1:4] GMC FORD GMC FORD GMC FORD GMC GMC FORD ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:10] row row row row ...
  ..$ : chr [1:4] BRAND NUM YEAR VALUE

And then on to a data.frame...

 d - as.data.frame(m)
 str(d)
'data.frame':   10 obs. of  4 variables:
 $ BRAND: chr  GMC FORD GMC FORD ...
 $ NUM  : chr  1 1 1 1 ...
 $ YEAR : chr  1999 2000 2001 2002 ...
 $ VALUE: chr  1 12000 12500 13000 ...

Cheers,
Ben




 Thank you
 
 AG
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

Ben Tupper
Bigelow Laboratory for Ocean Sciences
180 McKown Point Rd. P.O. Box 475
West Boothbay Harbor, Maine   04575-0475 
http://www.bigelow.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] csv mask order

2013-01-22 Thread Peter Langfelder
Do your lines start with the hash mark #? If so, they are considered
comment. Set comment.char= in your call to read.csv. Another
frequent culprit (personal experience) are apostrophes ('). If you
have any in your file, use the argument quote = \ or, if you are
sure the data are not quoted, use quote=. This is all described in
detail in help(read.csv), you may want to study it carefully to see
whether your file is misinterpreted in some subtle way.

HTH

Peter

On Tue, Jan 22, 2013 at 4:49 PM, Todd Sformo
todd.sfo...@north-slope.org wrote:
 I have imported a CSV file:

 rfishR - read.csv(file=rfishR.csv,stringsAsFactors = FALSE,

   strip.white = TRUE, na.strings = c(NA,) )

 attach(rfishR)

 When I call it up in R, it starts with line 2066 rather than 1 and some of 
 the headers (used Headers = TRUE, too) are masked?
 Sample data
 loc

 lat

 lon

 datum

 water

 date

 obs

 net

 species

 length

 mass

 other

 Dispos

 NS10

 69.5

 -156.8

 NAD83

 Chuc

 

 pt

 f

 fourhorn sculpin

 225

 na

 na

 id

 NS10

 69.5

 -156.4

 NAD83

 Chuc

 

 pt

 f

 fourhorn sculpin

 293

 na

 na

 id

 NS10

 69.5

 -156.2

 NAD83

 Chuc

 

 pt

 f

 fourhorn sculpin

 243

 na

 na

 id

 Please help.
 -TS


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a Data Frame from an XML

2013-01-22 Thread Gabor Grothendieck
On Tue, Jan 22, 2013 at 3:11 PM, Adam Gabbert adamjgabb...@gmail.com wrote:
 Hello,

 I'm attempting to read information from an XML into a data frame in R using
 the XML package. I am unable to get the data into a data frame as I would
 like.  I have some sample code below.

 *XML Code:*

 Header...

 Data I want in a data frame:

data
   row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 /
   row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 /
   row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 /
   row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 /
   row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 /
   row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 /
   row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 /
   row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS /
   row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 /
   row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 /
   /data

 *R Code:*

 doc -xmlInternalTreeParse (Sample2.xml)
 top - xmlRoot (doc)
 xmlName (top)
 names (top)
 art - top [[row]]
 art
 **

This will get a data frame of character columns

 as.data.frame(t(xpathSApply(doc, //row, xmlAttrs)), stringsAsFactors = 
 FALSE)
   BRAND NUM YEARVALUE
1GMC   1 19991
2   FORD   1 200012000
3GMC   1 200112500
4   FORD   1 200213000
5GMC   1 200314000
6   FORD   1 200417000
7GMC   1 200515000
8GMC   1 1967 PRICLESS
9   FORD   1 200717500
10   GMC   1 200822000


--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] user units in plotrix

2013-01-22 Thread Murat Tasan
oo, sounds like exactly what i want!
thanks!

-m

On Tue, Jan 22, 2013 at 4:13 PM, Greg Snow 538...@gmail.com wrote:
 If you want to convert between different units using base graphics then look
 at the grconvertX and grconvertY functions (in the graphics package).  These
 functions will convert from/to user coordinates, inches, device, figure, and
 plot coordinates.  So you could use grconvertX to  find out what user value
 on the x scale to give to draw.circle that would then generate a circle with
 a given size in inches, or relative to the device, figure, or plotting
 region.


 On Sun, Jan 20, 2013 at 2:59 PM, Murat Tasan mmu...@gmail.com wrote:

 hi all - i'm having some difficulty figuring out how to convert
 between user units (which i can't find a definition for in the
 plotrix package) and either (a) device units (e.g. inches with PDFs)
 or (b) user coordinates along any particular axis.

 as an example, suppose i set up a PDF device with inches, the device
 has both outer and inner magins, and the plot region has drastically
 different x and y coordinate ranges (e.g. xlim = c(0, 1), ylim =  c(0,
 SOME_VERY_LARGE_NUMBER)).

 now i'd like to draw.circle(...) but i can't figure out what units the
 radius argument takes.
 user units doesn't appear to be inches in this case, and it it
 corresponds to user coordinates, i don't know which axis' scaling is
 to be used as the reference.

 ideally, one would be able to specify the radius in user coordinates
 while specifying _which_ axis to use as the standard (e.g. an axis =
 y or axis = x argument).

 getFigCtr(...) can help in figuring this out, but its argument takes
 the relative position of the figure region, rather than the plot
 region, which is more apt for properly placing shapes.

 i know the grid package has extensive unit conversion code, but i'm
 trying to update a series of figures using only base graphics...

 i can't seem to find a rigorous definition of user units anywhere in
 the plotrix package.
 anyone know of where i can find this info?

 cheers,

 -m

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding a line to barchart

2013-01-22 Thread PIKAL Petr
Hi
This function adds line to each panel

addLine - function (a = NULL, b = NULL, v = NULL, h = NULL, ..., once = F) 
{
tcL - trellis.currentLayout()
k - 0
for (i in 1:nrow(tcL)) for (j in 1:ncol(tcL)) if (tcL[i, 
j]  0) {
k - k + 1
trellis.focus(panel, j, i, highlight = FALSE)
if (once) 
panel.abline(a = a[k], b = b[k], v = v[k], h = h[k], 
...)
else panel.abline(a = a, b = b, v = v, h = h, ...)
trellis.unfocus()
}
  }


addLine(v=2, col=2, lty=3)

Petr

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Jonathan Greenberg
 Sent: Tuesday, January 22, 2013 11:42 PM
 To: r-help
 Subject: [R] Adding a line to barchart
 
 R-helpers:
 
 I need a quick help with the following graph (I'm a lattice newbie):
 
 require(lattice)
 npp=1:5
 names(npp)=c(A,B,C,D,E)
 barchart(npp,origin=0,box.width=1)
 
 # What I want to do, is add a single vertical line positioned at x = 2
 that lays over the bars (say, using a dotted line).  How do I go about
 doing this?
 
 --j
 
 --
 Jonathan A. Greenberg, PhD
 Assistant Professor
 Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
 Department of Geography and Geographic Information Science University
 of Illinois at Urbana-Champaign
 607 South Mathews Avenue, MC 150
 Urbana, IL 61801
 Phone: 217-300-1924
 http://www.geog.illinois.edu/~jgrn/
 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >