date:20130122



On Jan 22, 2013, at 5:58 AM, Simonas Kecorius wrote:


Hey Duncan,

Neither me do imagine what formula OpenOffice uses for quantiles. I  
have
checked a data string, 24 values, to calculate a quantiles with  
OpenOffice
and R. The result is identical. The problem arises when I try to  
implement

quantile calculation in this form:
dat2-with(dat1,aggregate(cbind(dat1[, 
1:71]),by=list(newID),quantiles,0.1,type=4))
. This code does not generate an error, but I guess neither a right  
result.


You guess? What result and what is right?


So my question would be:
How I could calculate quantiles for a big data.frame in R (71  
columns and
288 rows). I need to take 24 rows, calculate quantiles, then take  
another



24 rows etc..for 71 columns.



You have already been told that you are misspelling the name of the R  
function.


The other open question in my mind is whether you were hoping for  
something other than a single quantile (in this case the 10th  
percentile, or perhaps wanted the quantiles that would divide your  
data into deciles?


If you want to do the calculation within groups then the second  
argument to `aggregate` must specify the grouping. By design  
`aggregate` will apply the function on all columns.

--
David.


Thanks in advance.




2013/1/22 Duncan Murdoch murdoch.dun...@gmail.com


On 13-01-21 6:41 PM, Simonas Kecorius wrote:


Dear R users,

I came up to a problem dealing with percentiles in R.

From my previous questions: I do have a big data.frame, with lots of


columns and rows. The following command enables me to calculate  
means for

all data frame.

dat1$newID-rep(1:(nrow(dat1)/**12),each=12) #if nrow(dat1)/12 is  
integer


dat2-with(dat1,aggregate(**cbind(dat1[, 
1:71]),by=list(**newID),mean))


What I need is to calculate percentiles for each group (there are 12
values
in a group). I tried the following:

duomenai-with(dat1,aggregate(**cbind(dat1[,1:71]),by=list(**
newID),quantiles,0.1,type=4))



You didn't define quantiles, so that won't work.  Assuming that's a  
typo,

and you meant quantile...




First, is the following syntax is right?
Secondly, I tried to calculate percentiles using OpenOffice and  
there is
disagreement between values. If I do calculation for some number  
row, than
R and OpenOffice numbers coincide, but for a data.frame it seams  
that

something goes wrong.



There are lots of different formulas for empirical quantiles.  The  
ones
available in R are described in the ?quantile help topic.  What  
formula

does OpenOffice use?

Duncan Murdoch





--
Simonas Kecorius
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A smart way to use $ in data frame

Hello Greg,

Thanks very much!

This helps!

Cheers,

Rebecca

From: Greg Snow [mailto:538...@gmail.com]
Sent: Friday, January 18, 2013 5:17 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] A smart way to use $ in data frame

The important thing to understand is that $ is a shortcut for [[ and you are 
moving into the realm where a shortcut is the longest distance between 2 points 
(see fortune(312)).

So your code can be something like:

state - 'oldstate'
balance - 'oldbalance'
dataa[[balance]][ dataa[[state]]=='AR' ]

You may also benefit from learning to use tools like with and subset 
(though subset has its own complications when used inside of other functions) 
or grep and match to find the columns of interest.

On Fri, Jan 18, 2013 at 12:40 PM, Yuan, Rebecca 
rebecca.y...@bankofamerica.commailto:rebecca.y...@bankofamerica.com wrote:
Hello all,

I have a data frame dataa:

newdate newstate newid newbalance newaccounts
1 31DEC2001AR 1 1170   61
2 31DEC2001VA 2  4565   54
3 31DEC2001WA 3 2726   35
4 31DEC2001AR 3 2700   35

The following gives me the balance of state AR:

dataa$newbalance[data$newstate == 'AR']
1170
2700

Now, I have another different data frame datab, it is very similar to data, 
except that the name of the columns are different, and the order of the columns 
are different:

oldstate olddate oldbalance oldid oldaccounts
1 AR   31DEC20121234 7  40
2 WA 31DEC2012 3  30
3 VA   31DEC20122345 5  23
3 AR   31DEC20125673 5  23

datab$oldbalance[datab$oldstate== 'AR' ]
1234
5673

Could I have a way to quote

data$balance[data$state == 'AR']

in general, where balance=oldbalance, state=oldstate when data=dataa, and 
balance = newbalance, state = newstate when data=datab ?

Thanks very much!

Cheers,

Rebecca

--
This message, and any attachments, is for the intended r...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plot two time series with different length and different starting point in one figure.

Hello,

I do have two different time series A and B, they are different in length and 
starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in 
March, 2012 and ends in Nov, 2012.

How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 
- Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, 
it would have two data points from A and B, and in December 2012, it would have 
one data point from A.

Thanks very much!

Cheers,

Rebecca


--
This message, and any attachments, is for the intended r...{{dropped:5}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.

2013-01-22 Thread PIKAL Petr

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different
 starting point in one figure.

 Hello,

 I do have two different time series A and B, they are different in
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012
 and B starts in March, 2012 and ends in Nov, 2012.

 How can I plot those two series A and B in the same plot? I.E., from
 Jan. 2012 - Feb, 2012, it would have one data point from A and from
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and in
 December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 Thanks very much!

 Cheers,

 Rebecca

 --
 This message, and any attachments, is for the intended
 r...{{dropped:5}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.

Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz] 
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one figure.
 
 Hello,
 
 I do have two different time series A and B, they are different in 
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 
 and B starts in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from 
 Jan. 2012 - Feb, 2012, it would have one data point from A and from 
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and 
 in December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 
 Thanks very much!
 
 Cheers,
 
 Rebecca
 
 
 --
 This message, and any attachments, is for the intended...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Erro message in glmmADMB

2013-01-22 Thread Ben Bolker

peixotop peixotop at leuphana.de writes:


 I am using glmmADMB and when I run some models, I recieve the following
 message:
 
 Erro em glmmadmb(eumencells ~ 1 + (1 | owners), data = pred3, family =
 nbinom,  :
 The function maximizer failed (couldn't find STD file)
 Furthermore: Lost warning messages:
 Command execution 'C:\Windows\system32\cmd.exe /c
 C:/Users/helenametal/Documents/R/win-library/2.15/
glmmADMB/bin/windows32/glmmadmb.exe
 -maxfn 500 -maxph 5 -noinit -shess' teve status 1
 : Mensagens de aviso perdidas:
 execução do comando 'C:\Windows\system32\cmd.exe /c
 C:/Users/helenametal/Documents/R/win-library/2.15/
glmmADMB/bin/windows32/glmmadmb.exe
 -maxfn 500 -maxph 5 -noinit -shess' teve status 1

  Sorry, this is not nearly enough information  for diagnosis.
This message just means that *something* went wrong during the
optimization step (I do appreciate that it would be good to
improve the error messages, although there may not be that
much more information available).

  Please (1) follow-up to r-sig-mixed-mod...@r-project.org
and (2) give more complete information on the full model you
ran, contents of pred3, etc. (see e.g. http://tinyurl.com/reproducible-000)

Here's a minimal example that shows that a model of the
form you present *could* work:
 
pred3 - data.frame(owners=rep(letters[1:20],each=20))
set.seed(1001)
u - rnorm(20,sd=2)
pred3$eumencells - rnbinom(nrow(pred3),mu=exp(1.5+u),size=2)
library(glmmADMB)
glmmadmb(eumencells ~ 1 + (1|owners),family=nbinom,data=pred3)

-- although it doesn't work very well -- it essentially estimates
the random effects as zero, lumps the among-owner variance into
the NB variance, and mis-estimates the intercept.  I don't blame
glmmADMB for this, though, it's a small data set and a tough problem.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Assistant

2013-01-22 Thread Adelabu Ahmmed

Good-day Sir,

I am R.Language users but am try to  estimate parameter of beta distribution 
particular dataset but give this error, which is not clear to me: (Initial 
value in vmmin is not finite)
 beta.fit - fitdistr(data,densfun=dbeta,shape1=value , shape2=value)
 kindly assist.
 expecting your reply:

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple use of dcast (reshape2 package)

Hi,

This could be done with ?aggregate()
res-aggregate(aa$Eaten,by=list(ID=aa$ID),FUN=function(x) x)
res1-data.frame(ID=res[,1],data.frame(res[[2]]))
 names(res1)[2:3]-unique(aa$Target)
 res1
#  ID TPP GPA
#1  1   0   9
#2  2   1  11
#3  3   3   8
#4  4   1   8
#5  5   2  10
A.K.




- Original Message -
From: Patrick Connolly p_conno...@slingshot.co.nz
To: R-help r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 4:23 AM
Subject: [R] Simple use of dcast (reshape2 package)

Suppose I have a small dataframe

 aa
     Target Eaten ID
50      TPP     0  1
51      TPP     1  2
52      TPP     3  3
53      TPP     1  4
54      TPP     2  5
50.1    GPA     9  1
51.1    GPA    11  2
52.1    GPA     8  3
53.1    GPA     8  4
54.1    GPA    10  5

And I want to reshape it into 

  ID TPP GPA
1  1   0   9
2  2   1  11
3  3   3   8
4  4   1   8
5  5   2  10

I realise that dcast function in the reshape2 package can handle much
more complicated tasks than that, but I can't make it do a simple one.

If I simply tried 

 dcast(aa, ... ~ Target)
Using ID as value column: use value.var to override.
Aggregation function missing: defaulting to length
  Eaten GPA TPP
1     0   0   1
2     1   0   2
3     2   0   1
4     3   0   1
5     8   2   0
6     9   1   0
7    10   1   0
8    11   1   0

As per the help file, it's giving counts of the numbers in the Eaten
column since that's the default fun.aggregate value.

My questions are: what fun.aggregate would work?  Alternatively, can
value.var be set to something useful?

TIA

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.  
   ___    Patrick Connolly  
{~._.~}                   Great minds discuss ideas    
_( Y )_               Average minds discuss events 
(:_~*~_:)                  Small minds discuss people  
(_)-(_)                            . Eleanor Roosevelt
      
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove my adress from mailing list

HI,

Please check the link:
https://stat.ethz.ch/mailman/listinfo/r-help


At the end, there is an option to unsubscribe:
To unsubscribe from R-help, get a password reminder,
or change your subscription options enter your subscription
email address:
Hope it helps:

A.K.



- Original Message -
From: M. Maurice m.mauric...@yahoo.de
To: R-help@r-project.org R-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 7:13 AM
Subject: [R] Remove my adress from mailing list

Hello!

I wish, that my email-adress is removed from the R-help mailing list.

Thanks!
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to align group based on the common values of two columns in r

Hi,

I am not sure about the logic behind creation of groups, especially, how do you 
want to assign the group number to a particular combination of Feature and OS.
One possible way would be:
 dat1$Group-paste(dat1[,1],dat1[,2],sep=)
 dat1
#  Feature OS Group
#1   4  2    42
#2   4  1    41
#3   4  3    43
#4   1  2    12
#5   4  1    41
A.K.






- Original Message -
From: Tammy Ma metal_lical...@live.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 4:28 AM
Subject: [R] How to align group based on the common values of two columns in r


HI,

I met this problem:

I have the feature data frame:


   Feature     OS
     4              2
     4              1
     4              3
     1              2
     4              1


what I want to do is to autimatically create one more column called group:

   Feature     OS      Group
     4              2         1
     4              1         2
     4              3         3
     1              2         4
     4              1         2



I don't want Ifelse, because I have so many combination of feature and OS, I 
even can not account.  I just want to have sth to autimatically create group 
indicator based on the difference combination of feature and OS.

Thanks for your help.


Kind regards,
Tammy


                          
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simple reshape

Hi,

You could also do this by:
set.seed(15)
tr.df-data.frame(ID=rep(1:29,each=3),prep=runif(87,1,3),postp=runif(87,0.5,1.5))
tr.df$time-1:87
res- reshape(tr.df, varying=2:3, v.name=value, 
times=c(prep,postp),idvar=time,timevar=prepost,direction=long)
res-res[order(res$ID,res$time),]
 row.names(res)-1:nrow(res)
 head(res,4)
#  ID time prepost value
#1  1    1    prep 2.2042281
#2  1    1   postp 1.3553657
#3  1    2    prep 1.3900879
#4  1    2   postp 0.8674933
A.K.



- Original Message -
From: Jim Lemon j...@bitwrit.com.au
To: Troels Ring tr...@gvdnet.dk
Cc: r-help@r-project.org
Sent: Tuesday, January 22, 2013 4:46 AM
Subject: Re: [R] simple reshape

On 01/22/2013 07:19 PM, Troels Ring wrote:
 Dear friends - this is a very simple question - I have a data frame
 'data.frame': 87 obs. of 3 variables:
 $ ID : int 1 1 1 2 2 2 3 3 3 4 ...
 $ prep : num 1.18 1.38 1.34 1.93 2.38 2.24 1.17 1.13 1.21 1.89 ...
 $ postp: num 0.63 0.71 0.75 1.01 1.12 1.07 0.87 0.64 0.7 0.8 ...

 - 29 persons (ID) each measured three times before and after an
 intervention: prep and postp -
 I need data rearranged like

 ID time val
 1 1 prep
 1 2 postp
 1 1
 1 2
 1 1
 1 2
 I cannot make reshape or stack do the trick.

Hi Troels,
With a bit of extra processing I think rep_n_stack (prettyR) will do 
what you want:

# fake some data
tr.df-data.frame(ID=rep(1:29,each=3),prep=runif(87,1,3),postp=runif(87,0.5,1.5))
# add a repeat number
tr.df$repno-rep(1:3,29)
# get the reshaped data frame
trlong.df-rep_n_stack(tr.df,to.stack=2:3,
  stack.names=c(prepost,value))
# reorder it
trlong.df[order(trlong.df$ID,trlong.df$repno),]

     ID repno prepost     value
1    1     1    prep 2.9158693
88   1     1   postp 0.9932342
2    1     2    prep 1.2852817
89   1     2   postp 0.8187234
3    1     3    prep 2.5771902
90   1     3   postp 1.0033936
4    2     1    prep 2.2969320
91   2     1   postp 0.6837140
5    2     2    prep 1.3083553
92   2     2   postp 1.4537096
6    2     3    prep 2.8654184
93   2     3   postp 1.0880881
...

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Assistant

2013-01-22 Thread Jessica Streicher

You're not giving people much to work with. I googled the error, and it seems 
to come from the call to optim and has likely to do with bad starting 
parameters.

That said, the documentation of fitdistr doesn't suggest it even supports 
dbeta, there is only a beta mentioned.

On 22.01.2013, at 17:07, Adelabu Ahmmed wrote:

 Good-day Sir,
 
 I am R.Language users but am try to  estimate parameter of beta distribution 
 particular dataset but give this error, which is not clear to me: (Initial 
 value in vmmin is not finite)
 beta.fit - fitdistr(data,densfun=dbeta,shape1=value , shape2=value)
 kindly assist.
 expecting your reply:
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Assistant

2013-01-22 Thread Rui Barradas


Hello,

You are calling the function in a wrong way. In the case of a beta fit, 
densfun should be the quoted string beta and the initial parameter 
values are elements of a named list. Like this:



library(MASS)

x - rbeta(1000, shape1 = 2, shape2 = 0.5)
fitdistr(x, densfun = beta, start = list(shape1 = 1, shape2 = 1))


As for your error, I only got it if the data clearly can not fit a beta.

y - rgamma(1000, shape = 2, rate = 0.5)
fitdistr(y, densfun = beta, start = list(shape1 = 1, shape2 = 1))
Error in optim(x = c(6.19809706003757, 2.32632108817696, 
3.60844436009277,  :

  initial value in 'vmmin' is not finite


So revise the way you call fitdistr and then, if the error persists, 
revise the parametric distribution to be fitted.



Hope this helps,

Rui Barradas

Em 22-01-2013 16:07, Adelabu Ahmmed escreveu:

Good-day Sir,

I am R.Language users but am try to  estimate parameter of beta distribution particular 
dataset but give this error, which is not clear to me: (Initial value in 
vmmin is not finite)
  beta.fit - fitdistr(data,densfun=dbeta,shape1=value , shape2=value)
  kindly assist.
  expecting your reply:

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] c(), rbind and cbind functions - why type of resulting object is double

2013-01-22 Thread Lourdes Peña Castillo

Hello Everyone,

I am using R 2.15.0 and I came across this behaviour and I was wondering
why I don't get an integer vector or and integer matrix with the following
code:

 z - c(1, 2:0, 3, 4:8)

 typeof(z)

[1] double

 z - rbind(1, 2:0, 3, 4:8)

Warning message:

In rbind(1, 2:0, 3, 4:8) :

  number of columns of result is not a multiple of vector length (arg 2)

 typeof(z)

[1] double

 z - matrix(c(1, 2:0, 3, 4:8), nrow = 5)

 typeof(z)

[1] double


Shouldn't be typeof integer? According to the online help if everything is
integer the output should be integer.
But if I do this, I get an integer matrix.

 z - matrix(1:20, nrow = 5)

 typeof(z)

[1] integer

Thanks!

Lourdes

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] c(), rbind and cbind functions - why type of resulting object is double

2013-01-22 Thread William Dunlap

 I was wondering
 why I don't get an integer vector or and integer matrix with the following
 code:
  z - c(1, 2:0, 3, 4:8)
  typeof(z)
 [1] double

It is because the literals 1 and 3 have type double.  Append L to make
them literal integers.
   typeof(c(1L, 2:0, 3L, 4:8))
  [1] integer
The colon function (:) returns an integer vector if it can do so without 
giving
a numerically incorrect answer.

   typeof(1.0:3.0)
  [1] integer
   typeof(1.5:3.5)
  [1] double

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Lourdes Peña Castillo
 Sent: Tuesday, January 22, 2013 9:26 AM
 To: r-help@r-project.org
 Subject: [R] c(), rbind and cbind functions - why type of resulting object is 
 double
 
 Hello Everyone,
 
 I am using R 2.15.0 and I came across this behaviour and I was wondering
 why I don't get an integer vector or and integer matrix with the following
 code:
 
  z - c(1, 2:0, 3, 4:8)
 
  typeof(z)
 
 [1] double
 
  z - rbind(1, 2:0, 3, 4:8)
 
 Warning message:
 
 In rbind(1, 2:0, 3, 4:8) :
 
   number of columns of result is not a multiple of vector length (arg 2)
 
  typeof(z)
 
 [1] double
 
  z - matrix(c(1, 2:0, 3, 4:8), nrow = 5)
 
  typeof(z)
 
 [1] double
 
 
 Shouldn't be typeof integer? According to the online help if everything is
 integer the output should be integer.
 But if I do this, I get an integer matrix.
 
  z - matrix(1:20, nrow = 5)
 
  typeof(z)
 
 [1] integer
 
 Thanks!
 
 Lourdes
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] c(), rbind and cbind functions - why type of resulting object is double

2013-01-22 Thread Patrick Burns


One place that talks about what Bill says is:

http://www.burns-stat.com/documents/tutorials/impatient-r/more-r-key-objects/more-r-numbers/

Pat

On 22/01/2013 17:35, William Dunlap wrote:

I was wondering
why I don't get an integer vector or and integer matrix with the following
code:

z - c(1, 2:0, 3, 4:8)
typeof(z)

[1] double


It is because the literals 1 and 3 have type double.  Append L to make
them literal integers.
typeof(c(1L, 2:0, 3L, 4:8))
   [1] integer
The colon function (:) returns an integer vector if it can do so without 
giving
a numerically incorrect answer.

typeof(1.0:3.0)
   [1] integer
typeof(1.5:3.5)
   [1] double

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf
Of Lourdes Peña Castillo
Sent: Tuesday, January 22, 2013 9:26 AM
To: r-help@r-project.org
Subject: [R] c(), rbind and cbind functions - why type of resulting object is 
double

Hello Everyone,

I am using R 2.15.0 and I came across this behaviour and I was wondering
why I don't get an integer vector or and integer matrix with the following
code:


z - c(1, 2:0, 3, 4:8)



typeof(z)


[1] double


z - rbind(1, 2:0, 3, 4:8)


Warning message:

In rbind(1, 2:0, 3, 4:8) :

   number of columns of result is not a multiple of vector length (arg 2)


typeof(z)


[1] double


z - matrix(c(1, 2:0, 3, 4:8), nrow = 5)



typeof(z)


[1] double


Shouldn't be typeof integer? According to the online help if everything is
integer the output should be integer.
But if I do this, I get an integer matrix.


z - matrix(1:20, nrow = 5)



typeof(z)


[1] integer

Thanks!

Lourdes

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Impatient R' and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] change confidence interval line length in barplot2 (plotrix package)

2013-01-22 Thread Martin Batholdy

Hi,

is there any way to change the width of the horizontal line of confidence 
intervals
in the barplot2 function in the plotrix package (independent of the width of 
the bars)?


example code:

library(plotrix)
# Example with confidence intervals and grid
hh - t(VADeaths)[, 1]
mybarcol - gray20
ci.l - hh * 0.85
ci.u - hh * 1.15
mp - barplot2(hh, beside = TRUE,
col = c(lightblue, mistyrose,
lightcyan, lavender),
legend = colnames(VADeaths), ylim = c(0, 20),
main = Death Rates in Virginia, font.main = 4,
sub = Faked 95 percent error bars, col.sub = mybarcol,
cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)



thanks!
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.


On Jan 22, 2013, at 7:07 AM, Yuan, Rebecca wrote:

 Hello,
 
 I do have two different time series A and B, they are different in length and 
 starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in 
 March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from Jan. 
 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 
 2012, it would have two data points from A and B, and in December 2012, it 
 would have one data point from A.

You could set the xlim argument to c( min(timeA, timeB), max(timeA, timeB) ) in 
the `plot` of either of the series and then use `lines` for the other series, 
perhaps with a different color argument.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.

Hello Arun,

This would help me to get the date type of data. A new question comes out that 
since the dates are not exactly the same on two date sets, there are some NA 
values in the merged data set, such as

2012-09-28   NA  NA5400726 14861715970
2012-09-30  5035606 14832837436 NA  NA

Does R have a function to convert the date to some format of Sep,2012, 
therefore when I merge those two, they will not have those NA numbers...

Thanks,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 2:15 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

Assuming that 'raw_data' is data.frame with first column as raw_time:
You could convert the raw_time to date format by 

 as.Date(28FEB2002,format=%d%B%Y)
#[1] 2002-02-28

In your data, it should  be:
raw_data$raw_time- as.Date(raw_time,format=%d%B%Y)

Could you just dput() a few lines of your dataset if this is not working?
Tx.

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 2:08 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

My data shows that I do not have a date type of data:

summary(raw_data)
      raw_time      raw_acct         raw_baln
28FEB2002:  1   Min.   : 61714   Min.   :117079835
28FEB2003:  1   1st Qu.: 75587   1st Qu.:158035150
28FEB2005:  1   Median :100234   Median :206906298
28FEB2006:  1   Mean   : 96058   Mean   :210550369
28FEB2007:  1   3rd Qu.:116908   3rd Qu.:263623782
28FEB2009:  1   Max.   :121853   Max.   :325290870
(Other)  :127                                      


How could I transfer the raw_time column to a date format, such as

summary(dateA)
        Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
2012-01-01 2012-04-01 2012-07-01 2012-07-01 2012-09-30 2012-12-31


Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 12:39 PM
To: Yuan, Rebecca
Cc: R help; Petr PIKAL
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi,

You could also try this:
dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day)
 
dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day)
set.seed(15)
 A-data.frame(dateA,value=sample(1:300,366,replace=TRUE))
 set.seed(25)
 B-data.frame(dateB,value=sample(1:300,275,replace=TRUE))
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
res1-res[complete.cases(res),]
library(zoo)
plot.zoo(res1)
plot.zoo(res)

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'PIKAL Petr' petr.pi...@precheza.cz
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 10:36 AM
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz]
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one figure.
 
 Hello,
 
 I do have two different time series A and B, they are different in 
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 
 and B starts in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from 
 Jan. 2012 - Feb, 2012, it would have one data point from A and from 
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and 
 in December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 
 Thanks very much!
 
 Cheers,
 
 Rebecca
 
 
 --
 This message, and any attachments, is for the \  in...{{dropped:23}}

__
R-help@r-project.org mailing list

Re: [R] change confidence interval line length in barplot2 (plotrix package)

2013-01-22 Thread Rolf Turner




There does not appear to be any such function as barplot2 in
the current version (3.4-5) of the plotrix package.  Moreover
I can find no reference to such a function in the NEWS for
plotrix.

cheers,

Rolf Turner

On 01/23/2013 07:28 AM, Martin Batholdy wrote:

Hi,

is there any way to change the width of the horizontal line of confidence 
intervals
in the barplot2 function in the plotrix package (independent of the width of 
the bars)?


example code:

library(plotrix)
# Example with confidence intervals and grid
hh - t(VADeaths)[, 1]
mybarcol - gray20
ci.l - hh * 0.85
ci.u - hh * 1.15
mp - barplot2(hh, beside = TRUE,
 col = c(lightblue, mistyrose,
 lightcyan, lavender),
 legend = colnames(VADeaths), ylim = c(0, 20),
 main = Death Rates in Virginia, font.main = 4,
 sub = Faked 95 percent error bars, col.sub = mybarcol,
 cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.

Hello David,

If I use plot with the following code:

plot(A, type = o, col = plot_colors[plotcolor], axes = FALSE, 
ann = FALSE)
par(new=TRUE)
plot(B, type = o, col = plot_colors[plotcolor+1], axes = 
FALSE, ann = FALSE)
box()

I will have the two series in one plot, but they are only from March,2012 to 
Nov, 2012, the nonoverlapping months are dropped out...

I know in Matlab that I can specify the x axis such as 

Plot(timeofA, A)
Hold on;
Plot(timeofB, B)

to get them in the same figure, but in R, I do not know how to do it.

Thanks,

Rebecca

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Tuesday, January 22, 2013 2:34 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.


On Jan 22, 2013, at 7:07 AM, Yuan, Rebecca wrote:

 Hello,
 
 I do have two different time series A and B, they are different in length and 
 starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in 
 March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from Jan. 
 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 
 2012, it would have two data points from A and B, and in December 2012, it 
 would have one data point from A.

You could set the xlim argument to c( min(timeA, timeB), max(timeA, timeB) ) in 
the `plot` of either of the series and then use `lines` for the other series, 
perhaps with a different color argument.

-- 

David Winsemius
Alameda, CA, USA

--
This message, and any attachments, is for the intended r...{{dropped:2}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to assign time series to a vector with one leap year

2013-01-22 Thread Janesh Devkota

Hello All,

I am trying to do the time series analysis in R and I want to assign a
vector as a time series. The data I provided is hourly. The data is from
Jan 1 2008 to Dec 31 2009. How can I assign the data such that the first
year is leap year and second is not ?

airtemp - read.csv(airtemp.csv,header=T,sep=)

aw - ts(airtemp,start=2008,frequency=8784,end=2009)

I assigned frequency as 8784 because 2008 year will have 8784 hourly data
points and 2009 has 8760 data points. The total data points are 17544

The data can be found on
https://www.dropbox.com/s/03z74632v1f3g1e/airtemp.csv

I apologize if this is very trivial to some of you.

Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.


On Jan 22, 2013, at 11:42 AM, Yuan, Rebecca wrote:

 Hello David,
 
 If I use plot with the following code:
 
   plot(A, type = o, col = plot_colors[plotcolor], axes = FALSE, 
 ann = FALSE)
   par(new=TRUE)
   plot(B, type = o, col = plot_colors[plotcolor+1], axes = 
 FALSE, ann = FALSE)
   box()
 
 I will have the two series in one plot, but they are only from March,2012 to 
 Nov, 2012, the nonoverlapping months are dropped out...
 
 I know in Matlab that I can specify the x axis such as 
 
 Plot(timeofA, A)
 Hold on;
 Plot(timeofB, B)
 
 to get them in the same figure, but in R, I do not know how to do it.

As I said before . You need to use the xlim argument to 'plot'. If you 
insist on using plot twice then you will need to use 'xlim=' twice, although 
I thought it would be easier to use `plo`t first and `lines` second.

-- 
David.
 
 Thanks,
 
 Rebecca
 
 -Original Message-
 From: David Winsemius [mailto:dwinsem...@comcast.net] 
 Sent: Tuesday, January 22, 2013 2:34 PM
 To: Yuan, Rebecca
 Cc: R help
 Subject: Re: [R] plot two time series with different length and different 
 starting point in one figure.
 
 
 On Jan 22, 2013, at 7:07 AM, Yuan, Rebecca wrote:
 
 Hello,
 
 I do have two different time series A and B, they are different in length 
 and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts 
 in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from Jan. 
 2012 - Feb, 2012, it would have one data point from A and from Mar, 
 2012-Nov, 2012, it would have two data points from A and B, and in December 
 2012, it would have one data point from A.
 
 You could set the xlim argument to c( min(timeA, timeB), max(timeA, timeB) ) 
 in the `plot` of either of the series and then use `lines` for the other 
 series, perhaps with a different color argument.
 
 -- 
 
 David Winsemius
 Alameda, CA, USA
 
 --
 This message, and any attachments, is for the intended...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Creating a Data Frame from an XML

2013-01-22 Thread Adam Gabbert

Hello,

I'm attempting to read information from an XML into a data frame in R using
the XML package. I am unable to get the data into a data frame as I would
like.  I have some sample code below.

*XML Code:*

Header...

Data I want in a data frame:

   data
  row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 /
  row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 /
  row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 /
  row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 /
  row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 /
  row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 /
  row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 /
  row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS /
  row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 /
  row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 /
  /data

*R Code:*

doc -xmlInternalTreeParse (Sample2.xml)
top - xmlRoot (doc)
xmlName (top)
names (top)
art - top [[row]]
art
**
*Output:*

 artrow BRAND=GMC NUM=1 YEAR=1999 VALUE=1/




This is where I am having difficulties.  I am unable to access additional
rows; ( i.e.  row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / )

and I am unable to access the individual entries to actually create the
data frame.  The data frame I would like is as follows:

BRANDNUMYEARVALUE
GMC1  1999  1
FORD   2  2000  12000
GMC1  2001   12500
etc

Any help or suggestions would be appreciated.  Conversly, my eventual goal
would be to take a data frame and write it into an XML in the previously
shown format.

Thank you

AG

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Introduction and help request

2013-01-22 Thread Ross Tinsley

Hello all

I am a researcher in the field of tourism and have just recently installed R64 
and RStudio onto my Mac (running latest OS). I am ran into some problems 
installing additional packages. I have looked through the General FAQs and Mac 
FAQS but haven't been able to find a solution.

I have downloaded the various packages I need from CRAN sources and while some 
have successfully installed others have not. I have been following the 
instructions on the Mac FAQ to unzip and install the downloaded packages using 
the command line but the results seem to indicate an error (they are installed 
but then don't work properly and so are subsequently uninstalled). It happens 
on more than one so that's why I thought it might be something generic I am 
doing. Here is a copy of the command line results:

Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
/private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/Hmisc_3.10-1.tar.gz
 
* installing to library 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
* installing *source* package ‘Hmisc’ ...
** package ‘Hmisc’ successfully unpacked and MD5 sums checked
** libs
*** arch - i386
sh: make: command not found
ERROR: compilation failed for package ‘Hmisc’
* removing 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/Hmisc’
Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
/private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/acepack_1.3-3.2.tar.gz
 
* installing to library 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
* installing *source* package ‘acepack’ ...
** package ‘acepack’ successfully unpacked and MD5 sums checked
** libs
*** arch - i386
sh: make: command not found
ERROR: compilation failed for package ‘acepack’
* removing 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/acepack’
Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
/private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/arm_1.6-01.02.tar.gz
 
* installing to library 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
ERROR: dependency ‘lme4’ is not available for package ‘arm’
* removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/arm’
Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
/private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/chron_2.3-43.tar.gz
 
* installing to library 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
* installing *source* package ‘chron’ ...
** package ‘chron’ successfully unpacked and MD5 sums checked
** libs
*** arch - i386
sh: make: command not found
ERROR: compilation failed for package ‘chron’
* removing 
‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/chron’


Thank you for any help

Ross
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plot.mob() fails with cut() error 'breaks' are not unique

2013-01-22 Thread Jason Musil

DeaR all,

I am using mob() for model based partitioning, with a dichotomous variable 
(participant's correct/incorrect response to a test item) regressed onto a 
continuous predictor related to a given property of the test item. Although 
this variable is continuous, the value of this variable for many items in this 
particular analysis is 0. The partitioning criterion is self-reported ability 
in a related area.

 mob1 - mob(
correct ~ circular.mean | srp.dimension,
control=mob_control(alpha=.001),
model=glinearModel,
family=binomial()
  )

 plot(mob1)

Error in cut.default(x, breaks = breaks, include.lowest = TRUE) : 
  'breaks' are not unique

The same persists if I specify either a desired number of breaks, or explicit 
breakpoints (e.g. breaks=3 or breaks=c(-0.1,0.1,0.5)). I guess this is to do 
with the funny distribution of the predictor variable, but I'm not sure what to 
do about it.

Many thanks and apologies if this doesn't fit the mailing list---it is my first 
posting!
Jason Musil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to align group based on the common values of two columns in r

Hi,
You could also try:

dat1-read.table(text=
 Feature    OS
    4  2
    4  1
    4  3
    1  2
    4  1
,sep=,header=TRUE)
 dat1$Group- as.numeric(factor(Reduce(paste0,dat1)))
A.K.

- Original Message -
From: Tammy Ma metal_lical...@live.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 4:28 AM
Subject: [R] How to align group based on the common values of two columns in r


HI,

I met this problem:

I have the feature data frame:


   Feature     OS
     4              2
     4              1
     4              3
     1              2
     4              1


what I want to do is to autimatically create one more column called group:

   Feature     OS      Group
     4              2         1
     4              1         2
     4              3         3
     1              2         4
     4              1         2



I don't want Ifelse, because I have so many combination of feature and OS, I 
even can not account.  I just want to have sth to autimatically create group 
indicator based on the difference combination of feature and OS.

Thanks for your help.


Kind regards,
Tammy


                          
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.

Hi,

dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day)
 
dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day)
set.seed(15)
 A-data.frame(dateA,value=sample(1:300,366,replace=TRUE))
 set.seed(25)
 B-data.frame(dateB,value=sample(1:300,275,replace=TRUE))
res-merge(A,B,by.x=dateA,by.y=dateB) #it works


A.K.



- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'PIKAL Petr' petr.pi...@precheza.cz
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 10:36 AM
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz] 
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one figure.
 
 Hello,
 
 I do have two different time series A and B, they are different in 
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 
 and B starts in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from 
 Jan. 2012 - Feb, 2012, it would have one data point from A and from 
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and 
 in December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 
 Thanks very much!
 
 Cheers,
 
 Rebecca
 
 
 --
 This message, and any attachments, is for the intended...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.

Hi,

You could also try this:
dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day)
 
dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day)
set.seed(15)
 A-data.frame(dateA,value=sample(1:300,366,replace=TRUE))
 set.seed(25)
 B-data.frame(dateB,value=sample(1:300,275,replace=TRUE))
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
res1-res[complete.cases(res),]
library(zoo)
plot.zoo(res1)
plot.zoo(res)

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'PIKAL Petr' petr.pi...@precheza.cz
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 10:36 AM
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz] 
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one figure.
 
 Hello,
 
 I do have two different time series A and B, they are different in 
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 
 and B starts in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from 
 Jan. 2012 - Feb, 2012, it would have one data point from A and from 
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and 
 in December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 
 Thanks very much!
 
 Cheers,
 
 Rebecca
 
 
 --
 This message, and any attachments, is for the intended...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fdHess function

2013-01-22 Thread Douglas Bates

Your question is better addressed to the R-help@R-project.org mailing list,
which I am copying on this reply.

You are confusing a statistical concept, the Fisher Information matrix,
with a numerical concept, the Hessian matrix of a scalar function of a
vector argument.

The Fisher information matrix is the Hessian matrix of a particular
function at its optimum and I have forgotten whether that function is the
log-likelihood or negative twice the log-likelihood or ...  Rather than get
it wrong I am sending a copy of this reply to the list where many of the
readers will be able to answer you more reliably than I can.


On Tue, Jan 22, 2013 at 1:22 PM, Marcos Coque Jr mcoqu...@yahoo.com.brwrote:

 Dear Bates,

 I am using the fdHess function for R language.
 And I have a question.

 What is the relationship with the Hessian and Fisher Information in your
 function?
 Because I think that Fisher Information=-Hessian, but I found the oposite
 in your function.
 Maybe I be something wrong...

 Thanks,

 Marcos


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.

Hi Rebecca,

Assuming that 'raw_data' is data.frame with first column as raw_time:
You could convert the raw_time to date format by 

 as.Date(28FEB2002,format=%d%B%Y)
#[1] 2002-02-28

In your data, it should  be:
raw_data$raw_time- as.Date(raw_time,format=%d%B%Y)

Could you just dput() a few lines of your dataset if this is not working?
Tx.

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 2:08 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

My data shows that I do not have a date type of data:

summary(raw_data)
      raw_time      raw_acct         raw_baln        
28FEB2002:  1   Min.   : 61714   Min.   :117079835  
28FEB2003:  1   1st Qu.: 75587   1st Qu.:158035150  
28FEB2005:  1   Median :100234   Median :206906298  
28FEB2006:  1   Mean   : 96058   Mean   :210550369  
28FEB2007:  1   3rd Qu.:116908   3rd Qu.:263623782  
28FEB2009:  1   Max.   :121853   Max.   :325290870  
(Other)  :127                                      


How could I transfer the raw_time column to a date format, such as

summary(dateA)
        Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
2012-01-01 2012-04-01 2012-07-01 2012-07-01 2012-09-30 2012-12-31


Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 12:39 PM
To: Yuan, Rebecca
Cc: R help; Petr PIKAL
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi,

You could also try this:
dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day)
 
dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day)
set.seed(15)
 A-data.frame(dateA,value=sample(1:300,366,replace=TRUE))
 set.seed(25)
 B-data.frame(dateB,value=sample(1:300,275,replace=TRUE))
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
res1-res[complete.cases(res),]
library(zoo)
plot.zoo(res1)
plot.zoo(res)

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'PIKAL Petr' petr.pi...@precheza.cz
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 10:36 AM
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz]
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one figure.
 
 Hello,
 
 I do have two different time series A and B, they are different in 
 length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 
 and B starts in March, 2012 and ends in Nov, 2012.
 
 How can I plot those two series A and B in the same plot? I.E., from 
 Jan. 2012 - Feb, 2012, it would have one data point from A and from 
 Mar, 2012-Nov, 2012, it would have two data points from A and B, and 
 in December 2012, it would have one data point from A.

Merge those 2 series.

?merge

Regards
Petr

 
 Thanks very much!
 
 Cheers,
 
 Rebecca
 
 
 --
 This message, and any attachments, is for the 
 intended...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
This message, and any attachments, is for the intended recipient(s) only, may 
contain information that is privileged, confidential and/or proprietary and 
subject to important terms and conditions available at 
http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended 
recipient, please delete this message.


__
R-help@r-project.org mailing list

Re: [R] change confidence interval line length in barplot2 (plotrix package)


On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote:

 Hi,
 
 is there any way to change the width of the horizontal line of confidence 
 intervals
 in the barplot2 function in the plotrix package (independent of the width of 
 the bars)?
 
 
 example code:
 
 library(plotrix)
 # Example with confidence intervals and grid
 hh - t(VADeaths)[, 1]
 mybarcol - gray20
 ci.l - hh * 0.85
 ci.u - hh * 1.15
 mp - barplot2(hh, beside = TRUE,
col = c(lightblue, mistyrose,
lightcyan, lavender),
legend = colnames(VADeaths), ylim = c(0, 20),
main = Death Rates in Virginia, font.main = 4,
sub = Faked 95 percent error bars, col.sub = mybarcol,
cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)

When I did an sos::findFn(barplot2) search to locate the real `barplot2` O 
alos noted in the same package (gplots) a function named `ooplot`. It calls 
itself an extenstion of barplot2 and has a ci.lwd argument. Might save you the 
time of doing what I thought might be needed, hacking te code.

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] change confidence interval line length in barplot2 (plotrix package)

2013-01-22 Thread Martin Batholdy

Ok, I have to apologize,
I confused the packages.

It's the function barplot2 from the gplots package!


  It calls itself an extenstion of barplot2 and has a ci.lwd argument. Might 
 save you the time of doing what I thought might be needed, hacking te code.

Unfortunately ci.lwd controls the thickness of the line but not the horizontal 
width.



On Jan 22, 2013, at 21:24 , David Winsemius dwinsem...@comcast.net wrote:

 
 On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote:
 
 Hi,
 
 is there any way to change the width of the horizontal line of confidence 
 intervals
 in the barplot2 function in the plotrix package (independent of the width of 
 the bars)?
 
 
 example code:
 
 library(plotrix)
 # Example with confidence intervals and grid
 hh - t(VADeaths)[, 1]
 mybarcol - gray20
 ci.l - hh * 0.85
 ci.u - hh * 1.15
 mp - barplot2(hh, beside = TRUE,
   col = c(lightblue, mistyrose,
   lightcyan, lavender),
   legend = colnames(VADeaths), ylim = c(0, 20),
   main = Death Rates in Virginia, font.main = 4,
   sub = Faked 95 percent error bars, col.sub = mybarcol,
   cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)
 
 When I did an sos::findFn(barplot2) search to locate the real `barplot2` 
 O alos noted in the same package (gplots) a function named `ooplot`. It calls 
 itself an extenstion of barplot2 and has a ci.lwd argument. Might save you 
 the time of doing what I thought might be needed, hacking te code.
 
 -- 
 David Winsemius
 Alameda, CA, USA
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Introduction and help request

2013-01-22 Thread Berend Hasselman


On 22-01-2013, at 19:20, Ross Tinsley rtins...@htmi.ch wrote:

 Hello all
 
 I am a researcher in the field of tourism and have just recently installed 
 R64 and RStudio onto my Mac (running latest OS). I am ran into some problems 
 installing additional packages. I have looked through the General FAQs and 
 Mac FAQS but haven't been able to find a solution.
 
 I have downloaded the various packages I need from CRAN sources and while 
 some have successfully installed others have not. I have been following the 
 instructions on the Mac FAQ to unzip and install the downloaded packages 
 using the command line but the results seem to indicate an error (they are 
 installed but then don't work properly and so are subsequently uninstalled). 
 It happens on more than one so that's why I thought it might be something 
 generic I am doing. Here is a copy of the command line results:
 
 Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
 /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/Hmisc_3.10-1.tar.gz
  
 * installing to library 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
 * installing *source* package ‘Hmisc’ ...
 ** package ‘Hmisc’ successfully unpacked and MD5 sums checked
 ** libs
 *** arch - i386
 sh: make: command not found
 ERROR: compilation failed for package ‘Hmisc’
 * removing 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/Hmisc’
 Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
 /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/acepack_1.3-3.2.tar.gz
  
 * installing to library 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
 * installing *source* package ‘acepack’ ...
 ** package ‘acepack’ successfully unpacked and MD5 sums checked
 ** libs
 *** arch - i386
 sh: make: command not found
 ERROR: compilation failed for package ‘acepack’
 * removing 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/acepack’
 Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
 /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/arm_1.6-01.02.tar.gz
  
 * installing to library 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
 ERROR: dependency ‘lme4’ is not available for package ‘arm’
 * removing 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/arm’
 Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL 
 /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/chron_2.3-43.tar.gz
  
 * installing to library 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
 * installing *source* package ‘chron’ ...
 ** package ‘chron’ successfully unpacked and MD5 sums checked
 ** libs
 *** arch - i386
 sh: make: command not found
 ERROR: compilation failed for package ‘chron’
 * removing 
 ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/chron’


1. This belongs on the R-SIG-Mac mailing list

2. Why don't you use the R.app GUI to install the binary versions of the 
required packages? Much easier.

3. The message:  sh: make: command not found means that you don't have make 
installed.
Most likely you don't have other required tools installed.
If you use the R.app GUI you don't really need tall those tools.

Advice: use R.app to install and if needed get the Xcode tools but only if you 
intend to compile your own packages.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fdHess function

2013-01-22 Thread Mark Leeds

Hi Doug: I was just looking at this coincidentally. When X is a vector, the
Fisher Information I_{theta} = the negative expectation of the second
derivatives of the log likelihood. So it's a matrix.  In other words,
I_theta = E(partial^2 /partial theta^2(log(X,theta).) where X is a vector.

But, even though the the Fisher Information has a seemingly nice formula, (
and this is where my confusion arose when I was dealing with this and why
I'm looking at it right
now. I have  short document that I wrote to myself  explaining it so if
anyone wants it, email me individually. It's nothing earth shattering !
) in many cases taking the that expectation is not easy so the  Fischer
Information is approximated by its empirical counterpart which is obtained
by summing each of the elements in the matrix given the n observations and
then dividing each of the elements in the matrix by n.













On Tue, Jan 22, 2013 at 3:27 PM, Douglas Bates ba...@stat.wisc.edu wrote:

 Your question is better addressed to the R-help@R-project.org mailing
 list,
 which I am copying on this reply.

 You are confusing a statistical concept, the Fisher Information matrix,
 with a numerical concept, the Hessian matrix of a scalar function of a
 vector argument.

 The Fisher information matrix is the Hessian matrix of a particular
 function at its optimum and I have forgotten whether that function is the
 log-likelihood or negative twice the log-likelihood or ...  Rather than get
 it wrong I am sending a copy of this reply to the list where many of the
 readers will be able to answer you more reliably than I can.


 On Tue, Jan 22, 2013 at 1:22 PM, Marcos Coque Jr mcoqu...@yahoo.com.br
 wrote:

  Dear Bates,
 
  I am using the fdHess function for R language.
  And I have a question.
 
  What is the relationship with the Hessian and Fisher Information in your
  function?
  Because I think that Fisher Information=-Hessian, but I found the oposite
  in your function.
  Maybe I be something wrong...
 
  Thanks,
 
  Marcos
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fdHess function

2013-01-22 Thread Mark Leeds

I neglected to mention that, once you get either I_theta or some empirical
estimate
of it, you then invert it to get an estimate of the asymptotic covariance
matrix of the
MLE.


On Tue, Jan 22, 2013 at 3:48 PM, Mark Leeds marklee...@gmail.com wrote:

 Hi Doug: I was just looking at this coincidentally. When X is a vector,
 the Fisher Information I_{theta} = the negative expectation of the second
 derivatives of the log likelihood. So it's a matrix.  In other words,
 I_theta = E(partial^2 /partial theta^2(log(X,theta).) where X is a vector.

 But, even though the the Fisher Information has a seemingly nice formula,
 ( and this is where my confusion arose when I was dealing with this and why
 I'm looking at it right
 now. I have  short document that I wrote to myself  explaining it so if
 anyone wants it, email me individually. It's nothing earth shattering !
 ) in many cases taking the that expectation is not easy so the  Fischer
 Information is approximated by its empirical counterpart which is obtained
 by summing each of the elements in the matrix given the n observations and
 then dividing each of the elements in the matrix by n.













 On Tue, Jan 22, 2013 at 3:27 PM, Douglas Bates ba...@stat.wisc.eduwrote:

 Your question is better addressed to the R-help@R-project.org mailing
 list,
 which I am copying on this reply.

 You are confusing a statistical concept, the Fisher Information matrix,
 with a numerical concept, the Hessian matrix of a scalar function of a
 vector argument.

 The Fisher information matrix is the Hessian matrix of a particular
 function at its optimum and I have forgotten whether that function is the
 log-likelihood or negative twice the log-likelihood or ...  Rather than
 get
 it wrong I am sending a copy of this reply to the list where many of the
 readers will be able to answer you more reliably than I can.


 On Tue, Jan 22, 2013 at 1:22 PM, Marcos Coque Jr mcoqu...@yahoo.com.br
 wrote:

  Dear Bates,
 
  I am using the fdHess function for R language.
  And I have a question.
 
  What is the relationship with the Hessian and Fisher Information in your
  function?
  Because I think that Fisher Information=-Hessian, but I found the
 oposite
  in your function.
  Maybe I be something wrong...
 
  Thanks,
 
  Marcos
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.

Hi Rebecca,

In the previous email, 
  res-merge(Anew,Bnew)
head(res)
#   Anew Bnew
#2012-01-01  181   NA
#2012-01-02   59   NA
#2012-01-03  290   NA
#2012-01-04  196   NA
#2012-01-05  111   NA
#2012-01-06  297   NA
 
plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I 
guess, it would remove that from plotting)  

If you want to remove the NA rows:
use, na.omit() or complete.cases()? #as I did in the previous email.

Could you dput() an example dataset?

A.K.
 




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 2:38 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

This would help me to get the date type of data. A new question comes out that 
since the dates are not exactly the same on two date sets, there are some NA 
values in the merged data set, such as

2012-09-28       NA          NA    5400726 14861715970
2012-09-30  5035606 14832837436         NA          NA

Does R have a function to convert the date to some format of Sep,2012, 
therefore when I merge those two, they will not have those NA numbers...

Thanks,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 2:15 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

Assuming that 'raw_data' is data.frame with first column as raw_time:
You could convert the raw_time to date format by 

 as.Date(28FEB2002,format=%d%B%Y)
#[1] 2002-02-28

In your data, it should  be:
raw_data$raw_time- as.Date(raw_time,format=%d%B%Y)

Could you just dput() a few lines of your dataset if this is not working?
Tx.

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 2:08 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

My data shows that I do not have a date type of data:

summary(raw_data)
      raw_time      raw_acct         raw_baln
28FEB2002:  1   Min.   : 61714   Min.   :117079835
28FEB2003:  1   1st Qu.: 75587   1st Qu.:158035150
28FEB2005:  1   Median :100234   Median :206906298
28FEB2006:  1   Mean   : 96058   Mean   :210550369
28FEB2007:  1   3rd Qu.:116908   3rd Qu.:263623782
28FEB2009:  1   Max.   :121853   Max.   :325290870
(Other)  :127                                      


How could I transfer the raw_time column to a date format, such as

summary(dateA)
        Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
2012-01-01 2012-04-01 2012-07-01 2012-07-01 2012-09-30 2012-12-31


Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 12:39 PM
To: Yuan, Rebecca
Cc: R help; Petr PIKAL
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi,

You could also try this:
dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day)
 
dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day)
set.seed(15)
 A-data.frame(dateA,value=sample(1:300,366,replace=TRUE))
 set.seed(25)
 B-data.frame(dateB,value=sample(1:300,275,replace=TRUE))
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
res1-res[complete.cases(res),]
library(zoo)
plot.zoo(res1)
plot.zoo(res)

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'PIKAL Petr' petr.pi...@precheza.cz
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 10:36 AM
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hello Petr,

As the time series have the same column names, I got the error message like:



 m1-merge(A, B, by.x = time, by.y = balance)
Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s)


To plot A and B in one plot is to compare the difference between them...

Any other thoughts?

Thanks,

Rebecca


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz]
Sent: Tuesday, January 22, 2013 10:28 AM
To: Yuan, Rebecca; R help
Subject: RE: plot two time series with different length and different starting 
point in one figure.

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
 project.org] On Behalf Of Yuan, Rebecca
 Sent: Tuesday, January 22, 2013 4:07 PM
 To: R help
 Subject: [R] plot two time series with different length and different 
 starting point in one

Re: [R] user units in plotrix

2013-01-22 Thread Greg Snow

If you want to convert between different units using base graphics then
look at the grconvertX and grconvertY functions (in the graphics package).
 These functions will convert from/to user coordinates, inches, device,
figure, and plot coordinates.  So you could use grconvertX to  find out
what user value on the x scale to give to draw.circle that would then
generate a circle with a given size in inches, or relative to the device,
figure, or plotting region.


On Sun, Jan 20, 2013 at 2:59 PM, Murat Tasan mmu...@gmail.com wrote:

 hi all - i'm having some difficulty figuring out how to convert
 between user units (which i can't find a definition for in the
 plotrix package) and either (a) device units (e.g. inches with PDFs)
 or (b) user coordinates along any particular axis.

 as an example, suppose i set up a PDF device with inches, the device
 has both outer and inner magins, and the plot region has drastically
 different x and y coordinate ranges (e.g. xlim = c(0, 1), ylim =  c(0,
 SOME_VERY_LARGE_NUMBER)).

 now i'd like to draw.circle(...) but i can't figure out what units the
 radius argument takes.
 user units doesn't appear to be inches in this case, and it it
 corresponds to user coordinates, i don't know which axis' scaling is
 to be used as the reference.

 ideally, one would be able to specify the radius in user coordinates
 while specifying _which_ axis to use as the standard (e.g. an axis =
 y or axis = x argument).

 getFigCtr(...) can help in figuring this out, but its argument takes
 the relative position of the figure region, rather than the plot
 region, which is more apt for properly placing shapes.

 i know the grid package has extensive unit conversion code, but i'm
 trying to update a series of figures using only base graphics...

 i can't seem to find a rigorous definition of user units anywhere in
 the plotrix package.
 anyone know of where i can find this info?

 cheers,

 -m

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to give a lengend in symbols functions

2013-01-22 Thread Greg Snow

I don't see a symbols function in the gtools package, do you mean the
symbols function in the graphics package?

If so, there is not a simple legend or key function to create the legend
(the number of possible options would make it more complicated than
building the legend by hand).  You will need to construct the legend by
hand.  You can use the symbols function to add the example symbols to the
legend and the text function to add the explanatory text.  The functions
grconvertX, grconvertY, strheight, and strwidth will help with deciding
where to place the text and symbols.


On Mon, Jan 21, 2013 at 6:37 PM, Jie Tang totang...@gmail.com wrote:

 hi Rusers

   I am trying to use symbos in gtools package
 symbols(data1,data3,circle=data1/data3,inches=0.1,bg=lightgreen)

 Now I want to give a lengend to tell the reader the meaning or magnitude of
 these  circle.
 How can I add these information in symbols plot just like legend in plot ?
  thank you .
 --
 TANG Jie
 Email: totang...@gmail.com
 Tel: 0086-2154896104
 Shanghai Typhoon Institute,China

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Create a Data Frame from an XML

2013-01-22 Thread Adam Gabbert

 Hello,

I'm attempting to read information from an XML into a data frame in R using
the XML package. I am unable to get the data into a data frame as I would
like.  I have some sample code below.

*XML Code:*

Header...

Data I want in a data frame:

   data
  row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 /
  row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 /
  row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 /
  row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 /
  row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 /
  row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 /
  row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 /
  row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS /
  row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 /
  row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 /
  /data

*R Code:*

doc -xmlInternalTreeParse (Sample2.xml)
top - xmlRoot (doc)
xmlName (top)
names (top)
art - top [[row]]
art
**
*Output:*

 artrow BRAND=GMC NUM=1 YEAR=1999 VALUE=1/

* *


This is where I am having difficulties.  I am unable to access additional
rows; ( i.e.  row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / )

and I am unable to access the individual entries to actually create the
data frame.  The data frame I would like is as follows:

BRANDNUMYEARVALUE
GMC1  1999  1
FORD   2  2000  12000
GMC1  2001   12500
etc

Any help or suggestions would be appreciated.  Conversly, my eventual goal
would be to take a data frame and write it into an XML in the previously
shown format.

Thank you

AG

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] change confidence interval line length in barplot2 (plotrix package)

2013-01-22 Thread David L Carlson

Maybe a fortunate mistake. If you use the base graphics barplot(), you can
use plotCI() in plotrix to add the confidence intervals with control over
the width of the horizontal ends of the bars (if needed, the defaults are
much narrower):

out - barplot(hh, beside = TRUE,
   col = c(lightblue, mistyrose, lightcyan, lavender),
   legend = colnames(VADeaths), ylim = c(0, 20),
   main = Death Rates in Virginia, font.main = 4,
   sub = Faked 95 percent error bars, col.sub = mybarcol,
   cex.names = 1.5)
plotCI(out, hh, pch=, gap=0, ui=ci.u, li=ci.l, add=TRUE)

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Martin Batholdy
 Sent: Tuesday, January 22, 2013 2:42 PM
 To: r-help@r-project.org
 Subject: Re: [R] change confidence interval line length in barplot2
 (plotrix package)
 
 Ok, I have to apologize,
 I confused the packages.
 
 It's the function barplot2 from the gplots package!
 
 
   It calls itself an extenstion of barplot2 and has a ci.lwd argument.
 Might save you the time of doing what I thought might be needed,
 hacking te code.
 
 Unfortunately ci.lwd controls the thickness of the line but not the
 horizontal width.
 
 
 
 On Jan 22, 2013, at 21:24 , David Winsemius dwinsem...@comcast.net
 wrote:
 
 
  On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote:
 
  Hi,
 
  is there any way to change the width of the horizontal line of
 confidence intervals
  in the barplot2 function in the plotrix package (independent of the
 width of the bars)?
 
 
  example code:
 
  library(plotrix)
  # Example with confidence intervals and grid
  hh - t(VADeaths)[, 1]
  mybarcol - gray20
  ci.l - hh * 0.85
  ci.u - hh * 1.15
  mp - barplot2(hh, beside = TRUE,
col = c(lightblue, mistyrose,
lightcyan, lavender),
legend = colnames(VADeaths), ylim = c(0, 20),
main = Death Rates in Virginia, font.main = 4,
sub = Faked 95 percent error bars, col.sub = mybarcol,
cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)
 
  When I did an sos::findFn(barplot2) search to locate the real
 `barplot2` O alos noted in the same package (gplots) a function named
 `ooplot`. It calls itself an extenstion of barplot2 and has a ci.lwd
 argument. Might save you the time of doing what I thought might be
 needed, hacking te code.
 
  --
  David Winsemius
  Alameda, CA, USA
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] change confidence interval line length in barplot2 (plotrix package)

2013-01-22 Thread Marc Schwartz

On Jan 22, 2013, at 2:41 PM, Martin Batholdy batho...@googlemail.com wrote:

 Ok, I have to apologize,
 I confused the packages.
 
 It's the function barplot2 from the gplots package!
 
 
 It calls itself an extenstion of barplot2 and has a ci.lwd argument. Might 
 save you the time of doing what I thought might be needed, hacking te code.
 
 Unfortunately ci.lwd controls the thickness of the line but not the 
 horizontal width.


barplot2() in gplots uses a hard coded width for the CI's, which is 50% of the 
bar width, so it is a consistent proportion.

You could hack the code or simply use base graphics barplot() along with either 
?segments or perhaps more easily, ?arrows, which would give you more 
flexibility.

Compare:

mp - barplot(1:5)
arrows(mp, 1:5 + 0.5, mp, 1:5 - 0.5, code = 3, angle = 90, length = 0.1)

with:

mp - barplot(1:5)
arrows(mp, 1:5 + 0.5, mp, 1:5 - 0.5, code = 3, angle = 90, length = 0.25)

where the 'length' argument to arrows() defines the width of the upper and 
lower boundary lines.

There are a fair number of other functions around that can add CI's to plots as 
well and a search of the archives should bear fruit.

Regards,

Marc Schwartz


 
 On Jan 22, 2013, at 21:24 , David Winsemius dwinsem...@comcast.net wrote:
 
 
 On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote:
 
 Hi,
 
 is there any way to change the width of the horizontal line of confidence 
 intervals
 in the barplot2 function in the plotrix package (independent of the width 
 of the bars)?
 
 
 example code:
 
 library(plotrix)
 # Example with confidence intervals and grid
 hh - t(VADeaths)[, 1]
 mybarcol - gray20
 ci.l - hh * 0.85
 ci.u - hh * 1.15
 mp - barplot2(hh, beside = TRUE,
  col = c(lightblue, mistyrose,
  lightcyan, lavender),
  legend = colnames(VADeaths), ylim = c(0, 20),
  main = Death Rates in Virginia, font.main = 4,
  sub = Faked 95 percent error bars, col.sub = mybarcol,
  cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)
 
 When I did an sos::findFn(barplot2) search to locate the real `barplot2` 
 O alos noted in the same package (gplots) a function named `ooplot`. It 
 calls itself an extenstion of barplot2 and has a ci.lwd argument. Might save 
 you the time of doing what I thought might be needed, hacking te code.
 
 -- 
 David Winsemius
 Alameda, CA, USA


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.

HI Rebecca,
Try this:

dateA-seq.Date(as.Date(28JAN2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month)
dateB-seq.Date(as.Date(30JAN2012,format=%d%B%Y),as.Date(30DEC2012,format=%d%B%Y),by=month)
set.seed(15)
 A-data.frame(dateA,value=cumsum(sample(1:50,12,replace=TRUE)))
 set.seed(25)
 B-data.frame(dateB,value=cumsum(sample(1:72,12,replace=TRUE)))
B[,1]-as.Date(gsub(\\d+$,28,B[,1]))
 
B[,1][duplicated(B[,1],fromLast=TRUE)]-as.Date(gsub((.*-).*(-.*),\\102\\2,B[,1][duplicated(B[,1],fromLast=TRUE)]))
 #this step may not be needed in ur data.  In the month of march, there were 
two values
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
library(zoo)
plot.zoo(res)
A.K.



- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 3:53 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

I do not want to remove those NA values because they are the monthly data but 
recorded as the last calendar date in A and last business date in B.

I tried to use 

raw_time    - substr(raw_time,3,9)
raw_time    - as.Date(raw_time,format=%d%B%Y)

to cutoff the date and leave the month and year in raw_time, and then convert 
it to a valid date type of data, but I failed.

Is there a way that I can present

2012-09-28       NA          NA    5400726 14861715970 
2012-09-30  5035606 14832837436         NA          NA

into something like

2012-09-30  5035606 14832837436    5400726 14861715970

By converting 2012-09-28 to the last calendar date as of 2012-09-30 then B will 
be recorded at the last business date of the month, and will not have any NA 
values.

Dput() gives me

 dput(tail(res))
structure(c(121, NA, 111, 111, 120, 119, 309, 
NA, 313, 307, 30, 313, 130, 
130, NA, 130, 130, 130, 309, 313, 
NA, 309, 310, 315), class = c(xts, 
zoo), .indexCLASS = Date, .indexTZ = , tclass = Date, tzone = , index 
= structure(c(134, 
134, 134, 135, 135, 135), tzone = , tclass = Date), .Dim = c(6L, 
4L), .Dimnames = list(NULL, c(raw_acct, raw_baln, raw_acct.1, 
raw_baln.1)))

Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 3:41 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

In the previous email,
  res-merge(Anew,Bnew)
head(res)
#   Anew Bnew
#2012-01-01  181   NA
#2012-01-02   59   NA
#2012-01-03  290   NA
#2012-01-04  196   NA
#2012-01-05  111   NA
#2012-01-06  297   NA
 
plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I 
guess, it would remove that from plotting)  

If you want to remove the NA rows:
use, na.omit() or complete.cases()? #as I did in the previous email.

Could you dput() an example dataset?

A.K.
 




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 2:38 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

This would help me to get the date type of data. A new question comes out that 
since the dates are not exactly the same on two date sets, there are some NA 
values in the merged data set, such as

2012-09-28       NA          NA    5400726 14861715970 2012-09-30  5035606 
14832837436         NA          NA

Does R have a function to convert the date to some format of Sep,2012, 
therefore when I merge those two, they will not have those NA numbers...

Thanks,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 2:15 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

Assuming that 'raw_data' is data.frame with first column as raw_time:
You could convert the raw_time to date format by 

 as.Date(28FEB2002,format=%d%B%Y)
#[1] 2002-02-28

In your data, it should  be:
raw_data$raw_time- as.Date(raw_time,format=%d%B%Y)

Could you just dput() a few lines of your dataset if this is not working?
Tx.

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 2:08 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

My data shows that I do not have a date type of data:

summary(raw_data)
      raw_time      raw_acct         raw_baln
28FEB2002:  1   Min.   : 61714   Min.   :117079835
28FEB2003:  1   1st Qu.: 75587   1st Qu.:158035150
28FEB2005:  1   Median :100234   Median :206906298
28FEB2006:  1   Mean   : 96058   Mean   :210550369
28FEB2007:  1   3rd

Re: [R] plot two time series with different length and different starting point in one figure.

Hello Arun,

Thanks very much! In this way, it works! I convert both A and B to the same day 
of the month, and therefore there is no NA shown for different last business 
day and last calendar day of the month.

You are very help!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 5:06 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

HI Rebecca,
Try this:

dateA-seq.Date(as.Date(28JAN2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month)
dateB-seq.Date(as.Date(30JAN2012,format=%d%B%Y),as.Date(30DEC2012,format=%d%B%Y),by=month)
set.seed(15)
 A-data.frame(dateA,value=cumsum(sample(1:50,12,replace=TRUE)))
 set.seed(25)
 B-data.frame(dateB,value=cumsum(sample(1:72,12,replace=TRUE)))
B[,1]-as.Date(gsub(\\d+$,28,B[,1]))
 
B[,1][duplicated(B[,1],fromLast=TRUE)]-as.Date(gsub((.*-).*(-.*),\\102\\2,B[,1][duplicated(B[,1],fromLast=TRUE)]))
 #this step may not be needed in ur data.  In the month of march, there were 
two values
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
library(zoo)
plot.zoo(res)
A.K.



- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 3:53 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

I do not want to remove those NA values because they are the monthly data but 
recorded as the last calendar date in A and last business date in B.

I tried to use 

raw_time    - substr(raw_time,3,9)
raw_time    - as.Date(raw_time,format=%d%B%Y)

to cutoff the date and leave the month and year in raw_time, and then convert 
it to a valid date type of data, but I failed.

Is there a way that I can present

2012-09-28       NA          NA    5400726 14861715970 2012-09-30  5035606 
14832837436         NA          NA

into something like

2012-09-30  5035606 14832837436    5400726 14861715970

By converting 2012-09-28 to the last calendar date as of 2012-09-30 then B will 
be recorded at the last business date of the month, and will not have any NA 
values.

Dput() gives me

 dput(tail(res))
structure(c(121, NA, 111, 111, 120, 119, 309, NA, 313, 307, 30, 313, 130, 130, 
NA, 130, 130, 130, 309, 313, NA, 309, 310, 315), class = c(xts, zoo), 
.indexCLASS = Date, .indexTZ = , tclass = Date, tzone = , index = 
structure(c(134, 134, 134, 135, 135, 135), tzone = , tclass = Date), .Dim = 
c(6L, 4L), .Dimnames = list(NULL, c(raw_acct, raw_baln, raw_acct.1,
raw_baln.1)))

Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 3:41 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

In the previous email,
  res-merge(Anew,Bnew)
head(res)
#   Anew Bnew
#2012-01-01  181   NA
#2012-01-02   59   NA
#2012-01-03  290   NA
#2012-01-04  196   NA
#2012-01-05  111   NA
#2012-01-06  297   NA
 
plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I 
guess, it would remove that from plotting)  

If you want to remove the NA rows:
use, na.omit() or complete.cases()? #as I did in the previous email.

Could you dput() an example dataset?

A.K.
 




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 2:38 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

This would help me to get the date type of data. A new question comes out that 
since the dates are not exactly the same on two date sets, there are some NA 
values in the merged data set, such as

2012-09-28       NA          NA    5400726 14861715970 2012-09-30  5035606 
14832837436         NA          NA

Does R have a function to convert the date to some format of Sep,2012, 
therefore when I merge those two, they will not have those NA numbers...

Thanks,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 2:15 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

Assuming that 'raw_data' is data.frame with first column as raw_time:
You could convert the raw_time to date format by 

 as.Date(28FEB2002,format=%d%B%Y)
#[1] 2002-02-28

In your data, it should  be:
raw_data$raw_time- as.Date(raw_time,format=%d%B%Y)

Could you just dput() a few lines of your dataset if this is not working?
Tx.

A.K.




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday,

[R] tapply and functions with more than one objects

2013-01-22 Thread Dominic Roye

Hello,

How i can use a costum function in tapply which has more than one variable?

I mean sum(x) only needs one object but what when i have a function
function(x,y) with more, how i indicate where are the other variables
to use?7


I hope someone can help me. Thank you!!

Best regards,

Dominic

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding a line to barchart

2013-01-22 Thread Jonathan Greenberg

R-helpers:

I need a quick help with the following graph (I'm a lattice newbie):

require(lattice)
npp=1:5
names(npp)=c(A,B,C,D,E)
barchart(npp,origin=0,box.width=1)

# What I want to do, is add a single vertical line positioned at x = 2 that
lays over the bars (say, using a dotted line).  How do I go about doing
this?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] density of hist(freq = FALSE) inversely affected by data magnitude

2013-01-22 Thread J Toll

Hi,

I have a couple of observations, a question or two, and perhaps a
suggestion related to the plotting of density on the y-axis within the
hist() function when freq=FALSE. I was using the function and trying
to develop an intuitive understanding of what the density is telling
me. After reading through this fairly helpful post:

http://stats.stackexchange.com/questions/17258/odd-problem-with-a-histogram-in-r-with-a-relative-frequency-axis

I finally realized that in the case where freq = FALSE, the y-axis
isn't really telling me the density. It's actually indicating the
density multiplied by the bin size. I assume this is for the case
where the bins may be of non-regular size.

from hist.default:

dens - counts/(n * diff(breaks))

So the count in each bin is divided by the total number of
observations (n) multiplied by the size of the bin. The problem, as I
see it, is that the density ends up being scaled by the size of the
bins, which is inversely proportional to the magnitude of the data.
Therefore the magnitude of the data is directly affecting the density,
which seems problematic.

For example*:

set.seed()
x - runif(100)
y - x / 1000

par(mfrow = c(2, 1))
hist(x, prob = TRUE)
hist(y, prob = TRUE)

From this example, you see that the density for the y histogram is
1000 times larger, simply because the y data is 1000 times smaller.
Again, that seems problematic. It seems to me, that the density
should be unit-less, but here it's affected by the magnitude of the
data.

So, my question is, why is density calculated this way?

For the case where all the bins are of the same size, I would think
density should simply be calculated as:

dens - counts / n

Of course, that might be somewhat misleading for the case where the
bin sizes vary. So then why not calculate density as:

dens - counts / (n * diff(breaks) / min(diff(breaks)))

Dividing diff(breaks) by min(diff(breaks)) removes the scaling effect
of the magnitude of the data, and simply leaves the relative
difference in bin size.

For the case where all the bins are the same size, the calculation is
equivalent to dens - counts / n

For all other cases, the density is scaled by the size of the bin, but
unaffected by the magnitude of the data.

So, what am I misunderstanding? Why is density calculated as it is,
and what does it mean?

Thanks,

James

*example from
http://stats.stackexchange.com/questions/17258/odd-problem-with-a-histogram-in-r-with-a-relative-frequency-axis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot two time series with different length and different starting point in one figure.

Hi Rebecca,
No problem.
Just a doubt regarding the last calendar day and last business day.
dateA-seq(as.Date(01FEB2012,format=%d%B%Y),length=15,by=1 month)-1   
#gives the last calendar day/month
dateB- 
seq.Date(as.Date(28MAR2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month)
 #here I used day 28.  If it didn't change
#then this works.  
set.seed(15)
 A-data.frame(dateA,value=cumsum(sample(1:50,15,replace=TRUE)))
 set.seed(25)
 B-data.frame(dateB,value=cumsum(sample(1:72,10,replace=TRUE)))
 A[,1]-as.Date(gsub(\\d+$,28,A[,1]))
library(xts)
library(zoo)
 Anew-as.xts(A[,-1],order.by=A[,1])
  Bnew-as.xts(B[,-1],order.by=B[,1])
  res-merge(Anew,Bnew)
 plot.zoo(res)
 
From your reply, it seems like dateB day didn't change.

A.K.





- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 5:28 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

Thanks very much! In this way, it works! I convert both A and B to the same day 
of the month, and therefore there is no NA shown for different last business 
day and last calendar day of the month.

You are very help!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Tuesday, January 22, 2013 5:06 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

HI Rebecca,
Try this:

dateA-seq.Date(as.Date(28JAN2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month)
dateB-seq.Date(as.Date(30JAN2012,format=%d%B%Y),as.Date(30DEC2012,format=%d%B%Y),by=month)
set.seed(15)
 A-data.frame(dateA,value=cumsum(sample(1:50,12,replace=TRUE)))
 set.seed(25)
 B-data.frame(dateB,value=cumsum(sample(1:72,12,replace=TRUE)))
B[,1]-as.Date(gsub(\\d+$,28,B[,1]))
 
B[,1][duplicated(B[,1],fromLast=TRUE)]-as.Date(gsub((.*-).*(-.*),\\102\\2,B[,1][duplicated(B[,1],fromLast=TRUE)]))
 #this step may not be needed in ur data.  In the month of march, there were 
two values
library(xts)
Anew-as.xts(A[,-1],order.by=A[,1])
 Bnew-as.xts(B[,-1],order.by=B[,1])
 res-merge(Anew,Bnew)
library(zoo)
plot.zoo(res)
A.K.



- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: 
Sent: Tuesday, January 22, 2013 3:53 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

I do not want to remove those NA values because they are the monthly data but 
recorded as the last calendar date in A and last business date in B.

I tried to use 

raw_time    - substr(raw_time,3,9)
raw_time    - as.Date(raw_time,format=%d%B%Y)

to cutoff the date and leave the month and year in raw_time, and then convert 
it to a valid date type of data, but I failed.

Is there a way that I can present

2012-09-28       NA          NA    5400726 14861715970 2012-09-30  5035606 
14832837436         NA          NA

into something like

2012-09-30  5035606 14832837436    5400726 14861715970

By converting 2012-09-28 to the last calendar date as of 2012-09-30 then B will 
be recorded at the last business date of the month, and will not have any NA 
values.

Dput() gives me

 dput(tail(res))
structure(c(121, NA, 111, 111, 120, 119, 309, NA, 313, 307, 30, 313, 130, 130, 
NA, 130, 130, 130, 309, 313, NA, 309, 310, 315), class = c(xts, zoo), 
.indexCLASS = Date, .indexTZ = , tclass = Date, tzone = , index = 
structure(c(134, 134, 134, 135, 135, 135), tzone = , tclass = Date), .Dim = 
c(6L, 4L), .Dimnames = list(NULL, c(raw_acct, raw_baln, raw_acct.1,
raw_baln.1)))

Thanks very much!

Cheers,

Rebecca

-Original Message-
From: arun [mailto:smartpink...@yahoo.com]
Sent: Tuesday, January 22, 2013 3:41 PM
To: Yuan, Rebecca
Cc: R help
Subject: Re: [R] plot two time series with different length and different 
starting point in one figure.

Hi Rebecca,

In the previous email,
  res-merge(Anew,Bnew)
head(res)
#   Anew Bnew
#2012-01-01  181   NA
#2012-01-02   59   NA
#2012-01-03  290   NA
#2012-01-04  196   NA
#2012-01-05  111   NA
#2012-01-06  297   NA
 
plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I 
guess, it would remove that from plotting)  

If you want to remove the NA rows:
use, na.omit() or complete.cases()? #as I did in the previous email.

Could you dput() an example dataset?

A.K.
 




- Original Message -
From: Yuan, Rebecca rebecca.y...@bankofamerica.com
To: 'arun' smartpink...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Tuesday, January 22, 2013 2:38 PM
Subject: RE: [R] plot two time series with different length and different 
starting point in one figure.

Hello Arun,

This would help me to get the date type of data. A new question comes out that 
since the dates are not exactly the same on two date sets, there are some NA 
values

[R] summarise subsets of a vector

2013-01-22 Thread Wim Kreinen

Hello,

I have vector called test. And now I wish to measure the mean of the first
10 number, the second 10 numbers etc
How does it work?
Thanks Wim

  dput (test)
c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0.71, 0.21875, 0, 0.27375, 0.26125,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.84125,
0.0575, 0.92625, 0.12, 0, 0)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to assign time series to a vector with one leap year

HI,
You can check this link:
http://r.789695.n4.nabble.com/leap-years-in-temporal-series-command-ts-td3309014.html

Also, this may help you:
library(lubridate), ?leap_year()
 leap_year(2008)
#[1] TRUE
 ymd(2008-2-29)
 1 parsed with %Y-%m-%d
#[1] 2008-02-29 UTC
A.K.




- Original Message -
From: Janesh Devkota janesh.devk...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 2:46 PM
Subject: [R] How to assign time series to a vector with one leap year

Hello All,

I am trying to do the time series analysis in R and I want to assign a
vector as a time series. The data I provided is hourly. The data is from
Jan 1 2008 to Dec 31 2009. How can I assign the data such that the first
year is leap year and second is not ?

airtemp - read.csv(airtemp.csv,header=T,sep=)

aw - ts(airtemp,start=2008,frequency=8784,end=2009)

I assigned frequency as 8784 because 2008 year will have 8784 hourly data
points and 2009 has 8760 data points. The total data points are 17544

The data can be found on
https://www.dropbox.com/s/03z74632v1f3g1e/airtemp.csv

I apologize if this is very trivial to some of you.

Thanks.

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] tapply and functions with more than one objects


On Jan 22, 2013, at 2:24 PM, Dominic Roye wrote:

 Hello,
 
 How i can use a costum function in tapply which has more than one variable?
 
 I mean sum(x) only needs one object but what when i have a function
 function(x,y) with more, how i indicate where are the other variables
 to use?7

You can use:

lapply(split( multi_col_object, category_vec) , function(x,y){sum(x,y)}  ) 

aggregate(dat, category, FUN=sum)

Or:

do.call(rbind, by( multi_col_object, category_vec, function(x,y){ } )

Sometimes `Reduce` is more compact. Other times `mapply` is needed.
-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] density of hist(freq = FALSE) inversely affected by data magnitude

2013-01-22 Thread William Dunlap

The probability density function is not unitless - it is the derivative of the
[cumulative] probability distribution function so it has units 
delta-probability-mass
over delta-x.  It must integrate to 1 (over the all possible x).  
hist(freq=FALSE,x)
or hist(prob=TRUE,x) displays an estimate of the density function and the 
following
example shows how the scale matches what you get from the presumed 
population density function.

 f
function (n, sd) 
{
x - rnorm(n, sd = sd)
hist(x, freq = FALSE) # estimated density
s - seq(min(x), max(x), len = 129)
lines(s, dnorm(s, sd = sd), col = red) # overlay expected density for 
this sample
}
 f(1e6, sd=1)
 f(100, sd=1)
 f(100, sd=0.0001)
 f(1e6, sd=0.0001)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of J Toll
 Sent: Tuesday, January 22, 2013 2:48 PM
 To: r-help
 Subject: [R] density of hist(freq = FALSE) inversely affected by data 
 magnitude
 
 Hi,
 
 I have a couple of observations, a question or two, and perhaps a
 suggestion related to the plotting of density on the y-axis within the
 hist() function when freq=FALSE.  I was using the function and trying
 to develop an intuitive understanding of what the density is telling
 me.  After reading through this fairly helpful post:
 
 http://stats.stackexchange.com/questions/17258/odd-problem-with-a-histogram-in-r-
 with-a-relative-frequency-axis
 
 I finally realized that in the case where freq = FALSE, the y-axis
 isn't really telling me the density.  It's actually indicating the
 density multiplied by the bin size.  I assume this is for the case
 where the bins may be of non-regular size.
 
 from hist.default:
 
 dens - counts/(n * diff(breaks))
 
 So the count in each bin is divided by the total number of
 observations (n) multiplied by the size of the bin.  The problem, as I
 see it, is that the density ends up being scaled by the size of the
 bins, which is inversely proportional to the magnitude of the data.
 Therefore the magnitude of the data is directly affecting the density,
 which seems problematic.
 
 For example*:
 
 set.seed()
 x - runif(100)
 y - x / 1000
 
 par(mfrow = c(2, 1))
 hist(x, prob = TRUE)
 hist(y, prob = TRUE)
 
 From this example, you see that the density for the y histogram is
 1000 times larger, simply because the y data is 1000 times smaller.
 Again, that seems problematic.  It seems to me, that the density
 should be unit-less, but here it's affected by the magnitude of the
 data.
 
 So, my question is, why is density calculated this way?
 
 For the case where all the bins are of the same size, I would think
 density should simply be calculated as:
 
 dens - counts / n
 
 Of course, that might be somewhat misleading for the case where the
 bin sizes vary.  So then why not calculate density as:
 
 dens - counts / (n * diff(breaks) / min(diff(breaks)))
 
 Dividing diff(breaks) by min(diff(breaks)) removes the scaling effect
 of the magnitude of the data, and simply leaves the relative
 difference in bin size.
 
 For the case where all the bins are the same size, the calculation is
 equivalent to dens - counts / n
 
 For all other cases, the density is scaled by the size of the bin, but
 unaffected by the magnitude of the data.
 
 So, what am I misunderstanding?  Why is density calculated as it is,
 and what does it mean?
 
 Thanks,
 
 
 James
 
 
 *example from 
 http://stats.stackexchange.com/questions/17258/odd-problem-with-a-
 histogram-in-r-with-a-relative-frequency-axis
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What is the convergence criterion for binomial logit in glm?


On Jan 22, 2013, at 2:55 PM, Dimitri Liakhovitski wrote:

 Dear R-ers,
 
 I am running logistics regression using package glm: glm(myDV ~ .,
 data=mydata, family=binomial(logit))
 
 I have a general question: in glm (binary logit) - what convergence
 criterion is being used?

You should look at the help page for `glm` (and follow the obvious links.)
 
 -- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding a line to barchart

Hi,

May be this helps:
 barchart(npp,origin=0,box.width=1,
 panel=function(x,y,...){
 panel.barchart(x,y,...)
 panel.abline(v=2,col.line=red,lty=3)})
A.K.




- Original Message -
From: Jonathan Greenberg j...@illinois.edu
To: r-help r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 5:41 PM
Subject: [R] Adding a line to barchart

R-helpers:

I need a quick help with the following graph (I'm a lattice newbie):

require(lattice)
npp=1:5
names(npp)=c(A,B,C,D,E)
barchart(npp,origin=0,box.width=1)

# What I want to do, is add a single vertical line positioned at x = 2 that
lays over the bars (say, using a dotted line).  How do I go about doing
this?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] summarise subsets of a vector

Hi,
try this:
 unlist(lapply(split(test,((seq_along(test)-1)%/% 10)+1),mean))
#   1    2    3    4    5    6    7    8 
#0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.146375 
  #     9   10   11 
#0.00 0.194500 0.00 

A.K.



- Original Message -
From: Wim Kreinen wkrei...@gmail.com
To: r-help r-help@r-project.org
Cc: 
Sent: Tuesday, January 22, 2013 6:09 PM
Subject: [R] summarise subsets of a vector

Hello,

I have vector called test. And now I wish to measure the mean of the first
10 number, the second 10 numbers etc
How does it work?
Thanks Wim

 dput (test)
c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0.71, 0.21875, 0, 0.27375, 0.26125,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.84125,
0.0575, 0.92625, 0.12, 0, 0)

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What is the convergence criterion for binomial logit in glm?

2013-01-22 Thread Dimitri Liakhovitski

I already looked. This help file for loglin (
http://127.0.0.1:12583/library/stats/html/loglin.html) says:
The Iterative Proportional Fitting algorithm as presented in Haberman
(1972) is used for fitting the model. At most iter iterations are
performed, convergence is taken to occur when the maximum deviation between
observed and fitted margins is less than eps. And the default eps is 0.1

So, is it then the convergence criterion used by glm when
family=binomial(logit)?
I just need to know for sure.

Thanks for confirming!
Dimitri



On Tue, Jan 22, 2013 at 6:37 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Jan 22, 2013, at 2:55 PM, Dimitri Liakhovitski wrote:

  Dear R-ers,
 
  I am running logistics regression using package glm: glm(myDV ~ .,
  data=mydata, family=binomial(logit))
 
  I have a general question: in glm (binary logit) - what convergence
  criterion is being used?

 You should look at the help page for `glm` (and follow the obvious links.)
 
  --

 David Winsemius
 Alameda, CA, USA




-- 
Dimitri Liakhovitski
gfk.com http://marketfusionanalytics.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What is the convergence criterion for binomial logit in glm?