Re: [R] Query on finding root

2023-08-28 Thread Ben Bolker
(I mean pdavies)

On Mon, Aug 28, 2023, 7:52 AM Ben Bolker  wrote:

> I would probably use the built in qdavies() function...
>
> On Mon, Aug 28, 2023, 7:48 AM Leonard Mada via R-help <
> r-help@r-project.org> wrote:
>
>> Dear R-Users,
>>
>> Just out of curiosity:
>> Which of the 2 methods is the better one?
>>
>> The results seem to differ slightly.
>>
>>
>> fun = function(u){((26104.50*u^0.03399381)/((1-u)^0.107)) - 28353.7}
>>
>> uniroot(fun, c(0,1))
>> # 0.6048184
>>
>> curve(fun(x), 0, 1)
>> abline(v=0.3952365, col="red")
>> abline(v=0.6048184, col="red")
>> abline(h=0, col="blue")
>>
>>
>>
>> fun = function(u){ (0.03399381*log(u) - 0.107*log(1-u)) -
>> log(28353.7/26104.50) }
>> fun = function(u){ (0.03399381*log(u) - 0.107*log1p(-u)) -
>> log(28353.7/26104.50) }
>>
>> uniroot(fun, c(0,1))
>> # 0.6047968
>>
>> curve(fun(x), 0, 1)
>> abline(v=0.3952365, col="red")
>> abline(v=0.6047968, col="red")
>> abline(h=0, col="blue")
>>
>> Sincerely,
>>
>> Leonard
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on finding root

2023-08-28 Thread Ben Bolker
I would probably use the built in qdavies() function...

On Mon, Aug 28, 2023, 7:48 AM Leonard Mada via R-help 
wrote:

> Dear R-Users,
>
> Just out of curiosity:
> Which of the 2 methods is the better one?
>
> The results seem to differ slightly.
>
>
> fun = function(u){((26104.50*u^0.03399381)/((1-u)^0.107)) - 28353.7}
>
> uniroot(fun, c(0,1))
> # 0.6048184
>
> curve(fun(x), 0, 1)
> abline(v=0.3952365, col="red")
> abline(v=0.6048184, col="red")
> abline(h=0, col="blue")
>
>
>
> fun = function(u){ (0.03399381*log(u) - 0.107*log(1-u)) -
> log(28353.7/26104.50) }
> fun = function(u){ (0.03399381*log(u) - 0.107*log1p(-u)) -
> log(28353.7/26104.50) }
>
> uniroot(fun, c(0,1))
> # 0.6047968
>
> curve(fun(x), 0, 1)
> abline(v=0.3952365, col="red")
> abline(v=0.6047968, col="red")
> abline(h=0, col="blue")
>
> Sincerely,
>
> Leonard
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on finding root

2023-08-28 Thread Leonard Mada via R-help

Dear R-Users,

Just out of curiosity:
Which of the 2 methods is the better one?

The results seem to differ slightly.


fun = function(u){((26104.50*u^0.03399381)/((1-u)^0.107)) - 28353.7}

uniroot(fun, c(0,1))
# 0.6048184

curve(fun(x), 0, 1)
abline(v=0.3952365, col="red")
abline(v=0.6048184, col="red")
abline(h=0, col="blue")



fun = function(u){ (0.03399381*log(u) - 0.107*log(1-u)) - 
log(28353.7/26104.50) }
fun = function(u){ (0.03399381*log(u) - 0.107*log1p(-u)) - 
log(28353.7/26104.50) }


uniroot(fun, c(0,1))
# 0.6047968

curve(fun(x), 0, 1)
abline(v=0.3952365, col="red")
abline(v=0.6047968, col="red")
abline(h=0, col="blue")

Sincerely,

Leonard

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on finding root

2023-08-27 Thread Ben Bolker
   This doesn't look like homework to me -- too specific.  The posting 
guide  says that the list 
is not intended for "Basic statistics and classroom homework" -- again, 
this doesn't seem to fall into that category.


  tl;dr, I think the difference between the two approaches is just 
whether the lower or upper tail is considered (i.e., adding lower.tail = 
FALSE to pdavies(), or more simply taking (1-x), makes the two answers 
agree).


  If you look at the source code for pdavies() you'll see that it's 
essentially doing the same uniroot() calculation that you are.


## Q(u)=(c*u^lamda1)/((1-u)^lamda2)
mean <- 28353.7 # mean calculated from data
lambda1 <- .03399381 # estimates c, lambda1 and lambda2 calculated from data
lambda2 <- .107
c <- 26104.50
library(Davies)# using package
params <- c(c,lambda1,lambda2)
u <- pdavies(x = mean, params = params, lower.tail = FALSE)
u
fun <- function(u) {
with(as.list(params), (c*u^lambda1)/((1-u)^lambda2)) - mean
}
curve(fun, from = 0.01, to = 1)
uniroot <- uniroot(fun,c(0.01,1))
abline(h = 0)
uniroot$root



On 2023-08-27 5:40 p.m., Rolf Turner wrote:


On Fri, 25 Aug 2023 22:17:05 +0530
ASHLIN VARKEY  wrote:


Sir,


Please note that r-help is a mailing list, not a knight! ️


I want to solve the equation Q(u)=mean, where Q(u) represents the
quantile function. Here my Q(u)=(c*u^lamda1)/((1-u)^lamda2), which is
the quantile function of Davies (Power-pareto) distribution.  Hence I
want to solve , *(c*u^lamda1)/((1-u)^lamda2)=28353.7(Eq.1)*
where lamda1=0.03399381, lamda2=0.107 and c=26104.50. When I used
the package 'Davies' and solved Eq 1, I got the answer u=0.3952365.
But when I use the function  'uniroot' to solve the Eq.1, I got a
different answer which is  u=0.6048157.  Why did this difference
happen?  Which is the correct method to solve Eq.1. Using the value
of *u *from the first method my further calculation was nearer to
empirical values.  The R-code I used is herewith. Kindly help me to
solve this issue.

R-code
Q(u)=(c*u^lamda1)/((1-u)^lamda2)
mean=28353.7 # mean calculated from data
lamda1=.03399381 # estimates c, lamda1 and lamda2 calculated from data
lamda2=.107
c=26104.50
library(Davies)# using package
params=c(c,lamda1,lamda2)
u=pdavies(28353.7,params)
u
fun=function(u){((26104.50*u^0.03399381)/((1-u)^0.107))-28353.7}
uniroot= uniroot(fun,c(0.01,1))
uniroot


As Prof. Nash has pointed out, this looks like homework.

Some general advice:  graphics can be very revealing, and are easy to
effect in R.  Relevant method: plot.function(); relevant utility:
abline().  Look at the help for these.

cheers,

Rolf Turner



--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of 
working hours.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on finding root

2023-08-27 Thread Rolf Turner


On Fri, 25 Aug 2023 22:17:05 +0530
ASHLIN VARKEY  wrote:

> Sir,

Please note that r-help is a mailing list, not a knight! ️

> I want to solve the equation Q(u)=mean, where Q(u) represents the
> quantile function. Here my Q(u)=(c*u^lamda1)/((1-u)^lamda2), which is
> the quantile function of Davies (Power-pareto) distribution.  Hence I
> want to solve , *(c*u^lamda1)/((1-u)^lamda2)=28353.7(Eq.1)*
> where lamda1=0.03399381, lamda2=0.107 and c=26104.50. When I used
> the package 'Davies' and solved Eq 1, I got the answer u=0.3952365.
> But when I use the function  'uniroot' to solve the Eq.1, I got a
> different answer which is  u=0.6048157.  Why did this difference
> happen?  Which is the correct method to solve Eq.1. Using the value
> of *u *from the first method my further calculation was nearer to
> empirical values.  The R-code I used is herewith. Kindly help me to
> solve this issue.
> 
> R-code
> Q(u)=(c*u^lamda1)/((1-u)^lamda2)
> mean=28353.7 # mean calculated from data
> lamda1=.03399381 # estimates c, lamda1 and lamda2 calculated from data
> lamda2=.107
> c=26104.50
> library(Davies)# using package
> params=c(c,lamda1,lamda2)
> u=pdavies(28353.7,params)
> u
> fun=function(u){((26104.50*u^0.03399381)/((1-u)^0.107))-28353.7}
> uniroot= uniroot(fun,c(0.01,1))
> uniroot

As Prof. Nash has pointed out, this looks like homework.

Some general advice:  graphics can be very revealing, and are easy to
effect in R.  Relevant method: plot.function(); relevant utility:
abline().  Look at the help for these.

cheers,

Rolf Turner

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Stats. Dep't. (secretaries) phone:
 +64-9-373-7599 ext. 89622
Home phone: +64-9-480-4619

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on finding root

2023-08-26 Thread J C Nash

Homework?

On 2023-08-25 12:47, ASHLIN VARKEY wrote:

Sir,
I want to solve the equation Q(u)=mean, where Q(u) represents the quantile
function. Here my Q(u)=(c*u^lamda1)/((1-u)^lamda2), which is the quantile
function of Davies (Power-pareto) distribution.  Hence I want to solve ,
*(c*u^lamda1)/((1-u)^lamda2)=28353.7(Eq.1)*
where lamda1=0.03399381, lamda2=0.107 and c=26104.50. When I used the
package 'Davies' and solved Eq 1, I got the answer u=0.3952365. But when I
use the function  'uniroot' to solve the Eq.1, I got a different answer
which is  u=0.6048157.  Why did this difference happen?  Which is the
correct method to solve Eq.1. Using the value of *u *from the first method
my further calculation was nearer to empirical values.  The R-code I used
is herewith. Kindly help me to solve this issue.

R-code
Q(u)=(c*u^lamda1)/((1-u)^lamda2)
mean=28353.7 # mean calculated from data
lamda1=.03399381 # estimates c, lamda1 and lamda2 calculated from data
lamda2=.107
c=26104.50
library(Davies)# using package
params=c(c,lamda1,lamda2)
u=pdavies(28353.7,params)
u
fun=function(u){((26104.50*u^0.03399381)/((1-u)^0.107))-28353.7}
uniroot= uniroot(fun,c(0.01,1))
uniroot

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query on finding root

2023-08-26 Thread ASHLIN VARKEY
Sir,
I want to solve the equation Q(u)=mean, where Q(u) represents the quantile
function. Here my Q(u)=(c*u^lamda1)/((1-u)^lamda2), which is the quantile
function of Davies (Power-pareto) distribution.  Hence I want to solve ,
*(c*u^lamda1)/((1-u)^lamda2)=28353.7(Eq.1)*
where lamda1=0.03399381, lamda2=0.107 and c=26104.50. When I used the
package 'Davies' and solved Eq 1, I got the answer u=0.3952365. But when I
use the function  'uniroot' to solve the Eq.1, I got a different answer
which is  u=0.6048157.  Why did this difference happen?  Which is the
correct method to solve Eq.1. Using the value of *u *from the first method
my further calculation was nearer to empirical values.  The R-code I used
is herewith. Kindly help me to solve this issue.

R-code
Q(u)=(c*u^lamda1)/((1-u)^lamda2)
mean=28353.7 # mean calculated from data
lamda1=.03399381 # estimates c, lamda1 and lamda2 calculated from data
lamda2=.107
c=26104.50
library(Davies)# using package
params=c(c,lamda1,lamda2)
u=pdavies(28353.7,params)
u
fun=function(u){((26104.50*u^0.03399381)/((1-u)^0.107))-28353.7}
uniroot= uniroot(fun,c(0.01,1))
uniroot

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about code

2023-04-22 Thread Jim Lemon
Hi Yeswanth,
As it says on the bottom ofthe first page, the corresponding author it:

yangj...@ms.xjb.ac.cn

Try that email address.

Jim

On Sat, Apr 22, 2023 at 2:59 PM ADIGARLA YESWANTH NAIDU
<102213...@smail.iitpkd.ac.in> wrote:
>
> Thanks for your reply sir ,
> Here is the reference,I want to follow the procedure which they have 
> mentioned in the method.
>
> Best Regards,
> Yeswanth,
>
> On Sat, Apr 22, 2023 at 4:08 AM Bert Gunter  wrote:
>>
>> "Perhaps you could supply a reference to
>> the work you are using?"
>>
>> ... in which case they should simply email the author directly, no?
>>
>> -- Bert
>>
>> On Fri, Apr 21, 2023 at 3:22 PM Jim Lemon  wrote:
>> >
>> > Hi Yeswanth,
>> > You seem to be referring to a specific publication by a specific
>> > author. Unless someone in R-help knows who and what you are referring
>> > to, it seems very difficult. Perhaps you could supply a reference to
>> > the work you are using?
>> >
>> > JIm
>> >
>> > On Sat, Apr 22, 2023 at 7:03 AM ADIGARLA YESWANTH NAIDU
>> > <102213...@smail.iitpkd.ac.in> wrote:
>> > >
>> > > I have been trying to write the code for the CCM analysis that you used 
>> > > in
>> > > your study, but unfortunately, I haven't been able to write it. I was
>> > > wondering if you would be willing to share the code with me. I understand
>> > > that the code may be your intellectual property, but I assure you that I
>> > > will use it solely for academic and non-commercial purposes.
>> > >
>> > > If you are able to share the code, I would greatly appreciate it. The 
>> > > code
>> > > will be a valuable resource for me to understand the implementation 
>> > > details
>> > > and reproduce the results of your study. Alternatively, if you are unable
>> > > to share the code, I would be grateful for any guidance or suggestions 
>> > > you
>> > > can provide in implementing the CCM analysis using the rEDM and
>> > > multispatialCCM packages.
>> > >
>> > > Thank you very much for your consideration. I look forward to your 
>> > > positive
>> > > response.
>> > >
>> > > Best regards,
>> > > Yeswanth.
>> > >
>> > > [[alternative HTML version deleted]]
>> > >
>> > > __
>> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide 
>> > > http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide 
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about code

2023-04-21 Thread Bert Gunter
"Perhaps you could supply a reference to
the work you are using?"

... in which case they should simply email the author directly, no?

-- Bert

On Fri, Apr 21, 2023 at 3:22 PM Jim Lemon  wrote:
>
> Hi Yeswanth,
> You seem to be referring to a specific publication by a specific
> author. Unless someone in R-help knows who and what you are referring
> to, it seems very difficult. Perhaps you could supply a reference to
> the work you are using?
>
> JIm
>
> On Sat, Apr 22, 2023 at 7:03 AM ADIGARLA YESWANTH NAIDU
> <102213...@smail.iitpkd.ac.in> wrote:
> >
> > I have been trying to write the code for the CCM analysis that you used in
> > your study, but unfortunately, I haven't been able to write it. I was
> > wondering if you would be willing to share the code with me. I understand
> > that the code may be your intellectual property, but I assure you that I
> > will use it solely for academic and non-commercial purposes.
> >
> > If you are able to share the code, I would greatly appreciate it. The code
> > will be a valuable resource for me to understand the implementation details
> > and reproduce the results of your study. Alternatively, if you are unable
> > to share the code, I would be grateful for any guidance or suggestions you
> > can provide in implementing the CCM analysis using the rEDM and
> > multispatialCCM packages.
> >
> > Thank you very much for your consideration. I look forward to your positive
> > response.
> >
> > Best regards,
> > Yeswanth.
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about code

2023-04-21 Thread Jim Lemon
Hi Yeswanth,
You seem to be referring to a specific publication by a specific
author. Unless someone in R-help knows who and what you are referring
to, it seems very difficult. Perhaps you could supply a reference to
the work you are using?

JIm

On Sat, Apr 22, 2023 at 7:03 AM ADIGARLA YESWANTH NAIDU
<102213...@smail.iitpkd.ac.in> wrote:
>
> I have been trying to write the code for the CCM analysis that you used in
> your study, but unfortunately, I haven't been able to write it. I was
> wondering if you would be willing to share the code with me. I understand
> that the code may be your intellectual property, but I assure you that I
> will use it solely for academic and non-commercial purposes.
>
> If you are able to share the code, I would greatly appreciate it. The code
> will be a valuable resource for me to understand the implementation details
> and reproduce the results of your study. Alternatively, if you are unable
> to share the code, I would be grateful for any guidance or suggestions you
> can provide in implementing the CCM analysis using the rEDM and
> multispatialCCM packages.
>
> Thank you very much for your consideration. I look forward to your positive
> response.
>
> Best regards,
> Yeswanth.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query about code

2023-04-21 Thread ADIGARLA YESWANTH NAIDU
I have been trying to write the code for the CCM analysis that you used in
your study, but unfortunately, I haven't been able to write it. I was
wondering if you would be willing to share the code with me. I understand
that the code may be your intellectual property, but I assure you that I
will use it solely for academic and non-commercial purposes.

If you are able to share the code, I would greatly appreciate it. The code
will be a valuable resource for me to understand the implementation details
and reproduce the results of your study. Alternatively, if you are unable
to share the code, I would be grateful for any guidance or suggestions you
can provide in implementing the CCM analysis using the rEDM and
multispatialCCM packages.

Thank you very much for your consideration. I look forward to your positive
response.

Best regards,
Yeswanth.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] query in loops

2022-12-05 Thread jim holtman
So what is the problem that you would like help in correcting?  The program
seems to run.

Thanks

Jim Holtman
*Data Munger Guru*


*What is the problem that you are trying to solve?Tell me what you want to
do, not how you want to do it.*


On Mon, Dec 5, 2022 at 12:59 PM ASHLIN VARKEY 
wrote:

> Sir,
> I want to write a loop in R to find the AIC factor. For its calculation, I
> need to run an algorithm in the attached file. Here  'x' represents the
> dataset and xi denotes the i-th observation after arranging it in ascending
> order. Q(u) and q(u) represent the quantile function and quantile density
> function respectively. For my distribution Q(u) and q(u) are given below.
> Q(u)=-α log⁡(1-u)+(b-α)u+((r-b))/2 u^2
> q(u)=b+u(r-b+α/(1-u)).
> Can you please help me to correct this program based on the algorithm?
> *R code*
> x=c(0.047, 0.296, 0.540, 1.271, 0.115, 0.334, 0.570, 1.326, 0.121, 0.395,
> 0.641, 1.447, 0.132, 0.458, 0.644, 1.485, 0.164, 0.466, 0.696, 1.553,
> 0.197, 0.501, 0.841,1.581,
> 0.203,0.507, 0.863, 1.589, 3.743, 0.260, 0.529, 1.099, 2.178, 0.282, 0.534,
> 1.219, 2.343, 2.416, 2.444, 2.825, 2.830, 3.578, 3.658, 3.978, 4.033)
> xi=sort(x)
> xi
> n=45
> alpha=-1.014
> b=.949
> r=3.11
> u=c()
> D=c()
> q=c()
> Q=c()
> for (i in 1:n) {
> u[i]=i/(n+1)
> Q[i]=-alpha*log(1-u[i])+(b-alpha)*u[i]+((r-b)/2)*(u[i]^2)
> q[i]=b+u[i]*(r-b+(alpha/(1-u[i])))
> D[i]=Q[i]-xi[i]
> if (D[i]<(10^-7)) {
>   print (q[i])
> }
> else{
>   u[i+1]=u[i]+((xi[i]-Q[i])/q[i])
> }
>   }
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] query in loops

2022-12-05 Thread ASHLIN VARKEY
Sir,
I want to write a loop in R to find the AIC factor. For its calculation, I
need to run an algorithm in the attached file. Here  'x' represents the
dataset and xi denotes the i-th observation after arranging it in ascending
order. Q(u) and q(u) represent the quantile function and quantile density
function respectively. For my distribution Q(u) and q(u) are given below.
Q(u)=-α log⁡(1-u)+(b-α)u+((r-b))/2 u^2
q(u)=b+u(r-b+α/(1-u)).
Can you please help me to correct this program based on the algorithm?
*R code*
x=c(0.047, 0.296, 0.540, 1.271, 0.115, 0.334, 0.570, 1.326, 0.121, 0.395,
0.641, 1.447, 0.132, 0.458, 0.644, 1.485, 0.164, 0.466, 0.696, 1.553,
0.197, 0.501, 0.841,1.581,
0.203,0.507, 0.863, 1.589, 3.743, 0.260, 0.529, 1.099, 2.178, 0.282, 0.534,
1.219, 2.343, 2.416, 2.444, 2.825, 2.830, 3.578, 3.658, 3.978, 4.033)
xi=sort(x)
xi
n=45
alpha=-1.014
b=.949
r=3.11
u=c()
D=c()
q=c()
Q=c()
for (i in 1:n) {
u[i]=i/(n+1)
Q[i]=-alpha*log(1-u[i])+(b-alpha)*u[i]+((r-b)/2)*(u[i]^2)
q[i]=b+u[i]*(r-b+(alpha/(1-u[i])))
D[i]=Q[i]-xi[i]
if (D[i]<(10^-7)) {
  print (q[i])
}
else{
  u[i+1]=u[i]+((xi[i]-Q[i])/q[i])
}
  }
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query regarding R 'irr' package 'N.cohen.kappa'

2022-06-16 Thread Jim Lemon
Hi Kalaivani,
The N.cohen.kappa function was written by Matthais Gamer, the
maintainer of the irr package. Both that function and N2.cohen.kappa
(written by Puspendra Singh) involve corrections that are described in
the references on the respective help pages. It is likely that there
will be small differences in the estimates for large N in the
different methods of calculation. I cannot advise which would best
suit your purpose as I only did testing and refining code in the
N2.cohen.kappa function. Perhaps corresponding directly with Matthais
Gamer would be your best option.

Jim

On Fri, Jun 17, 2022 at 2:20 AM Kalaivani Mani  wrote:
>
> Dear R-help Team,
>
> I am from India and have a query on 'N.cohen.kappa' Sample size
> calculations for Cohen's Kappa Statistic. I have calculated manually the
> sample size using the formula mentioned in "Cantor, A. B. (1996)
> Sample-size calculation for Cohen’s kappa. Psychological Methods, 1, 150-
> 153". Later came to know that it can be done using the R 'irr' package. I
> got a different number.
>
> Let us consider the following two situations:
>
> For situation 1, the sample size is 1370 using R
> Testing H0: kappa = 0.81 vs. HA: kappa> 0.95 given that kappa = 0.95 and
> both raters classify 1.5% of subjects as positive.
> R command used is: N.cohen.kappa(0.015, 0.015, 0.95, 0.81, alpha=0.05,
> power=0.8, twosided=FALSE ).
>
> But for the same situation, the sample size is much higher by manual
> calculation, which is 8580.
>
> For situation 2, the sample size is 74 by using R and is matching with the
> manual calculation too.
> Testing H0: kappa = 0.81 vs. HA: kappa> 0.95 given that kappa = 0.95 and
> rater1 classify 40% of subjects and rate2 classify 50% of subjects as
> positive.
> R command used is: N.cohen.kappa(0.40, 0.50, 0.95, 0.81, alpha=0.05,
> power=0.8, twosided=FALSE ).
>
> I am attaching both the 'Excel sheet formula-kappa sample size situation1 &
> 2').
>
> Why is this so? Please help me to sort this out.
>
> Looking forward to hearing from you.
>
> Best,
> Kalaivani
>
> --
> *Dr. Kalaivani Mani, *
>
> *M.Sc., Biostatistics (CMC, Vellore), Ph.D. (AIIMS, Delhi)*
>
> *Scientist-IV*
>
> *Dept. of Biostatistics*
>
> *All India Institute of Medical Sciences*
> *New Delhi-110029, India.*
> *Mobile:91-9717319082*
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query regarding R 'irr' package 'N.cohen.kappa'

2022-06-16 Thread Kalaivani Mani
Dear R-help Team,

I am from India and have a query on 'N.cohen.kappa' Sample size
calculations for Cohen's Kappa Statistic. I have calculated manually the
sample size using the formula mentioned in "Cantor, A. B. (1996)
Sample-size calculation for Cohen’s kappa. Psychological Methods, 1, 150-
153". Later came to know that it can be done using the R 'irr' package. I
got a different number.

Let us consider the following two situations:

For situation 1, the sample size is 1370 using R
Testing H0: kappa = 0.81 vs. HA: kappa> 0.95 given that kappa = 0.95 and
both raters classify 1.5% of subjects as positive.
R command used is: N.cohen.kappa(0.015, 0.015, 0.95, 0.81, alpha=0.05,
power=0.8, twosided=FALSE ).

But for the same situation, the sample size is much higher by manual
calculation, which is 8580.

For situation 2, the sample size is 74 by using R and is matching with the
manual calculation too.
Testing H0: kappa = 0.81 vs. HA: kappa> 0.95 given that kappa = 0.95 and
rater1 classify 40% of subjects and rate2 classify 50% of subjects as
positive.
R command used is: N.cohen.kappa(0.40, 0.50, 0.95, 0.81, alpha=0.05,
power=0.8, twosided=FALSE ).

I am attaching both the 'Excel sheet formula-kappa sample size situation1 &
2').

Why is this so? Please help me to sort this out.

Looking forward to hearing from you.

Best,
Kalaivani

-- 
*Dr. Kalaivani Mani, *

*M.Sc., Biostatistics (CMC, Vellore), Ph.D. (AIIMS, Delhi)*

*Scientist-IV*

*Dept. of Biostatistics*

*All India Institute of Medical Sciences*
*New Delhi-110029, India.*
*Mobile:91-9717319082*
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query regarding stats/p.adjust package (base) - specifically 'Hochberg' function

2021-08-24 Thread Jim Lemon
This is beginning to sound like a stats taliban fatwa. I don't care if
you're using an abacus, you want to get the correct result. My guess
is that the different instantiations of the Hochberg adjustment are
using different algorithms to calculate the result. The Hochberg
adjustment is known to be sensitive to the distributions of the test
statistics. People who are more expert than I in this area have
different ideas about how to handle this problem. This probably
contributes to the hopefully small differences in the eventual
corrected p-values.

Jim

On Wed, Aug 25, 2021 at 8:02 AM Rolf Turner  wrote:
>
>
> On Tue, 24 Aug 2021 14:44:55 +
> David Swanepoel  wrote:
>
> > Dear R Core Dev Team, I hope all is well your side!
> > My apologies if this is not the correct point of contact to use to
> > address this. If not, kindly advise or forward my request to the
> > relevant team/persons.
> >
> > I have a query regarding the 'Hochberg' method of the stats/p.adjust
> > R package and hope you can assist me please. I have attached the data
> > I used in Excel,
>
> 
>
> In addition to the good advice given to you earlier by Bert Gunter, you
> should consider the following advice:
>
> Don't use Excel!!!
>
> This is a corollary of a more general theorem:   Don't use Micro$oft!!!
>
> cheers,
>
> Rolf Turner
>
> --
> Honorary Research Fellow
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query regarding stats/p.adjust package (base) - specifically 'Hochberg' function

2021-08-24 Thread Rolf Turner


On Tue, 24 Aug 2021 14:44:55 +
David Swanepoel  wrote:

> Dear R Core Dev Team, I hope all is well your side!
> My apologies if this is not the correct point of contact to use to
> address this. If not, kindly advise or forward my request to the
> relevant team/persons.
> 
> I have a query regarding the 'Hochberg' method of the stats/p.adjust
> R package and hope you can assist me please. I have attached the data
> I used in Excel,



In addition to the good advice given to you earlier by Bert Gunter, you
should consider the following advice:

Don't use Excel!!!

This is a corollary of a more general theorem:   Don't use Micro$oft!!!

cheers,

Rolf Turner

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query regarding stats/p.adjust package (base) - specifically 'Hochberg' function

2021-08-24 Thread David Swanepoel
Dear Bert,

Thanks for your prompt response. I used R-help as it was listed as the contact 
point in the Description file of the stats package.

I'll attempt to find more information from stats.stackexchange.com as suggested 
and also see if I can figure out the sci.stat.consult and sci.stat.math Usenet 
groups that is mentioned in the guidance document too. Much appreciated.

Kind regards,
David

-Original Message-
From: Bert Gunter  
Sent: 24 August 2021 19:51
To: David Swanepoel 
Cc: r-help@r-project.org; luke-tier...@uiowa.edu; kurt.hor...@wu.ac.at; 
t.kalib...@kent.ac.uk
Subject: Re: [R] Query regarding stats/p.adjust package (base) - specifically 
'Hochberg' function

1. No Excel attachments made it through. Binary attachments are generally 
stripped by the list server for security reasons.

2. As you may have already learned, this is the wrong forum for statistics or 
package specific questions. Read *and follow* the posting guide linked below to 
post on r-help appropriately. In particular, for questions about specific 
non-standard packages, contact package maintainers (found through e.g. 
?maintainers)

3. Statistics issues generally don't belong here. Try stats.stackexchange.com 
instead perhaps.

4. We are not *R Core development,*  and you probably should not be contacting 
them either.  See here for general guidelines for R lists:
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.r-project.org%2Fmail.htmldata=04%7C01%7C%7Cba1cd94903054f3328d608d96727bcb1%7C84df9e7fe9f640afb435%7C1%7C0%7C637654242624008665%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=UgFMJoteTmcrGsYlA6w6mgMx8EhDsU9ndbBn5LeOFJ0%3Dreserved=0


Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Aug 24, 2021 at 10:39 AM David Swanepoel  
wrote:
>
> Dear R Core Dev Team, I hope all is well your side!
> My apologies if this is not the correct point of contact to use to address 
> this. If not, kindly advise or forward my request to the relevant 
> team/persons.
>
> I have a query regarding the 'Hochberg' method of the stats/p.adjust R 
> package and hope you can assist me please. I have attached the data I used in 
> Excel, which are lists of p-values for two different tests (Hardy Weinberg 
> Equilibrium and Linkage Disequilibrium) for four population groups.
>
> The basis of my concern is a discrepancy specifically between the Hochberg 
> correction applied by four different R packages and the results of the 
> Hochberg correction by the online tool, 
> MultipleTesting.com<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.multipletesting.com%2Fdata=04%7C01%7C%7Cba1cd94903054f3328d608d96727bcb1%7C84df9e7fe9f640afb435%7C1%7C0%7C637654242624018660%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=I6IKtam382hI1ONk%2BJIFiTyCzwThHhdet6fDoNZPOv8%3Dreserved=0>.
>
> Using the below R packages/functions, I ran multiple test correction (MTC) 
> adjustments for the p-values listed in my dataset. All R packages below 
> agreed with each other regarding the 'significance' of the p-values for the 
> Hochberg adjustment.
>
>
>   *   stats/p.adjust (method: Hochberg)
>   *   mutoss/hochberg
>   *   multtest/mt.rawp2adjp (procedure: Hochberg)
>   *   elitism/mtp (method: Hochberg)
>
> In checking the same values on the MultipleTesting.com, more p-values were 
> flagged as significant for both the HWE and LD results across all four 
> populations. I show these differences in the Excel sheet attached.
> Essentially, using the R packages, only the first HWE p-value of Pop2 is 
> significant at an alpha of 0.05. Using the MT.com tool, however, multiple 
> p-values are shown to be significant across both tests with the Hochberg 
> correction (the highlighted cells in the Excel sheet).
>
>
> I asked the authors of MT.com about this, and they gave the following 
> response:
>
> "we have checked the issue, and we believe the computation by our page is 
> correct (I cannot give opinion about the other packages).
> When we look on the original Hochberg paper, and we only use the very first 
> (smallest) p value, then m"=1, thus, according to the equation in the 
> Hochberg 1988 paper, in this case practically there is no further correction 
> necessary.
> In other words, in case the *smallest* p value is smaller than alpha, then 
> the *smallest* p value will remain significant irrespective of the other p 
> values when we make the Hochberg correction."
>
> I have attached the Hochberg paper here but, unfortunately, I don't 
> understand enough of 

Re: [R] Query regarding stats/p.adjust package (base) - specifically 'Hochberg' function

2021-08-24 Thread Martin Maechler
> Bert Gunter 
> on Tue, 24 Aug 2021 10:50:50 -0700 writes:

> 1. No Excel attachments made it through. Binary
> attachments are generally stripped by the list server for
> security reasons.

> 2. As you may have already learned, this is the wrong
> forum for statistics or package specific questions. Read
> *and follow* the posting guide linked below to post on
> r-help appropriately. In particular, for questions about
> specific non-standard packages, contact package
> maintainers (found through e.g. ?maintainers)

> 3. Statistics issues generally don't belong here. Try
> stats.stackexchange.com instead perhaps.

> 4. We are not *R Core development,* and you probably
> should not be contacting them either.  See here for
> general guidelines for R lists:
> https://www.r-project.org/mail.html


> Bert Gunter

> "The trouble with having an open mind is that people keep
> coming along and sticking things into it."  -- Opus (aka
> Berkeley Breathed in his "Bloom County" comic strip )

Well, this was a bit harsh of an answer, Bert.

p.adjust() is a standard R  function  (package 'stats') -- as
David Swanepoel did even mention.

I think he's okay asking here if the algorithms used in such a
standard R functions are  "ok"  and how/why they seemlingly
differ from other implementations ...

Martin

> On Tue, Aug 24, 2021 at 10:39 AM David Swanepoel
>  wrote:
>> 
>> Dear R Core Dev Team, I hope all is well your side!  My
>> apologies if this is not the correct point of contact to
>> use to address this. If not, kindly advise or forward my
>> request to the relevant team/persons.
>> 
>> I have a query regarding the 'Hochberg' method of the
>> stats/p.adjust R package and hope you can assist me
>> please. I have attached the data I used in Excel, which
>> are lists of p-values for two different tests (Hardy
>> Weinberg Equilibrium and Linkage Disequilibrium) for four
>> population groups.
>> 
>> The basis of my concern is a discrepancy specifically
>> between the Hochberg correction applied by four different
>> R packages and the results of the Hochberg correction by
>> the online tool,
>> MultipleTesting.com.
>> 
>> Using the below R packages/functions, I ran multiple test
>> correction (MTC) adjustments for the p-values listed in
>> my dataset. All R packages below agreed with each other
>> regarding the 'significance' of the p-values for the
>> Hochberg adjustment.
>> 
>> 
>> * stats/p.adjust (method: Hochberg) * mutoss/hochberg *
>> multtest/mt.rawp2adjp (procedure: Hochberg) * elitism/mtp
>> (method: Hochberg)
>> 
>> In checking the same values on the MultipleTesting.com,
>> more p-values were flagged as significant for both the
>> HWE and LD results across all four populations. I show
>> these differences in the Excel sheet attached.
>> Essentially, using the R packages, only the first HWE
>> p-value of Pop2 is significant at an alpha of 0.05. Using
>> the MT.com tool, however, multiple p-values are shown to
>> be significant across both tests with the Hochberg
>> correction (the highlighted cells in the Excel sheet).
>> 
>> 
>> I asked the authors of MT.com about this, and they gave
>> the following response:
>> 
>> "we have checked the issue, and we believe the
>> computation by our page is correct (I cannot give opinion
>> about the other packages).  When we look on the original
>> Hochberg paper, and we only use the very first (smallest)
>> p value, then m"=1, thus, according to the equation in
>> the Hochberg 1988 paper, in this case practically there
>> is no further correction necessary.  In other words, in
>> case the *smallest* p value is smaller than alpha, then
>> the *smallest* p value will remain significant
>> irrespective of the other p values when we make the
>> Hochberg correction."
>> 
>> I have attached the Hochberg paper here but,
>> unfortunately, I don't understand enough of the stats to
>> verify this. I have applied their logic on the same Excel
>> sheet under the section "MT.com explanation", which shows
>> why they consider the highlighted values significant.
>> 
>> I have also attached the 2 R files that I used to do the
>> MTC runs and they can be run as is. They are just quite
>> long as they contain many of the other MTC methods in the
>> different packages too.
>> 
>> Kindly provide your thoughts as to whether you agree with
>> this interpretation of the Hochberg paper or not? I would
>> like to see concordance between the MT.com tool and the
>> different R packages above (or understand why they are
>> different), so that I can be more confident in the
>> explanations 

Re: [R] Query regarding stats/p.adjust package (base) - specifically 'Hochberg' function

2021-08-24 Thread Bert Gunter
1. No Excel attachments made it through. Binary attachments are
generally stripped by the list server for security reasons.

2. As you may have already learned, this is the wrong forum for
statistics or package specific questions. Read *and follow* the
posting guide linked below to post on r-help appropriately. In
particular, for questions about specific non-standard packages,
contact package maintainers (found through e.g. ?maintainers)

3. Statistics issues generally don't belong here. Try
stats.stackexchange.com instead perhaps.

4. We are not *R Core development,*  and you probably should not be
contacting them either.  See here for general guidelines for R lists:
https://www.r-project.org/mail.html


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Aug 24, 2021 at 10:39 AM David Swanepoel
 wrote:
>
> Dear R Core Dev Team, I hope all is well your side!
> My apologies if this is not the correct point of contact to use to address 
> this. If not, kindly advise or forward my request to the relevant 
> team/persons.
>
> I have a query regarding the 'Hochberg' method of the stats/p.adjust R 
> package and hope you can assist me please. I have attached the data I used in 
> Excel, which are lists of p-values for two different tests (Hardy Weinberg 
> Equilibrium and Linkage Disequilibrium) for four population groups.
>
> The basis of my concern is a discrepancy specifically between the Hochberg 
> correction applied by four different R packages and the results of the 
> Hochberg correction by the online tool, 
> MultipleTesting.com.
>
> Using the below R packages/functions, I ran multiple test correction (MTC) 
> adjustments for the p-values listed in my dataset. All R packages below 
> agreed with each other regarding the 'significance' of the p-values for the 
> Hochberg adjustment.
>
>
>   *   stats/p.adjust (method: Hochberg)
>   *   mutoss/hochberg
>   *   multtest/mt.rawp2adjp (procedure: Hochberg)
>   *   elitism/mtp (method: Hochberg)
>
> In checking the same values on the MultipleTesting.com, more p-values were 
> flagged as significant for both the HWE and LD results across all four 
> populations. I show these differences in the Excel sheet attached.
> Essentially, using the R packages, only the first HWE p-value of Pop2 is 
> significant at an alpha of 0.05. Using the MT.com tool, however, multiple 
> p-values are shown to be significant across both tests with the Hochberg 
> correction (the highlighted cells in the Excel sheet).
>
>
> I asked the authors of MT.com about this, and they gave the following 
> response:
>
> "we have checked the issue, and we believe the computation by our page is 
> correct (I cannot give opinion about the other packages).
> When we look on the original Hochberg paper, and we only use the very first 
> (smallest) p value, then m"=1, thus, according to the equation in the 
> Hochberg 1988 paper, in this case practically there is no further correction 
> necessary.
> In other words, in case the *smallest* p value is smaller than alpha, then 
> the *smallest* p value will remain significant irrespective of the other p 
> values when we make the Hochberg correction."
>
> I have attached the Hochberg paper here but, unfortunately, I don't 
> understand enough of the stats to verify this. I have applied their logic on 
> the same Excel sheet under the section "MT.com explanation", which shows why 
> they consider the highlighted values significant.
>
> I have also attached the 2 R files that I used to do the MTC runs and they 
> can be run as is. They are just quite long as they contain many of the other 
> MTC methods in the different packages too.
>
> Kindly provide your thoughts as to whether you agree with this interpretation 
> of the Hochberg paper or not? I would like to see concordance between the 
> MT.com tool and the different R packages above (or understand why they are 
> different), so that I can be more confident in the explanations of my own 
> results as a stats layman.
>
> I hope this makes sense. Please let me know if I need to clarify anything.
>
>
> Many thanks and kind regards,
> David
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query

2021-04-26 Thread Rui Barradas

Hello,

What you are trying to do is not possible without editing the plot 
method for objects of class "varirf". See its code with



getAnywhere("plot.varirf")


Argument axes = FALSE is set in the code followed by explicit calls to 
axis(1, etc) and axis(2, etc).


Contact the

maintainer("vars")

?


Hope this helps,

Rui Barradas


Às 18:31 de 25/04/21, Sun Yong Kim escreveu:

vars package

From: John Kane 
Sent: Sunday, April 25, 2021 12:30 PM
To: Sun Yong Kim 
Cc: r-help@r-project.org 
Subject: Re: [R] Query

What package has the irf function?

On Fri, 23 Apr 2021 at 09:34, Sun Yong Kim
 wrote:


Hi

I have been trying to circulate a question but my question keeps getting 
rejected. Let me try again via email. The message is below:

I am trying to change the tick size for IRF graphs that I created using the 
following code below:

par(mfrow=c(2,3))
IRF_F2 <- irf(SVARmod1, impulse="deltaw2",
   response=c("Belgium","France","Germany","Italy","Netherlands"),
n.ahead=60)
plot(IRF_F2,ylim=c(-0.5,0.3), plot.type="single", main="US Wealth Share
Shock")

The code gives me the following IRFs I have attached to this email. However the 
tick intervals is way too small and I want to change it to intervals of 10. I 
was thinking something like:

par(mfrow=c(2,3))
IRF_F2 <- irf(SVARmod1, impulse="deltaw2",

response=c("Belgium","France","Germany","Italy","Netherlands"),
n.ahead=60)
plot(IRF_F2,ylim=c(-0.5,0.3), plot.type="single", main="US Wealth
Share Shock", xaxt="n")
axis(side= 1, at = c(10, 20, 30, 40, 50, 60))

But it doesn't work. Any suggestions on how to reduce the tick interval?
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Dq0X2DkFhyF93HkjWTBQKhk!ANzmkRglqFkZT0F8ICLJH_RiaYVwyaJXw8dNqctnUZELgnK40ZBoSLhCQEnOFFw68r29nt9j0mkpJ2fB$
PLEASE do read the posting guide 
https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!ANzmkRglqFkZT0F8ICLJH_RiaYVwyaJXw8dNqctnUZELgnK40ZBoSLhCQEnOFFw68r29nt9j0kVB9796$
and provide commented, minimal, self-contained, reproducible code.




--
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query

2021-04-26 Thread Jim Lemon
Hi SunYong,
The docs are not exactly clear on this, but you might try adding
axes=FALSE instead of xaxt="n" and then calling both axes separately.

Jim


On Mon, Apr 26, 2021 at 5:28 PM Sun Yong Kim
 wrote:
>
> vars package
> 
> From: John Kane 
> Sent: Sunday, April 25, 2021 12:30 PM
> To: Sun Yong Kim 
> Cc: r-help@r-project.org 
> Subject: Re: [R] Query
>
> What package has the irf function?
>
> On Fri, 23 Apr 2021 at 09:34, Sun Yong Kim
>  wrote:
> >
> > Hi
> >
> > I have been trying to circulate a question but my question keeps getting 
> > rejected. Let me try again via email. The message is below:
> >
> > I am trying to change the tick size for IRF graphs that I created using the 
> > following code below:
> >
> > par(mfrow=c(2,3))
> > IRF_F2 <- irf(SVARmod1, impulse="deltaw2",
> >   response=c("Belgium","France","Germany","Italy","Netherlands"),
> > n.ahead=60)
> > plot(IRF_F2,ylim=c(-0.5,0.3), plot.type="single", main="US Wealth Share
> > Shock")
> >
> > The code gives me the following IRFs I have attached to this email. However 
> > the tick intervals is way too small and I want to change it to intervals of 
> > 10. I was thinking something like:
> >
> > par(mfrow=c(2,3))
> > IRF_F2 <- irf(SVARmod1, impulse="deltaw2",
> >
> > response=c("Belgium","France","Germany","Italy","Netherlands"),
> > n.ahead=60)
> > plot(IRF_F2,ylim=c(-0.5,0.3), plot.type="single", main="US Wealth
> > Share Shock", xaxt="n")
> > axis(side= 1, at = c(10, 20, 30, 40, 50, 60))
> >
> > But it doesn't work. Any suggestions on how to reduce the tick interval?
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Dq0X2DkFhyF93HkjWTBQKhk!ANzmkRglqFkZT0F8ICLJH_RiaYVwyaJXw8dNqctnUZELgnK40ZBoSLhCQEnOFFw68r29nt9j0mkpJ2fB$
> > PLEASE do read the posting guide 
> > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!ANzmkRglqFkZT0F8ICLJH_RiaYVwyaJXw8dNqctnUZELgnK40ZBoSLhCQEnOFFw68r29nt9j0kVB9796$
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> John Kane
> Kingston ON Canada
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query

2021-04-26 Thread Sun Yong Kim
vars package

From: John Kane 
Sent: Sunday, April 25, 2021 12:30 PM
To: Sun Yong Kim 
Cc: r-help@r-project.org 
Subject: Re: [R] Query

What package has the irf function?

On Fri, 23 Apr 2021 at 09:34, Sun Yong Kim
 wrote:
>
> Hi
>
> I have been trying to circulate a question but my question keeps getting 
> rejected. Let me try again via email. The message is below:
>
> I am trying to change the tick size for IRF graphs that I created using the 
> following code below:
>
> par(mfrow=c(2,3))
> IRF_F2 <- irf(SVARmod1, impulse="deltaw2",
>   response=c("Belgium","France","Germany","Italy","Netherlands"),
> n.ahead=60)
> plot(IRF_F2,ylim=c(-0.5,0.3), plot.type="single", main="US Wealth Share
> Shock")
>
> The code gives me the following IRFs I have attached to this email. However 
> the tick intervals is way too small and I want to change it to intervals of 
> 10. I was thinking something like:
>
> par(mfrow=c(2,3))
> IRF_F2 <- irf(SVARmod1, impulse="deltaw2",
>
> response=c("Belgium","France","Germany","Italy","Netherlands"),
> n.ahead=60)
> plot(IRF_F2,ylim=c(-0.5,0.3), plot.type="single", main="US Wealth
> Share Shock", xaxt="n")
> axis(side= 1, at = c(10, 20, 30, 40, 50, 60))
>
> But it doesn't work. Any suggestions on how to reduce the tick interval?
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Dq0X2DkFhyF93HkjWTBQKhk!ANzmkRglqFkZT0F8ICLJH_RiaYVwyaJXw8dNqctnUZELgnK40ZBoSLhCQEnOFFw68r29nt9j0mkpJ2fB$
> PLEASE do read the posting guide 
> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!ANzmkRglqFkZT0F8ICLJH_RiaYVwyaJXw8dNqctnUZELgnK40ZBoSLhCQEnOFFw68r29nt9j0kVB9796$
> and provide commented, minimal, self-contained, reproducible code.



--
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query

2021-04-25 Thread John Kane
What package has the irf function?

On Fri, 23 Apr 2021 at 09:34, Sun Yong Kim
 wrote:
>
> Hi
>
> I have been trying to circulate a question but my question keeps getting 
> rejected. Let me try again via email. The message is below:
>
> I am trying to change the tick size for IRF graphs that I created using the 
> following code below:
>
> par(mfrow=c(2,3))
> IRF_F2 <- irf(SVARmod1, impulse="deltaw2",
>   response=c("Belgium","France","Germany","Italy","Netherlands"),
> n.ahead=60)
> plot(IRF_F2,ylim=c(-0.5,0.3), plot.type="single", main="US Wealth Share
> Shock")
>
> The code gives me the following IRFs I have attached to this email. However 
> the tick intervals is way too small and I want to change it to intervals of 
> 10. I was thinking something like:
>
> par(mfrow=c(2,3))
> IRF_F2 <- irf(SVARmod1, impulse="deltaw2",
>
> response=c("Belgium","France","Germany","Italy","Netherlands"),
> n.ahead=60)
> plot(IRF_F2,ylim=c(-0.5,0.3), plot.type="single", main="US Wealth
> Share Shock", xaxt="n")
> axis(side= 1, at = c(10, 20, 30, 40, 50, 60))
>
> But it doesn't work. Any suggestions on how to reduce the tick interval?
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
John Kane
Kingston ON Canada

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query

2021-04-23 Thread Sun Yong Kim
Hi

I have been trying to circulate a question but my question keeps getting 
rejected. Let me try again via email. The message is below:

I am trying to change the tick size for IRF graphs that I created using the 
following code below:

par(mfrow=c(2,3))
IRF_F2 <- irf(SVARmod1, impulse="deltaw2",
  response=c("Belgium","France","Germany","Italy","Netherlands"),
n.ahead=60)
plot(IRF_F2,ylim=c(-0.5,0.3), plot.type="single", main="US Wealth Share
Shock")

The code gives me the following IRFs I have attached to this email. However the 
tick intervals is way too small and I want to change it to intervals of 10. I 
was thinking something like:

par(mfrow=c(2,3))
IRF_F2 <- irf(SVARmod1, impulse="deltaw2",

response=c("Belgium","France","Germany","Italy","Netherlands"),
n.ahead=60)
plot(IRF_F2,ylim=c(-0.5,0.3), plot.type="single", main="US Wealth
Share Shock", xaxt="n")
axis(side= 1, at = c(10, 20, 30, 40, 50, 60))

But it doesn't work. Any suggestions on how to reduce the tick interval?
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on constrained regressions using -mgcv- and -pcls-

2020-11-02 Thread Bert Gunter
Warning: I did *not* attempt to follow your query(original or addendum) in
detail. But as you have not yet received a reply, it may be because your
post seems mostly about statistical issues, which are generally off topic
here. This list is primarily about R programming issues. If statistical
issues are your primary focus, SO may be a better place to post:
https://stats.stackexchange.com/

Otherwise, I guess you'll just have to continue waiting.

Incidentally, suggestions for improvements in nonstandard packages should
generally be sent to the package maintainer (?maintainer) rather than
posted here. Maintainers may not even check this list.

Finally, this is a plain text list. HTML posts often get mangled by the
server.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Nov 2, 2020 at 9:28 PM Clive Nicholas via R-help <
r-help@r-project.org> wrote:

> As an addendum / erratum to my original post, the second block of code
> should read for completeness:
>
> set.seed(02102020)
> N=500
> M=10
> rater=rep(1:M, each = N)
> lead_n=as.factor(rep(1:N,M))
> a=rep(rnorm(N),M)
> z=rep(round(25+2*rnorm(N)+.2*a))
> x=a+rnorm(N*M)
> y=.5*x+5*a-.5*z+2*rnorm(N*M)
> x_cl=rep(aggregate(x,list(lead_n) mean)[,2],M)
> model=lm(y~x+x_cl+z)
> summary(model)
> y=1+1.5*x+4.6*x_cl-0.5*z
> x.mat=cbind(rep(1,length(y)),x,x_cl,z)
> ls.print(lsfit(x.mat,y,intercept=FALSE))
> M=list(y=y,
> w=rep(1, length(y)),
> X=x.mat,
> C=matrix(0,0,0),
> p=rep(1, ncol(x.mat)),
> off=array(0,0),
> S=list(),
> sp=array(0,0),
> Ain=diag(ncol(x.mat)),
> bin=rep(0, ncol(x.mat)) )
> pcls(M)
>
> However, all my questions stand.
>
> Ta, Clive
>
> On Tue, 3 Nov 2020 at 01:14, Clive Nicholas 
> wrote:
>
> > Hello all,
> >
> > I'll level with you: I'm puzzled!
> >
> > How is it that this constrained regression routine using -pcls- runs
> > satisfactorily (courtesy of Tian Zheng):
> >
> > library(mgcv)
> > options(digits=3)
> > x.1=rnorm(100, 0, 1)
> > x.2=rnorm(100, 0, 1)
> > x.3=rnorm(100, 0, 1)
> > x.4=rnorm(100, 0, 1)
> > y=1+0.5*x.1-0.2*x.2+0.3*x.3+0.1*x.4+rnorm(100, 0, 0.01)
> > x.mat=cbind(rep(1, length(y)), x.1, x.2, x.3, x.4)
> > ls.print(lsfit(x.mat, y, intercept=FALSE))
> > M=list(y=y,
> > w=rep(1, length(y)),
> > X=x.mat,
> > C=matrix(0,0,0),
> > p=rep(1, ncol(x.mat)),
> > off=array(0,0),
> > S=list(),
> > sp=array(0,0),
> > Ain=diag(ncol(x.mat)),
> > bin=rep(0, ncol(x.mat)) )
> > pcls(M)
> > Residual Standard Error=0.0095
> > R-Square=1
> > F-statistic (df=5, 95)=314735
> > p-value=0
> >
> > Estimate Std.Err t-value Pr(>|t|)
> >1.000  0.0010  1043.90
> > x.10.501  0.0010   512.60
> > x.2   -0.202  0.0009  -231.60
> > x.30.298  0.0010   297.80
> > x.40.103  0.001194.80
> >
> > but this one does not for a panel dataset:
> >
> > set.seed(02102020)
> > N=500
> > M=10
> > rater=rep(1:M, each = N)
> > lead_n=as.factor(rep(1:N,M))
> > a=rep(rnorm(N),M)
> > z=rep(round(25+2*rnorm(N)+.2*a))
> > x=a+rnorm(N*M)
> > y=.5*x+5*a-.5*z+2*rnorm(N*M)
> > x_cl=rep(aggregate(x,list(lead_n) mean)[,2],M)
> > model=lm(y~x+x_cl+z)
> > summary(model)
> > y=1+1.5*x+4.6*x_cl-0.5*z
> > x.mat=cbind(rep(1,length(y)),x,x_cl,z)
> > ls.print(lsfit(x.mat,y,intercept=FALSE))
> >
> > Residual Standard Error=0
> > R-Square=1
> > F-statistic (df=4, 4996)=5.06e+30
> > p-value=0
> >
> >  Estimate Std.Err   t-value Pr(>|t|)
> >   1.0   0  2.89e+130
> > x 0.8   0  2.71e+140
> > x_cl  4.6   0  1.18e+150
> > z-0.5   0 -3.63e+140
> >
> > ?
> >
> > There shouldn't be anything wrong with the second set of data, unless
> I've
> > missed something obvious (that constraints don't work for panel data?
> Seems
> > unlikely to me)!
> >
> > Also:
> >
> > (1) I'm ultimately looking just to constrain ONE coefficient whilst
> > allowing the other coefficients to be unconstrained (I tried this with
> the
> > first dataset by setting
> >
> > y=1+0.5*x.1-x.2+x.3+x.4
> >
> > in the call, but got similar-looking output to what I got in the second
> > dataset); and
> >
> > (2) it would be really useful to have the call to -pcls(M)- produce more
> > informative output (SEs, t-values, fit stats, etc).
> >
> > Many thanks in anticipation of your expert help and being told what a
> > clueless berk I am,
> > Clive
> >
> > --
> > Clive Nicholas
> >
> > "My colleagues in the social sciences talk a great deal about
> methodology.
> > I prefer to call it style." -- Freeman J. Dyson
> >
>
>
> --
> Clive Nicholas
>
> "My colleagues in the social sciences talk a great deal about methodology.
> I prefer to call it style." -- Freeman J. Dyson
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> 

Re: [R] Query on constrained regressions using -mgcv- and -pcls-

2020-11-02 Thread Clive Nicholas via R-help
As an addendum / erratum to my original post, the second block of code
should read for completeness:

set.seed(02102020)
N=500
M=10
rater=rep(1:M, each = N)
lead_n=as.factor(rep(1:N,M))
a=rep(rnorm(N),M)
z=rep(round(25+2*rnorm(N)+.2*a))
x=a+rnorm(N*M)
y=.5*x+5*a-.5*z+2*rnorm(N*M)
x_cl=rep(aggregate(x,list(lead_n) mean)[,2],M)
model=lm(y~x+x_cl+z)
summary(model)
y=1+1.5*x+4.6*x_cl-0.5*z
x.mat=cbind(rep(1,length(y)),x,x_cl,z)
ls.print(lsfit(x.mat,y,intercept=FALSE))
M=list(y=y,
w=rep(1, length(y)),
X=x.mat,
C=matrix(0,0,0),
p=rep(1, ncol(x.mat)),
off=array(0,0),
S=list(),
sp=array(0,0),
Ain=diag(ncol(x.mat)),
bin=rep(0, ncol(x.mat)) )
pcls(M)

However, all my questions stand.

Ta, Clive

On Tue, 3 Nov 2020 at 01:14, Clive Nicholas 
wrote:

> Hello all,
>
> I'll level with you: I'm puzzled!
>
> How is it that this constrained regression routine using -pcls- runs
> satisfactorily (courtesy of Tian Zheng):
>
> library(mgcv)
> options(digits=3)
> x.1=rnorm(100, 0, 1)
> x.2=rnorm(100, 0, 1)
> x.3=rnorm(100, 0, 1)
> x.4=rnorm(100, 0, 1)
> y=1+0.5*x.1-0.2*x.2+0.3*x.3+0.1*x.4+rnorm(100, 0, 0.01)
> x.mat=cbind(rep(1, length(y)), x.1, x.2, x.3, x.4)
> ls.print(lsfit(x.mat, y, intercept=FALSE))
> M=list(y=y,
> w=rep(1, length(y)),
> X=x.mat,
> C=matrix(0,0,0),
> p=rep(1, ncol(x.mat)),
> off=array(0,0),
> S=list(),
> sp=array(0,0),
> Ain=diag(ncol(x.mat)),
> bin=rep(0, ncol(x.mat)) )
> pcls(M)
> Residual Standard Error=0.0095
> R-Square=1
> F-statistic (df=5, 95)=314735
> p-value=0
>
> Estimate Std.Err t-value Pr(>|t|)
>1.000  0.0010  1043.90
> x.10.501  0.0010   512.60
> x.2   -0.202  0.0009  -231.60
> x.30.298  0.0010   297.80
> x.40.103  0.001194.80
>
> but this one does not for a panel dataset:
>
> set.seed(02102020)
> N=500
> M=10
> rater=rep(1:M, each = N)
> lead_n=as.factor(rep(1:N,M))
> a=rep(rnorm(N),M)
> z=rep(round(25+2*rnorm(N)+.2*a))
> x=a+rnorm(N*M)
> y=.5*x+5*a-.5*z+2*rnorm(N*M)
> x_cl=rep(aggregate(x,list(lead_n) mean)[,2],M)
> model=lm(y~x+x_cl+z)
> summary(model)
> y=1+1.5*x+4.6*x_cl-0.5*z
> x.mat=cbind(rep(1,length(y)),x,x_cl,z)
> ls.print(lsfit(x.mat,y,intercept=FALSE))
>
> Residual Standard Error=0
> R-Square=1
> F-statistic (df=4, 4996)=5.06e+30
> p-value=0
>
>  Estimate Std.Err   t-value Pr(>|t|)
>   1.0   0  2.89e+130
> x 0.8   0  2.71e+140
> x_cl  4.6   0  1.18e+150
> z-0.5   0 -3.63e+140
>
> ?
>
> There shouldn't be anything wrong with the second set of data, unless I've
> missed something obvious (that constraints don't work for panel data? Seems
> unlikely to me)!
>
> Also:
>
> (1) I'm ultimately looking just to constrain ONE coefficient whilst
> allowing the other coefficients to be unconstrained (I tried this with the
> first dataset by setting
>
> y=1+0.5*x.1-x.2+x.3+x.4
>
> in the call, but got similar-looking output to what I got in the second
> dataset); and
>
> (2) it would be really useful to have the call to -pcls(M)- produce more
> informative output (SEs, t-values, fit stats, etc).
>
> Many thanks in anticipation of your expert help and being told what a
> clueless berk I am,
> Clive
>
> --
> Clive Nicholas
>
> "My colleagues in the social sciences talk a great deal about methodology.
> I prefer to call it style." -- Freeman J. Dyson
>


-- 
Clive Nicholas

"My colleagues in the social sciences talk a great deal about methodology.
I prefer to call it style." -- Freeman J. Dyson

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query on constrained regressions using -mgcv- and -pcls-

2020-11-02 Thread Clive Nicholas via R-help
Hello all,

I'll level with you: I'm puzzled!

How is it that this constrained regression routine using -pcls- runs
satisfactorily (courtesy of Tian Zheng):

library(mgcv)
options(digits=3)
x.1=rnorm(100, 0, 1)
x.2=rnorm(100, 0, 1)
x.3=rnorm(100, 0, 1)
x.4=rnorm(100, 0, 1)
y=1+0.5*x.1-0.2*x.2+0.3*x.3+0.1*x.4+rnorm(100, 0, 0.01)
x.mat=cbind(rep(1, length(y)), x.1, x.2, x.3, x.4)
ls.print(lsfit(x.mat, y, intercept=FALSE))
M=list(y=y,
w=rep(1, length(y)),
X=x.mat,
C=matrix(0,0,0),
p=rep(1, ncol(x.mat)),
off=array(0,0),
S=list(),
sp=array(0,0),
Ain=diag(ncol(x.mat)),
bin=rep(0, ncol(x.mat)) )
pcls(M)
Residual Standard Error=0.0095
R-Square=1
F-statistic (df=5, 95)=314735
p-value=0

Estimate Std.Err t-value Pr(>|t|)
   1.000  0.0010  1043.90
x.10.501  0.0010   512.60
x.2   -0.202  0.0009  -231.60
x.30.298  0.0010   297.80
x.40.103  0.001194.80

but this one does not for a panel dataset:

set.seed(02102020)
N=500
M=10
rater=rep(1:M, each = N)
lead_n=as.factor(rep(1:N,M))
a=rep(rnorm(N),M)
z=rep(round(25+2*rnorm(N)+.2*a))
x=a+rnorm(N*M)
y=.5*x+5*a-.5*z+2*rnorm(N*M)
x_cl=rep(aggregate(x,list(lead_n) mean)[,2],M)
model=lm(y~x+x_cl+z)
summary(model)
y=1+1.5*x+4.6*x_cl-0.5*z
x.mat=cbind(rep(1,length(y)),x,x_cl,z)
ls.print(lsfit(x.mat,y,intercept=FALSE))

Residual Standard Error=0
R-Square=1
F-statistic (df=4, 4996)=5.06e+30
p-value=0

 Estimate Std.Err   t-value Pr(>|t|)
  1.0   0  2.89e+130
x 0.8   0  2.71e+140
x_cl  4.6   0  1.18e+150
z-0.5   0 -3.63e+140

?

There shouldn't be anything wrong with the second set of data, unless I've
missed something obvious (that constraints don't work for panel data? Seems
unlikely to me)!

Also:

(1) I'm ultimately looking just to constrain ONE coefficient whilst
allowing the other coefficients to be unconstrained (I tried this with the
first dataset by setting

y=1+0.5*x.1-x.2+x.3+x.4

in the call, but got similar-looking output to what I got in the second
dataset); and

(2) it would be really useful to have the call to -pcls(M)- produce more
informative output (SEs, t-values, fit stats, etc).

Many thanks in anticipation of your expert help and being told what a
clueless berk I am,
Clive

-- 
Clive Nicholas

"My colleagues in the social sciences talk a great deal about methodology.
I prefer to call it style." -- Freeman J. Dyson

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Abby Spurdle
> The contour lines are actually useful to see groupings.
> However w/o a legend for density it is not possible to see what is
> presented.

I need to re-iterate, that the diagonal lines, may be important.

Also, I'm not sure I see the point in adding density values.
Unless people have a good knowledge of probability theory and
calculus, I doubt that specific density values will be useful.
i.e. If I said the density was 0.0035, what does that tell you...?

If you really want to add a legend, it's possible.

But this creates at least two problems:
(1) In the base graphics system, the resulting plots can't be nested.
(2) It's difficult to interpret specific color-encoded values.

In my opinion, a better idea, is to label the contour lines.
In my packages, this is possible by using contour.labels=TRUE,
however, the defaults are ugly.
(Something else for my todo list).

Here's a slightly more complex example, with prettier contour labels:

library (barsurf)
library (KernSmooth)
set.bs.theme ("heat")

plot_ds <- function (dataset, main="", xlim, ylim, ...,
ncontours=3, labcex=0.8, ndec=3,
k1=1, k2=1, n=30)
{   names <- names (dataset)
x <- dataset [,1]
y <- dataset [,2]
 bw.x <- k1 * bw.nrd (x)
bw.y <- k2 * bw.nrd (y)
if (missing (xlim) )
xlim <- range (x) + c(-1, 1) * bw.x
if (missing (ylim) )
ylim <- range (y) + c(-1, 1) * bw.y

ks <- bkde2D (dataset, c (bw.x, bw.y),
c (n, n), list (xlim, ylim), FALSE)

fb <- seq (min (ks$fhat), max (ks$fhat),
length.out = ncontours + 2)
fb <- fb [2:(ncontours + 1)]
fb <- round (fb, ndec)

plot_cfield (ks$x1, ks$x2, ks$fhat,
contours=FALSE,
main=main, xlab = names [1], ylab = names [2],
xyrel="m")
points (x, y, pch=16, col="#0040")
contour (ks$x1, ks$x2, ks$fhat, levels=fb, labcex=labcex, add=TRUE)
}

plot_ds (bat_call, "plot 2", c (25, 28), c (-15, 10), k1=1.25, k2=1.25)

If you still want a legend, have a look at:
graphics::filled.contour

And then modify the second half of my code, starting after ks <- ...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Abby Spurdle
>  that extraneous white lines in PDFs are the fault of the PDF
> viewing program rather than of R.

Except it's a PNG file.

I've tried to minimize artifacts viewing PDF files.
But assumed (falsely?) that PNGs and other raster formats, would be fine.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Neotropical bat risk assessments

Tnx Jim,

Yes if there is a way to first extract the ranges of each data files Fc 
range and Sc ranges and then link to the plot that would be stellar.

I will look at this code and see how it is working so far.

Thanks a million.
Bruce

Hi Bruce & Abby,
Here is a start on merging the two plots.
Abby - I had to cheat on the legend colors as I could not work out
from the help pages how to specify the range of colors. Also I don't
know the range of densities. Both should be easy to fix. While I
specified xlab and ylab, they don't seem to make it to the plotting
functions. More study needed.
Bruce - The following code gives general idea of how to automate
plotting from a single data set. let me know whether you want
automated adjustment of axes, etc.
Both - I suspect that the constraints forming the diagonal lines are
due to characteristics of the bat larynx.

bfs<-read.csv("Procen_sample.csv")
# split out what you want to identify the plot
species<-unlist(strsplit("Procen_sample.csv","_"))[1]
library(bivariate)
# define the plot sequence
plot_ds <- function (dataset, main="", xlim, ylim, ..., k1=1, k2=1)
 {   names <- names (dataset)
 fh <- kbvpdf (dataset [,1], dataset [,2], k1 * bw.nrd (dataset
[,1]), k2 * bw.nrd (dataset [,2]) )
 plot (fh, main=main, xlab = names [1], ylab = names [2],
 xlim=xlim, ylim=ylim,
 ncontours=2)
}
# open the device
png(paste0(species,".png"))
# leave space for the color legend
par(mar=c(6,4,4,2))
plot_ds (bfs[,c("Fc","Sc")],
  main=paste(species,"characteristic bat call"),
  xlab="Frequency (kHz)",ylab="Characteristic slope (octaves/s)",
  ,k1=1.25, k2=1.25)
library(plotrix)
xylim<-par("usr")
color.legend(xylim[1],xylim[3]-(xylim[4]-xylim[3])/7,
  xylim[1]+(xylim[2]-xylim[1])/4,xylim[3]-(xylim[4]-xylim[3])/10,
legend=seq(0,10,length.out=5),
rect.col=color.scale(0:4,extremes=c("#7be6bd","#bdb3df")),align="rb")
text(xylim[1]+(xylim[2]-xylim[1])/8,
  xylim[3]-(xylim[4]-xylim[3])/5,
  "Density",xpd=TRUE)
dev.off()

Jim

On Wed, Jun 3, 2020 at 6:22 AM Neotropical bat risk assessments
 wrote:

Hi Abby,

The contour lines are actually useful to see groupings.
However w/o a legend for density it is not possible to see what is
presented.

Very nice

Jim, thank you.
However, the (deterministic, or near-deterministic) diagonal lines in
the plot, make me question the suitability of this approach.
In my plot, the contour lines could be removed, and brighter colors
could be used.

But perhaps, a better approach would be to model those lines...
And it's not clear from the plot, if all the observations fall on a
diagonal line...


P.S.
I'm not sure why there's a white line on the plot.
Most of my testing was with PDF output, I will need to do some more
testing with PNG output.


--
Bruce W. Miller, PhD.
Neotropical bat risk assessments
Conservation Fellow - Wildlife Conservation Society

If we lose the bats, we may lose much of the tropical vegetation and the lungs 
of the planet

Using acoustic sampling to identify and map species distributions
and pioneering acoustic tools for ecology and conservation of bats for >25 
years.

Key projects include providing free interactive identification keys and call 
fact sheets for the vocal signatures of New World Bats




--
Bruce W. Miller, PhD.
Neotropical bat risk assessments
Conservation Fellow - Wildlife Conservation Society

If we lose the bats, we may lose much of the tropical vegetation and the lungs 
of the planet

Using acoustic sampling to identify and map species distributions
and pioneering acoustic tools for ecology and conservation of bats for >25 
years.

Key projects include providing free interactive identification keys and call 
fact sheets for the vocal signatures of New World Bats

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread David Winsemius



On 6/2/20 11:44 AM, Abby Spurdle wrote:

Very nice

Jim, thank you.
However, the (deterministic, or near-deterministic) diagonal lines in
the plot, make me question the suitability of this approach.
In my plot, the contour lines could be removed, and brighter colors
could be used.

But perhaps, a better approach would be to model those lines...
And it's not clear from the plot, if all the observations fall on a
diagonal line...


P.S.
I'm not sure why there's a white line on the plot.



I think if you search the archives of Rhelp you will find many such 
whinges and that extraneous white lines in PDFs are the fault of the PDF 
viewing program rather than of R.



--

David.


Most of my testing was with PDF output, I will need to do some more
testing with PNG output.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Jim Lemon
Hi Bruce & Abby,
Here is a start on merging the two plots.
Abby - I had to cheat on the legend colors as I could not work out
from the help pages how to specify the range of colors. Also I don't
know the range of densities. Both should be easy to fix. While I
specified xlab and ylab, they don't seem to make it to the plotting
functions. More study needed.
Bruce - The following code gives general idea of how to automate
plotting from a single data set. let me know whether you want
automated adjustment of axes, etc.
Both - I suspect that the constraints forming the diagonal lines are
due to characteristics of the bat larynx.

bfs<-read.csv("Procen_sample.csv")
# split out what you want to identify the plot
species<-unlist(strsplit("Procen_sample.csv","_"))[1]
library(bivariate)
# define the plot sequence
plot_ds <- function (dataset, main="", xlim, ylim, ..., k1=1, k2=1)
{   names <- names (dataset)
fh <- kbvpdf (dataset [,1], dataset [,2], k1 * bw.nrd (dataset
[,1]), k2 * bw.nrd (dataset [,2]) )
plot (fh, main=main, xlab = names [1], ylab = names [2],
xlim=xlim, ylim=ylim,
ncontours=2)
}
# open the device
png(paste0(species,".png"))
# leave space for the color legend
par(mar=c(6,4,4,2))
plot_ds (bfs[,c("Fc","Sc")],
 main=paste(species,"characteristic bat call"),
 xlab="Frequency (kHz)",ylab="Characteristic slope (octaves/s)",
 ,k1=1.25, k2=1.25)
library(plotrix)
xylim<-par("usr")
color.legend(xylim[1],xylim[3]-(xylim[4]-xylim[3])/7,
 xylim[1]+(xylim[2]-xylim[1])/4,xylim[3]-(xylim[4]-xylim[3])/10,
legend=seq(0,10,length.out=5),
rect.col=color.scale(0:4,extremes=c("#7be6bd","#bdb3df")),align="rb")
text(xylim[1]+(xylim[2]-xylim[1])/8,
 xylim[3]-(xylim[4]-xylim[3])/5,
 "Density",xpd=TRUE)
dev.off()

Jim

On Wed, Jun 3, 2020 at 6:22 AM Neotropical bat risk assessments
 wrote:
>
> Hi Abby,
>
> The contour lines are actually useful to see groupings.
> However w/o a legend for density it is not possible to see what is
> presented.
> >> Very nice
> > Jim, thank you.
> > However, the (deterministic, or near-deterministic) diagonal lines in
> > the plot, make me question the suitability of this approach.
> > In my plot, the contour lines could be removed, and brighter colors
> > could be used.
> >
> > But perhaps, a better approach would be to model those lines...
> > And it's not clear from the plot, if all the observations fall on a
> > diagonal line...
> >
> >
> > P.S.
> > I'm not sure why there's a white line on the plot.
> > Most of my testing was with PDF output, I will need to do some more
> > testing with PNG output.
>
>
> --
> Bruce W. Miller, PhD.
> Neotropical bat risk assessments
> Conservation Fellow - Wildlife Conservation Society
>
> If we lose the bats, we may lose much of the tropical vegetation and the 
> lungs of the planet
>
> Using acoustic sampling to identify and map species distributions
> and pioneering acoustic tools for ecology and conservation of bats for >25 
> years.
>
> Key projects include providing free interactive identification keys and call 
> fact sheets for the vocal signatures of New World Bats
>
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Neotropical bat risk assessments

Hi Abby,

The contour lines are actually useful to see groupings.
However w/o a legend for density it is not possible to see what is 
presented.

Very nice

Jim, thank you.
However, the (deterministic, or near-deterministic) diagonal lines in
the plot, make me question the suitability of this approach.
In my plot, the contour lines could be removed, and brighter colors
could be used.

But perhaps, a better approach would be to model those lines...
And it's not clear from the plot, if all the observations fall on a
diagonal line...


P.S.
I'm not sure why there's a white line on the plot.
Most of my testing was with PDF output, I will need to do some more
testing with PNG output.



--
Bruce W. Miller, PhD.
Neotropical bat risk assessments
Conservation Fellow - Wildlife Conservation Society

If we lose the bats, we may lose much of the tropical vegetation and the lungs 
of the planet

Using acoustic sampling to identify and map species distributions
and pioneering acoustic tools for ecology and conservation of bats for >25 
years.

Key projects include providing free interactive identification keys and call 
fact sheets for the vocal signatures of New World Bats

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Abby Spurdle
> Very nice

Jim, thank you.
However, the (deterministic, or near-deterministic) diagonal lines in
the plot, make me question the suitability of this approach.
In my plot, the contour lines could be removed, and brighter colors
could be used.

But perhaps, a better approach would be to model those lines...
And it's not clear from the plot, if all the observations fall on a
diagonal line...


P.S.
I'm not sure why there's a white line on the plot.
Most of my testing was with PDF output, I will need to do some more
testing with PNG output.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Neotropical bat risk assessments
Hi all,

I spent some time this morning fiddling with the parameters in the plot 
code provided by Jim and Abby and  by changing some important ones.

Jim did note
*# set the matrix limits a bit beyond the data ranges*
fcsc_mat<-makeDensityMatrix(bfs$Fc,bfs$Sc,nx=100,ny=100,
  zfun="sum",xlim=c(*30,45*),ylim=c(*-55,110*)) and

axis(1,at=seq(5,95,10),round(seq(*30.0,50.0*,length.out=10),1))

axis(2,at=seq(5,95,10),round(seq(*-55,110*,length.out=10),1))

So editing the lines above to match what the data includes the plots for 
various species are working!

I now need to figure out how to add a legend for the density values in 
the bivariate package plots.

I am assuming there can be a line or so of code that can extract the 
min-max values from the actual data files
that will update the xlim, ylim and axis data?  I think this should be a 
simple first step after reading in each new data set.

I can not thank Jim and Abby enough.  Super helpful

Cheers,
Bruce

-- 
Bruce W. Miller, PhD.
Neotropical bat risk assessments
Conservation Fellow - Wildlife Conservation Society

If we lose the bats, we may lose much of the tropical vegetation and the lungs 
of the planet

Using acoustic sampling to identify and map species distributions
and pioneering acoustic tools for ecology and conservation of bats for >25 
years.

Key projects include providing free interactive identification keys and call 
fact sheets for the vocal signatures of New World Bats


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Neotropical bat risk assessments
Hi all,

Many thanks for the efforts and suggestions.

This is getting closer to what is needed.  No legend showing the density 
values yet.
I was able to replicate a similar plot with the original data set.
However when I tried this with a different data set that has other Fc & 
Sc values  the plot does not work... just a blank PNG
Code from console below:

  >bfs<-Eptfur
 > dim(bfs)
[1] 5638   17
 > names(bfs)
  [1] "Filename" "st"   "Dur"  "TBC"  "Fmax" "Fmin" "Fmean"
  [8] "Tk"   "Fk"   "Qk"   "Tc"   "Fc" "Dc"   "S1"
[15] "Sc"   "Qual" "Pmc"
 > library(plotrix)
 > # set the matrix limits a bit beyond the data ranges
 > fcsc_mat<-makeDensityMatrix(bfs$Fc,bfs$Sc,nx=25,ny=25,
+ zfun="sum",xlim=c(24,29),ylim=c(-20,10))
Range of density (>0) - Inf -Inf
Warning messages:
*1: In min(x) : no non-missing arguments to min; returning Inf**
**2: In max(x) : no non-missing arguments to max; returning -Inf*
 > png("bat_call_plot.png")
 > par(mar=c(6,4,4,2))
 > color2D.matplot(fcsc_mat,
+ main="Freqency by slope of bat calls",
+ extremes=c("yellow","red"),xlab="Frequency (kHz)",
+ ylab="Characteristic slope (octaves/s)",
+ border=NA,axes=FALSE)
Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
 > axis(1,at=seq(5,95,10),round(seq(24.5,28.5,length.out=10),1))
 > axis(2,at=seq(5,95,10),round(seq(-20,10,length.out=10),1))
 > color.legend(0,-14,25,-10,legend=seq(0,10,length.out=5),
+ rect.col=color.scale(0:4,extremes=c("yellow","red")),align="rb")
 > text(12.5,-20,"Density (cell count)",xpd=TRUE)
 > dev.off()
null device
   1

I will not need to add a function it iterate as I will not be running  
this as an iterative task at one time... I just need the code to be able 
to use different data sets that have the same fields.
The Sc values over the 200+ data sets will range from potentially large 
negative numbers to positive numbers depending on the slope of the 
calls, i.e. increasing frequencies or decreasing frequencies.
An example of these two parameters for a single species with descriptive 
stats.
N is valid number of call pulses, then 10%-90% bins of where the call 
pulses fall into.

Parameters  N   Min Max MeanSt.Dev  10% 25% 75% 
90%
Fc  32802   43.01   50.00   46.86   1.3145.07   45.98   47.76   48.63
Sc  32802   -309.78 13.76   -6.60   10.98   -10.31  -7.50   -3.91   
-2.81


I am very appreciative and thank you both for guiding the efforts.

Bruce
> Very nice. I forgot that you didn't have the complete data set.
>
> png("as_bat_call.png")
> plot_ds (bfs[,c("Fc","Sc")], "plot 1", xlim = c (25, 30), ylim = c (-15, 10),
>  k1=1.25, k2=1.25)
> dev.off()
>
> Jim
>
> On Tue, Jun 2, 2020 at 6:24 PM Abby Spurdle  wrote:
>> I'm putting this back on the list.
>>
>>> So how would I set up the code to do this with the data type I have?
>>> I will need to replicate the same task > 200 times with other data sets.
>>> What I need to do is plot *Fc *against *Sc* with the third dimension being 
>>> the *density* of the data points.
>> Using Jim's bat_call data:
>>
>>  library (bivariate)
>>
>>  plot_ds <- function (dataset, main="", xlim, ylim, ..., k1=1, k2=1)
>>  {   names <- names (dataset)
>>  fh <- kbvpdf (dataset [,1], dataset [,2], k1 * bw.nrd (dataset
>> [,1]), k2 * bw.nrd (dataset [,2]) )
>>  plot (fh, main=main, xlab = names [1], ylab = names [2],
>>  xlim=xlim, ylim=ylim,
>>  ncontours=2)
>>  }
>>
>>  plot_ds (bat_call, "plot 1", k1=1.25, k2=1.25)
>>
>> Note that I've used stats::bw.nrd.
>> The k1 and k2 values, simply scale the default bandwidth.
>> (In this case, I've increased the smoothness).
>>
>> If you want to do it 200+ times:
>> (1) Create another function, to iterate over each data set.
>> (2) If you want to save the plots, you will need to add in a call to
>> pdf/png/etc and close the device, in each iteration.
>> (3) It may be desirable to have constant xlim/ylim values, ideally
>> based on the ranges of the combined data:
>>
>>  plot_ds (bat_call, "plot 1", xlim = c (25, 30), ylim = c (-15, 10),
>>  k1=1.25, k2=1.25)
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.


-- 
Bruce W. Miller, PhD.
Neotropical bat risk assessments
Conservation Fellow - Wildlife Conservation Society

If we lose the bats, we may lose much of the tropical vegetation and the lungs 
of the planet

Using acoustic sampling to identify and map species distributions
and 

Re: [R] Query on contour plots

2020-06-02 Thread Jim Lemon
Very nice. I forgot that you didn't have the complete data set.

png("as_bat_call.png")
plot_ds (bfs[,c("Fc","Sc")], "plot 1", xlim = c (25, 30), ylim = c (-15, 10),
k1=1.25, k2=1.25)
dev.off()

Jim

On Tue, Jun 2, 2020 at 6:24 PM Abby Spurdle  wrote:
>
> I'm putting this back on the list.
>
> > So how would I set up the code to do this with the data type I have?
>
> > I will need to replicate the same task > 200 times with other data sets.
> > What I need to do is plot *Fc *against *Sc* with the third dimension being 
> > the *density* of the data points.
>
> Using Jim's bat_call data:
>
> library (bivariate)
>
> plot_ds <- function (dataset, main="", xlim, ylim, ..., k1=1, k2=1)
> {   names <- names (dataset)
> fh <- kbvpdf (dataset [,1], dataset [,2], k1 * bw.nrd (dataset
> [,1]), k2 * bw.nrd (dataset [,2]) )
> plot (fh, main=main, xlab = names [1], ylab = names [2],
> xlim=xlim, ylim=ylim,
> ncontours=2)
> }
>
> plot_ds (bat_call, "plot 1", k1=1.25, k2=1.25)
>
> Note that I've used stats::bw.nrd.
> The k1 and k2 values, simply scale the default bandwidth.
> (In this case, I've increased the smoothness).
>
> If you want to do it 200+ times:
> (1) Create another function, to iterate over each data set.
> (2) If you want to save the plots, you will need to add in a call to
> pdf/png/etc and close the device, in each iteration.
> (3) It may be desirable to have constant xlim/ylim values, ideally
> based on the ranges of the combined data:
>
> plot_ds (bat_call, "plot 1", xlim = c (25, 30), ylim = c (-15, 10),
> k1=1.25, k2=1.25)
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Abby Spurdle
I'm putting this back on the list.

> So how would I set up the code to do this with the data type I have?

> I will need to replicate the same task > 200 times with other data sets.
> What I need to do is plot *Fc *against *Sc* with the third dimension being 
> the *density* of the data points.

Using Jim's bat_call data:

library (bivariate)

plot_ds <- function (dataset, main="", xlim, ylim, ..., k1=1, k2=1)
{   names <- names (dataset)
fh <- kbvpdf (dataset [,1], dataset [,2], k1 * bw.nrd (dataset
[,1]), k2 * bw.nrd (dataset [,2]) )
plot (fh, main=main, xlab = names [1], ylab = names [2],
xlim=xlim, ylim=ylim,
ncontours=2)
}

plot_ds (bat_call, "plot 1", k1=1.25, k2=1.25)

Note that I've used stats::bw.nrd.
The k1 and k2 values, simply scale the default bandwidth.
(In this case, I've increased the smoothness).

If you want to do it 200+ times:
(1) Create another function, to iterate over each data set.
(2) If you want to save the plots, you will need to add in a call to
pdf/png/etc and close the device, in each iteration.
(3) It may be desirable to have constant xlim/ylim values, ideally
based on the ranges of the combined data:

plot_ds (bat_call, "plot 1", xlim = c (25, 30), ylim = c (-15, 10),
k1=1.25, k2=1.25)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-01 Thread Jim Lemon
Good morning Bruce & Abby,
The fruit bats of Sydney have retreated to their camps so I can
finally answer your last two queries. Attached is a plot of your data
set on a 100 x 100 grid. This is how I did it:

bfs<-read.csv("Procen_sample.csv")
dim(bfs)
names(bfs)
library(plotrix)
# set the matrix limits a bit beyond the data ranges
fcsc_mat<-makeDensityMatrix(bfs$Fc,bfs$Sc,nx=100,ny=100,
 zfun="sum",xlim=c(24,29),ylim=c(-20,10))
png("bat_call.png")
par(mar=c(6,4,4,2))
color2D.matplot(fcsc_mat,
 main="Freqency by chirp slope of bat calls",
 extremes=c("yellow","red"),xlab="Frequency (kHz)",
 ylab="Characteristic slope (octaves/s)",
 border=NA,axes=FALSE)
axis(1,at=seq(5,95,10),round(seq(24.5,28.5,length.out=10),1))
axis(2,at=seq(5,95,10),round(seq(-20,10,length.out=10),1))
color.legend(0,-14,25,-10,legend=seq(0,10,length.out=5),
 rect.col=color.scale(0:4,extremes=c("yellow","red")),align="rb")
text(12.5,-20,"Density (cell count)",xpd=TRUE)
dev.off()

Abby's bivariate package looks like it will do some things that
color2D.matplot won't. However, I haven't had time to install it and
try it out, so I don't know whether it will be as easy to plug
different calls onto the same grid. Also, there appears to be
constraints on the frequency and slope in the calls and I don't know
enough about them to say why. Further tweaking may lead to better
solutions.

Jim
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-01 Thread Abby Spurdle
Hi,

I'm probably biased.

But my package, bivariate, contains a wrapper for KernSmooth::bkde2D,
which can produce both 3D surface plots and (pretty) contour plots of
bivariate kernel density estimates, conveniently.

https://cran.r-project.org/web/packages/bivariate/vignettes/bivariate.pdf
(pages 18 to 19)


On Mon, Jun 1, 2020 at 5:16 AM Neotropical bat risk assessments
 wrote:
>
> Hi all,
>
> While exploring  packages for 3D plots that several folks suggested (Tnx
> all!)
> It seems what I really need is a contour plot.  This is not working int
> he Deducer GUI.
>
> This will be an aid to separating bats by their vocal signatures.
> What I need to do is plot *Fc *against *Sc* with the third dimension
> being the *density* of the data points in the Fc-Sc plot.
>
> Data format is like this abbreviated sample.  Fc is a frequency in kHz
> and Sc is the characteristic slope  (octaves per second) of each call pulse.
>
> Any suggestions, guidance greatly appreciated.
> Bruce
>
> Fc  Sc
> 26.58   -5.95
> 27.03   -8.2
> 27.16   -2.07
> 26.19   -7.68
> 26.62   -3.99
> 26.85   -6.08
> 26.94   0
> 26.1-5.74
> 26.62   -5.96
> 26.85   -4.05
> 26.98   -4.09
> 26.02   -5.69
> 26.53   -7.89
> 26.62   -2
> 26.8-4.04
> 28.73   7
> 25.72   -2.97
> 26.14   -5.76
> 26.32   -3.89
> 26.40
> 26.32   5.88
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-05-31 Thread Jim Lemon
Hi Bruce,
With a much larger data set, you would see a smoother plot like your
sample. I plotted frequency as the abcissa and slope as the ordinate. It
looks as though your sample has it the other way round and the plot limits
are extended beyond the range of the data. However, makeDensityMatrix and
color2D.matplot could produce a plot like it.

Jim

On Mon, Jun 1, 2020 at 11:13 AM Neotropical bat risk assessments <
neotropical.b...@gmail.com> wrote:

> Tnx Jim
>
> Great help.
> I need to read about package plotrix .
> Hoping to achieve something like this sample on right.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-05-31 Thread Jim Lemon
Hi Neo,
It's a bit of a guess, but try this:

bat_call<-read.table(text="Fc  Sc
26.58   -5.95
27.03   -8.2
27.16   -2.07
26.19   -7.68
26.62   -3.99
26.85   -6.08
26.94   0
26.1-5.74
26.62   -5.96
26.85   -4.05
26.98   -4.09
26.02   -5.69
26.53   -7.89
26.62   -2
26.8-4.04
28.73   7
25.72   -2.97
26.14   -5.76
26.32   -3.89
26.40
26.32   5.88",
header=TRUE)
library(plotrix)
color2D.matplot(makeDensityMatrix(bat_call$Fc,bat_call$Sc,nx=5,ny=5,
 zfun="sum",xlim=range(bat_call$Fc),ylim=range(bat_call$Sc)),
 main="Map of bat calls",extremes=c("blue","red"),xlab="Frequency",
 ylab="Characteristic slope",axes=FALSE)
axis(1,at=seq(0.5,4.5,1),seq(26.3,28.3,0.5))
axis(2,at=seq(0.5,4.5,1),seq(4,-11.2,-3.5))
color.legend(-0.5,-0.65,1,-0.45,legend=seq(0,4,length.out=5),
 rect.col=color.scale(0:4,extremes=c("blue","red")),align="rb")
text(0.25,-0.89,"Density",xpd=TRUE)

Jim

On Mon, Jun 1, 2020 at 3:16 AM Neotropical bat risk assessments
 wrote:
>
> Hi all,
>
> While exploring  packages for 3D plots that several folks suggested (Tnx
> all!)
> It seems what I really need is a contour plot.  This is not working int
> he Deducer GUI.
>
> This will be an aid to separating bats by their vocal signatures.
> What I need to do is plot *Fc *against *Sc* with the third dimension
> being the *density* of the data points in the Fc-Sc plot.
>
> Data format is like this abbreviated sample.  Fc is a frequency in kHz
> and Sc is the characteristic slope  (octaves per second) of each call pulse.
>
> Any suggestions, guidance greatly appreciated.
> Bruce
>
> Fc  Sc
> 26.58   -5.95
> 27.03   -8.2
> 27.16   -2.07
> 26.19   -7.68
> 26.62   -3.99
> 26.85   -6.08
> 26.94   0
> 26.1-5.74
> 26.62   -5.96
> 26.85   -4.05
> 26.98   -4.09
> 26.02   -5.69
> 26.53   -7.89
> 26.62   -2
> 26.8-4.04
> 28.73   7
> 25.72   -2.97
> 26.14   -5.76
> 26.32   -3.89
> 26.40
> 26.32   5.88
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query on contour plots

2020-05-31 Thread Neotropical bat risk assessments
Hi all,

While exploring  packages for 3D plots that several folks suggested (Tnx 
all!)
It seems what I really need is a contour plot.  This is not working int 
he Deducer GUI.

This will be an aid to separating bats by their vocal signatures.
What I need to do is plot *Fc *against *Sc* with the third dimension 
being the *density* of the data points in the Fc-Sc plot.

Data format is like this abbreviated sample.  Fc is a frequency in kHz 
and Sc is the characteristic slope  (octaves per second) of each call pulse.

Any suggestions, guidance greatly appreciated.
Bruce

Fc  Sc
26.58   -5.95
27.03   -8.2
27.16   -2.07
26.19   -7.68
26.62   -3.99
26.85   -6.08
26.94   0
26.1-5.74
26.62   -5.96
26.85   -4.05
26.98   -4.09
26.02   -5.69
26.53   -7.89
26.62   -2
26.8-4.04
28.73   7
25.72   -2.97
26.14   -5.76
26.32   -3.89
26.40
26.32   5.88


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on 3d plotting pacakges

2020-05-31 Thread Bert Gunter
1. Search
2. Search!
3. Search!!

"3D plotting" at rseek.org (or R 3-d plotting on google)
CRAN plotting task view:

https://CRAN.R-project.org/view=Graphics

... and you don't even necessarily need 3D plotting if you encode the
density with color à la heatmaps.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, May 31, 2020 at 8:05 AM Neotropical bat risk assessments <
neotropical.b...@gmail.com> wrote:

> Hi all,
>
> Fumbling around trying to find a plot package to do 3D plots.
> This will be an aid to separating bats by their vocal signatures.
> What I need to do is plot *Fc *against *Sc* with the third dimension
> being the *density* of the data points in the Fc-Sc plot.
>
> Data format is like this abbreviated sample.  Fc is a frequency in kHz
> and Sc is the characteristic slope  (octaves per second) of each call
> pulse.
>
> Any suggestions, guidance greatly appreciated.
> Bruce
>
> Fc  Sc
> 26.58   -5.95
> 27.03   -8.2
> 27.16   -2.07
> 26.19   -7.68
> 26.62   -3.99
> 26.85   -6.08
> 26.94   0
> 26.1-5.74
> 26.62   -5.96
> 26.85   -4.05
> 26.98   -4.09
> 26.02   -5.69
> 26.53   -7.89
> 26.62   -2
> 26.8-4.04
> 28.73   7
> 25.72   -2.97
> 26.14   -5.76
> 26.32   -3.89
> 26.40
> 26.32   5.88
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query on 3d plotting pacakges

2020-05-31 Thread Neotropical bat risk assessments
Hi all,

Fumbling around trying to find a plot package to do 3D plots.
This will be an aid to separating bats by their vocal signatures.
What I need to do is plot *Fc *against *Sc* with the third dimension 
being the *density* of the data points in the Fc-Sc plot.

Data format is like this abbreviated sample.  Fc is a frequency in kHz 
and Sc is the characteristic slope  (octaves per second) of each call pulse.

Any suggestions, guidance greatly appreciated.
Bruce

Fc  Sc
26.58   -5.95
27.03   -8.2
27.16   -2.07
26.19   -7.68
26.62   -3.99
26.85   -6.08
26.94   0
26.1-5.74
26.62   -5.96
26.85   -4.05
26.98   -4.09
26.02   -5.69
26.53   -7.89
26.62   -2
26.8-4.04
28.73   7
25.72   -2.97
26.14   -5.76
26.32   -3.89
26.40
26.32   5.88


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about calculating the monthly average of daily data columns

2019-10-20 Thread Jim Lemon
Hi Subhamitra,
This is not the only way to do this, but if you only want the monthly
averages, it is simple:

# I had to change the "soft" tabs in your email to commas
# in order to read the data in
spdf<-read.table(text="PERMNO,DATE,Spread
111,19940103,0.025464308
111,19940104,0.064424296
111,19940105,0.018579337
111,19940106,0.018872211
111,19940107,0.065279782
111,19940110,0.063485905
111,19940111,0.018355453
111,19940112,0.064135683
111,19940113,0.063519987
111,19940114,0.018277351
111,19940117,0.018628417
111,19940118,0.065630229
111,19940119,0.018713152
111,19940120,0.019119037
111,19940121,0.068342043
111,19940124,0.020843244
111,19940125,0.019954211
111,19940126,0.018980321
111,19940127,0.066827165
111,19940128,0.067459235
111,19940131,0.068682559
111,19940201,0.02081465
111,19940202,0.068236091
111,19940203,0.068821406
111,19940204,0.020075648
111,19940207,0.066070584
111,19940208,0.066068837
111,19940209,0.019077072
111,19940210,0.065894875
111,19940211,0.018847478
111,19940214,0.065040844
111,19940215,0.01880332
111,19940216,0.018836199
111,19940217,0.06665
111,19940218,0.067116793
111,19940221,0.068809742
111,19940222,0.068230213
111,19940223,0.069502855
111,19940224,0.070383523
111,19940225,0.020430811
111,19940228,0.067087257
111,19940301,0.066776479
111,19940302,0.019959031
111,19940303,0.066596469
111,19940304,0.019131334
111,19940307,0.019312528
111,19940308,0.067349909
111,19940309,0.068916431
111,19940310,0.068620043
111,19940311,0.070494844
111,19940314,0.071056842
111,19940315,0.071042517
111,19940316,0.072401771
111,19940317,0.071940001
111,19940318,0.07352884
111,19940321,0.072671688
111,19940322,0.072652595
111,19940323,0.021352138
111,19940324,0.069933727
111,19940325,0.068717467
111,19940328,0.020470748
111,19940329,0.020003748
111,19940330,0.065833717
111,19940331,0.065268388
111,19940401,0.018762356
111,19940404,0.064914179
111,19940405,0.064706743
111,19940406,0.018764175
111,19940407,0.06524806
111,19940408,0.018593449
111,19940411,0.064913949
111,19940412,0.01872089
111,19940413,0.018729328
111,19940414,0.018978773
111,19940415,0.065477137
111,19940418,0.064614365
111,19940419,0.064184148
111,19940420,0.018553192
111,19940421,0.066872771
111,19940422,0.06680782
111,19940425,0.067467961
111,19940426,0.02014297
111,19940427,0.062464016
111,19940428,0.062357052
112,19940429,0.000233993
112,19940103,0.000815264
112,19940104,0.000238165
112,19940105,0.000813632
112,19940106,0.000236915
112,19940107,0.000809102
112,19940110,0.000801642
112,19940111,0.000797932
112,19940112,0.000795251
112,19940113,0.000795186
112,19940114,0.000231359
112,19940117,0.000232134
112,19940118,0.000233718
112,19940119,0.000233993
112,19940120,0.000234694
112,19940121,0.000235753
112,19940124,0.000808653
112,19940125,0.000235604
112,19940126,0.000805068
112,19940127,0.000802337
112,19940128,0.000801768
112,19940131,0.000233517
112,19940201,0.000797431
112,19940202,0.00028
112,19940203,0.000233826
112,19940204,0.000799519
112,19940207,0.000798105
112,19940208,0.000792245
112,19940209,0.000231113
112,19940210,0.000233413
112,19940211,0.000798168
112,19940214,0.000233282
112,19940215,0.000797848
112,19940216,0.000785165
112,19940217,0.000228426
112,19940218,0.000786783
112,19940221,0.00078343
112,19940222,0.000781459
112,19940223,0.000776264
112,19940224,0.000226399
112,19940225,0.000779066
112,19940228,0.000773603
112,19940301,0.000226487
112,19940302,0.000775233
112,19940303,0.000227017
112,19940304,0.000227854
112,19940307,0.000782814
112,19940308,0.000229164
112,19940309,0.000787033
112,19940310,0.000784049
112,19940311,0.000228984
112,19940314,0.00078697
112,19940315,0.000782567
112,19940316,0.000228516
112,19940317,0.000786347
112,19940318,0.000229236
112,19940321,0.000230107
112,19940322,0.000792689
112,19940323,0.000787284
112,19940324,0.000787221
112,19940325,0.000227978",
header=TRUE,sep=",",stringsAsFactors=FALSE)
# split the year and month out of the date string
# as you have more than one year in your complete
# data set
spdf$yrmon<-substr(spdf$DATE,1,6)
# get the mean for each PERMNO and year/month
by(spdf$Spread,spdf[,c("PERMNO","yrmon")],mean)

Jim

On Sun, Oct 20, 2019 at 11:09 PM Subhamitra Patra <
subhamitra.pa...@gmail.com> wrote:

>
> Here, I am asking one more query (just for learning purpose) that if my
> country name and its respective variable is in the panel format, and I want
> to take the monthly average for each country, how the code will be
> arranged. For your convenience, I am providing a small data sample below.
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about calculating the monthly average of daily data columns

2019-10-20 Thread Rui Barradas
42863, 10.3509617168731,
9.09646558899397, 11.270647314, 11.3984335011704, 11.4808985388742,
10.5608771133999, 10.3684356806175, 10.4815588822618, 10.5818867877558,
12.2561035284691, 8.6464271477849, 10.3412351841865, 10.7577574534162,
11.1124067479261, 9.91627943243343, 10.6356898895291, 10.2107566441478,
10.0672734202575, 10.2385787014999, 11.7112606160069, 10.0453801263575,
8.84654136100724, 10.2173421609193, 9.27919801705716, 10.4755578829547,
7.69340209082122, 9.24705253848083, 10.8415406794597, 8.69603117680965,
11.2589214416702, 10.5425642239737, 10.1389355042458, 9.17267675180435,
12.3052338002213, 10.0181674985326, 12.2715476751051, 9.64516268052739,
10.6305299379912, 10.1829347684655, 9.97983942366781, 10.2559847744715,
10.309221814, 9.75215330673072, 10.250464278709, 9.31118800061454,
10.3310666767171, 9.09703848990093, 10.241195105962, 8.57290406448477,
8.98090855172704, 8.64653101832229, 12.6791587435376, 9.56000538681993,
10.4062255533723, 11.067091476284, 10.5255014737268, 10.2240941949978,
9.13081571869084, 9.5942352120783, 9.2753466212409, 10.2789293993548,
8.10255065585342, 9.48751297655077, 8.51198576785003, 9.46310532206947,
9.86727270762806, 11.5149248124739, 9.31557156735022, 9.34351230206303,
10.022139448869, 11.4111350893792, 8.57891783464065, 10.3761090924661,
9.38300408584683, 9.33694577526158, 9.2581686085, 9.29856853889735,
8.4250073823245, 8.83022950824832, 9.1510846172981, 10.2553042376765,
10.0739540955956, 9.04955917463259, 10.8927827168631, 9.44611041690694,
10.7883395708593, 10.6010088332078, 7.72560864006592, 10.1760839916637,
11.5576569894392, 11.384809257294, 8.73504353987083, 9.00585942714512,
9.62327893504013, 10.3527072699866, 10.5220100705827, 8.74921668696853,
8.56415116683662, 12.1348451793815, 10.9496674323819, 9.64443817181322,
9.52977454697087, 10.4281877186725, 8.52701721410292, 11.6911584965782,
10.2300108250139, 8.65368821276485, 11.7733431942379, 10.2060233777681,
9.57291673029552, 9.82687667895106, 10.5939736188493, 11.2510605726337,
10.3383384488323, 9.92301237292945, 10.0164623230529, 10.4939857044034,
10.5631769648289, 10.935731043532, 11.0659359187168, 8.51697010486427,
9.79512310587405, 9.35132038807071, 11.3286703149903, 10.4621597293933,
10.4099459919071, 8.86246315190942, 9.30054044639769, 9.40346575227191,
9.59278722974697)), row.names = c(NA, -260L), class = "data.frame")





From: Subhamitra Patra <mailto:subhamitra.pa...@gmail.com>
Sent: Friday, September 13, 2019 3:59 PM
To: PIKAL Petr <mailto:petr.pi...@precheza.cz>; r-help mailing list
<mailto:r-help@r-project.org>
Subject: Re: [R] Query about calculating the monthly average of daily

data

columns

Dear PIKAL,

Thank you very much for your suggestion.

I tried your previous suggested code and getting the average value for
each month for both country A, and B. But in your recent email, you are
suggesting not to change the date column to real date. If I am going
through your recently suggested code, i.e.

  "aggregate(value column, list(format(date column, "%m.%Y"), country
column), mean)"

I am getting an Error that "aggregate(value, list(format(date, "%m.%Y"),
country), mean) : object 'value' not found".

Here, my query "may I need to define the date column, country column, and
value column separately?"

Further, I need something the average value result like below in the data
frame

Month   Country A   Country B
Jan 199426.66 35.78
Feb 199426.13 29.14

so that it will be easy for me to export to excel, and to use for the
further calculations.

Please suggest me in this regard.

Thank you.








https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;

Sender notified by



https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;

09/13/19, 07:22:53 PM



On Fri, Sep 13, 2019 at 7:03 PM PIKAL Petr 
petr.pi...@precheza.cz>

wrote:
Hi

I am almost 100% sure that you would spare yourself much trouble if you
changed your date column to real date

?as.Date

reshape your wide format to long one
library(reshape2)
?melt

to get 3 column data.frame with one date column, one country column and
one value column

use ?aggregate and ?format to get summary value

something like
aggregate(value column, list(format(date column, "%m.%Y"), country
column), mean)

But if you insist to scratch your left ear with right hand accross your
head, you could continue your way.

Cheers
Petr


-Original Message-
From: R-help <mailto:r-help-boun...@r-project.org> On Behalf Of

Subhamitra

Patra
Sent: Friday, September 13, 2019 3:20 PM
To: Jim Lemon <mailto:drjimle...@gmail.com>; r-help mailing list


http://project.org>
Subject: Re: [R] Query about calculating the monthly average of daily

data

columns

Dear Sir,

Yes, I understood the logic. But, still, I have a few queries that I

mentioned

below your answers.

"# if

Re: [R] Query about calculating the monthly average of daily data columns

2019-10-20 Thread jim holtman
8,
> > 1.3130254639318,
> > 0.191890764940071, -0.0493619237876962, -0.55183232511689,
> > 0.470263932874487,
> > -0.217088645692971, 0.231550037620628, -0.530406537266415,
> > -0.616522469083808,
> > 0.329347084038265, 1.49420692610475, 1.91750823142859, 0.753536143872474,
> > 0.766584887163714, -0.259803384094296, -0.402463714097741,
> > -0.0229799209735185,
> > -0.259677990559218, -1.41529707261105, 0.191362852138627,
> > 1.54483266684747,
> > -1.17947655378489, -0.426265411073274, 0.723010460481118,
> > 1.37405142869537,
> > -0.374771207936141, 0.0513905365832423, -0.369432731236118,
> > -0.945441984794364,
> > 0.17950664824, 0.31971255725438, -1.25117960937319, 2.46228549580083
> > ), countryB = c(9.4028512714591, 10.7551115504652, 11.2436629228434,
> > 8.25642360904389, 10.7054313972395, 10.1714609666091, 10.3726975056206,
> > 10.6594299429162, 8.56250595443296, 10.5612153841067, 8.07612112826519,
> > 9.94704207511951, 11.392407273156, 10.4257690445268, 10.6339442533038,
> > 10.5314883825356, 10.3506665399033, 10.2426403362978, 10.8437715647564,
> > 10.8247200587034, 11.2449815064171, 9.2898697883112, 9.05418978124619,
> > 10.6080277672463, 9.19882175737148, 11.3589722806948, 10.4139334238189,
> > 9.44305216810892, 9.58426470056472, 8.84208362003176, 10.8125431356391,
> > 7.71357872650814, 8.73526671289828, 10.714693958224, 9.49976972594189,
> > 9.41896864941478, 7.33073349261249, 10.5268398982262, 9.92255671125184,
> > 10.5665378092202, 10.5035704895405, 7.93682068228084, 10.882421050424,
> > 9.3237610577468, 8.42128120513304, 9.46103753451174, 10.3472215515392,
> > 11.0483414005193, 10.3421689244599, 7.85120280889754, 11.6327644046254,
> > 9.57620745972098, 10.6553844719749, 10.8490250129346, 10.2742492933876,
> > 9.55428072119304, 9.0976351049804, 10.0731951766966, 10.6956488509516,
> > 11.1530744146062, 10.3496303024767, 9.52734980693306, 9.64478424894734,
> > 9.28301632295047, 10.9568790570688, 11.6052870914912, 9.58530202776537,
> > 11.1338134902295, 8.66685735149472, 11.0230863576875, 10.8000609212302,
> > 10.6510296259782, 11.831292326569, 9.53836122448974, 9.55805411377422,
> > 9.90336204189518, 9.36377040999133, 11.7041009433341, 9.95628297574831,
> > 10.718111342931, 10.4562688422472, 8.85976383099186, 8.94085496683824,
> > 8.19538394018434, 10.1058448260449, 9.70821289789561, 9.08757962570738,
> > 10.657541876, 10.0521137258219, 9.9924295699559, 11.8730269098299,
> > 11.2634470795907, 11.3854762443416, 9.56742053529845, 10.4101561978503,
> > 9.53376547865009, 9.75410966995361, 9.92804558924886, 8.36231430067066,
> > 10.7486459346681, 12.0143881312685, 11.0083060332839, 9.32820954213586,
> > 10.8420346742049, 9.73064414798223, 10.7593902723319, 10.976622155215,
> > 10.1039774975157, 8.36317871802524, 9.21809894958653, 10.1015362220683,
> > 11.4655736295123, 9.65528297274543, 9.67844310028008, 10.1516820910267,
> > 8.38764450852642, 10.163558398201, 11.1432463477237, 12.0509818193223,
> > 10.9896913965091, 11.1772406550953, 9.14396687337779, 9.93338627749979,
> > 10.9548864433126, 8.64911301751956, 11.706463972364, 11.1012846649741,
> > 8.7805267197408, 11.5802098773954, 10.2268513542863, 10.3509617168731,
> > 9.09646558899397, 11.270647314, 11.3984335011704, 11.4808985388742,
> > 10.5608771133999, 10.3684356806175, 10.4815588822618, 10.5818867877558,
> > 12.2561035284691, 8.6464271477849, 10.3412351841865, 10.7577574534162,
> > 11.1124067479261, 9.91627943243343, 10.6356898895291, 10.2107566441478,
> > 10.0672734202575, 10.2385787014999, 11.7112606160069, 10.0453801263575,
> > 8.84654136100724, 10.2173421609193, 9.27919801705716, 10.4755578829547,
> > 7.69340209082122, 9.24705253848083, 10.8415406794597, 8.69603117680965,
> > 11.2589214416702, 10.5425642239737, 10.1389355042458, 9.17267675180435,
> > 12.3052338002213, 10.0181674985326, 12.2715476751051, 9.64516268052739,
> > 10.6305299379912, 10.1829347684655, 9.97983942366781, 10.2559847744715,
> > 10.309221814, 9.75215330673072, 10.250464278709, 9.31118800061454,
> > 10.3310666767171, 9.09703848990093, 10.241195105962, 8.57290406448477,
> > 8.98090855172704, 8.64653101832229, 12.6791587435376, 9.56000538681993,
> > 10.4062255533723, 11.067091476284, 10.5255014737268, 10.2240941949978,
> > 9.13081571869084, 9.5942352120783, 9.2753466212409, 10.2789293993548,
> > 8.10255065585342, 9.48751297655077, 8.51198576785003, 9.46310532206947,
> > 9.86727270762806, 11.5149248124739, 9.31557156735022, 9.34351230206303,
> > 10.022139448869, 11.4111350893792, 8.57891783464065, 10.3761090924661,
> > 9.38300408584683, 9.33694577526158, 

Re: [R] Query about calculating the monthly average of daily data columns

2019-10-20 Thread Subhamitra Patra
9261, 9.91627943243343, 10.6356898895291, 10.2107566441478,
> 10.0672734202575, 10.2385787014999, 11.7112606160069, 10.0453801263575,
> 8.84654136100724, 10.2173421609193, 9.27919801705716, 10.4755578829547,
> 7.69340209082122, 9.24705253848083, 10.8415406794597, 8.69603117680965,
> 11.2589214416702, 10.5425642239737, 10.1389355042458, 9.17267675180435,
> 12.3052338002213, 10.0181674985326, 12.2715476751051, 9.64516268052739,
> 10.6305299379912, 10.1829347684655, 9.97983942366781, 10.2559847744715,
> 10.309221814, 9.75215330673072, 10.250464278709, 9.31118800061454,
> 10.3310666767171, 9.09703848990093, 10.241195105962, 8.57290406448477,
> 8.98090855172704, 8.64653101832229, 12.6791587435376, 9.56000538681993,
> 10.4062255533723, 11.067091476284, 10.5255014737268, 10.2240941949978,
> 9.13081571869084, 9.5942352120783, 9.2753466212409, 10.2789293993548,
> 8.10255065585342, 9.48751297655077, 8.51198576785003, 9.46310532206947,
> 9.86727270762806, 11.5149248124739, 9.31557156735022, 9.34351230206303,
> 10.022139448869, 11.4111350893792, 8.57891783464065, 10.3761090924661,
> 9.38300408584683, 9.33694577526158, 9.2581686085, 9.29856853889735,
> 8.4250073823245, 8.83022950824832, 9.1510846172981, 10.2553042376765,
> 10.0739540955956, 9.04955917463259, 10.8927827168631, 9.44611041690694,
> 10.7883395708593, 10.6010088332078, 7.72560864006592, 10.1760839916637,
> 11.5576569894392, 11.384809257294, 8.73504353987083, 9.00585942714512,
> 9.62327893504013, 10.3527072699866, 10.5220100705827, 8.74921668696853,
> 8.56415116683662, 12.1348451793815, 10.9496674323819, 9.64443817181322,
> 9.52977454697087, 10.4281877186725, 8.52701721410292, 11.6911584965782,
> 10.2300108250139, 8.65368821276485, 11.7733431942379, 10.2060233777681,
> 9.57291673029552, 9.82687667895106, 10.5939736188493, 11.2510605726337,
> 10.3383384488323, 9.92301237292945, 10.0164623230529, 10.4939857044034,
> 10.5631769648289, 10.935731043532, 11.0659359187168, 8.51697010486427,
> 9.79512310587405, 9.35132038807071, 11.3286703149903, 10.4621597293933,
> 10.4099459919071, 8.86246315190942, 9.30054044639769, 9.40346575227191,
> 9.59278722974697)), row.names = c(NA, -260L), class = "data.frame")
>
>
>
>
>
> From: Subhamitra Patra <mailto:subhamitra.pa...@gmail.com>
> Sent: Friday, September 13, 2019 3:59 PM
> To: PIKAL Petr <mailto:petr.pi...@precheza.cz>; r-help mailing list
> <mailto:r-help@r-project.org>
> Subject: Re: [R] Query about calculating the monthly average of daily data
> columns
>
> Dear PIKAL,
>
> Thank you very much for your suggestion.
>
> I tried your previous suggested code and getting the average value for
> each month for both country A, and B. But in your recent email, you are
> suggesting not to change the date column to real date. If I am going
> through your recently suggested code, i.e.
>
>  "aggregate(value column, list(format(date column, "%m.%Y"), country
> column), mean)"
>
> I am getting an Error that "aggregate(value, list(format(date, "%m.%Y"),
> country), mean) : object 'value' not found".
>
> Here, my query "may I need to define the date column, country column, and
> value column separately?"
>
> Further, I need something the average value result like below in the data
> frame
>
> Month   Country A   Country B
> Jan 199426.66 35.78
> Feb 199426.13 29.14
>
> so that it will be easy for me to export to excel, and to use for the
> further calculations.
>
> Please suggest me in this regard.
>
> Thank you.
>
>
>
>
>
>
> https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;
> Sender notified by
>
> https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;
> 09/13/19, 07:22:53 PM
>
>
>
> On Fri, Sep 13, 2019 at 7:03 PM PIKAL Petr <mailto:petr.pi...@precheza.cz>
> wrote:
> Hi
>
> I am almost 100% sure that you would spare yourself much trouble if you
> changed your date column to real date
>
> ?as.Date
>
> reshape your wide format to long one
> library(reshape2)
> ?melt
>
> to get 3 column data.frame with one date column, one country column and
> one value column
>
> use ?aggregate and ?format to get summary value
>
> something like
> aggregate(value column, list(format(date column, "%m.%Y"), country
> column), mean)
>
> But if you insist to scratch your left ear with right hand accross your
> head, you could continue your way.
>
> Cheers
> Petr
>
> > -Original Message-
> > From: R-help <mailto:r-help-boun...@r-project.org> On Behalf Of
> Subhamitra
> > Patra
> > Sent: Fri

Re: [R] Query about calculating the monthly average of daily data columns

2019-09-16 Thread PIKAL Petr
7040999133, 11.7041009433341, 9.95628297574831, 
10.718111342931, 10.4562688422472, 8.85976383099186, 8.94085496683824, 
8.19538394018434, 10.1058448260449, 9.70821289789561, 9.08757962570738, 
10.657541876, 10.0521137258219, 9.9924295699559, 11.8730269098299, 
11.2634470795907, 11.3854762443416, 9.56742053529845, 10.4101561978503, 
9.53376547865009, 9.75410966995361, 9.92804558924886, 8.36231430067066, 
10.7486459346681, 12.0143881312685, 11.0083060332839, 9.32820954213586, 
10.8420346742049, 9.73064414798223, 10.7593902723319, 10.976622155215, 
10.1039774975157, 8.36317871802524, 9.21809894958653, 10.1015362220683, 
11.4655736295123, 9.65528297274543, 9.67844310028008, 10.1516820910267, 
8.38764450852642, 10.163558398201, 11.1432463477237, 12.0509818193223, 
10.9896913965091, 11.1772406550953, 9.14396687337779, 9.93338627749979, 
10.9548864433126, 8.64911301751956, 11.706463972364, 11.1012846649741, 
8.7805267197408, 11.5802098773954, 10.2268513542863, 10.3509617168731, 
9.09646558899397, 11.270647314, 11.3984335011704, 11.4808985388742, 
10.5608771133999, 10.3684356806175, 10.4815588822618, 10.5818867877558, 
12.2561035284691, 8.6464271477849, 10.3412351841865, 10.7577574534162, 
11.1124067479261, 9.91627943243343, 10.6356898895291, 10.2107566441478, 
10.0672734202575, 10.2385787014999, 11.7112606160069, 10.0453801263575, 
8.84654136100724, 10.2173421609193, 9.27919801705716, 10.4755578829547, 
7.69340209082122, 9.24705253848083, 10.8415406794597, 8.69603117680965, 
11.2589214416702, 10.5425642239737, 10.1389355042458, 9.17267675180435, 
12.3052338002213, 10.0181674985326, 12.2715476751051, 9.64516268052739, 
10.6305299379912, 10.1829347684655, 9.97983942366781, 10.2559847744715, 
10.309221814, 9.75215330673072, 10.250464278709, 9.31118800061454, 
10.3310666767171, 9.09703848990093, 10.241195105962, 8.57290406448477, 
8.98090855172704, 8.64653101832229, 12.6791587435376, 9.56000538681993, 
10.4062255533723, 11.067091476284, 10.5255014737268, 10.2240941949978, 
9.13081571869084, 9.5942352120783, 9.2753466212409, 10.2789293993548, 
8.10255065585342, 9.48751297655077, 8.51198576785003, 9.46310532206947, 
9.86727270762806, 11.5149248124739, 9.31557156735022, 9.34351230206303, 
10.022139448869, 11.4111350893792, 8.57891783464065, 10.3761090924661, 
9.38300408584683, 9.33694577526158, 9.2581686085, 9.29856853889735, 
8.4250073823245, 8.83022950824832, 9.1510846172981, 10.2553042376765, 
10.0739540955956, 9.04955917463259, 10.8927827168631, 9.44611041690694, 
10.7883395708593, 10.6010088332078, 7.72560864006592, 10.1760839916637, 
11.5576569894392, 11.384809257294, 8.73504353987083, 9.00585942714512, 
9.62327893504013, 10.3527072699866, 10.5220100705827, 8.74921668696853, 
8.56415116683662, 12.1348451793815, 10.9496674323819, 9.64443817181322, 
9.52977454697087, 10.4281877186725, 8.52701721410292, 11.6911584965782, 
10.2300108250139, 8.65368821276485, 11.7733431942379, 10.2060233777681, 
9.57291673029552, 9.82687667895106, 10.5939736188493, 11.2510605726337, 
10.3383384488323, 9.92301237292945, 10.0164623230529, 10.4939857044034, 
10.5631769648289, 10.935731043532, 11.0659359187168, 8.51697010486427, 
9.79512310587405, 9.35132038807071, 11.3286703149903, 10.4621597293933, 
10.4099459919071, 8.86246315190942, 9.30054044639769, 9.40346575227191, 
9.59278722974697)), row.names = c(NA, -260L), class = "data.frame")





From: Subhamitra Patra <mailto:subhamitra.pa...@gmail.com> 
Sent: Friday, September 13, 2019 3:59 PM
To: PIKAL Petr <mailto:petr.pi...@precheza.cz>; r-help mailing list 
<mailto:r-help@r-project.org>
Subject: Re: [R] Query about calculating the monthly average of daily data 
columns

Dear PIKAL,

Thank you very much for your suggestion.

I tried your previous suggested code and getting the average value for each 
month for both country A, and B. But in your recent email, you are suggesting 
not to change the date column to real date. If I am going through your recently 
suggested code, i.e.

 "aggregate(value column, list(format(date column, "%m.%Y"), country column), 
mean)"

I am getting an Error that "aggregate(value, list(format(date, "%m.%Y"), 
country), mean) : object 'value' not found". 

Here, my query "may I need to define the date column, country column, and value 
column separately?"

Further, I need something the average value result like below in the data frame

Month       Country A   Country B
Jan 1994    26.66         35.78
Feb 1994    26.13         29.14

so that it will be easy for me to export to excel, and to use for the further 
calculations.

Please suggest me in this regard.

Thank you.





https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;
Sender notified by 
https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;
 09/13/19, 07:22:53 PM 



On Fri, Sep 13, 2019 at 7:03 PM PIKAL Petr <mailto:petr.pi...@precheza.cz> 
wrote:
Hi

Re: [R] Query about calculating the monthly average of daily data columns

2019-09-13 Thread Jim Lemon
Sorry, forgot to include the list.

On Sat, Sep 14, 2019 at 10:27 AM Jim Lemon  wrote:
>
> See inline
>
> On Fri, Sep 13, 2019 at 11:20 PM Subhamitra Patra 
>  wrote:
>>
>> Dear Sir,
>>
>> Yes, I understood the logic. But, still, I have a few queries that I 
>> mentioned below your answers.
>>
>>> "# if you only have to get the monthly averages, it can be done this way
>>> spdat$month<-sapply(strsplit(spdat$dates,"-"),"[",2)
>>> spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",3)"
>>>
>>> B. Here, I need to define the no. of months, and years separately, right? 
>>> or else what 2, and 3 (in bold) indicates?
>>
>>
>> To get the grouping variable of sequential months that you want, you only 
>> need the month and year values of the dates in the first column. First I 
>> used the "strsplit" function to split the date field at the hyphens, then 
>> used "sapply" to extract ("[") the second (month) and third (year) parts as 
>> two new columns. Because you have more than one year of data, you need the 
>> year values or you will group all Januarys, all Februarys and so on. Notice 
>> how I pass both of the new columns as a list (a data frame is a type of 
>> list) in the call to get the mean of each month.
>>
>> 1. Here, as per my understanding, the "3" indicates the 3rd year, right? 
>> But, you showed an average for 2 months of the same year. Then, what "3" in 
>> the  spdat$year object indicate?
>
>
> No, as I explained in the initial email and below, the "strsplit" function 
> takes one or more strings (your dates) and breaks them at the specified 
> character ("-"), So
>
> strsplit("1-1-1994","-")
> [[1]]
> [1] "1""1""1994"
>
> That is passed to the "sapply" function that applies the extraction ("[") 
> operator to the result of "strsplit". The "3" indicates that you want to 
> extract the third element, in this case, the year.
>
> > sapply(strsplit("1-1-1994","-"),"[",3)
> [1] "1994"
>
> So by splitting the dates and extracting the second (month) and third (year) 
> element from each date, we have all the information needed to create a 
> grouping variable for monthly averages.
>
>>
>>
>>> C. From this part, I got the exact average values of both January and 
>>> February of 1994 for country A, and B. But, in code, I have a query that I 
>>> need to define  spdat$returnA, and  spdat$returnB separately before writing 
>>> this code, right? Like this, I need to define for each 84 countries 
>>> separately with their respective number of months, and years before writing 
>>> this code, right?
>>
>>
>> I don't think so. Because I don't know what your data looks like, I am 
>> guessing that for each row, it has columns for each of the 84 countries. I 
>> don't know what these columns are named, either. Maybe:
>>
>> date Australia   Belarus   ...Zambia
>> 01/01/1994   20 21 22
>> ...
>>
>> Here, due to my misunderstanding about the code, I was wrong. But, what data 
>> structure you guessed, it is absolutely right that for each row, I have 
>> columns for each of the 84 countries. So, I think, I need to define the date 
>> column with no. of months, and years once for all the countries. Therefore, 
>> I got my answer to the first and third question in the previous email (what 
>> you suggested) that I no need to define the column of each country, as the 
>> date, and no. of observations are same for all countries. But, the no. of 
>> days are different for each month, and similarly, for each year. So, I think 
>> I need to define date for each year separately.  Hence, I have given an 
>> example of 12 months, for 2 years (i.e. 1994, and 1995), and have written 
>> the following code. Please correct me in case I am wrong.
>>
>>  spdat<-data.frame(
>>   
>> dates=paste(c(1:21,1:20,1:23,1:21,1:22,1:22,1:21,1:23,1:22,1:21,1:22,1:22),c(rep(1,21),rep(2,20),rep(3,23),
>>  rep(4,21), 
>> rep(5,22),rep(6,22),rep(7,21),rep(8,23),rep(9,22),rep(10,21),rep(11,22),rep(12,22)),rep(1994,260)
>>  
>> dates1=paste(c(1:22,1:20,1:23,1:20,1:23,1:22,1:21,1:23,1:21,1:22,1:22,1:21),c(rep(1,22),rep(2,20),rep(3,23),
>>  rep(4,20), 
>> rep(5,23),rep(6,22),rep(7,21),rep(8,23),rep(9,21),rep(10,21),rep(11,22),rep(12,21)),rep(1995,259)
>>  ,sep="-")
>>
> First, you don't have to recreate the data that you already have. I did 
> because I don't have it and have to guess what it looks like. Remember 
> neither I nor any of the others who have offered help have your data or even 
> a representative sample. If you tried the code above, you surely must know 
> that it doesn't work. I could create code that would produce the dates from 
> 1-1-1994 to 31/12/1995 or any other stretch you would like, but it would only 
> confuse you more.  _You already have the dates in your data file._ What I 
> have shown you is how to use those dates to create the grouping variable that 
> you want.
>
>> Concerning the exporting of structure of the dataset to excel, I will have 
>> 12*84 matrix. But, please 

Re: [R] Query about calculating the monthly average of daily data columns

2019-09-13 Thread Subhamitra Patra
Dear PIKAL,

Thank you very much for your suggestion.

I tried your previous suggested code and getting the average value for each
month for both country A, and B. But in your recent email, you are
suggesting not to change the date column to real date. If I am going
through your recently suggested code, i.e.

 "aggregate(value column, list(format(date column, "%m.%Y"), country
column), mean)"

I am getting an Error that "*aggregate(value, list(format(date, "%m.%Y"),
country), mean) : **object 'value' not found"*.

Here, my query "*may I need to define the date column, country column, and
value column separately?"*

Further, I need something the average value result like below in the data
frame

Month   Country A   Country B
Jan 199426.66 35.78
Feb 199426.13 29.14

so that it will be easy for me to export to excel, and to use for the
further calculations.

Please suggest me in this regard.

Thank you.







[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
Sender
notified by
Mailtrack
<https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;>
09/13/19,
07:22:53 PM

On Fri, Sep 13, 2019 at 7:03 PM PIKAL Petr  wrote:

> Hi
>
> I am almost 100% sure that you would spare yourself much trouble if you
> changed your date column to real date
>
> ?as.Date
>
> reshape your wide format to long one
> library(reshape2)
> ?melt
>
> to get 3 column data.frame with one date column, one country column and
> one value column
>
> use ?aggregate and ?format to get summary value
>
> something like
> aggregate(value column, list(format(date column, "%m.%Y"), country
> column), mean)
>
> But if you insist to scratch your left ear with right hand accross your
> head, you could continue your way.
>
> Cheers
> Petr
>
> > -Original Message-
> > From: R-help  On Behalf Of Subhamitra
> > Patra
> > Sent: Friday, September 13, 2019 3:20 PM
> > To: Jim Lemon ; r-help mailing list  > project.org>
> > Subject: Re: [R] Query about calculating the monthly average of daily
> data
> > columns
> >
> > Dear Sir,
> >
> > Yes, I understood the logic. But, still, I have a few queries that I
> mentioned
> > below your answers.
> >
> > "# if you only have to get the monthly averages, it can be done this way
> > > spdat$month<-sapply(strsplit(spdat$dates,"-"),"["*,2*)
> > > spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",*3*)"
> > >
> > > B. Here, I need to define the no. of months, and years separately,
> right?
> > > or else what 2, and 3 (in bold) indicates?
> > >
> >
> > To get the grouping variable of sequential months that you want, you only
> > need the month and year values of the dates in the first column. First I
> used
> > the "strsplit" function to split the date field at the hyphens, then used
> > "sapply" to extract ("[") the second (month) and *third (year)* parts as
> two
> > new columns. Because you have more than one year of data, you need the
> > year values or you will group all Januarys, all Februarys and so on.
> > Notice how I pass both of the new columns as a list (a data frame is a
> type of
> > list) in the call to get the mean of each month.
> >
> > 1. Here, as per my understanding, the "3" indicates the 3rd year, right?
> > But, you showed an average for 2 months of the same year. Then, what "3"
> > in the  spdat$year object indicate?
> >
> >
> > C. From this part, I got the exact average values of both January and
> > > February of 1994 for country A, and B. But, in code, I have a query
> > > that I need to define  spdat$returnA, and  spdat$returnB separately
> > > before writing this code, right? Like this, I need to define for each
> > > 84 countries separately with their respective number of months, and
> > > years before writing this code, right?
> > >
> >
> > I don't think so. Because I don't know what your data looks like, I am
> > guessing that for each row, it has columns for each of the 84 countries.
> I
> > don't know what these columns are named, either. Maybe:
> >
> > date Australia   Belarus   ...Zambia
> > 01/01/1994   20 21 22
> > ...
> >
> > Here, due to my misunderstanding about the code, I was wrong. But, what
> > data structure you guessed, it is absolutely right that for each row, I
> have
> > columns for ea

Re: [R] Query about calculating the monthly average of daily data columns

2019-09-13 Thread PIKAL Petr
Hi

I am almost 100% sure that you would spare yourself much trouble if you changed 
your date column to real date

?as.Date

reshape your wide format to long one
library(reshape2)
?melt

to get 3 column data.frame with one date column, one country column and one 
value column

use ?aggregate and ?format to get summary value

something like
aggregate(value column, list(format(date column, "%m.%Y"), country column), 
mean)

But if you insist to scratch your left ear with right hand accross your head, 
you could continue your way.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Subhamitra
> Patra
> Sent: Friday, September 13, 2019 3:20 PM
> To: Jim Lemon ; r-help mailing list  project.org>
> Subject: Re: [R] Query about calculating the monthly average of daily data
> columns
>
> Dear Sir,
>
> Yes, I understood the logic. But, still, I have a few queries that I mentioned
> below your answers.
>
> "# if you only have to get the monthly averages, it can be done this way
> > spdat$month<-sapply(strsplit(spdat$dates,"-"),"["*,2*)
> > spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",*3*)"
> >
> > B. Here, I need to define the no. of months, and years separately, right?
> > or else what 2, and 3 (in bold) indicates?
> >
>
> To get the grouping variable of sequential months that you want, you only
> need the month and year values of the dates in the first column. First I used
> the "strsplit" function to split the date field at the hyphens, then used
> "sapply" to extract ("[") the second (month) and *third (year)* parts as two
> new columns. Because you have more than one year of data, you need the
> year values or you will group all Januarys, all Februarys and so on.
> Notice how I pass both of the new columns as a list (a data frame is a type of
> list) in the call to get the mean of each month.
>
> 1. Here, as per my understanding, the "3" indicates the 3rd year, right?
> But, you showed an average for 2 months of the same year. Then, what "3"
> in the  spdat$year object indicate?
>
>
> C. From this part, I got the exact average values of both January and
> > February of 1994 for country A, and B. But, in code, I have a query
> > that I need to define  spdat$returnA, and  spdat$returnB separately
> > before writing this code, right? Like this, I need to define for each
> > 84 countries separately with their respective number of months, and
> > years before writing this code, right?
> >
>
> I don't think so. Because I don't know what your data looks like, I am
> guessing that for each row, it has columns for each of the 84 countries. I
> don't know what these columns are named, either. Maybe:
>
> date Australia   Belarus   ...Zambia
> 01/01/1994   20 21 22
> ...
>
> Here, due to my misunderstanding about the code, I was wrong. But, what
> data structure you guessed, it is absolutely right that for each row, I have
> columns for each of the 84 countries. So, I think, I need to define the date
> column with no. of months, and years once for all the countries.
> Therefore, I got my answer to the first and third question in the previous
> email (what you suggested) that I no need to define the column of each
> country, as the date, and no. of observations are same for all countries.
> But, the no. of days are different for each month, and similarly, for each
> year. So, I think I need to define date for each year separately.  Hence, I 
> have
> given an example of 12 months, for 2 years (i.e. 1994, and 1995), and have
> written the following code. Please correct me in case I am wrong.
>
>  spdat<-data.frame(
>
> dates=paste(c(1:21,1:20,1:23,1:21,1:22,1:22,1:21,1:23,1:22,1:21,1:22,1:22),c(r
> ep(1,21),rep(2,20),
> rep(3,23), rep(4,21),
> rep(5,22),rep(6,22),rep(7,21),rep(8,23),rep(9,22),rep(10,21),rep(11,22),rep(12
> ,22)
> ),rep(1994,260)
>  dates1=
> paste(c(1:22,1:20,1:23,1:20,1:23,1:22,1:21,1:23,1:21,1:22,1:22,1:21),c(rep(1,2
> 2),rep(2,20),
> rep(3,23), rep(4,20),
> rep(5,23),rep(6,22),rep(7,21),rep(8,23),rep(9,21),rep(10,21),rep(11,22),rep(12
> ,21)
> ),rep(1995,259) ,sep="-")
>
> Concerning the exporting of structure of the dataset to excel, I will have
> 12*84 matrix. But, please suggest me the way to proceed for the large
> sample. I have mentioned below what I understood from your code. Please
> correct me if I am wrong.
> 1. I need to define the date for each year as the no. of days in each month
> are different for each year (as mentioned in my above code). For instance, in
> my data file, Jan 1994 has 21 days wh

Re: [R] Query about calculating the monthly average of daily data columns

2019-09-13 Thread Subhamitra Patra
Dear Sir,

Yes, I understood the logic. But, still, I have a few queries that I
mentioned below your answers.

"# if you only have to get the monthly averages, it can be done this way
> spdat$month<-sapply(strsplit(spdat$dates,"-"),"["*,2*)
> spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",*3*)"
>
> B. Here, I need to define the no. of months, and years separately, right?
> or else what 2, and 3 (in bold) indicates?
>

To get the grouping variable of sequential months that you want, you only
need the month and year values of the dates in the first column. First I
used the "strsplit" function to split the date field at the hyphens, then
used "sapply" to extract ("[") the second (month) and *third (year)* parts
as two new columns. Because you have more than one year of data, you need
the year values or you will group all Januarys, all Februarys and so on.
Notice how I pass both of the new columns as a list (a data frame is a type
of list) in the call to get the mean of each month.

1. Here, as per my understanding, the "3" indicates the 3rd year, right?
But, you showed an average for 2 months of the same year. Then, what "3" in
the  spdat$year object indicate?


C. From this part, I got the exact average values of both January and
> February of 1994 for country A, and B. But, in code, I have a query that I
> need to define  spdat$returnA, and  spdat$returnB separately before writing
> this code, right? Like this, I need to define for each 84 countries
> separately with their respective number of months, and years before writing
> this code, right?
>

I don't think so. Because I don't know what your data looks like, I am
guessing that for each row, it has columns for each of the 84 countries. I
don't know what these columns are named, either. Maybe:

date Australia   Belarus   ...Zambia
01/01/1994   20 21 22
...

Here, due to my misunderstanding about the code, I was wrong. But, what
data structure you guessed, it is absolutely right that for each row, I
have columns for each of the 84 countries. So, I think, I need to define
the date column with no. of months, and years once for all the countries.
Therefore, I got my answer to the first and third question in the previous
email (what you suggested) that I no need to define the column of each
country, as the date, and no. of observations are same for all countries.
But, the no. of days are different for each month, and similarly, for each
year. So, I think I need to define date for each year separately.  Hence, I
have given an example of 12 months, for 2 years (i.e. 1994, and 1995), and
have written the following code. Please correct me in case I am wrong.

 spdat<-data.frame(

dates=paste(c(1:21,1:20,1:23,1:21,1:22,1:22,1:21,1:23,1:22,1:21,1:22,1:22),c(rep(1,21),rep(2,20),
rep(3,23), rep(4,21),
rep(5,22),rep(6,22),rep(7,21),rep(8,23),rep(9,22),rep(10,21),rep(11,22),rep(12,22)
),rep(1994,260)
 dates1=
paste(c(1:22,1:20,1:23,1:20,1:23,1:22,1:21,1:23,1:21,1:22,1:22,1:21),c(rep(1,22),rep(2,20),
rep(3,23), rep(4,20),
rep(5,23),rep(6,22),rep(7,21),rep(8,23),rep(9,21),rep(10,21),rep(11,22),rep(12,21)
),rep(1995,259) ,sep="-")

Concerning the exporting of structure of the dataset to excel, I will have
12*84 matrix. But, please suggest me the way to proceed for the large
sample. I have mentioned below what I understood from your code. Please
correct me if I am wrong.
1. I need to define the date for each year as the no. of days in each month
are different for each year (as mentioned in my above code). For instance,
in my data file, Jan 1994 has 21 days while Jan 1995 has 22 days.
2. Need to define the date column as character.
3. Need to define the monthly average for each month, and year. So, now
code will be as follows.
spdat$month<-sapply(strsplit(spdat$dates,"-"),"[",2,3,4,5,6,7,8,9,10,11,12)
  As I need all months average sequentially.
spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",3)

Here, this meaning of "3", I am really unable to get.

4. Need to define each country with each month and year as mentioned in the
last part of your code.

Please suggest me in this regard.

Thank you.







[image: Mailtrack]

Sender
notified by
Mailtrack

09/13/19,
06:41:41 PM

On Fri, Sep 13, 2019 at 4:24 PM Jim Lemon  wrote:

> Hi Subhamitra,
> I'll try to write my answers adjacent to your questions below.
>
> On Fri, Sep 13, 2019 at 6:08 PM Subhamitra Patra <
> subhamitra.pa...@gmail.com> wrote:
>
>> Dear Sir,
>>
>> Thank you very much for your suggestion.
>>
>> Yes, your suggested code worked. But, actually, I have data from 3rd
>> January 1994 to 3rd August 2017 for very large (i.e. for 84 countries)
>> sample. From this, I have given the example of the years up to 2000. Before
>> applying the same code for the long 24 years, I want to learn the logic
>> 

Re: [R] Query about calculating the monthly average of daily data columns

2019-09-13 Thread Jim Lemon
Hi Subhamitra,
I'll try to write my answers adjacent to your questions below.

On Fri, Sep 13, 2019 at 6:08 PM Subhamitra Patra 
wrote:

> Dear Sir,
>
> Thank you very much for your suggestion.
>
> Yes, your suggested code worked. But, actually, I have data from 3rd
> January 1994 to 3rd August 2017 for very large (i.e. for 84 countries)
> sample. From this, I have given the example of the years up to 2000. Before
> applying the same code for the long 24 years, I want to learn the logic
> behind the code. Actually, some part of the code is not understandable to
> me which I mentioned in the bold letter as follows.
>
> "spdat<-data.frame(
>   dates=paste(c(1:30,1:28),c(rep(1,30),rep(2,28)),rep(1994,58),sep="-"),
>   returnA=sample(*15:50*,58,TRUE),returnB=sample(*10:45*,58,TRUE))"
>
> A. Here, I need to define the no. of days in a month, and the no. of
> countries name separately, right? But, what is meant by 15:50, and 10:45 in
> return A, and B respectively?
>

To paraphrase Donald Trump, this is FAKE DATA! I have no idea what the real
values of return are, so I made them up using the "sample" function.
However, this is not meant to mislead anyone, just to show how whatever
numbers are in your data can be used in calculations. The colon (":")
operator creates a sequence of numbers starting with the one to the left
and ending with the one to the right.

>
> "# if you only have to get the monthly averages, it can be done this way
> spdat$month<-sapply(strsplit(spdat$dates,"-"),"["*,2*)
> spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",*3*)"
>
> B. Here, I need to define the no. of months, and years separately, right?
> or else what 2, and 3 (in bold) indicates?
>

To get the grouping variable of sequential months that you want, you only
need the month and year values of the dates in the first column. First I
used the "strsplit" function to split the date field at the hyphens, then
used "sapply" to extract ("[") the second (month) and third (year) parts as
two new columns. Because you have more than one year of data, you need the
year values or you will group all Januarys, all Februarys and so on. Notice
how I pass both of the new columns as a list (a data frame is a type of
list) in the call to get the mean of each month.

>
> "# get the averages by month and year - is this correct?
> monthlyA<-by(*spdat$returnA*,spdat[,c("month","year")],mean)
> monthlyB<-by(*spdat$returnB*,spdat[,c("month","year")],mean)"
>
> C. From this part, I got the exact average values of both January and
> February of 1994 for country A, and B. But, in code, I have a query that I
> need to define  spdat$returnA, and  spdat$returnB separately before writing
> this code, right? Like this, I need to define for each 84 countries
> separately with their respective number of months, and years before writing
> this code, right?
>

I don't think so. Because I don't know what your data looks like, I am
guessing that for each row, it has columns for each of the 84 countries. I
don't know what these columns are named, either. Maybe:

date Australia   Belarus   ...Zambia
01/01/1994   20 21 22
...


> Yes, after obtaining the monthly average for each country's data, I need
> to use them for further calculations. So, I want to export the result to
> excel. But, until understanding the code, I think I willn't able to apply
> for the entire sample, and cannot be able to discuss the format of the
> resulted column to export to excel.
>

Say that we perform the grouped mean calculation for the first two country
columns like this:
monmeans<-sapply(spdat[,2:3],by,spdat[,c("month","year")],mean)
monmeans
Australia  Belarus
[1,]  29.7 30.4
[2,]  34.17857 27.39286

We are presented with a 2x2 matrix of monthly means in just the format
someone might use for importing into Excel. The first row is January 1994,
the second February 1994 and so on. By expanding the columns to include all
the countries in your data, You should have the result you want.

Jim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about calculating the monthly average of daily data columns

2019-09-13 Thread PIKAL Petr
Hi

I may be completely wrong but reshape/aggregate should by what you want
spdat
   dates returnA returnB
1   1-1-1994  16  13
2   2-1-1994  44  10
3   3-1-1994  24  32
.
> library(reshape2)
> spdat.m <- melt(spdat)
Using dates as id variables
> str(spdat.m)
'data.frame':   116 obs. of  3 variables:
 $ dates   : Factor w/ 58 levels "1-1-1994","1-2-1994",..: 1 23 44 47 49 51 53 
55 57 3 ...
 $ variable: Factor w/ 2 levels "returnA","returnB": 1 1 1 1 1 1 1 1 1 1 ...
 $ value   : int  16 44 24 47 16 35 34 34 26 36 ...
> spdat.m$realdate <- as.Date(spdat.m[,1], format="%d-%m-%Y")
> aggregate(spdat.m$value, list(format(spdat.m$realdate, "%m.%Y"), 
> spdat.m$variable), mean)
  Group.1 Group.2x
1 01.1994 returnA 31.9
2 02.1994 returnA 32.39286
3 01.1994 returnB 24.26667
4 02.1994 returnB 30.03571

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Subhamitra
> Patra
> Sent: Friday, September 13, 2019 10:08 AM
> To: Jim Lemon 
> Cc: r-help mailing list 
> Subject: Re: [R] Query about calculating the monthly average of daily data
> columns
>
> Dear Sir,
>
> Thank you very much for your suggestion.
>
> Yes, your suggested code worked. But, actually, I have data from 3rd January
> 1994 to 3rd August 2017 for very large (i.e. for 84 countries) sample. From
> this, I have given the example of the years up to 2000. Before applying the
> same code for the long 24 years, I want to learn the logic behind the code.
> Actually, some part of the code is not understandable to me which I
> mentioned in the bold letter as follows.
>
> "spdat<-data.frame(
>   dates=paste(c(1:30,1:28),c(rep(1,30),rep(2,28)),rep(1994,58),sep="-"),
>   returnA=sample(*15:50*,58,TRUE),returnB=sample(*10:45*,58,TRUE))"
>
> A. Here, I need to define the no. of days in a month, and the no. of countries
> name separately, right? But, what is meant by 15:50, and 10:45 in return A,
> and B respectively?
>
> "# if you only have to get the monthly averages, it can be done this way
> spdat$month<-sapply(strsplit(spdat$dates,"-"),"["*,2*)
> spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",*3*)"
>
> B. Here, I need to define the no. of months, and years separately, right?
> or else what 2, and 3 (in bold) indicates?
>
> "# get the averages by month and year - is this correct?
> monthlyA<-by(*spdat$returnA*,spdat[,c("month","year")],mean)
> monthlyB<-by(*spdat$returnB*,spdat[,c("month","year")],mean)"
>
> C. From this part, I got the exact average values of both January and
> February of 1994 for country A, and B. But, in code, I have a query that I
> need to define  spdat$returnA, and  spdat$returnB separately before writing
> this code, right? Like this, I need to define for each 84 countries separately
> with their respective number of months, and years before writing this code,
> right?
>
> Yes, after obtaining the monthly average for each country's data, I need to
> use them for further calculations. So, I want to export the result to excel. 
> But,
> until understanding the code, I think I willn't able to apply for the entire
> sample, and cannot be able to discuss the format of the resulted column to
> export to excel.
>
> Therefore, kindly help me to understand the code.
>
> Thank you very much, Sir, and thanks to this R forum for helping the R-
> beginners.
>
>
>
> [image: Mailtrack]
> <https://mailtrack.io?utm_source=gmail_medium=signature_ca
> mpaign=signaturevirality5&>
> Sender
> notified by
> Mailtrack
> <https://mailtrack.io?utm_source=gmail_medium=signature_ca
> mpaign=signaturevirality5&>
> 09/13/19,
> 12:57:58 PM
>
> On Fri, Sep 13, 2019 at 3:15 AM Jim Lemon  wrote:
>
> > Hi Subhamitra,
> > Your data didn't make it through, so I guess the first thing is to
> > guess what it looks like. Here's a try at just January and February of
> > 1994 so that we can see the result on the screen. The logic will work
> > just as well for the whole seven years.
> >
> > # create fake data for the first two months spdat<-data.frame(
> > dates=paste(c(1:30,1:28),c(rep(1,30),rep(2,28)),rep(1994,58),sep="-"),
> >  returnA=sample(15:50,58,TRUE),returnB=sample(10:45,58,TRUE))
> > # I'll assume that the dates in your file are character, not factor
> > spdat$dates<-as.character(spdat$dates)
> > # if you only have to get the monthly averages, it can be done this
> > way
> > spdat$month<-sapply(strsplit(spdat$dates,"-"),"[",2)
> >

Re: [R] Query about calculating the monthly average of daily data columns

2019-09-13 Thread Subhamitra Patra
Dear Sir,

Thank you very much for your suggestion.

Yes, your suggested code worked. But, actually, I have data from 3rd
January 1994 to 3rd August 2017 for very large (i.e. for 84 countries)
sample. From this, I have given the example of the years up to 2000. Before
applying the same code for the long 24 years, I want to learn the logic
behind the code. Actually, some part of the code is not understandable to
me which I mentioned in the bold letter as follows.

"spdat<-data.frame(
  dates=paste(c(1:30,1:28),c(rep(1,30),rep(2,28)),rep(1994,58),sep="-"),
  returnA=sample(*15:50*,58,TRUE),returnB=sample(*10:45*,58,TRUE))"

A. Here, I need to define the no. of days in a month, and the no. of
countries name separately, right? But, what is meant by 15:50, and 10:45 in
return A, and B respectively?

"# if you only have to get the monthly averages, it can be done this way
spdat$month<-sapply(strsplit(spdat$dates,"-"),"["*,2*)
spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",*3*)"

B. Here, I need to define the no. of months, and years separately, right?
or else what 2, and 3 (in bold) indicates?

"# get the averages by month and year - is this correct?
monthlyA<-by(*spdat$returnA*,spdat[,c("month","year")],mean)
monthlyB<-by(*spdat$returnB*,spdat[,c("month","year")],mean)"

C. From this part, I got the exact average values of both January and
February of 1994 for country A, and B. But, in code, I have a query that I
need to define  spdat$returnA, and  spdat$returnB separately before writing
this code, right? Like this, I need to define for each 84 countries
separately with their respective number of months, and years before writing
this code, right?

Yes, after obtaining the monthly average for each country's data, I need to
use them for further calculations. So, I want to export the result to
excel. But, until understanding the code, I think I willn't able to apply
for the entire sample, and cannot be able to discuss the format of the
resulted column to export to excel.

Therefore, kindly help me to understand the code.

Thank you very much, Sir, and thanks to this R forum for helping the
R-beginners.



[image: Mailtrack]

Sender
notified by
Mailtrack

09/13/19,
12:57:58 PM

On Fri, Sep 13, 2019 at 3:15 AM Jim Lemon  wrote:

> Hi Subhamitra,
> Your data didn't make it through, so I guess the first thing is to
> guess what it looks like. Here's a try at just January and February of
> 1994 so that we can see the result on the screen. The logic will work
> just as well for the whole seven years.
>
> # create fake data for the first two months
> spdat<-data.frame(
>  dates=paste(c(1:30,1:28),c(rep(1,30),rep(2,28)),rep(1994,58),sep="-"),
>  returnA=sample(15:50,58,TRUE),returnB=sample(10:45,58,TRUE))
> # I'll assume that the dates in your file are character, not factor
> spdat$dates<-as.character(spdat$dates)
> # if you only have to get the monthly averages, it can be done this way
> spdat$month<-sapply(strsplit(spdat$dates,"-"),"[",2)
> spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",3)
> # get the averages by month and year - is this correct?
> monthlyA<-by(spdat$returnA,spdat[,c("month","year")],mean)
> monthlyB<-by(spdat$returnB,spdat[,c("month","year")],mean)
>
> Now you have what you say you want:
>
> monthlyA
> month: 1
> year: 1994
> [1] 34.1
> 
> month: 2
> year: 1994
> [1] 33.32143
>
> monthlyB
> month: 1
> year: 1994
> [1] 29.7
> 
> month: 2
> year: 1994
> [1] 27.28571
>
> Sorry I didn't use a loop (for(month in 1:12) ... for (year in
> 1994:2000) ...), too lazy.
> Now you have to let us know how this information is to be formatted to
> go into Excel. Excel will import the text as above, but I think you
> want something that you can use for further calculations.
>
> Jim
>
> On Fri, Sep 13, 2019 at 12:54 AM Subhamitra Patra
>  wrote:
> >
> > Dear R-users,
> >
> > I have daily data from 03-01-1994 to 29-12-2000. In my datafile, he first
> > column is date and the second and third columns are the returns of the
> > country A, and B. Here, the date column is same for both countries. I
> want
> > to calculate the monthly average of both country's returns by using a
> loop,
> > and then, I want to export the results into excel.
> >
> > Please help me in this regard.
> >
> > Please find the attached datasheet.
> >
> > Thank you.
> >
> > --
> > *Best Regards,*
> > *Subhamitra Patra*
> > *Phd. Research Scholar*
> > *Department of Humanities and Social Sciences*
> > *Indian Institute of Technology, Kharagpur*
> > *INDIA*
> >
> > [image: Mailtrack]
> > <
> https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;
> >
> > Sender
> > notified by
> > Mailtrack
> > <
> 

Re: [R] Query about calculating the monthly average of daily data columns

2019-09-12 Thread Jim Lemon
Hi Subhamitra,
Your data didn't make it through, so I guess the first thing is to
guess what it looks like. Here's a try at just January and February of
1994 so that we can see the result on the screen. The logic will work
just as well for the whole seven years.

# create fake data for the first two months
spdat<-data.frame(
 dates=paste(c(1:30,1:28),c(rep(1,30),rep(2,28)),rep(1994,58),sep="-"),
 returnA=sample(15:50,58,TRUE),returnB=sample(10:45,58,TRUE))
# I'll assume that the dates in your file are character, not factor
spdat$dates<-as.character(spdat$dates)
# if you only have to get the monthly averages, it can be done this way
spdat$month<-sapply(strsplit(spdat$dates,"-"),"[",2)
spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",3)
# get the averages by month and year - is this correct?
monthlyA<-by(spdat$returnA,spdat[,c("month","year")],mean)
monthlyB<-by(spdat$returnB,spdat[,c("month","year")],mean)

Now you have what you say you want:

monthlyA
month: 1
year: 1994
[1] 34.1

month: 2
year: 1994
[1] 33.32143

monthlyB
month: 1
year: 1994
[1] 29.7

month: 2
year: 1994
[1] 27.28571

Sorry I didn't use a loop (for(month in 1:12) ... for (year in
1994:2000) ...), too lazy.
Now you have to let us know how this information is to be formatted to
go into Excel. Excel will import the text as above, but I think you
want something that you can use for further calculations.

Jim

On Fri, Sep 13, 2019 at 12:54 AM Subhamitra Patra
 wrote:
>
> Dear R-users,
>
> I have daily data from 03-01-1994 to 29-12-2000. In my datafile, he first
> column is date and the second and third columns are the returns of the
> country A, and B. Here, the date column is same for both countries. I want
> to calculate the monthly average of both country's returns by using a loop,
> and then, I want to export the results into excel.
>
> Please help me in this regard.
>
> Please find the attached datasheet.
>
> Thank you.
>
> --
> *Best Regards,*
> *Subhamitra Patra*
> *Phd. Research Scholar*
> *Department of Humanities and Social Sciences*
> *Indian Institute of Technology, Kharagpur*
> *INDIA*
>
> [image: Mailtrack]
> 
> Sender
> notified by
> Mailtrack
> 
> 09/12/19,
> 08:23:07 PM
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about calculating the monthly average of daily data columns

2019-09-12 Thread Rui Barradas

Hello,

Inline.

Às 17:33 de 12/09/19, Bert Gunter escreveu:
But she wants *monthly* averages, Rui. 


Thanks, my mistake.

Ergo ave() or tidyData

equivalent, right?


Maybe. But ave() returns as many values as the input length, this seems 
more suited for tapply or aggregate.



I will first create an example data set.

set.seed(1234)
start <- as.Date("03-01-1994", "%d-%m-%Y")
end <- as.Date("29-12-2000", "%d-%m-%Y")
date <- seq(start, end, by = "day")
date <- date[as.integer(format(date, "%u")) %in% 1:5]
df1 <- data.frame(date,
  CountryA = rnorm(length(date)),
  CountryB = rnorm(length(date)))


Now the averages by month

month <- zoo::as.yearmon(df1[[1]])
aggA <- aggregate(CountryA ~ month, df1, mean)
aggB <- aggregate(CountryB ~ month, df1, mean)
MonthReturns <- merge(aggA, aggB)
head(MonthReturns)


Final clean up.

rm(date, month, aggA, aggB)


Hope this helps,

Rui Barradas


-- Bert

On Thu, Sep 12, 2019 at 8:41 AM Rui Barradas > wrote:


Hello,

Please include data, say

dput(head(data, 20))  # post the output of this


But, is the problem as simple as

rowMeans(data[2:3], na.rm = TRUE)

?

Hope this helps,

Rui Barradas


Às 15:53 de 12/09/19, Subhamitra Patra escreveu:
 > Dear R-users,
 >
 > I have daily data from 03-01-1994 to 29-12-2000. In my datafile,
he first
 > column is date and the second and third columns are the returns
of the
 > country A, and B. Here, the date column is same for both
countries. I want
 > to calculate the monthly average of both country's returns by
using a loop,
 > and then, I want to export the results into excel.
 >
 > Please help me in this regard.
 >
 > Please find the attached datasheet.
 >
 > Thank you.
 >

__
R-help@r-project.org  mailing list --
To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about calculating the monthly average of daily data columns

2019-09-12 Thread Rui Barradas

Hello,

Please include data, say

dput(head(data, 20))  # post the output of this


But, is the problem as simple as

rowMeans(data[2:3], na.rm = TRUE)

?

Hope this helps,

Rui Barradas


Às 15:53 de 12/09/19, Subhamitra Patra escreveu:

Dear R-users,

I have daily data from 03-01-1994 to 29-12-2000. In my datafile, he first
column is date and the second and third columns are the returns of the
country A, and B. Here, the date column is same for both countries. I want
to calculate the monthly average of both country's returns by using a loop,
and then, I want to export the results into excel.

Please help me in this regard.

Please find the attached datasheet.

Thank you.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about calculating the monthly average of daily data columns

2019-09-12 Thread Bert Gunter
No reproducible example, so hard to say. What class is your "date" column?
-- factor, character, Date?  See ?Date
Once you have an object of appropriate class -- see ?format.Date -- ?months
can extract the month and ?ave can do your averaging. No explicit looping
is needed.

The tidydata alternative universe can also do all these things if that's
where you prefer to live.

As usual, any attached data was stripped. See ?dput for one way to include
data in your post.

Cheers,
Bert


On Thu, Sep 12, 2019 at 7:54 AM Subhamitra Patra 
wrote:

> Dear R-users,
>
> I have daily data from 03-01-1994 to 29-12-2000. In my datafile, he first
> column is date and the second and third columns are the returns of the
> country A, and B. Here, the date column is same for both countries. I want
> to calculate the monthly average of both country's returns by using a loop,
> and then, I want to export the results into excel.
>
> Please help me in this regard.
>
> Please find the attached datasheet.
>
> Thank you.
>
> --
> *Best Regards,*
> *Subhamitra Patra*
> *Phd. Research Scholar*
> *Department of Humanities and Social Sciences*
> *Indian Institute of Technology, Kharagpur*
> *INDIA*
>
> [image: Mailtrack]
> <
> https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;
> >
> Sender
> notified by
> Mailtrack
> <
> https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5;
> >
> 09/12/19,
> 08:23:07 PM
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query about calculating the monthly average of daily data columns

2019-09-12 Thread Subhamitra Patra
Dear R-users,

I have daily data from 03-01-1994 to 29-12-2000. In my datafile, he first
column is date and the second and third columns are the returns of the
country A, and B. Here, the date column is same for both countries. I want
to calculate the monthly average of both country's returns by using a loop,
and then, I want to export the results into excel.

Please help me in this regard.

Please find the attached datasheet.

Thank you.

-- 
*Best Regards,*
*Subhamitra Patra*
*Phd. Research Scholar*
*Department of Humanities and Social Sciences*
*Indian Institute of Technology, Kharagpur*
*INDIA*

[image: Mailtrack]

Sender
notified by
Mailtrack

09/12/19,
08:23:07 PM
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on R-squared correlation coefficient for linear regression through origin

2018-09-27 Thread Rui Barradas

Hello,

As for R^2 in Excel for models without an intercept, maybe the following 
are relevant.


https://support.microsoft.com/en-us/help/829249/you-will-receive-an-incorrect-r-squared-value-in-the-chart-tool-in-exc

https://stat.ethz.ch/pipermail/r-help/2012-July/318347.html


Hope this helps,

Rui Barradas

Às 11:56 de 27/09/2018, Patrick Barrie escreveu:

I have a query on the R-squared correlation coefficient for linear
regression through the origin.

The general expression for R-squared in regression (whether linear or
non-linear) is
R-squared = 1 - sum(y-ypredicted)^2 / sum(y-ybar)^2

However, the lm function within R does not seem to use this expression
when the intercept is constrained to be zero. It gives results different
to Excel and other data analysis packages.

As an example (using built-in cars dataframe):

  cars.lm=lm(dist ~ 0+speed, data=cars) # linear regression through

origin

summary(cars.lm)$r.squared # report R-squared [1] 0.8962893 >

1-deviance(cars.lm)/sum((cars$dist-mean(cars$dist))^2)     # calculates
R-squared directly [1] 0.6018997 > # The latter corresponds to the value
reported by Excel (and other data analysis packages) > > # Note that we
expect R-squared to be smaller for linear regression through the origin
  > # than for linear regression without a constraint (which is 0.6511 in
this example)

Does anyone know what R is doing in this case? Is there an option to get
R to return what I termed the "general" expression for R-squared? The
adjusted R-squared value is also affected. [Other parameters all seem
correct.]

Thanks for any help on this issue,

Patrick

P.S. I believe old versions of Excel (before 2003) also had this issue.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on R-squared correlation coefficient for linear regression through origin

2018-09-27 Thread peter dalgaard
This is an old discussion. The thing that R is doing is to compare the model to 
the model without any regressors, which in the no-intercept case is the 
constant zero. Otherwise, you would be comparing non-nested models and the R^2 
would not satisfy the property of being between 0 and 1. 

A similar issue affects anova tables, where the regression sum of squares is 
sum(yhat^2) rather than sum((yhat - ybar)^2).

-pd

> On 27 Sep 2018, at 12:56 , Patrick Barrie  wrote:
> 
> I have a query on the R-squared correlation coefficient for linear 
> regression through the origin.
> 
> The general expression for R-squared in regression (whether linear or 
> non-linear) is
> R-squared = 1 - sum(y-ypredicted)^2 / sum(y-ybar)^2
> 
> However, the lm function within R does not seem to use this expression 
> when the intercept is constrained to be zero. It gives results different 
> to Excel and other data analysis packages.
> 
> As an example (using built-in cars dataframe):
>> cars.lm=lm(dist ~ 0+speed, data=cars) # linear regression through 
> origin
>> summary(cars.lm)$r.squared # report R-squared [1] 0.8962893 > 
> 1-deviance(cars.lm)/sum((cars$dist-mean(cars$dist))^2) # calculates 
> R-squared directly [1] 0.6018997 > # The latter corresponds to the value 
> reported by Excel (and other data analysis packages) > > # Note that we 
> expect R-squared to be smaller for linear regression through the origin
>> # than for linear regression without a constraint (which is 0.6511 in 
> this example)
> 
> Does anyone know what R is doing in this case? Is there an option to get 
> R to return what I termed the "general" expression for R-squared? The 
> adjusted R-squared value is also affected. [Other parameters all seem 
> correct.]
> 
> Thanks for any help on this issue,
> 
> Patrick
> 
> P.S. I believe old versions of Excel (before 2003) also had this issue.
> 
> -- 
> Dr Patrick J. Barrie
> Department of Chemical Engineering and Biotechnology
> University of Cambridge
> Philippa Fawcett Drive, Cambridge CB3 0AS
> 01223 331864
> pj...@cam.ac.uk
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on R-squared correlation coefficient for linear regression through origin

2018-09-27 Thread Eric Berger
See also this thread in stats.stackexchange

https://stats.stackexchange.com/questions/26176/removal-of-statistically-significant-intercept-term-increases-r2-in-linear-mo



On Thu, Sep 27, 2018 at 3:43 PM, J C Nash  wrote:

> This issue that traces back to the very unfortunate use
> of R-squared as the name of a tool to simply compare a model to the model
> that
> is a single number (the mean). The mean can be shown to be the optimal
> choice
> for a model that is a single number, so it makes sense to try to do better.
>
> The OP has the correct form -- and I find no matter what the software, when
> working with models that do NOT have a constant in them (i.e., nonlinear
> models, regression through the origin) it pays to do the calculation
> "manually". In R it is really easy to write the necessary function, so
> why take a chance that a software developer has tried to expand the concept
> using a personal choice that is beyond a clear definition.
>
> I've commented elsewhere that I use this statistic even for nonlinear
> models in my own software, since I think one should do better than the
> mean for a model, but other workers shy away from using it for nonlinear
> models because there may be false interpretation based on its use for
> linear models.
>
> JN
>
>
> On 2018-09-27 06:56 AM, Patrick Barrie wrote:
> > I have a query on the R-squared correlation coefficient for linear
> > regression through the origin.
> >
> > The general expression for R-squared in regression (whether linear or
> > non-linear) is
> > R-squared = 1 - sum(y-ypredicted)^2 / sum(y-ybar)^2
> >
> > However, the lm function within R does not seem to use this expression
> > when the intercept is constrained to be zero. It gives results different
> > to Excel and other data analysis packages.
> >
> > As an example (using built-in cars dataframe):
> >>  cars.lm=lm(dist ~ 0+speed, data=cars) # linear regression through
> > origin
> >> summary(cars.lm)$r.squared # report R-squared [1] 0.8962893 >
> > 1-deviance(cars.lm)/sum((cars$dist-mean(cars$dist))^2) # calculates
> > R-squared directly [1] 0.6018997 > # The latter corresponds to the value
> > reported by Excel (and other data analysis packages) > > # Note that we
> > expect R-squared to be smaller for linear regression through the origin
> >  > # than for linear regression without a constraint (which is 0.6511 in
> > this example)
> >
> > Does anyone know what R is doing in this case? Is there an option to get
> > R to return what I termed the "general" expression for R-squared? The
> > adjusted R-squared value is also affected. [Other parameters all seem
> > correct.]
> >
> > Thanks for any help on this issue,
> >
> > Patrick
> >
> > P.S. I believe old versions of Excel (before 2003) also had this issue.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on R-squared correlation coefficient for linear regression through origin

2018-09-27 Thread J C Nash
This issue that traces back to the very unfortunate use
of R-squared as the name of a tool to simply compare a model to the model that
is a single number (the mean). The mean can be shown to be the optimal choice
for a model that is a single number, so it makes sense to try to do better.

The OP has the correct form -- and I find no matter what the software, when
working with models that do NOT have a constant in them (i.e., nonlinear
models, regression through the origin) it pays to do the calculation
"manually". In R it is really easy to write the necessary function, so
why take a chance that a software developer has tried to expand the concept
using a personal choice that is beyond a clear definition.

I've commented elsewhere that I use this statistic even for nonlinear
models in my own software, since I think one should do better than the
mean for a model, but other workers shy away from using it for nonlinear
models because there may be false interpretation based on its use for
linear models.

JN


On 2018-09-27 06:56 AM, Patrick Barrie wrote:
> I have a query on the R-squared correlation coefficient for linear 
> regression through the origin.
> 
> The general expression for R-squared in regression (whether linear or 
> non-linear) is
> R-squared = 1 - sum(y-ypredicted)^2 / sum(y-ybar)^2
> 
> However, the lm function within R does not seem to use this expression 
> when the intercept is constrained to be zero. It gives results different 
> to Excel and other data analysis packages.
> 
> As an example (using built-in cars dataframe):
>>  cars.lm=lm(dist ~ 0+speed, data=cars) # linear regression through 
> origin
>> summary(cars.lm)$r.squared # report R-squared [1] 0.8962893 > 
> 1-deviance(cars.lm)/sum((cars$dist-mean(cars$dist))^2)     # calculates 
> R-squared directly [1] 0.6018997 > # The latter corresponds to the value 
> reported by Excel (and other data analysis packages) > > # Note that we 
> expect R-squared to be smaller for linear regression through the origin
>  > # than for linear regression without a constraint (which is 0.6511 in 
> this example)
> 
> Does anyone know what R is doing in this case? Is there an option to get 
> R to return what I termed the "general" expression for R-squared? The 
> adjusted R-squared value is also affected. [Other parameters all seem 
> correct.]
> 
> Thanks for any help on this issue,
> 
> Patrick
> 
> P.S. I believe old versions of Excel (before 2003) also had this issue.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query on R-squared correlation coefficient for linear regression through origin

2018-09-27 Thread Patrick Barrie
I have a query on the R-squared correlation coefficient for linear 
regression through the origin.

The general expression for R-squared in regression (whether linear or 
non-linear) is
R-squared = 1 - sum(y-ypredicted)^2 / sum(y-ybar)^2

However, the lm function within R does not seem to use this expression 
when the intercept is constrained to be zero. It gives results different 
to Excel and other data analysis packages.

As an example (using built-in cars dataframe):
>  cars.lm=lm(dist ~ 0+speed, data=cars) # linear regression through 
origin
> summary(cars.lm)$r.squared # report R-squared [1] 0.8962893 > 
1-deviance(cars.lm)/sum((cars$dist-mean(cars$dist))^2)     # calculates 
R-squared directly [1] 0.6018997 > # The latter corresponds to the value 
reported by Excel (and other data analysis packages) > > # Note that we 
expect R-squared to be smaller for linear regression through the origin
 > # than for linear regression without a constraint (which is 0.6511 in 
this example)

Does anyone know what R is doing in this case? Is there an option to get 
R to return what I termed the "general" expression for R-squared? The 
adjusted R-squared value is also affected. [Other parameters all seem 
correct.]

Thanks for any help on this issue,

Patrick

P.S. I believe old versions of Excel (before 2003) also had this issue.

-- 
Dr Patrick J. Barrie
Department of Chemical Engineering and Biotechnology
University of Cambridge
Philippa Fawcett Drive, Cambridge CB3 0AS
01223 331864
pj...@cam.ac.uk


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on read.xls function in gdata

2018-09-06 Thread PIKAL Petr
Hi

You need to help yourself. My guess is that you did not tell to read_excel 
function where is your excel file.

When the file is in working directory it works seamlessly.

> library(readxl)
> read_excel("ebc.xlsx")
# A tibble: 8 x 16
  material   Rok osoby`8oh` `8ohg` `5ohm`  otyr `3tyr`   mda   hhe   hne
 
1 tio2 2012. vyroba 25.35.25.   38.42.   40.   40.   40.
2 tio2 2013. vyroba 38.46.35.   45.48.   45.   55.   45.
3 tio2 2012. kontrola   10.12.12.   28.14.   20.   15.   15.
4 tio2 2013. kontrola   17.15.17.   18.20.   20.   15.   20.
5 fe2o32013. vyroba 28.32.25.   28.29.   32.   30.   32.
6 fe2o32013. kontrola   17.15.17.   18.20.   20.   17.   19.
7 kompozit 2016. vyroba 28.37.28.   32.32.   25.   25.   25.
8 kompozit 2016. kontrola   20.20.20.   22.22.   18.   20.   20.
# ... with 5 more variables: `8iso` , ltb4 , ltc4 , ltd4 ,
#   lte4 
>

You should either to disclose to read function where is your excel file or 
change working directory or copy excel file to your working directory, whatever 
is easiest and most convenient to you.

Cheers
Petr


From: Aakash Kumar 
Sent: Thursday, September 6, 2018 2:48 PM
To: PIKAL Petr ; r-help@r-project.org
Subject: RE: [R] Query on read.xls function in gdata

Thanks for the suggestion, Petr.

I did try using read_xls and read_excel functions, but I got an error :

"Error in read_fun : Failed to open .xls file"

The file can be manually opened though.

It would be very helpful if you could help me out with this.

Thanks.

Regards,
Aakash


On 06-Sep-2018 17:00, "PIKAL Petr" 
mailto:petr.pi...@precheza.cz>> wrote:
Hi

If you do not need to stick with gdata you could try package readxl, it does 
not need any further packages.

https://cran.r-project.org/web/packages/readxl/readxl.pdf

It results in tibble data but change to data.frame is easy.

Cheers
Petr
> -Original Message-
> From: R-help 
> mailto:r-help-boun...@r-project.org>> On Behalf 
> Of Aakash Kumar
> Sent: Thursday, September 6, 2018 12:43 PM
> To: r-help@r-project.org<mailto:r-help@r-project.org>
> Subject: [R] Query on read.xls function in gdata
>
> Hi Team,
>
> I am trying to read in .xls files in R using read.xls function present in 
> gdata
> package. I have installed the required perl dependencies as well.
> Yet, I am facing the following error.
>
> "*Error in xls2sep : Intermediate file is missing*"
>
> Could someone please help me out understanding the cause and if possible, a
> fix for the same?
>
> Thanks in advance!
>
> Regards,
> Aakash
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
> UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on read.xls function in gdata

2018-09-06 Thread Jeff Newmiller
Without a reproducible example that is highly unlikely to happen.

However, you might just want to try the openxlsx or readxl packages instead.

On September 6, 2018 3:43:05 AM PDT, Aakash Kumar  wrote:
>Hi Team,
>
>I am trying to read in .xls files in R using read.xls function present
>in
>gdata package. I have installed the required perl dependencies as well.
>Yet, I am facing the following error.
>
>"*Error in xls2sep : Intermediate file is missing*"
>
>Could someone please help me out understanding the cause and if
>possible, a
>fix for the same?
>
>Thanks in advance!
>
>Regards,
>Aakash
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on read.xls function in gdata

2018-09-06 Thread PIKAL Petr
Hi

If you do not need to stick with gdata you could try package readxl, it does 
not need any further packages.

https://cran.r-project.org/web/packages/readxl/readxl.pdf

It results in tibble data but change to data.frame is easy.

Cheers
Petr
> -Original Message-
> From: R-help  On Behalf Of Aakash Kumar
> Sent: Thursday, September 6, 2018 12:43 PM
> To: r-help@r-project.org
> Subject: [R] Query on read.xls function in gdata
>
> Hi Team,
>
> I am trying to read in .xls files in R using read.xls function present in 
> gdata
> package. I have installed the required perl dependencies as well.
> Yet, I am facing the following error.
>
> "*Error in xls2sep : Intermediate file is missing*"
>
> Could someone please help me out understanding the cause and if possible, a
> fix for the same?
>
> Thanks in advance!
>
> Regards,
> Aakash
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query on read.xls function in gdata

2018-09-06 Thread Aakash Kumar
Hi Team,

I am trying to read in .xls files in R using read.xls function present in
gdata package. I have installed the required perl dependencies as well.
Yet, I am facing the following error.

"*Error in xls2sep : Intermediate file is missing*"

Could someone please help me out understanding the cause and if possible, a
fix for the same?

Thanks in advance!

Regards,
Aakash

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on convergence

2018-07-25 Thread PIKAL Petr
Hi

maybe

ii<-TRUE
while(ii) {

do something
if(some condition of two variables is met) {ii <- FALSE}

}

But in R such constructions are seldom necessary.

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of tembe-
> atanasio...@ynu.jp
> Sent: Wednesday, July 25, 2018 1:22 PM
> To: r-help@r-project.org
> Subject: [R] Query on convergence
>
> Hello,
>
>
>
> Is there somebody who can demonstrate how to code a while loop that ends
> when a convergence between the values of two or more variables (say vectors)
> is reached? Thank you
>
> Regards
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query on convergence

2018-07-25 Thread tembe-atanasio...@ynu.jp
Hello,



Is there somebody who can demonstrate how to code a while loop that ends when a 
convergence between the values of two or more variables (say vectors) is 
reached? Thank you

Regards


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on while loop

2018-07-20 Thread Bert Gunter
I don't know how to say this charitably, but your post indicates that you
**really need to go through an R tutorial or two.** Rather than give you
answers to these very basic matters, a couple of hints:

1.  A and P are vectors with 3 elements, not matrices .

2. I presume things like c11 and c32 are meant to be subscripts but that is
not how subscripts are written in R. R also can do such calculations on
whole objects rather than elementwise;

3. X1 is undefined (INF) as it is = 1/0  . So I have no idea what you
expect here.

Cheers,
Bert








Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Fri, Jul 20, 2018 at 2:57 PM, Atanasio Alberto Tembe Tembe <
manote...@gmail.com> wrote:

>  Hello,
>
> I have two matrices: a<-matrix(c(100,350,100,240,150,210,60,120,200 ),3,3)
> and c<-matrix(c(2,9,13,10,4,11,14,12,3),3,3).
>
> I have also defined the following variables:
> K=0
> A[i,j]=colSums(a)
> P[i,j]=rowSums(a)
> F[i,j]=c[i,j]^(-2 )
>
> Using these data I want to perform the calculation which must end when
> a convergence between X and Y values is reached.
>
> X1=1/(K*A1*c11+K*A2*c12+K*A3*c13)
>
> Y1=1/(X1*P1*c11+X1*P1*c12+X1*P1*c13)
>
> X2=1/(Y1*A1*c21+Y1*A2*c22+Y1*A3*c23)
>
> Y2=1/(X2*P2*c21+X2*P2*c22+X2*P2*c23)
>
> X3=1/(Y1*A1*c31+Y1*A2*c32+Y1*A3*c33)
>
> Y3=1/(X2*P3*c31+X2*P3*c32+X2*P3*c33)
>
>
>
> I have been struggling over this for some time. Your support is highly
> appreciated.
>
> Thanks
>
>
> --
> Atanasio Alberto Tembe (Mr)
> Doctoral student
> Graduate School of Urban Innovation
> Transportation and Urban Engineering Laboratory
> Yokohama National University
> Tel: +81-(0)80-4605-1305 <+81%2080-8080-2482>
>  Mail: tembe-atanasio...@ynu.jp 
>   manote...@gmail.com 
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query on while loop

2018-07-20 Thread Atanasio Alberto Tembe Tembe
 Hello,

I have two matrices: a<-matrix(c(100,350,100,240,150,210,60,120,200 ),3,3)
and c<-matrix(c(2,9,13,10,4,11,14,12,3),3,3).

I have also defined the following variables:
K=0
A[i,j]=colSums(a)
P[i,j]=rowSums(a)
F[i,j]=c[i,j]^(-2 )

Using these data I want to perform the calculation which must end when
a convergence between X and Y values is reached.

X1=1/(K*A1*c11+K*A2*c12+K*A3*c13)

Y1=1/(X1*P1*c11+X1*P1*c12+X1*P1*c13)

X2=1/(Y1*A1*c21+Y1*A2*c22+Y1*A3*c23)

Y2=1/(X2*P2*c21+X2*P2*c22+X2*P2*c23)

X3=1/(Y1*A1*c31+Y1*A2*c32+Y1*A3*c33)

Y3=1/(X2*P3*c31+X2*P3*c32+X2*P3*c33)



I have been struggling over this for some time. Your support is highly
appreciated.

Thanks


-- 
Atanasio Alberto Tembe (Mr)
Doctoral student
Graduate School of Urban Innovation
Transportation and Urban Engineering Laboratory
Yokohama National University
Tel: +81-(0)80-4605-1305 <+81%2080-8080-2482>
 Mail: tembe-atanasio...@ynu.jp 
  manote...@gmail.com 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query regarding simulating weibull aft model with predefined censoring rate

2018-07-18 Thread Bert Gunter
Off topic for r-help (see posting guide linked below).

Suggest Posting on stats.stackexchange.com instead..

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Wed, Jul 18, 2018 at 12:16 PM, Fabiha Binte Farooq 
wrote:

> Hi there,
> I am an MS student from Bangladesh. I am doing thesis in my MS degree. In
> my research, I am generating data from weibull distribution and my model is
> accelerated failure time (AFT) model. I am considering right censoring as
> well as covariates. Now I have been facing difficulties to generate
> censoring time controlling censoring proportion. I am attaching my codes
> here.
>
> Problem. I have generated censoring time using a relationship between scale
> and covariates from an article for PH model. But my model is AFT. Is it
> authentic to use it here? Please help!!!
>
> Sincerely,
> Fabiha
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query regarding simulating weibull aft model with predefined censoring rate

2018-07-18 Thread Fabiha Binte Farooq
Hi there,
I am an MS student from Bangladesh. I am doing thesis in my MS degree. In
my research, I am generating data from weibull distribution and my model is
accelerated failure time (AFT) model. I am considering right censoring as
well as covariates. Now I have been facing difficulties to generate
censoring time controlling censoring proportion. I am attaching my codes
here.

Problem. I have generated censoring time using a relationship between scale
and covariates from an article for PH model. But my model is AFT. Is it
authentic to use it here? Please help!!!

Sincerely,
Fabiha
library(survival)
##true coefficients
b0=.25## intercept
b1=2
b2=3  ## regression coefficients 
b3=.5
n=30
## covariates
library(MASS)
mu=rep(0,3)
sigma=matrix(.7,nrow=3,ncol=3)+diag(3)*.3
x = mvrnorm(n, mu=mu, Sigma=sigma)
x1=x[,1]
x2=x[,2]
x3=x[,3]
## real survival times
t = rweibull(n, shape=.5, scale=exp((b0+b1*x1+b2*x2+b3*x3)))
## censoring times
c = rweibull(n, shape=.5, scale=exp(-(b0+b1*x1+b2*x2+b3*x3)/.5))
## observed times
time = pmin(t, c)
time 
## censoring indicator
status=as.numeric(c>t)
status
## data frame
data=data.frame(time,status,x1,x2,x3)
data
## model fitting
model=survreg(Surv(time, status)~x1+x2+x3, dist="weibull", data=data)
summary=summary(model)
summary

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query on the Arimax modeling results

2018-05-25 Thread Sanchi Bhatia
Hi R team,



We’ve run Arimax models in R. We had a lot of queries around the
interpretation of the outputs.



*Dependent variable =* Volume (Growth %)

*Independent Variables =* 3 Macroeconomic variables (Growth %)



Following is the line of code



Arimax.Model <- auto.arima(y = input.data[,"Volume"], xreg =
input.data[,model.vars], seasonal = F)



Following is the output



Series: input.data[, "Volume"]

Regression with ARIMA(0,0,0) errors



*Coefficients:*

*  Birth_Rate_Change  Proportion_Female_labour  Females_20_39*

*97.7658 1.37019.7528*

*s.e.23.8575 0.305 3.9874*



sigma^2 estimated as 4.316:  log likelihood=-24.07

AIC=56.15   AICc=61.86   BIC=58.09



*Query – *

   1. *Could you help us understand the interpretation of the coefficients
   obtained? Even if it’s not a growth model, how would we interpret the
   coefficients?*
   2. *Is there a limit on the number of regressors we can get in the model
   in ARIMAX modeling exercise in R? This is because we are only getting 3
   regressors at max.*



Will be very thankful if you could provide answers to our queries.



Best Regards



Regards,
Sanchi
………
Sanchi Bhatia
Senior Analyst  |  o +91-124-495-3813  |  m +91-9810531725
*Absolutdata * – *Intelligent Analytics*
San Francisco  |  London  |  Dubai  |  New Delhi  |  Bangalore  |  Singapore
…...……
Join us:  *Blog * | *LinkedIn
* |  *Twitter*


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query - Merging and conditional replacement of values in a data frame

2017-02-13 Thread MacQueen, Don
How about this?

foo <- merge(df1, df2, all=TRUE)

is.new <- !is.na(foo$v11)
foo$v1[is.new] <- foo$v11[is.new]

foo <- foo[, names(df1)]

> foo
  time  v1 v2 v3
11   2  3  4
22   5  6  4
33 112  3  4
44 112  3  4
55   2  3  4
66   2  3  4


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062


On 2/11/17, 4:13 PM, "R-help on behalf of Bhaskar Mitra" 
 wrote:

Hello Everyone,

I have two data frames df1 and df2 as shown below. They
are of different length. However, they have one common column - time.

df1 <-
time v1  v2 v3
1 2   3  4
2 5   6  4
3 1   3  4
4 1   3  4
5 2   3  4
6 2   3  4


df2 <-
time v11  v12 v13
3 112   3  4
4 112   3  4

By matching the 'time' column in df1 and df2, I am trying to modify column
'v1' in df1 by replacing it
with values in column 'v11' in df2. The modified df1 should look something
like this:

df1 <-
time v1   v2 v3
1 2   3  4
2 5   6  4
3 112 3  4
4 112 3  4
5 2   3  4
6 2   3  4

I tried to use the 'merge' function to combine df1 and df2 followed by
the conditional 'ifelse' statement. However, that doesn't seem to work.

Can I replace the values in df1 by not merging the two data frames?

Thanks for your help,

Regards,
Bhaskar

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query - Merging and conditional replacement of values in a data frame

2017-02-12 Thread Bhaskar Mitra
Thanks for all your help. This is helpful.

Best,
Bhaskar

On Sun, Feb 12, 2017 at 4:35 AM, Jim Lemon  wrote:

> Hi Bhaskar,
> Maybe:
>
> df1 <-read.table(text="time v1  v2 v3
> 1 2   3  4
> 2 5   6  4
> 3 1   3  4
> 4 1   3  4
> 5 2   3  4
> 6 2   3  4",
> header=TRUE)
>
>
> df2 <-read.table(text="time v11  v12 v13
> 3 112   3  4
> 4 112   3  4",
> header=TRUE)
>
> for(time1 in df1$time) {
>  time2<-which(df2$time==time1)
>  if(length(time2)) df1[df1$time==time1,]<-df2[time2,]
> }
>
> Jim
>
>
> On Sun, Feb 12, 2017 at 11:13 AM, Bhaskar Mitra
>  wrote:
> > Hello Everyone,
> >
> > I have two data frames df1 and df2 as shown below. They
> > are of different length. However, they have one common column - time.
> >
> > df1 <-
> > time v1  v2 v3
> > 1 2   3  4
> > 2 5   6  4
> > 3 1   3  4
> > 4 1   3  4
> > 5 2   3  4
> > 6 2   3  4
> >
> >
> > df2 <-
> > time v11  v12 v13
> > 3 112   3  4
> > 4 112   3  4
> >
> > By matching the 'time' column in df1 and df2, I am trying to modify
> column
> > 'v1' in df1 by replacing it
> > with values in column 'v11' in df2. The modified df1 should look
> something
> > like this:
> >
> > df1 <-
> > time v1   v2 v3
> > 1 2   3  4
> > 2 5   6  4
> > 3 112 3  4
> > 4 112 3  4
> > 5 2   3  4
> > 6 2   3  4
> >
> > I tried to use the 'merge' function to combine df1 and df2 followed by
> > the conditional 'ifelse' statement. However, that doesn't seem to work.
> >
> > Can I replace the values in df1 by not merging the two data frames?
> >
> > Thanks for your help,
> >
> > Regards,
> > Bhaskar
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query - Merging and conditional replacement of values in a data frame

2017-02-12 Thread Jim Lemon
Hi Bhaskar,
Maybe:

df1 <-read.table(text="time v1  v2 v3
1 2   3  4
2 5   6  4
3 1   3  4
4 1   3  4
5 2   3  4
6 2   3  4",
header=TRUE)


df2 <-read.table(text="time v11  v12 v13
3 112   3  4
4 112   3  4",
header=TRUE)

for(time1 in df1$time) {
 time2<-which(df2$time==time1)
 if(length(time2)) df1[df1$time==time1,]<-df2[time2,]
}

Jim


On Sun, Feb 12, 2017 at 11:13 AM, Bhaskar Mitra
 wrote:
> Hello Everyone,
>
> I have two data frames df1 and df2 as shown below. They
> are of different length. However, they have one common column - time.
>
> df1 <-
> time v1  v2 v3
> 1 2   3  4
> 2 5   6  4
> 3 1   3  4
> 4 1   3  4
> 5 2   3  4
> 6 2   3  4
>
>
> df2 <-
> time v11  v12 v13
> 3 112   3  4
> 4 112   3  4
>
> By matching the 'time' column in df1 and df2, I am trying to modify column
> 'v1' in df1 by replacing it
> with values in column 'v11' in df2. The modified df1 should look something
> like this:
>
> df1 <-
> time v1   v2 v3
> 1 2   3  4
> 2 5   6  4
> 3 112 3  4
> 4 112 3  4
> 5 2   3  4
> 6 2   3  4
>
> I tried to use the 'merge' function to combine df1 and df2 followed by
> the conditional 'ifelse' statement. However, that doesn't seem to work.
>
> Can I replace the values in df1 by not merging the two data frames?
>
> Thanks for your help,
>
> Regards,
> Bhaskar
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query - Merging and conditional replacement of values in a data frame

2017-02-11 Thread Jeff Newmiller

Or use rownames and subscripting?

df1 <- read.table( text=
"time v1  v2 v3
1 2   3  4
2 5   6  4
3 1   3  4
4 1   3  4
5 2   3  4
6 2   3  4
",header=TRUE)

df2 <- read.table( text=
"time v11  v12 v13
3 112   3  4
4 112   3  4
",header=TRUE)

df3 <- df1
rownames( df3 ) <- df3$time
df3[ as.character( df2$time ), "v1" ] <- df2[ , "v11" ]
df3
df3[ "7", c( "time", "v1" ) ] <- data.frame( time=7, v1=2 )
df3
df2b <- data.frame( time=c(7,8), v2=c(4,5), v3=c(6,7) )
df2b
df3[ df2b$time, c( "time", "v2", "v3" ) ] <- df2b
df3

On Sat, 11 Feb 2017, Bert Gunter wrote:


Your "assignments" (<-) are not legitimate R code that can be cut and
pasted. Learn to use dput() to provide examples that we can use.

You fail to say whether the time column of df2 is a proper subset of
df1 or may contain times not in df1. I shall assume the latter. You
also did not say whether the time values occur in order in both data
frames. I shall assume they do not.

If I understand correctly,then,  match and subscripting will do it,
something like



df1 <-data.frame(time = 1:6, v1 = c(2,5,1,1,2,2))
df2 <- data.frame(time = 4:3,v11 = c(112,113))



wm <- match(df1$time,df2$time)
df1[!is.na(wm),"v1"] <- df2[na.omit(wm),"v11"]



df1


 time  v1
11   2
22   5
33 113
44 112
55   2
66   2

Cheers,
Bert





Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Feb 11, 2017 at 4:13 PM, Bhaskar Mitra
 wrote:

Hello Everyone,

I have two data frames df1 and df2 as shown below. They
are of different length. However, they have one common column - time.

df1 <-
time v1  v2 v3
1 2   3  4
2 5   6  4
3 1   3  4
4 1   3  4
5 2   3  4
6 2   3  4


df2 <-
time v11  v12 v13
3 112   3  4
4 112   3  4

By matching the 'time' column in df1 and df2, I am trying to modify column
'v1' in df1 by replacing it
with values in column 'v11' in df2. The modified df1 should look something
like this:

df1 <-
time v1   v2 v3
1 2   3  4
2 5   6  4
3 112 3  4
4 112 3  4
5 2   3  4
6 2   3  4

I tried to use the 'merge' function to combine df1 and df2 followed by
the conditional 'ifelse' statement. However, that doesn't seem to work.

Can I replace the values in df1 by not merging the two data frames?

Thanks for your help,

Regards,
Bhaskar

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query - Merging and conditional replacement of values in a data frame

2017-02-11 Thread Bert Gunter
Your "assignments" (<-) are not legitimate R code that can be cut and
pasted. Learn to use dput() to provide examples that we can use.

You fail to say whether the time column of df2 is a proper subset of
df1 or may contain times not in df1. I shall assume the latter. You
also did not say whether the time values occur in order in both data
frames. I shall assume they do not.

If I understand correctly,then,  match and subscripting will do it,
something like


> df1 <-data.frame(time = 1:6, v1 = c(2,5,1,1,2,2))
> df2 <- data.frame(time = 4:3,v11 = c(112,113))

> wm <- match(df1$time,df2$time)
> df1[!is.na(wm),"v1"] <- df2[na.omit(wm),"v11"]

> df1

  time  v1
11   2
22   5
33 113
44 112
55   2
66   2

Cheers,
Bert





Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Feb 11, 2017 at 4:13 PM, Bhaskar Mitra
 wrote:
> Hello Everyone,
>
> I have two data frames df1 and df2 as shown below. They
> are of different length. However, they have one common column - time.
>
> df1 <-
> time v1  v2 v3
> 1 2   3  4
> 2 5   6  4
> 3 1   3  4
> 4 1   3  4
> 5 2   3  4
> 6 2   3  4
>
>
> df2 <-
> time v11  v12 v13
> 3 112   3  4
> 4 112   3  4
>
> By matching the 'time' column in df1 and df2, I am trying to modify column
> 'v1' in df1 by replacing it
> with values in column 'v11' in df2. The modified df1 should look something
> like this:
>
> df1 <-
> time v1   v2 v3
> 1 2   3  4
> 2 5   6  4
> 3 112 3  4
> 4 112 3  4
> 5 2   3  4
> 6 2   3  4
>
> I tried to use the 'merge' function to combine df1 and df2 followed by
> the conditional 'ifelse' statement. However, that doesn't seem to work.
>
> Can I replace the values in df1 by not merging the two data frames?
>
> Thanks for your help,
>
> Regards,
> Bhaskar
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query - Merging and conditional replacement of values in a data frame

2017-02-11 Thread Bhaskar Mitra
Hello Everyone,

I have two data frames df1 and df2 as shown below. They
are of different length. However, they have one common column - time.

df1 <-
time v1  v2 v3
1 2   3  4
2 5   6  4
3 1   3  4
4 1   3  4
5 2   3  4
6 2   3  4


df2 <-
time v11  v12 v13
3 112   3  4
4 112   3  4

By matching the 'time' column in df1 and df2, I am trying to modify column
'v1' in df1 by replacing it
with values in column 'v11' in df2. The modified df1 should look something
like this:

df1 <-
time v1   v2 v3
1 2   3  4
2 5   6  4
3 112 3  4
4 112 3  4
5 2   3  4
6 2   3  4

I tried to use the 'merge' function to combine df1 and df2 followed by
the conditional 'ifelse' statement. However, that doesn't seem to work.

Can I replace the values in df1 by not merging the two data frames?

Thanks for your help,

Regards,
Bhaskar

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query regarding Approximate/Fuzzy matching & String Extraction(numeric) in R

2016-09-24 Thread Bert Gunter
"So I want a **Fuzzy logic approach** to..."

That is a near meaningless buzzword.

I suggest you search on "fuzzy logic" on the rseek.org website and see
if any of the hits there does whatever it is that you have in mind.

Cheers,
Bert




Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Sep 24, 2016 at 11:49 AM, Aarushi Kaushal
 wrote:
> Hey there,
>
> I work for an organisation named Bullero Capital Pvt. Ltd. in New Delhi,
> which is involved in financial services, Portfolio management to be
> precise. Recently we've started creating ourselves a database using R for
> all the stocks etc. to be automated and hence analyzed accordingly for
> future investment purposes (data related to which is already available, and
> in our possession).
>
> I and a colleague of mine, we are currently at the data cleaning stage -
> where we need to organize and format the data according to how we want it
> in the database. The problem lies in notation & symbols used in the
> original csv data files acquired from the government website - where we
> have to do approximate matching (for efficiency) and thereby extract the
> numerics only from that string of characters from the respective columns of
> the dataframe.
>
> 1.) As of now we are looking at using the agrep function, to detect &
> locate the pattern matches namely - DIVIDEND , SPLIT, BONUS
>
> 2.) From there on carry out the extraction of the respective numeric values
> associated with these actions in to the corresponding columns -
> BONUS_NUM(Numerator for the ratio), BONUS_DEN( Denominator for the ratio),
> SPLIT_NUM(Numerator for the ratio), SPLIT_DEN (Denominator for the Ratio),
> FInal Dividend, Interim Dividend & Special Dividend.
>
>
> COLUMN PURPOSE
>
>1. DIVIDEND-RE.1/- PER SHARE
>2. AGM/DIV-RS.3.50 PER SHARE
>3. SPL DIV-RS.2.70 PER SHARE
>4. DIV - FIN 3.50RE PER SHARE + SPL-Rs.1.4
>5. FV SPLIT Rs.10 to RE.1
>6. BON 3:2 + SPLT Rs. 5 to Rs.2.5
>7. BONUS 4:1
>8. DIV:10%
>
> Ex.
> DIVIDEND-RE.1/- PER SHARE
> FINAL_DIV-1
>
> AGM/DIV-RS.3.50 PER SHARE
> FINAL_DIV-3.50
>
> SPL DIV-RS.2.70 PER SHARE
> SPECIAL DIV-2.70
>
> Ex.
> FV SPLIT Rs.10 to RE.1
> SPLIT_NUM - 1
> SPLIT_DEN - 10
>
> Ex. BONUS 4:1
> BONUS_NUM - 4
> BONUS_DEN - 1
>
> However, the problem with that is that agrep returns the vector indices
>  instead of the string indices which makes it cumbersome to extract the
> numeric values following the respective matches.
> So I want a Fuzzy logic approach to
>
>- check for the presence of SPLIT, DIVIDEND, BONUS
>- index of which ever cell the pattern match occurs in the column
>PURPOSE of the data frame
>- index position of that particular pattern in the string to extract the
>numerical value following the matched pattern
>
> *Basically Is there any way in R to determine if the patterns can be
> checked and matched approximately while returning for value - the indices
> for the same in the respective strings?**(such that if in case the symbols
> change furthermore in the future according to the government website's
> notation in the data storage, or the format/positioning/spacing changes -
> it could account for all those changes automatically.)*
> I am attaching below the .csv file consisting of just the column we need to
> carry out the cleaning in for your convenience.
>
> It would be very helpful, if we could get some guidance as to how to
> proceed further at the earliest.
>
> regards,
> aarushi kaushal
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query regarding Approximate/Fuzzy matching & String Extraction(numeric) in R

2016-09-24 Thread David Winsemius

> On Sep 24, 2016, at 11:49 AM, Aarushi Kaushal  
> wrote:
> 
> Hey there,
> 
> I work for an organisation named Bullero Capital Pvt. Ltd. in New Delhi,
> which is involved in financial services, Portfolio management to be
> precise. Recently we've started creating ourselves a database using R for
> all the stocks etc. to be automated and hence analyzed accordingly for
> future investment purposes (data related to which is already available, and
> in our possession).
> 
> I and a colleague of mine, we are currently at the data cleaning stage -
> where we need to organize and format the data according to how we want it
> in the database. The problem lies in notation & symbols used in the
> original csv data files acquired from the government website - where we
> have to do approximate matching (for efficiency) and thereby extract the
> numerics only from that string of characters from the respective columns of
> the dataframe.
> 
> 1.) As of now we are looking at using the agrep function, to detect &
> locate the pattern matches namely - DIVIDEND , SPLIT, BONUS
> 
> 2.) From there on carry out the extraction of the respective numeric values
> associated with these actions in to the corresponding columns -
> BONUS_NUM(Numerator for the ratio), BONUS_DEN( Denominator for the ratio),
> SPLIT_NUM(Numerator for the ratio), SPLIT_DEN (Denominator for the Ratio),
> FInal Dividend, Interim Dividend & Special Dividend.
> 
> 
> COLUMN PURPOSE
> 
>   1. DIVIDEND-RE.1/- PER SHARE
>   2. AGM/DIV-RS.3.50 PER SHARE
>   3. SPL DIV-RS.2.70 PER SHARE
>   4. DIV - FIN 3.50RE PER SHARE + SPL-Rs.1.4
>   5. FV SPLIT Rs.10 to RE.1
>   6. BON 3:2 + SPLT Rs. 5 to Rs.2.5
>   7. BONUS 4:1
>   8. DIV:10%
> 
> Ex.
> DIVIDEND-RE.1/- PER SHARE
> FINAL_DIV-1
> 
> AGM/DIV-RS.3.50 PER SHARE
> FINAL_DIV-3.50
> 
> SPL DIV-RS.2.70 PER SHARE
> SPECIAL DIV-2.70
> 
> Ex.
> FV SPLIT Rs.10 to RE.1
> SPLIT_NUM - 1
> SPLIT_DEN - 10
> 
> Ex. BONUS 4:1
> BONUS_NUM - 4
> BONUS_DEN - 1
> 
> However, the problem with that is that agrep returns the vector indices
> instead of the string indices which makes it cumbersome to extract the
> numeric values following the respective matches.

Please read ?agrep which was my starting point. (I needed to see if `agrep` was 
like grep in being capable of returning character values of matches.)

Can you explain what that actually means? What would be a "string index" if it 
is not the value returned when the parameter to `agrep` is setas:  value=TRUE?


> So I want a Fuzzy logic approach to
> 
>   - check for the presence of SPLIT, DIVIDEND, BONUS
>   - index of which ever cell the pattern match occurs in the column
>   PURPOSE of the data frame
>   - index position of that particular pattern in the string to extract the
>   numerical value following the matched pattern
> 
> *Basically Is there any way in R to determine if the patterns can be
> checked and matched approximately while returning for value - the indices
> for the same in the respective strings?**(such that if in case the symbols
> change furthermore in the future according to the government website's
> notation in the data storage, or the format/positioning/spacing changes -
> it could account for all those changes automatically.)*
> I am attaching below the .csv file consisting of just the column we need to
> carry out the cleaning in for your convenience.
> 
> It would be very helpful, if we could get some guidance as to how to
> proceed further at the earliest.

It would be helpful for us for _you_ to construct a simple example and explain 
what was desired from it (as is described in the Posting Guide).

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query regarding Approximate/Fuzzy matching & String Extraction(numeric) in R

2016-09-24 Thread Aarushi Kaushal
Hey there,

I work for an organisation named Bullero Capital Pvt. Ltd. in New Delhi,
which is involved in financial services, Portfolio management to be
precise. Recently we've started creating ourselves a database using R for
all the stocks etc. to be automated and hence analyzed accordingly for
future investment purposes (data related to which is already available, and
in our possession).

I and a colleague of mine, we are currently at the data cleaning stage -
where we need to organize and format the data according to how we want it
in the database. The problem lies in notation & symbols used in the
original csv data files acquired from the government website - where we
have to do approximate matching (for efficiency) and thereby extract the
numerics only from that string of characters from the respective columns of
the dataframe.

1.) As of now we are looking at using the agrep function, to detect &
locate the pattern matches namely - DIVIDEND , SPLIT, BONUS

2.) From there on carry out the extraction of the respective numeric values
associated with these actions in to the corresponding columns -
BONUS_NUM(Numerator for the ratio), BONUS_DEN( Denominator for the ratio),
SPLIT_NUM(Numerator for the ratio), SPLIT_DEN (Denominator for the Ratio),
FInal Dividend, Interim Dividend & Special Dividend.


COLUMN PURPOSE

   1. DIVIDEND-RE.1/- PER SHARE
   2. AGM/DIV-RS.3.50 PER SHARE
   3. SPL DIV-RS.2.70 PER SHARE
   4. DIV - FIN 3.50RE PER SHARE + SPL-Rs.1.4
   5. FV SPLIT Rs.10 to RE.1
   6. BON 3:2 + SPLT Rs. 5 to Rs.2.5
   7. BONUS 4:1
   8. DIV:10%

Ex.
DIVIDEND-RE.1/- PER SHARE
FINAL_DIV-1

AGM/DIV-RS.3.50 PER SHARE
FINAL_DIV-3.50

SPL DIV-RS.2.70 PER SHARE
SPECIAL DIV-2.70

Ex.
FV SPLIT Rs.10 to RE.1
SPLIT_NUM - 1
SPLIT_DEN - 10

Ex. BONUS 4:1
BONUS_NUM - 4
BONUS_DEN - 1

However, the problem with that is that agrep returns the vector indices
 instead of the string indices which makes it cumbersome to extract the
numeric values following the respective matches.
So I want a Fuzzy logic approach to

   - check for the presence of SPLIT, DIVIDEND, BONUS
   - index of which ever cell the pattern match occurs in the column
   PURPOSE of the data frame
   - index position of that particular pattern in the string to extract the
   numerical value following the matched pattern

*Basically Is there any way in R to determine if the patterns can be
checked and matched approximately while returning for value - the indices
for the same in the respective strings?**(such that if in case the symbols
change furthermore in the future according to the government website's
notation in the data storage, or the format/positioning/spacing changes -
it could account for all those changes automatically.)*
I am attaching below the .csv file consisting of just the column we need to
carry out the cleaning in for your convenience.

It would be very helpful, if we could get some guidance as to how to
proceed further at the earliest.

regards,
aarushi kaushal
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on the R of free soft version 3

2016-09-21 Thread John Kane
What is your operating system?

Please do not post in HTML.

John Kane
Kingston ON Canada


> -Original Message-
> From: kkam-...@echigo.ne.jp
> Sent: Wed, 21 Sep 2016 09:30:15 +0900
> To: r-help@r-project.org
> Subject: [R] Query on the R of free soft version 3
> 
> Dear
> 
> 
> 
> Although I can install the new version of the R, I can not open the soft.
> 
> 
> 
> How do I do it?
> 
> 
> 
> Kyuzi Kamoi, MD.
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your 
desktop!

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on the R of free soft version 3

2016-09-21 Thread PIKAL Petr
Hi

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Kamin
> Kyuji
> Sent: Wednesday, September 21, 2016 2:30 AM
> To: R-help@r-project.org
> Subject: [R] Query on the R of free soft version 3
>
> Dear
>
>
>
> Although I can install the new version of the R, I can not open the soft.
>
>
>
> How do I do it?

Did you try to doubleclick on R icon?

Your short question imply either our mind reading capability or our presence in 
your office. The later is not the case however some clever helpers are better 
in first option than other, so you maybe get better hints.

Cheers
Petr

>
>
>
> Kyuzi Kamoi, MD.
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Query on the R of free soft version 3

2016-09-21 Thread David Winsemius

> On Sep 20, 2016, at 5:30 PM, Kamin Kyuji  wrote:
> 
> Dear
> 
> 
> 
> Although I can install the new version of the R, I can not open the soft.
> 
> 
> 
> How do I do it?

Surely you will need to tell us more than that. This just tells us you are 
having problems but nothing else.


> 
> 
> 
> Kyuzi Kamoi, MD.
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query on the R of free soft version 3

2016-09-21 Thread Kamin Kyuji
Dear

 

Although I can install the new version of the R, I can not open the soft.

 

How do I do it?

 

Kyuzi Kamoi, MD.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query to find minimum value in a matrix in R

2016-09-16 Thread PIKAL Petr
Hi

you can follow logic of functions by using debug and see how they operate by 
inspecting objects evaluated within functions.

See
?debug

However it seems to me that your functions are quite complicated. If I 
understand correctly, they compute minimum value of upper part of matrix. If I 
am correct, this function does the same, is shorter, more understandable and 
extensible.

min.upper <- function(x) {
mm <- min(x[upper.tri(x)])
x[lower.tri(x)] <- NA
ind <- which(x==mm, arr.ind=TRUE)
c(mm, ind)
}

mat <- structure(c(0, 5, 9, 13, 5, 0, 10, 14, 9, 10, 0, 15, 13, 14,
15, 0), .Dim = c(4L, 4L), .Dimnames = list(NULL, c("col1", "col2",
"col3", "col4")))
min.upper(mat)
[1] 5 1 2

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of susmita T
> Sent: Friday, September 16, 2016 2:09 PM
> To: r-help@r-project.org
> Subject: [R] Query to find minimum value in a matrix in R
>
> Hi,
> Good Morning! I am new to R and finding difficulty in understanding the
> code. Since few days I am stuck at single line of code which I am unable to
> understand.
> Though there may be number of logics to find min value. As a new beginner I
> am following a book and as it has the following code
>
> mind<-function(d)
> {
>   n<-nrow(d)
>   dd<-cbind(d,1:n)
>   wmins<-apply(dd[-n,],1,imin)
>   i<-which.min(wmins[2,])
>   j<-wmins[1,i]
>   return(c(d[i,j],i,j))
> }
> imin<-function(x)
> {
>   lx<-length(x)
>   i<-x[lx]
>   j<-which.min(x[(i+1):(lx-1)])
>   k<-i+j
>   return(c(k,x[k]))
> }
>
> So when executed this with mind(below matrix) I get
> 0 12  13  8   20
> 120   15  28  88
> 1315  0   6   9
> 8 28  6   0   33
> 2088  9   33  0
> the answer as 6 , row 3 column 4
>
> Due to the symmetry of the matrix , the skipping of the early part of row is
> done by using expression (x[(i+1):(lx-1)])..(which is in red color in the code
> shown above). I am unable to understand the line in red code and how it is
> implemented in the line 5(i.e wins)…(shown in pink color in the code above I
> have done necessary homework to understand but still finding it hard to get
> it. Please someone help.
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 

Re: [R] Query to find minimum value in a matrix in R

2016-09-16 Thread S Ellison
> I am unable to understand the line in red code 
Colour does not survive plain text transmission; try adding comments (# ...) 
instead, or state which line of code you do not understand.

In the mean time you could take a look, first, as 
?cbind
?apply
?'[' 

with particular attention to the meaning of negative indices (like '-n' in 
dd[-n,])

S Ellison



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of susmita T
> Sent: 16 September 2016 13:09
> To: r-help@r-project.org
> Subject: [R] Query to find minimum value in a matrix in R
> 
> Hi,
> Good Morning! I am new to R and finding difficulty in understanding the code.
> Since few days I am stuck at single line of code which I am unable to
> understand.
> Though there may be number of logics to find min value. As a new beginner I
> am following a book and as it has the following code
> 
> mind<-function(d)
> {
>   n<-nrow(d)
>   dd<-cbind(d,1:n)
>   wmins<-apply(dd[-n,],1,imin)
>   i<-which.min(wmins[2,])
>   j<-wmins[1,i]
>   return(c(d[i,j],i,j))
> }
> imin<-function(x)
> {
>   lx<-length(x)
>   i<-x[lx]
>   j<-which.min(x[(i+1):(lx-1)])
>   k<-i+j
>   return(c(k,x[k]))
> }
> 
> So when executed this with mind(below matrix) I get
> 0 12  13  8   20
> 120   15  28  88
> 1315  0   6   9
> 8 28  6   0   33
> 2088  9   33  0
> the answer as 6 , row 3 column 4
> 
> Due to the symmetry of the matrix , the skipping of the early part of row is
> done by using expression (x[(i+1):(lx-1)])..(which is in red color in the code
> shown above). I am unable to understand the line in red code and how it is
> implemented in the line 5(i.e wins)…(shown in pink color in the code above I
> have done necessary homework to understand but still finding it hard to get 
> it.
> Please someone help.
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


***
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If 
you have received this message in error, please notify the sender 
immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com 
and delete this message and any copies from your computer and network. 
LGC Limited. Registered in England 2991879. 
Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Query to find minimum value in a matrix in R

2016-09-16 Thread susmita T
Hi,
Good Morning! I am new to R and finding difficulty in understanding the code. 
Since few days I am stuck at single line of code which I am unable to 
understand.
Though there may be number of logics to find min value. As a new beginner I am 
following a book and as it has the following code

mind<-function(d)
{
n<-nrow(d)
dd<-cbind(d,1:n)
wmins<-apply(dd[-n,],1,imin)
i<-which.min(wmins[2,])
j<-wmins[1,i]
return(c(d[i,j],i,j))
}
imin<-function(x)
{
lx<-length(x)
i<-x[lx]
j<-which.min(x[(i+1):(lx-1)])
k<-i+j
return(c(k,x[k]))
}

So when executed this with mind(below matrix) I get
0   12  13  8   20
12  0   15  28  88
13  15  0   6   9
8   28  6   0   33
20  88  9   33  0
the answer as 6 , row 3 column 4

Due to the symmetry of the matrix , the skipping of the early part of row is 
done by using expression (x[(i+1):(lx-1)])..(which is in red color in the code 
shown above). I am unable to understand
the line in red code and how it is implemented in the line 5(i.e wins)…(shown 
in pink color in the code above
I have done necessary homework to understand but still finding it hard to get 
it. Please someone help.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Query about Text Preprocessing (Encoding)

2016-05-29 Thread Duncan Murdoch

On 29/05/2016 3:20 AM, Khadija Shakeel wrote:

i want to work with Urdu language but R is only displaying Urdu text but
cant work with Urdu text. Actually I want to apply preproessing steps of
text mining. but R is nor responding for this text.
Help me how can I handle this problem?

here are some pictures of word cloud of Urdu text.



R doesn't currently have a translation team (see 
translation.r-project.org) for Urdu, so it may be hard for you to get 
Urdu-specific support.  However, I would guess the problems you are 
having are common to other languages that use non-Roman alphabets, and 
you may get some advice from the translation teams for one of them.


The general issues that I know of are:

 - R needs to know your encoding.  On Unix-alikes the best support is 
for UTF-8; Windows support is weaker, because Windows tends to use 
UTF-16 or other multibyte encodings, and R's support for those is mixed.


 - You need to make sure your graphics device supports your alphabet. 
Not all graphics devices have character support for all languages.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query about Text Preprocessing (Encoding)

2016-05-29 Thread Khadija Shakeel
i want to work with Urdu language but R is only displaying Urdu text but
cant work with Urdu text. Actually I want to apply preproessing steps of
text mining. but R is nor responding for this text.
Help me how can I handle this problem?

here are some pictures of word cloud of Urdu text.

-- 
Khadija Shakeel
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Query about use of format in strptime

2016-04-11 Thread MacQueen, Don
First I added one row to your data, to illustrate a case with missing
times:

year month day hh mm hs
2007 11 19 0 0 0.00
2007 11 19 0 30 0.00
2007 11 19 1 0 0.00
2007 11 19 1 30 0.00
2007 11 19 2 0 0.00
2007 11 19 2 30 0.00
2007 11 19 3 0 0.00
2007 11 19 3 30 0.00
2007 11 19 4 0 0.00
2007 11 19 4 30 0.00
2007 11 19 6 30 0.00

(and I put it in a separate file named snowday.dat)

Then try this:

sd <- read.table('snowday.dat', sep=' ', head=TRUE)
sd$tm <- as.POSIXct( paste(sd$year, sd$month, sd$day, sd$hh, sd$mm,
sep='-'), format='%Y-%m-%d-%H-%M')
dft <- data.frame( tm=seq(min(sd$tm), max(sd$tm), by='30 min') )
sd <- merge(sd, dft, all=TRUE)



This appears to do what you are asking for (if I understand correctly).

> sd
tm year month day hh mm hs
1  2007-11-19 00:00:00 200711  19  0  0  0
2  2007-11-19 00:30:00 200711  19  0 30  0
3  2007-11-19 01:00:00 200711  19  1  0  0
4  2007-11-19 01:30:00 200711  19  1 30  0
5  2007-11-19 02:00:00 200711  19  2  0  0
6  2007-11-19 02:30:00 200711  19  2 30  0
7  2007-11-19 03:00:00 200711  19  3  0  0
8  2007-11-19 03:30:00 200711  19  3 30  0
9  2007-11-19 04:00:00 200711  19  4  0  0
10 2007-11-19 04:30:00 200711  19  4 30  0
11 2007-11-19 05:00:00   NANA  NA NA NA NA
12 2007-11-19 05:30:00   NANA  NA NA NA NA
13 2007-11-19 06:00:00   NANA  NA NA NA NA
14 2007-11-19 06:30:00 200711  19  6 30  0



Notes:
There is no need to use factor()
As David said, don't use POSIXlt. Use POSIXct instead.

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 4/11/16, 1:48 AM, "R-help on behalf of Stefano Sofia"
<r-help-boun...@r-project.org on behalf of
stefano.so...@regione.marche.it> wrote:

>Dear Jim and dear Enrico,
>thank you for your replies.
>Unfortunately your hints didn't solve my problem, and I am getting mad.
>Can I show you my whole process? I will be as quick as possible.
>I start from a data frame called Snow of the form
>
>year month day hh mm hs
>2007 11 19 0 0 0.00
>2007 11 19 0 30 0.00
>2007 11 19 1 0 0.00
>2007 11 19 1 30 0.00
>2007 11 19 2 0 0.00
>2007 11 19 2 30 0.00
>2007 11 19 3 0 0.00
>2007 11 19 3 30 0.00
>2007 11 19 4 0 0.00
>2007 11 19 4 30 0.00
>...
>
>whth semi-hourly data.
>I need to deal with date so I used strptime:
>
>Snow$data_factor <- as.factor(paste(Snow$year, Snow$month, Snow$day,
>Snow$hh, Snow$mm, sep="-"))
>Snow$data_strptime <- strptime(Snow$data_factor, format =
>"%Y-%m-%d-%H-%M")
>
>It gives me
>
>year month day hh mm hs  data_factor  data_strptime
>1 200711  19  0  0  0  2007-11-19-0-0 2007-11-19 00:00:00
>2 200711  19  0 30  0  2007-11-19-0-30  2007-11-19 00:30:00
>3 200711  19  1  0  0  2007-11-19-1-0  2007-11-19 01:00:00
>4 200711  19  1 30  0  2007-11-19-1-30  2007-11-19 01:30:00
>5 200711  19  2  0  0  2007-11-19-2-0  2007-11-19 02:00:00
>6 200711  19  2 30  0  2007-11-19-2-30  2007-11-19 02:30:00
>7 200711  19  3  0  0  2007-11-19-3-0  2007-11-19 03:00:00
>8 200711  19  3 30  0  2007-11-19-3-30  2007-11-19 03:30:00
>9 200711  19  4  0  0  2007-11-19-4-0  2007-11-19 04:00:00
>10   200711  19  4 30  0  2007-11-19-4-30  2007-11-19 04:30:00
>...
>
>The type of the column data_strptime is
>$data_strptime
>[1] "POSIXlt" "POSIXt"
>
>Because of some days (or part of them) might be missing, given a time
>interval I want to create a new data frame with all time-steps and then
>merge the new data frame with the old one.
>In order to create a new data frame with all time-steps, I thought to use
>
>df_new <- data.frame(data_strptime=seq(init_day, fin_day, by="30 mins"))
>
>and then
>
>Snow_all <- merge(df_new, Snow, by=("data_strptime"), all.x=TRUE)
>
>My problem is in dealing with init_day and fin_day, respectively for
>example "20071119" and "20071121".
>I am not able to create a sequence of class "POSIXlt" "POSIXt", in order
>to merge the two data frames.
>
>Could you please help me in this?
>Thank you again for your attention
>Stefano
>
>
>
>Da: Jim Lemon [drjimle...@gmail.com]
>Inviato: lunedì 11 aprile 2016 9.47
>A: Stefano Sofia
>Cc: r-help@r-project.org
>Oggetto: Re: [R] Query about use of format in strptime
>
>Hi Stefano,
>As the help page says:
>
>"The default for the format methods is "%Y-%m-%d %H:%M:%S" if any
>element has a time component which is not midnight, and "%Y-%m-%d"
>otherwise. This is because when the result is printed, it uses the
>default for

Re: [R] Query about use of format in strptime

2016-04-11 Thread David Winsemius

> On Apr 11, 2016, at 1:48 AM, Stefano Sofia <stefano.so...@regione.marche.it> 
> wrote:
> 
> Dear Jim and dear Enrico,
> thank you for your replies.
> Unfortunately your hints didn't solve my problem, and I am getting mad.
> Can I show you my whole process? I will be as quick as possible.
> I start from a data frame called Snow of the form
> 
> year month day hh mm hs
> 2007 11 19 0 0 0.00
> 2007 11 19 0 30 0.00
> 2007 11 19 1 0 0.00
> 2007 11 19 1 30 0.00
> 2007 11 19 2 0 0.00
> 2007 11 19 2 30 0.00
> 2007 11 19 3 0 0.00
> 2007 11 19 3 30 0.00
> 2007 11 19 4 0 0.00
> 2007 11 19 4 30 0.00
> ...
> 
> whth semi-hourly data.
> I need to deal with date so I used strptime:
> 
> Snow$data_factor <- as.factor(paste(Snow$year, Snow$month, Snow$day, Snow$hh, 
> Snow$mm, sep="-"))
> Snow$data_strptime <- strptime(Snow$data_factor, format = "%Y-%m-%d-%H-%M")
> 
> It gives me
> 
> year month day hh mm hs  data_factor  data_strptime
> 1 200711  19  0  0  0  2007-11-19-0-0 2007-11-19 00:00:00
> 2 200711  19  0 30  0  2007-11-19-0-30  2007-11-19 00:30:00
> 3 200711  19  1  0  0  2007-11-19-1-0  2007-11-19 01:00:00
> 4 200711  19  1 30  0  2007-11-19-1-30  2007-11-19 01:30:00
> 5 200711  19  2  0  0  2007-11-19-2-0  2007-11-19 02:00:00
> 6 200711  19  2 30  0  2007-11-19-2-30  2007-11-19 02:30:00
> 7 200711  19  3  0  0  2007-11-19-3-0  2007-11-19 03:00:00
> 8 200711  19  3 30  0  2007-11-19-3-30  2007-11-19 03:30:00
> 9 200711  19  4  0  0  2007-11-19-4-0  2007-11-19 04:00:00
> 10   200711  19  4 30  0  2007-11-19-4-30  2007-11-19 04:30:00
> ...
> 
> The type of the column data_strptime is
> $data_strptime
> [1] "POSIXlt" "POSIXt"
> 
> Because of some days (or part of them) might be missing, given a time 
> interval I want to create a new data frame with all time-steps and then merge 
> the new data frame with the old one.
> In order to create a new data frame with all time-steps, I thought to use
> 
> df_new <- data.frame(data_strptime=seq(init_day, fin_day, by="30 mins"))
> 
> and then
> 
> Snow_all <- merge(df_new, Snow, by=("data_strptime"), all.x=TRUE)
> 
> My problem is in dealing with  and , respectively for example "20071119" 
> and "20071121".
> I am not able to create a sequence of class "POSIXlt" "POSIXt", in order to 
> merge the two data frames.



First you asked about character values with dashes in them and now you want no 
dashes. Make up our mind:


init_day="20071119" 
 fin_day="20071121".
df_new <- data.frame(data_strptime=seq(as.POSIXct(init_day, %Y%M%D%H%M"), 
   as.POSIXct(fin_day, %Y%M%D%H%M"), by="30 
mins"))

Do NOT use POSIXlt for dataframe columns.

-- 
David.


> 
> Could you please help me in this?
> Thank you again for your attention
> Stefano
> 
> 
> 
> Da: Jim Lemon [drjimle...@gmail.com]
> Inviato: lunedì 11 aprile 2016 9.47
> A: Stefano Sofia
> Cc: r-help@r-project.org
> Oggetto: Re: [R] Query about use of format in strptime
> 
> Hi Stefano,
> As the help page says:
> 
> "The default for the format methods is "%Y-%m-%d %H:%M:%S" if any
> element has a time component which is not midnight, and "%Y-%m-%d"
> otherwise. This is because when the result is printed, it uses the
> default format. If you want a specified output representation:
> 
> format(strptime(init_day, format="%Y-%m-%d-%H-%M"),"%Y-%M-%d %H:%M")
> [1] "2015-30-24 00:30"
> 
> For the "midnight" case:
> 
> format(strptime(init_day, format="%Y-%m-%d-%H-%M"),"%Y-%m-%d %H:%M")
> [1] "2015-02-24 00:00"
> 
> Jim
> 
> 
> On Mon, Apr 11, 2016 at 5:22 PM, Stefano Sofia
> <stefano.so...@regione.marche.it> wrote:
>> Dear R-list users,
>> I need to use strptime because I have to deal with date with hours and 
>> minutes.
>> I read the manual for strptime and I also looked at many examples, but when 
>> I try to apply it to my code, I always encounter some problems.
>> I try to change the default format, with no success. Why? How can I change 
>> the format?
>> 
>> 1.
>> init_day <- as.factor("2015-02-24-00-30")
>> strptime(init_day, format="%Y-%m-%d-%H-%M")
>> [1] "2015-02-24 00:30:00"
>> It works, but why also seconds are shown if in format seconds are not 
>> specified?
>> 
>> 2.
>> init_day &

Re: [R] Query about use of format in strptime

2016-04-11 Thread Stefano Sofia
Dear Jim and dear Enrico,
thank you for your replies.
Unfortunately your hints didn't solve my problem, and I am getting mad.
Can I show you my whole process? I will be as quick as possible.
I start from a data frame called Snow of the form

year month day hh mm hs
2007 11 19 0 0 0.00
2007 11 19 0 30 0.00
2007 11 19 1 0 0.00
2007 11 19 1 30 0.00
2007 11 19 2 0 0.00
2007 11 19 2 30 0.00
2007 11 19 3 0 0.00
2007 11 19 3 30 0.00
2007 11 19 4 0 0.00
2007 11 19 4 30 0.00
...

whth semi-hourly data.
I need to deal with date so I used strptime:

Snow$data_factor <- as.factor(paste(Snow$year, Snow$month, Snow$day, Snow$hh, 
Snow$mm, sep="-"))
Snow$data_strptime <- strptime(Snow$data_factor, format = "%Y-%m-%d-%H-%M")

It gives me

year month day hh mm hs  data_factor  data_strptime
1 200711  19  0  0  0  2007-11-19-0-0 2007-11-19 00:00:00
2 200711  19  0 30  0  2007-11-19-0-30  2007-11-19 00:30:00
3 200711  19  1  0  0  2007-11-19-1-0  2007-11-19 01:00:00
4 200711  19  1 30  0  2007-11-19-1-30  2007-11-19 01:30:00
5 200711  19  2  0  0  2007-11-19-2-0  2007-11-19 02:00:00
6 200711  19  2 30  0  2007-11-19-2-30  2007-11-19 02:30:00
7 200711  19  3  0  0  2007-11-19-3-0  2007-11-19 03:00:00
8 200711  19  3 30  0  2007-11-19-3-30  2007-11-19 03:30:00
9 200711  19  4  0  0  2007-11-19-4-0  2007-11-19 04:00:00
10   200711  19  4 30  0  2007-11-19-4-30  2007-11-19 04:30:00
...

The type of the column data_strptime is
$data_strptime
[1] "POSIXlt" "POSIXt"

Because of some days (or part of them) might be missing, given a time interval 
I want to create a new data frame with all time-steps and then merge the new 
data frame with the old one.
In order to create a new data frame with all time-steps, I thought to use

df_new <- data.frame(data_strptime=seq(init_day, fin_day, by="30 mins"))

and then

Snow_all <- merge(df_new, Snow, by=("data_strptime"), all.x=TRUE)

My problem is in dealing with init_day and fin_day, respectively for example 
"20071119" and "20071121".
I am not able to create a sequence of class "POSIXlt" "POSIXt", in order to 
merge the two data frames.

Could you please help me in this?
Thank you again for your attention
Stefano



Da: Jim Lemon [drjimle...@gmail.com]
Inviato: lunedì 11 aprile 2016 9.47
A: Stefano Sofia
Cc: r-help@r-project.org
Oggetto: Re: [R] Query about use of format in strptime

Hi Stefano,
As the help page says:

"The default for the format methods is "%Y-%m-%d %H:%M:%S" if any
element has a time component which is not midnight, and "%Y-%m-%d"
otherwise. This is because when the result is printed, it uses the
default format. If you want a specified output representation:

format(strptime(init_day, format="%Y-%m-%d-%H-%M"),"%Y-%M-%d %H:%M")
[1] "2015-30-24 00:30"

For the "midnight" case:

format(strptime(init_day, format="%Y-%m-%d-%H-%M"),"%Y-%m-%d %H:%M")
[1] "2015-02-24 00:00"

Jim


On Mon, Apr 11, 2016 at 5:22 PM, Stefano Sofia
<stefano.so...@regione.marche.it> wrote:
> Dear R-list users,
> I need to use strptime because I have to deal with date with hours and 
> minutes.
> I read the manual for strptime and I also looked at many examples, but when I 
> try to apply it to my code, I always encounter some problems.
> I try to change the default format, with no success. Why? How can I change 
> the format?
>
> 1.
> init_day <- as.factor("2015-02-24-00-30")
> strptime(init_day, format="%Y-%m-%d-%H-%M")
> [1] "2015-02-24 00:30:00"
> It works, but why also seconds are shown if in format seconds are not 
> specified?
>
> 2.
> init_day <- as.factor("2015-02-24-0-00")
> strptime(init_day, format="%Y-%m-%d-%H-%M")
> [1] "2015-02-24"
> Again, the specified format is not applied. Why?
>
> Thank you for your attention and your help
> Stefano
>
>
> 
>
> AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere 
> informazioni confidenziali, pertanto è destinato solo a persone autorizzate 
> alla ricezione. I messaggi di posta elettronica per i client di Regione 
> Marche possono contenere informazioni confidenziali e con privilegi legali. 
> Se non si è il destinatario specificato, non leggere, copiare, inoltrare o 
> archiviare questo messaggio. Se si è ricevuto questo messaggio per errore, 
> inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio 
> computer. Ai sensi dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso 
> di necessità ed urgenza, la risposta al presente messaggio di posta 
> elettronica può esser

  1   2   3   >