Re: [R] difference between script and a function....

2022-12-24 Thread akshay kulkarni
Dear Ivan,
   Thanks a lot.!

Thanking you,
Yours sincerely,
AKSHAY M KULKARNI

From: Ivan Krylov 
Sent: Saturday, December 24, 2022 9:27 PM
To: akshay kulkarni 
Cc: R help Mailing list 
Subject: Re: [R] difference between script and a function

On Sat, 24 Dec 2022 15:47:14 +
akshay kulkarni  wrote:

> How do you debug if there is an error, particularly if I run the
> script from the BASH prompt?

Post-mortem debugging for non-interactive R scripts can be enabled by
setting options(error = quote(dump.frames("A_SUITABLE_FILE_NAME",
to.file = TRUE))) in your script. See ?dump.frames for more information.

When you're running command-line R interactively, the main tools at
your disposal are traceback() and browser() (see their respective help
pages). There are two main ways to use the browser: (1) by setting the
debugging flag on a function via debug(function) or debugonce(function)
-- you will land in the browser when the function is called -- and (2)
by setting options(error = recover), which will launch the browser at
the time of the crash, letting you walk the stack and inspect the
values of various variables.

For more information on pure-R debugging, see The R Inferno book:
<https://www.burns-stat.com/documents/books/the-r-inferno/>

--
Best regards,
Ivan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between script and a function....

2022-12-24 Thread Ivan Krylov
On Sat, 24 Dec 2022 15:47:14 +
akshay kulkarni  wrote:

> How do you debug if there is an error, particularly if I run the
> script from the BASH prompt? 

Post-mortem debugging for non-interactive R scripts can be enabled by
setting options(error = quote(dump.frames("A_SUITABLE_FILE_NAME",
to.file = TRUE))) in your script. See ?dump.frames for more information.

When you're running command-line R interactively, the main tools at
your disposal are traceback() and browser() (see their respective help
pages). There are two main ways to use the browser: (1) by setting the
debugging flag on a function via debug(function) or debugonce(function)
-- you will land in the browser when the function is called -- and (2)
by setting options(error = recover), which will launch the browser at
the time of the crash, letting you walk the stack and inspect the
values of various variables.

For more information on pure-R debugging, see The R Inferno book:


-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between script and a function....

2022-12-24 Thread akshay kulkarni
dear Ivan,
  Thanks for the reply. One last question:


  1.  How do you debug if there is an error, particularly if I run the script 
from the BASH prompt? The closest I have come to debugging a script is when I 
source the script in Rstudio. There everything is intiutive...but from the 
Linux prompt?

Thanking you,
Yours sincerely,
AKSHAY M KULKARNI

From: Ivan Krylov 
Sent: Saturday, December 24, 2022 8:52 PM
To: akshay kulkarni 
Cc: R help Mailing list 
Subject: Re: [R] difference between script and a function

On Sat, 24 Dec 2022 14:54:52 +
akshay kulkarni  wrote:

> If there is  some error in the script, the error will be output to
> the stdout? Or to the file that it creates for saving the output of
> the script?

When using Rscript: to stderr, to be precise.

When using R CMD BATCH: to the Rout file.

> Will ALL the intermediate objects be stored ?

May I shamelessly plug my own package, depcache? Using its cache()
function, you can save intermediate objects into *.rds files in a
relatively transparent manner. By default, they are stored in a
subdirectory of the working directory. A downside is that I haven't
come up with a useful cache expiration strategy yet.

A more involved approach that will also let you access your
intermediates is provided by the "targets" package, not only on CRAN
but also peer-reviewed by rOpenSci.

--
Best regards,
Ivan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between script and a function....

2022-12-24 Thread Ivan Krylov
On Sat, 24 Dec 2022 14:54:52 +
akshay kulkarni  wrote:

> If there is  some error in the script, the error will be output to
> the stdout? Or to the file that it creates for saving the output of
> the script?

When using Rscript: to stderr, to be precise.

When using R CMD BATCH: to the Rout file.

> Will ALL the intermediate objects be stored ?

May I shamelessly plug my own package, depcache? Using its cache()
function, you can save intermediate objects into *.rds files in a
relatively transparent manner. By default, they are stored in a
subdirectory of the working directory. A downside is that I haven't
come up with a useful cache expiration strategy yet.

A more involved approach that will also let you access your
intermediates is provided by the "targets" package, not only on CRAN
but also peer-reviewed by rOpenSci.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between script and a function....

2022-12-24 Thread akshay kulkarni
Dear Simmons,
Thanks a lot. One more question:


  1.  If there is  some error in the script, the error will be output to the 
stdout? Or to the file that it creates for saving the output of the script?

Thanking you,
Yours sincerely,
AKSHAY M KULKARNI

From: Andrew Simmons 
Sent: Saturday, December 24, 2022 8:18 PM
To: akshay kulkarni 
Cc: R help Mailing list 
Subject: Re: [R] difference between script and a function

1. The execution environment for a script is the global environment. Each R 
script run from a shell will be given its own global environment. Each R 
session has exactly one global environment, but you can have several active R 
sessions.

2. Using return in a script instead of a function will throw an error

Error: no function to return from, jumping to top level

3. You can use print(), cat(), and writeLines() to write to the output. It does 
not save your R objects before it exits the script. You could use save() or 
save.image() to save your objects, or possibly saveRDS() if you are only 
looking to save one object. You could also use source() if you just want the 
objects from another script.

4. Will you have shared access to the objects in another R session? No, objects 
are not shared, unless you've got something weird setup with external pointers. 
Each session has it owns global environment.

5. Any of the doc pages for the functions I listed above would help, you can 
also do

?utils::Rscript

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between script and a function....

2022-12-24 Thread Andrew Simmons
1. The execution environment for a script is the global environment. Each R
script run from a shell will be given its own global environment. Each R
session has exactly one global environment, but you can have several active
R sessions.

2. Using return in a script instead of a function will throw an error

Error: no function to return from, jumping to top level

3. You can use print(), cat(), and writeLines() to write to the output. It
does not save your R objects before it exits the script. You could use
save() or save.image() to save your objects, or possibly saveRDS() if you
are only looking to save one object. You could also use source() if you
just want the objects from another script.

4. Will you have shared access to the objects in another R session? No,
objects are not shared, unless you've got something weird setup with
external pointers. Each session has it owns global environment.

5. Any of the doc pages for the functions I listed above would help, you
can also do

?utils::Rscript

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difference between script and a function....

2022-12-24 Thread akshay kulkarni
Dear members,
I will be running scripts automatically in RHEL 
with crontab. I want to know the differences between running  a script and a 
function. in particular:


  1.  An execution environment will be created for the function. what about a 
script? Is the execution environment the global environment?
  2.  What happens if I use return() at the end of a script?
  3.  I came to know that when you use R CMD BATCH, a file will be created and 
the output will be saved to the file. How do I output what I want to output in 
a script? Should I use return()? Will ALL the intermediate objects be stored ?
  4.  When I run a script, it will have access to the objects in the global 
environment right? i.e I can use all the functions in the global environment, 
right?
  5.  Any links to resources for further information? How do i access the 
relevant documentation?

ANy help will be greatly appreciated...

thanking you,
yours sincerely,
AKSHAY M KULKARNI

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference in offset values in R and STATA

2020-04-14 Thread Patrick (Malone Quantitative)
Also, is the default base for "log" the same in both programs?


On Tue, Apr 14, 2020 at 3:36 PM Sorkin, John  wrote:
>
> Your question is unlikely to be answered unless you post code demonstrating 
> the problem
> J
>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and 
> Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to 
> faxing)
>
> On Apr 14, 2020, at 3:23 PM, Haddison Mureithi  
> wrote:
>
> Hae guys,
> When performing a poisson regression sometimes one has to input the
> offset/exposure variable to account for individual time spent in a certain
> therapy before acquiring a certain condition of interest, whereby in r
> offset(log(months)) and in STATA offset(log(months)) results differ.
> Thereby increasing the difference of the overall outcome, i don't know why
> any one know why?
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=02%7C01%7C%7Cb386607c67fa45f5ac3508d7e0a923cc%7C717009a620de461a88940312a395cac9%7C0%7C0%7C637224889939614430sdata=VsicWcmiBw162nLk6CZGKx5pKW4SbBliXeYY%2BLYu90Q%3Dreserved=0
> PLEASE do read the posting guide 
> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.htmldata=02%7C01%7C%7Cb386607c67fa45f5ac3508d7e0a923cc%7C717009a620de461a88940312a395cac9%7C0%7C0%7C637224889939614430sdata=D4wrhP70NpBLjcE8Yd9JZCIykwqf4dDTla4Zkyk%2FQ5Q%3Dreserved=0
> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Patrick S. Malone, Ph.D., Malone Quantitative
NEW Service Models: http://malonequantitative.com

He/Him/His

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference in offset values in R and STATA

2020-04-14 Thread Sorkin, John
Your question is unlikely to be answered unless you post code demonstrating the 
problem
J

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to 
faxing)

On Apr 14, 2020, at 3:23 PM, Haddison Mureithi  
wrote:

Hae guys,
When performing a poisson regression sometimes one has to input the
offset/exposure variable to account for individual time spent in a certain
therapy before acquiring a certain condition of interest, whereby in r
offset(log(months)) and in STATA offset(log(months)) results differ.
Thereby increasing the difference of the overall outcome, i don't know why
any one know why?

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=02%7C01%7C%7Cb386607c67fa45f5ac3508d7e0a923cc%7C717009a620de461a88940312a395cac9%7C0%7C0%7C637224889939614430sdata=VsicWcmiBw162nLk6CZGKx5pKW4SbBliXeYY%2BLYu90Q%3Dreserved=0
PLEASE do read the posting guide 
https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.htmldata=02%7C01%7C%7Cb386607c67fa45f5ac3508d7e0a923cc%7C717009a620de461a88940312a395cac9%7C0%7C0%7C637224889939614430sdata=D4wrhP70NpBLjcE8Yd9JZCIykwqf4dDTla4Zkyk%2FQ5Q%3Dreserved=0
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference in offset values in R and STATA

2020-04-14 Thread Haddison Mureithi
Hae guys,
When performing a poisson regression sometimes one has to input the
offset/exposure variable to account for individual time spent in a certain
therapy before acquiring a certain condition of interest, whereby in r
offset(log(months)) and in STATA offset(log(months)) results differ.
Thereby increasing the difference of the overall outcome, i don't know why
any one know why?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between cor_auto (qgraph package) and lavCor (lavaan package)

2019-04-19 Thread Enjalbert Line via R-help
Hello,
I would like to know the diffrence between 2 commands : cor_auto (from qgraph 
package) and lavCor (from lavaan package) to compute a polychoric correlation 
matrix in order to do a network analysis.
I have the responses to the SF-36 questionnaire (36 items with ordered 
responses) and I would like to have a polychoric correlation matrix by 
following the Sacha Epskamp's method. But I have a warning if using cor_auto :  
"Correlation matrix is not positive definite. Finding nearest positive definite 
matrix". I can ignore this warning or not ? Because I try to do a polychoric 
matrix in STATA software and I do not have a warning message.
And I don't understand the difference between lavCor command, because When I 
use lavCor, I do not have the warning message either.
//cor.auto1 <- cor_auto(Data_1it)
//cor.lav1<-lavCor(Data_1it, ordered=names(Data_1it))
Thank you in advance for your answer. 
Line


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between ifelse and if...else?

2017-12-13 Thread MacQueen, Don
Because ifelse is not intended to be an alternative to if ... else. They exist 
for different purposes.

(besides the other replies, a careful reading of their help pages, and trying 
the examples, should explain the different purposes).

--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 

On 12/13/17, 7:31 AM, "R-help on behalf of Jinsong Zhao" 
 wrote:

Hi there,

I don't know why the following codes are return different results.

 > ifelse(3 > 2, 1:3, length(1:3))
[1] 1
 > if (3 > 2) 1:3 else length(1:3)
[1] 1 2 3

Any hints?

Best,
Jinsong

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between ifelse and if...else?

2017-12-13 Thread Duncan Murdoch

On 13/12/2017 10:31 AM, Jinsong Zhao wrote:

Hi there,

I don't know why the following codes are return different results.

  > ifelse(3 > 2, 1:3, length(1:3))
[1] 1
  > if (3 > 2) 1:3 else length(1:3)
[1] 1 2 3

Any hints?


The documentation in the help page ?ifelse and ?"if" explains it pretty 
clearly.  Think of ifelse() as a function with vector inputs and a 
vector output, and if() as a flow control construction.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between ifelse and if...else?

2017-12-13 Thread Eric Berger
ifelse returns the "shape" of the first argument

In your ifelse the shape of "3 > 2" is a vector of length one, so it will
return a vector length one.

Avoid "ifelse" until you are very comfortable with it. It can often burn
you.




On Wed, Dec 13, 2017 at 5:33 PM, jeremiah rounds 
wrote:

> ifelse is vectorized.
>
> On Wed, Dec 13, 2017 at 7:31 AM, Jinsong Zhao  wrote:
>
> > Hi there,
> >
> > I don't know why the following codes are return different results.
> >
> > > ifelse(3 > 2, 1:3, length(1:3))
> > [1] 1
> > > if (3 > 2) 1:3 else length(1:3)
> > [1] 1 2 3
> >
> > Any hints?
> >
> > Best,
> > Jinsong
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posti
> > ng-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between ifelse and if...else?

2017-12-13 Thread jeremiah rounds
ifelse is vectorized.

On Wed, Dec 13, 2017 at 7:31 AM, Jinsong Zhao  wrote:

> Hi there,
>
> I don't know why the following codes are return different results.
>
> > ifelse(3 > 2, 1:3, length(1:3))
> [1] 1
> > if (3 > 2) 1:3 else length(1:3)
> [1] 1 2 3
>
> Any hints?
>
> Best,
> Jinsong
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difference between ifelse and if...else?

2017-12-13 Thread Jinsong Zhao

Hi there,

I don't know why the following codes are return different results.

> ifelse(3 > 2, 1:3, length(1:3))
[1] 1
> if (3 > 2) 1:3 else length(1:3)
[1] 1 2 3

Any hints?

Best,
Jinsong

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difference-in-difference method for estimating causal impact,

2017-04-18 Thread Ralf Pfeiffer via R-help



Hello, 

i want to estimate the causal impact on a scale variable, using the 
difference-in-difference-method and the following 4 groups
- control- and treatment group (counterfactual analysis) 
- two periods, measurement before and after treatment.  

After discovering and estimating the causal effect (of course, only if 
existing) i like to make deeper analysis, causal mediation analysis, and fixed 
effects analyisis.  

Does anyone know a package or packages, with which one can do this estimations, 
calculation and graphics?
Thanks a  lot for any information.
iksmax


   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between console output of cat and print

2017-04-17 Thread Bert Gunter
Well,...

cat() is as Jeff describes.

However, print() is a generic function (see ?UseMethod) for which
there are literally hundreds of different methods that may do far
more/different than merely output character strings. For example, the
print method for trellis objects, print.trellis, draws a graph of the
object.

The print method  called for print(10) is the default method, for
which ?print.default should be consulted: it is actually printing a
vector of length 1, and the print method for vectors labels each line
with the index of the first item printed.

Please read An Intro to R or other R tutorial to learn about S3 methods.

Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Apr 17, 2017 at 3:01 PM, Jeff Newmiller
 wrote:
> Please stop posting html email per the Posting Guide. You are only going to 
> reduce the chance of successfully communicating your questions to experienced 
> users on this list.
>
> Re cat vs print: the purpose of print is to show values much as they are 
> entered in source code, so quotes and escaped characters such as "\n" are 
> shown. Cat is intended to provide a way to send characters straight to the 
> console so the effects of special characters can be visible (i.e. getting 
> text on the next line when a "\n" occurs in a string). Thus the element 
> numbering is not relevant there.
> --
> Sent from my phone. Please excuse my brevity.
>
> On April 17, 2017 12:18:36 AM PDT, Data MagicPro  
> wrote:
>>Since both *cat * as well as * print * create a character vector for
>>outputing on the screen. Still both give different results as apparant
>>below. My query is why so ?
>>
>>
>>> cat(10)
>>10
>>> print(10)
>>[1] 10
>>
>>Why is the [1] of index number missing in case of *cat *?
>>
>>Thanks
>>Ramnik
>>
>>   [[alternative HTML version deleted]]
>>
>>__
>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between console output of cat and print

2017-04-17 Thread Jeff Newmiller
Please stop posting html email per the Posting Guide. You are only going to 
reduce the chance of successfully communicating your questions to experienced 
users on this list.

Re cat vs print: the purpose of print is to show values much as they are 
entered in source code, so quotes and escaped characters such as "\n" are 
shown. Cat is intended to provide a way to send characters straight to the 
console so the effects of special characters can be visible (i.e. getting text 
on the next line when a "\n" occurs in a string). Thus the element numbering is 
not relevant there.
-- 
Sent from my phone. Please excuse my brevity.

On April 17, 2017 12:18:36 AM PDT, Data MagicPro  
wrote:
>Since both *cat * as well as * print * create a character vector for
>outputing on the screen. Still both give different results as apparant
>below. My query is why so ?
>
>
>> cat(10)
>10
>> print(10)
>[1] 10
>
>Why is the [1] of index number missing in case of *cat *?
>
>Thanks
>Ramnik
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between console output of cat and print

2017-04-17 Thread Data MagicPro
Since both *cat * as well as * print * create a character vector for
outputing on the screen. Still both give different results as apparant
below. My query is why so ?


> cat(10)
10
> print(10)
[1] 10

Why is the [1] of index number missing in case of *cat *?

Thanks
Ramnik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference metric info of same font on different device

2017-04-08 Thread Jinsong Zhao

On 2017/4/7 23:13, Jeff Newmiller wrote:

I think it is a fundamental characteristic of graphics drivers that output will 
look different in the details... you are on a wild goose chase. Postscript in 
particular has a huge advantage in font presentation over other graphics output 
mechanisms.



I agreed with your opinion on Postscript.

However, as shown in the attached plots in previous post, the glyph 
metric info for any CID-keyed font is based on assumption in R. In fact, 
it's not possible to get the metric info of a CID-keyed font without 
accessing the actual font which may be in truetype or opentype/CFF 
format. And, I don't think R could find the actual font.


Best,
Jinsong

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference metric info of same font on different device

2017-04-08 Thread Jinsong Zhao

On 2017/4/7 23:13, Jeff Newmiller wrote:

I think it is a fundamental characteristic of graphics drivers that output will 
look different in the details... you are on a wild goose chase. Postscript in 
particular has a huge advantage in font presentation over other graphics output 
mechanisms.



Well, the problem stems from the MetricInfo of CID-keyed fonts, which 
are intended only for use for the glyphs of East Asian languages, which 
are all monospaced and are all treated as filling the same bounding box. 
(from the help page of CIDFont)


However, is it possible to use the same MetricInfo of CID-keyed fonts as 
that for png() or windows()?


Best,
Jinsong

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference metric info of same font on different device

2017-04-07 Thread Jeff Newmiller
I think it is a fundamental characteristic of graphics drivers that output will 
look different in the details... you are on a wild goose chase. Postscript in 
particular has a huge advantage in font presentation over other graphics output 
mechanisms. 
-- 
Sent from my phone. Please excuse my brevity.

On April 7, 2017 1:05:45 AM PDT, Jinsong Zhao  wrote:
>Hi there,
>
>I try to plot with custom fonts, which have good shape Latin and CJK 
>characters. I set up all the fonts correctly. However, when I plot the 
>same code on png() and postscript(), I get different result. The main 
>problem is the space between characters is narrower in postscript()
>than 
>that in png(), and some character also overlap in postscript().  You
>can 
>see the differences from the attached png files.
>
>Is there any way to get the same plot using postscript() and png()? 
>Thanks in advance.
>
>Best,
>Jinsong
>
>The code I used is here:
>
>windowsFonts(song = windowsFont("SourceHanSerifSC-Regular"),
>  hei  = windowsFont("SourceHanSansSC-Regular"),
>  hwhei  = windowsFont("SourceHanSansHWSC-Regular"),
>  fzsong  = windowsFont("FZShuSong-Z01"),
>  fzhei = windowsFont("FZHei-B01"))
>
>postscriptFonts(song = CIDFont("SourceHanSerifSC-Regular", 
>"UniSourceHanSerifCN-UTF8-H", "UTF-8", ""),
> hei  = CIDFont("SourceHanSansSC-Regular", 
>"UniSourceHanSansCN-UTF8-H", "UTF-8", ""),
> hwhei  = CIDFont("SourceHanSansHWSC-Regular", 
>"UniSourceHanSansHWCN-UTF8-H", "UTF-8", ""),
> fzsong  = CIDFont("FZShuSong-Z01","GBK-EUC-H", 
>"GBK", ""),
> fzhei = CIDFont("FZHei-B01", "GBK-EUC-H", "GBK", ""))
>
>fa <- c("sans", "serif", "song", "hei", "hwhei", "fzsong", "fzhei")
>
>postscript("font.eps", fonts = fa, onefile = FALSE, width = 4, height =
>
>4, horizontal = FALSE)
>
>#png("font.png", width=4*300, height=4*300, res =300)
>
>plot(0,xlab="",ylab="",type="n")
>text(1, -0.75, expression(CO[2]-Hei), family = "hei")
>text(1, -0.5, expression(CO[2]-HWHei), family = "hwhei")
>text(1, -0.25, expression(CO[2]-FZHei), family = "fzhei")
>text(1, 0.0, expression(CO[2]-Sans), family = "sans")
>text(1, 0.25, expression(CO[2]-FZSong), family = "fzsong")
>text(1, 0.5, expression(CO[2]-Song), family = "song")
>text(1, 0.75, expression(CO[2]-Serif), family = "serif")
>
>dev.off()

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difference metric info of same font on different device

2017-04-07 Thread Jinsong Zhao

Hi there,

I try to plot with custom fonts, which have good shape Latin and CJK 
characters. I set up all the fonts correctly. However, when I plot the 
same code on png() and postscript(), I get different result. The main 
problem is the space between characters is narrower in postscript() than 
that in png(), and some character also overlap in postscript().  You can 
see the differences from the attached png files.


Is there any way to get the same plot using postscript() and png()? 
Thanks in advance.


Best,
Jinsong

The code I used is here:

windowsFonts(song = windowsFont("SourceHanSerifSC-Regular"),
 hei  = windowsFont("SourceHanSansSC-Regular"),
 hwhei  = windowsFont("SourceHanSansHWSC-Regular"),
 fzsong  = windowsFont("FZShuSong-Z01"),
 fzhei = windowsFont("FZHei-B01"))

postscriptFonts(song = CIDFont("SourceHanSerifSC-Regular", 
"UniSourceHanSerifCN-UTF8-H", "UTF-8", ""),
hei  = CIDFont("SourceHanSansSC-Regular", 
"UniSourceHanSansCN-UTF8-H", "UTF-8", ""),
hwhei  = CIDFont("SourceHanSansHWSC-Regular", 
"UniSourceHanSansHWCN-UTF8-H", "UTF-8", ""),
fzsong  = CIDFont("FZShuSong-Z01","GBK-EUC-H", 
"GBK", ""),

fzhei = CIDFont("FZHei-B01", "GBK-EUC-H", "GBK", ""))

fa <- c("sans", "serif", "song", "hei", "hwhei", "fzsong", "fzhei")

postscript("font.eps", fonts = fa, onefile = FALSE, width = 4, height = 
4, horizontal = FALSE)


#png("font.png", width=4*300, height=4*300, res =300)

plot(0,xlab="",ylab="",type="n")
text(1, -0.75, expression(CO[2]-Hei), family = "hei")
text(1, -0.5, expression(CO[2]-HWHei), family = "hwhei")
text(1, -0.25, expression(CO[2]-FZHei), family = "fzhei")
text(1, 0.0, expression(CO[2]-Sans), family = "sans")
text(1, 0.25, expression(CO[2]-FZSong), family = "fzsong")
text(1, 0.5, expression(CO[2]-Song), family = "song")
text(1, 0.75, expression(CO[2]-Serif), family = "serif")

dev.off()
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between R for the Mac and for Windows

2017-04-01 Thread peter dalgaard
[This is drifting somewhat awy from the original intention of the topic, I 
think].

This looks like a build dependency. I get 

3.3.2 (yeah, I know, should upgrade):

> (1+2i)/0
[1] NaN+NaNi

R-devel, march 24:

> (1+2i)/0
[1] Inf+Infi

on the *same* machine. The difference is that one is stock CRAN, the other was 
built locally. So with the toolchain being updated for 3.4.0, this difference 
would likely go away. Or at least change...

-pd


> On 31 Mar 2017, at 19:15 , Berend Hasselman  wrote:
> 
> 
> I have noted a difference between R on macOS en on Kubuntu Trusty (64bits) 
> with complex division.
> I don't know what would happen R on Windows.
> 
> R.3.3.3:
> 
> macOS (10.11.6)
> -
>> (1+2i)/0
> [1] NaN+NaNi
>> (-1+2i)/0
> [1] NaN+NaNi
>> 
>> 1i/0
> [1] NaN+NaNi
>> 1i/(0+0i)
> [1] NaN+NaNi
> 
> 
> KubuntuTrusty
> -
>> (1+2i)/0
> [1] Inf+Infi
>> (-1+2i)/0
> [1] -Inf+Infi
>> 
>> 1i/0
> [1] NaN+Infi
>> 1i/(0+0i)
> [1] NaN+Infi
> 
> Interesting to see what R on Windows delivers.
> 
> Berend Hasselman
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread Berend Hasselman

> On 31 Mar 2017, at 19:28, John McKown  wrote:
> 
> On Fri, Mar 31, 2017 at 12:15 PM, Berend Hasselman  wrote:
> 
> I have noted a difference between R on macOS en on Kubuntu Trusty (64bits) 
> with complex division.
> I don't know what would happen R on Windows.
> 
> R.3.3.3:
> 
> macOS (10.11.6)
> -
> > (1+2i)/0
> [1] NaN+NaNi
> > (-1+2i)/0
> [1] NaN+NaNi
> >
> > 1i/0
> [1] NaN+NaNi
> > 1i/(0+0i)
> [1] NaN+NaNi
> 
> 
> KubuntuTrusty
> -
> > (1+2i)/0
> [1] Inf+Infi
> > (-1+2i)/0
> [1] -Inf+Infi
> >
> > 1i/0
> [1] NaN+Infi
> > 1i/(0+0i)
> [1] NaN+Infi
> 
> Interesting to see what R on Windows delivers.
> 
> ​> (1+2i)/0
> [1] Inf+Infi
> > (-1+2i)/0
> [1] -Inf+Infi
> > 1i/0
> [1] NaN+Infi
> > 1i/(0+0i)
> [1] NaN+Infi
> > Sys.info()
>  sysname  release 
>"Windows"  "7 x64" 
>  version nodename 
> "build 7601, Service Pack 1" "IT-JMCKOWN" 
>  machinelogin 
> "x86-64""John.Mckown" 
> user   effective_user 
>"John.Mckown""John.Mckown" 
> > 
> 
> Same as Kubuntu. I am _guessing_ that the MacOS somehow sets up the floating 
> point processing to work differently, since they are all on Intel machines 
> nowadays. Or the R was customized to detect division by zero in software and 
> not really do any floating point processing at all.
> ​
> 

I think it's the system math library that does this.

I have assumed that the Kubuntu Trusty (and Windows) give the correct result.
In my package geigen I have taken that into account and made a specialized 
complexdivision function that tries to detect a possibly wrong outcome (which 
appears to happen only on macOS).

Berend Hasselman

> Berend Hasselman
> 
> 
> 
> -- 
> "Irrigation of the land with seawater desalinated by fusion power is ancient. 
> It's called 'rain'." -- Michael McClary, in alt.fusion
> 
> Maranatha! <><
> John McKown

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread John McKown
On Fri, Mar 31, 2017 at 12:15 PM, Berend Hasselman  wrote:

>
> I have noted a difference between R on macOS en on Kubuntu Trusty (64bits)
> with complex division.
> I don't know what would happen R on Windows.
>
> R.3.3.3:
>
> macOS (10.11.6)
> -
> > (1+2i)/0
> [1] NaN+NaNi
> > (-1+2i)/0
> [1] NaN+NaNi
> >
> > 1i/0
> [1] NaN+NaNi
> > 1i/(0+0i)
> [1] NaN+NaNi
>
>
> KubuntuTrusty
> -
> > (1+2i)/0
> [1] Inf+Infi
> > (-1+2i)/0
> [1] -Inf+Infi
> >
> > 1i/0
> [1] NaN+Infi
> > 1i/(0+0i)
> [1] NaN+Infi
>
> Interesting to see what R on Windows delivers.
>

​> (1+2i)/0
[1] Inf+Infi
> (-1+2i)/0
[1] -Inf+Infi
> 1i/0
[1] NaN+Infi
> 1i/(0+0i)
[1] NaN+Infi
> Sys.info()
 sysname  release
   "Windows"  "7 x64"
 version nodename
"build 7601, Service Pack 1" "IT-JMCKOWN"
 machinelogin
"x86-64""John.Mckown"
user   effective_user
   "John.Mckown""John.Mckown"
>

Same as Kubuntu. I am _guessing_ that the MacOS somehow sets up the
floating point processing to work differently, since they are all on Intel
machines nowadays. Or the R was customized to detect division by zero in
software and not really do any floating point processing at all.
​

>
> Berend Hasselman
>
>

-- 
"Irrigation of the land with seawater desalinated by fusion power is
ancient. It's called 'rain'." -- Michael McClary, in alt.fusion

Maranatha! <><
John McKown

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread Uwe Ligges



On 31.03.2017 19:15, Berend Hasselman wrote:


I have noted a difference between R on macOS en on Kubuntu Trusty (64bits) with 
complex division.
I don't know what would happen R on Windows.

R.3.3.3:

macOS (10.11.6)
-

(1+2i)/0

[1] NaN+NaNi

(-1+2i)/0

[1] NaN+NaNi


1i/0

[1] NaN+NaNi

1i/(0+0i)

[1] NaN+NaNi


KubuntuTrusty
-

(1+2i)/0

[1] Inf+Infi

(-1+2i)/0

[1] -Inf+Infi


1i/0

[1] NaN+Infi

1i/(0+0i)

[1] NaN+Infi

Interesting to see what R on Windows delivers.


Same as KubuntuTrusty and what I would expect.

Best,
Uwe Ligges





Berend Hasselman

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread Berend Hasselman

I have noted a difference between R on macOS en on Kubuntu Trusty (64bits) with 
complex division.
I don't know what would happen R on Windows.

R.3.3.3:

macOS (10.11.6)
-
> (1+2i)/0
[1] NaN+NaNi
> (-1+2i)/0
[1] NaN+NaNi
> 
> 1i/0
[1] NaN+NaNi
> 1i/(0+0i)
[1] NaN+NaNi


KubuntuTrusty
-
> (1+2i)/0
[1] Inf+Infi
> (-1+2i)/0
[1] -Inf+Infi
> 
> 1i/0
[1] NaN+Infi
> 1i/(0+0i)
[1] NaN+Infi

Interesting to see what R on Windows delivers.

Berend Hasselman

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread Ista Zahn
The only place I've noticed differences is in encoding and string sorting,
both of which are locale and library dependent.

Best,
Ista

On Mar 31, 2017 8:14 AM, "Neil Salkind"  wrote:

> Can someone please direct me to an answer to the question as to how R
> differs for these two operating systems, if at all? Thanks - Neil
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread peter dalgaard
File encodings differ when you move outside of standard ASCII code. Not really 
R's problem, but it is a fly in the ointment when teaching classes with mixed 
laptop armoury and there are also differences between classroom and desktop 
computers. RStudio does have features to switch encodings, but I usually 
sidestep the issue by commenting scripts in English.

-pd 

> On 31 Mar 2017, at 05:40 , Boris Steipe  wrote:
> 
> I can't remember having seen my students write code that runs correctly on 
> one platform but not the other. Obviously under the hood there are 
> significant differences, but as far as code goes, R seems quite foolproof. 
> There are GUI differences in base R - but AFAIK no such differences in the 
> RStudio IDE.
> 
> B. 
> 
> 
> 
> 
>> On Mar 30, 2017, at 9:21 PM, Neil Salkind  wrote:
>> 
>> Can someone please direct me to an answer to the question as to how R 
>> differs for these two operating systems, if at all? Thanks - Neil 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between R for the Mac and for Windows

2017-03-30 Thread David Winsemius

> On Mar 30, 2017, at 8:40 PM, Boris Steipe  wrote:
> 
> I can't remember having seen my students write code that runs correctly on 
> one platform but not the other. Obviously under the hood there are 
> significant differences, but as far as code goes, R seems quite foolproof. 
> There are GUI differences in base R - but AFAIK no such differences in the 
> RStudio IDE.
> 
> B. 
> 
> 

The Mac version of R is more like the Linux version when run from the UNIX 
command line. RStudio and the R.app GUI's are both nice IDE's. A few packages 
are not available because of the need to link to programs that are only 
available on a particular OS. You can see which ones with a visit to the Cran 
package checks pages.

-- 
David


> 
> 
>> On Mar 30, 2017, at 9:21 PM, Neil Salkind  wrote:
>> 
>> Can someone please direct me to an answer to the question as to how R 
>> differs for these two operating systems, if at all? Thanks - Neil 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between R for the Mac and for Windows

2017-03-30 Thread Boris Steipe
I can't remember having seen my students write code that runs correctly on one 
platform but not the other. Obviously under the hood there are significant 
differences, but as far as code goes, R seems quite foolproof. There are GUI 
differences in base R - but AFAIK no such differences in the RStudio IDE.

B. 




> On Mar 30, 2017, at 9:21 PM, Neil Salkind  wrote:
> 
> Can someone please direct me to an answer to the question as to how R differs 
> for these two operating systems, if at all? Thanks - Neil 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between R for the Mac and for Windows

2017-03-30 Thread Neil Salkind
Can someone please direct me to an answer to the question as to how R differs 
for these two operating systems, if at all? Thanks - Neil 
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference in Generalized Extreme Value distribution parameter estimations using lmom and fExtremes

2017-01-03 Thread Amelia Marsh via R-help
Dear R forum

I have following dataset

amounts = 
c(2803102.248,1088675.278,10394575.14,1007368.396,1004871.328,1092956.088,1020110.818,997371.4487,1000904.154,998105.9744,997434.3006,1080067.258,997594.7992,1000871.015,1001321.094,1000713.448,997591.2307,1469501.54,1066924.393,1074918.566,998628.6216,1002538.482,1056969.243,997386.2638,1.36951E+11,997996.9907,1001257.498,998297.1517,5253186.541,1005503.303,997785.7993,997327.4303,1037039.271,997353.5027,998297.0299,1072558.563,2713147.593,997679.0361,1015856.216,1424576097,999165.4936,998038.8554,3221340.057,1009576.799,5.84277E+12,18595873.96,1054794.099,1005800.558,997533.8031,997347.4897,2208865120,4224689.441,997660.4156,997325.1814,46809107.76,1200682.819,998921.9662,997540.1311,997594.3338,1109023.716,1007961.274,1939821.599,998260.2296,175808356.8,1005375.437,997412.0361,997383.9452,998863.5354,1554312.55,997791.3639,997355.1921,997476.2689,14557283.34,997937.3784,1013997.695,1006244.593,999265.8925,1052001.211,1005484.306,1258924.294,998740.9426,997896.56!
 
31,3613729.605,1000823.697,1656621.398,997874.4055,1056353.896,1000380.152,997576.3836,997442.5109,998563.4918,1032782.759,1010023.106,998578.6725,997344.4766,997310.5771,1002905.434,86902124.97,998396.3911,1245564.907)
 


Using this dataset, I am trying to estimate the parameter values of Extreme 
Value Distribution. I am using the libraries lmom and fExtremes as follows:


library(lmom) 
library(fExtremes)

# 


# Parameter estimation : Using lmom 


lmom <- samlmu(amounts) 
(parameters_of_GEV_lmom <- pelgev(lmom)) 


# OUTPUT: 

> parameters_of_GEV_lmom <- pelgev(lmom); parameters_of_GEV_lmom 


xi  # Location Parameter
8.883402e+06

alpha # Scale Paramter
5.692228e+07 

k   # Shape Parameter
-9.990491e-01 



# 



# Parameter estimation : Using fExtremes

(parameters_of_GEV_fExtremes <- gevFit(amounts, type = "pwm")) 

# OUTPUT: 

Title: 
GEV Parameter Estimation 

Call: 
gevFit(x = amounts, type = "pwm") 

Estimation Type: 
gev pwm 

Estimated Parameters: 


xi# Shape Parameter
9.990479e-01


mu   # Location Parameter
8.855115e+06


beta  # Scale paramter
5.699583e+07 



# __

While it is obvious that the parameter values will differ as lmom is using L 
moments and fExtremes is using Probability Weighted Moment, my concern is about 
the shape parameter. The value of shape parameter is same across all methods 
except the sign.

While lmom estimates shape parameter = -0.99905, fExtremes estimates shape 
parameter = 0.99905. When I have used Statistical software to estimate the 
parameters, I got the parameter values exactly tallying with what lmom is 
generating but scale parameter was equal to 0.99905 (Positive value same as 
fExtremes value) and not -0.99905 which is generated by lmom libraray.

Can some one guide me.


With regards

Amelia

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference

2016-10-29 Thread P Tennant

Hi,

As Jeff said, more than one grouping variable can be supplied, and there 
is an example at the bottom of the help page for ave(). The same goes 
for by(), but the order that you supply the grouping variables becomes 
important. Whichever grouping variable is supplied first to by() will 
change its levels first in the output sequence. You can see from your 
dataset:


d2 <- data.frame(city=rep(1:2, ea=6),
year=c(rep(2001, 3), rep(2002, 3), rep(2001, 3), rep(2002, 3)),
num=c(25,75,150,35,65,120,25,95,150,35,110,120))

d2
   # city year num
# 1 1 2001  25
# 2 1 2001  75
# 3 1 2001 150
# 4 1 2002  35
# 5 1 2002  65
# 6 1 2002 120
# 7 2 2001  25
# 8 2 2001  95
# 9 2 2001 150
# 102 2002  35
# 112 2002 110
# 122 2002 120

that `year' changes its levels through the sequence down the table 
first, and then `city' changes. You want your new column to align with 
this sequence. If you put city first in the list of grouping variables 
for by(), rather than `year', you won't get the sequence reflected in 
your dataset:


by(d2$num, d2[c('city', 'year')], function(x) x - x[1])

# city: 1
# year: 2001
# [1]   0  50 125
# -
# city: 2
# year: 2001
# [1]   0  70 125
# -
# city: 1
# year: 2002
# [1]  0 30 85
# -
# city: 2
# year: 2002
# [1]  0 75 85

In contrast to using by() as I've suggested, using match() to create 
indices that flag when a new `city/year' category is encountered seems a 
more explicit, secure way to do the calculation. Adapting an earlier 
solution provided in this thread:


year.city <- with(d2, interaction(year, city))
indexOfFirstYearCity <- match(year.city, year.city)
indexOfFirstYearCity
# [1]  1  1  1  4  4  4  7  7  7 10 10 10

d2$diff <- d2$num - d2$num[indexOfFirstYearCity]
d2

  city year num diff
1 1 2001  250
2 1 2001  75   50
3 1 2001 150  125
4 1 2002  350
5 1 2002  65   30
6 1 2002 120   85
7 2 2001  250
8 2 2001  95   70
9 2 2001 150  125
102 2002  350
112 2002 110   75
122 2002 120   85


Philip

On 29/10/2016 3:15 PM, Jeff Newmiller wrote:

Now would be an excellent time to read the help page for ?ave. You can specify 
multiple grouping variables.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference

2016-10-28 Thread Jeff Newmiller
Now would be an excellent time to read the help page for ?ave. You can specify 
multiple grouping variables. 
-- 
Sent from my phone. Please excuse my brevity.

On October 28, 2016 7:28:44 PM PDT, Ashta  wrote:
>Hi all thank you very much for your help. Worked very well for that
>data set. I just found out that one of the data sets have another
>level and do the same thing, I want to calculate the difference
>between successive row values (num)  to the first row value within
>city and year.
>
>city, year, num
>1, 2001,25
>1, 2001,75
>1, 2001,150
>1, 2002,35
>1, 2002,65
>1, 2002,120
>2, 2001,25
>2, 2001,95
>2, 2001,150
>2, 2002,35
>2, 2002,110
>2, 2002,120
>
>The result will be
>
>city,year,num,Diff
>1, 2001,25, 0
>1, 2001,75, 50
>1, 2001,150, 125
>1, 2002,35, 0
>1, 2002,65, 30
>1, 2002,120, 85
>2, 2001,25, 0
>2, 2001,95, 70
>2, 2001,150, 125
>2, 2002,35, 0
>2, 2002,110, 75
>2, 2002,120, 85
>
>Thank you again
>
>
>On Fri, Oct 28, 2016 at 4:08 AM, P Tennant 
>wrote:
>> Hi,
>>
>> You could use an anonymous function to operate on each `year-block'
>of your
>> dataset, then assign the result as a new column:
>>
>> d <- data.frame(year=c(rep(2001, 3), rep(2002, 3)),
>> num=c(25,75,150,30,85,95))
>>
>> d$diff <- unlist(by(d$num, d$year, function(x) x - x[1]))
>> d
>>
>>   year num diff
>> 1 2001  250
>> 2 2001  75   50
>> 3 2001 150  125
>> 4 2002  300
>> 5 2002  85   55
>> 6 2002  95   65
>>
>>
>> Philip
>>
>>
>> On 28/10/2016 3:20 PM, Ashta wrote:
>>>
>>> Hi all,
>>>
>>> I want to calculate the difference  between successive row values to
>>> the first row value within year.
>>> How do I get that?
>>>
>>>   Here isthe sample of data
>>> Year   Num
>>> 200125
>>> 200175
>>> 2001   150
>>> 200230
>>> 200285
>>> 200295
>>>
>>> Desired output
>>> Year   Num  diff
>>> 200125   0
>>> 200175  50
>>> 2001  150125
>>> 2002300
>>> 200285  55
>>> 200295  65
>>>
>>> Thank you.
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference

2016-10-28 Thread Ashta
Hi all thank you very much for your help. Worked very well for that
data set. I just found out that one of the data sets have another
level and do the same thing, I want to calculate the difference
between successive row values (num)  to the first row value within
city and year.

city, year, num
1, 2001,25
1, 2001,75
1, 2001,150
1, 2002,35
1, 2002,65
1, 2002,120
2, 2001,25
2, 2001,95
2, 2001,150
2, 2002,35
2, 2002,110
2, 2002,120

The result will be

city,year,num,Diff
1, 2001,25, 0
1, 2001,75, 50
1, 2001,150, 125
1, 2002,35, 0
1, 2002,65, 30
1, 2002,120, 85
2, 2001,25, 0
2, 2001,95, 70
2, 2001,150, 125
2, 2002,35, 0
2, 2002,110, 75
2, 2002,120, 85

Thank you again


On Fri, Oct 28, 2016 at 4:08 AM, P Tennant  wrote:
> Hi,
>
> You could use an anonymous function to operate on each `year-block' of your
> dataset, then assign the result as a new column:
>
> d <- data.frame(year=c(rep(2001, 3), rep(2002, 3)),
> num=c(25,75,150,30,85,95))
>
> d$diff <- unlist(by(d$num, d$year, function(x) x - x[1]))
> d
>
>   year num diff
> 1 2001  250
> 2 2001  75   50
> 3 2001 150  125
> 4 2002  300
> 5 2002  85   55
> 6 2002  95   65
>
>
> Philip
>
>
> On 28/10/2016 3:20 PM, Ashta wrote:
>>
>> Hi all,
>>
>> I want to calculate the difference  between successive row values to
>> the first row value within year.
>> How do I get that?
>>
>>   Here isthe sample of data
>> Year   Num
>> 200125
>> 200175
>> 2001   150
>> 200230
>> 200285
>> 200295
>>
>> Desired output
>> Year   Num  diff
>> 200125   0
>> 200175  50
>> 2001  150125
>> 2002300
>> 200285  55
>> 200295  65
>>
>> Thank you.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference

2016-10-28 Thread Jeff Newmiller
Haven't seen a solution using ave...

d$diff <- ave( d$num, d$year, FUN = function( v ) { v - v[ 1 ] } )
-- 
Sent from my phone. Please excuse my brevity.

On October 28, 2016 12:46:15 PM CDT, William Dunlap via R-help 
 wrote:
>You could use match() to find, for each row, the index of the first
>row with the give row's year:
>
>> d <- data.frame(year=c(rep(2001, 3), rep(2002, 3)),
>  num=c(25,75,150,30,85,95))
>> indexOfFirstOfYear <- with(d, match(year, year))
>> indexOfFirstOfYear
>[1] 1 1 1 4 4 4
>> d$diff <- d$num - d$num[indexOfFirstOfYear]
>> d
>  year num diff
>1 2001  250
>2 2001  75   50
>3 2001 150  125
>4 2002  300
>5 2002  85   55
>6 2002  95   65
>
>
>Bill Dunlap
>TIBCO Software
>wdunlap tibco.com
>
>On Thu, Oct 27, 2016 at 9:20 PM, Ashta  wrote:
>
>> Hi all,
>>
>> I want to calculate the difference  between successive row values to
>> the first row value within year.
>> How do I get that?
>>
>>  Here isthe sample of data
>> Year   Num
>> 200125
>> 200175
>> 2001   150
>> 200230
>> 200285
>> 200295
>>
>> Desired output
>> Year   Num  diff
>> 200125   0
>> 200175  50
>> 2001  150125
>> 2002300
>> 200285  55
>> 200295  65
>>
>> Thank you.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference

2016-10-28 Thread William Dunlap via R-help
You could use match() to find, for each row, the index of the first
row with the give row's year:

> d <- data.frame(year=c(rep(2001, 3), rep(2002, 3)),
  num=c(25,75,150,30,85,95))
> indexOfFirstOfYear <- with(d, match(year, year))
> indexOfFirstOfYear
[1] 1 1 1 4 4 4
> d$diff <- d$num - d$num[indexOfFirstOfYear]
> d
  year num diff
1 2001  250
2 2001  75   50
3 2001 150  125
4 2002  300
5 2002  85   55
6 2002  95   65


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Oct 27, 2016 at 9:20 PM, Ashta  wrote:

> Hi all,
>
> I want to calculate the difference  between successive row values to
> the first row value within year.
> How do I get that?
>
>  Here isthe sample of data
> Year   Num
> 200125
> 200175
> 2001   150
> 200230
> 200285
> 200295
>
> Desired output
> Year   Num  diff
> 200125   0
> 200175  50
> 2001  150125
> 2002300
> 200285  55
> 200295  65
>
> Thank you.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference

2016-10-28 Thread jim holtman
I read the problem incorrectly; I did not see that you wanted the
difference from the first entry; trying again:

> require(dplyr)
> input <- read.table(text = "Year   Num
+ 200125
+ 200175
+ 2001   150
+ 200230
+ 200285
+ 200295", header = TRUE)
>
> input %>%
+ group_by(Year) %>%
+ mutate(diff = Num - Num[1L])
Source: local data frame [6 x 3]
Groups: Year [2]

   Year   Num  diff

1  200125 0
2  20017550
3  2001   150   125
4  200230 0
5  20028555
6  20029565
>
> # use data.table
> require(data.table)
> setDT(input)  # convert to data.table
> input[, diff := Num - Num[1L], by = Year][]  # print output
   Year Num diff
1: 2001  250
2: 2001  75   50
3: 2001 150  125
4: 2002  300
5: 2002  85   55
6: 2002  95   65

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Fri, Oct 28, 2016 at 12:20 AM, Ashta  wrote:
> Hi all,
>
> I want to calculate the difference  between successive row values to
> the first row value within year.
> How do I get that?
>
>  Here isthe sample of data
> Year   Num
> 200125
> 200175
> 2001   150
> 200230
> 200285
> 200295
>
> Desired output
> Year   Num  diff
> 200125   0
> 200175  50
> 2001  150125
> 2002300
> 200285  55
> 200295  65
>
> Thank you.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference

2016-10-28 Thread jim holtman
Here are a couple of other ways using 'dplyr' and 'data.table'

> require(dplyr)
> input <- read.table(text = "Year   Num
+ 200125
+ 200175
+ 2001   150
+ 200230
+ 200285
+ 200295", header = TRUE)
>
> input %>%
+ group_by(Year) %>%
+ mutate(diff = c(0, diff(Num)))
Source: local data frame [6 x 3]
Groups: Year [2]

   Year   Num  diff

1  200125 0
2  20017550
3  2001   15075
4  200230 0
5  20028555
6  20029510
>
> # use data.table
> require(data.table)
Loading required package: data.table
data.table 1.9.6  For help type ?data.table or
https://github.com/Rdatatable/data.table/wiki
The fastest way to learn (by data.table authors):
https://www.datacamp.com/courses/data-analysis-the-data-table-way
---
data.table + dplyr code now lives in dtplyr.
Please library(dtplyr)!
---

Attaching package: ‘data.table’

The following objects are masked from ‘package:dplyr’:

between, last

> setDT(input)  # convert to data.table
> input[, diff := c(0, diff(Num)), by = Year][]  # print output
   Year Num diff
1: 2001  250
2: 2001  75   50
3: 2001 150   75
4: 2002  300
5: 2002  85   55
6: 2002  95   10
>

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Fri, Oct 28, 2016 at 12:20 AM, Ashta  wrote:
> Hi all,
>
> I want to calculate the difference  between successive row values to
> the first row value within year.
> How do I get that?
>
>  Here isthe sample of data
> Year   Num
> 200125
> 200175
> 2001   150
> 200230
> 200285
> 200295
>
> Desired output
> Year   Num  diff
> 200125   0
> 200175  50
> 2001  150125
> 2002300
> 200285  55
> 200295  65
>
> Thank you.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference

2016-10-28 Thread P Tennant

Hi,

You could use an anonymous function to operate on each `year-block' of 
your dataset, then assign the result as a new column:


d <- data.frame(year=c(rep(2001, 3), rep(2002, 3)),
num=c(25,75,150,30,85,95))

d$diff <- unlist(by(d$num, d$year, function(x) x - x[1]))
d

  year num diff
1 2001  250
2 2001  75   50
3 2001 150  125
4 2002  300
5 2002  85   55
6 2002  95   65


Philip

On 28/10/2016 3:20 PM, Ashta wrote:

Hi all,

I want to calculate the difference  between successive row values to
the first row value within year.
How do I get that?

  Here isthe sample of data
Year   Num
200125
200175
2001   150
200230
200285
200295

Desired output
Year   Num  diff
200125   0
200175  50
2001  150125
2002300
200285  55
200295  65

Thank you.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difference

2016-10-27 Thread Ashta
Hi all,

I want to calculate the difference  between successive row values to
the first row value within year.
How do I get that?

 Here isthe sample of data
Year   Num
200125
200175
2001   150
200230
200285
200295

Desired output
Year   Num  diff
200125   0
200175  50
2001  150125
2002300
200285  55
200295  65

Thank you.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference subsetting (dataset$variable vs. dataset["variable"]

2016-05-31 Thread Jeff Newmiller
You were clearly mistaken. 

dataframe$column is almost the same as dataframe[["column"]], except that the $ 
does partial matching. Both of these "extract" a list element. 

A data frame is a list where all elements are vectors of the same length.  A 
list is a vector where each element can refer to any of a variety of types of 
objects. The names of the objects in the list are associated with the list 
vector, not the referred objects (e.g. columns).  The [] operator "slices" the 
list but keeps the names and referring semantics. The [[]] extraction operator 
(and its pal $) refer to a single element out of the list, losing access to the 
containing list and the names that go with it. 

The Introduction to R document has all this in it... it just usually glazes 
your eyes the first few times you read it.  You might find the R Inferno more 
entertaining. 

-- 
Sent from my phone. Please excuse my brevity.

On May 30, 2016 11:45:52 PM PDT, g.maub...@weinwolf.de wrote:
>Hi All,
>
>I thought dataset$variable is the same as dataset["variable"]. I tried
>the 
>following:
>
>> str(ZWW_Kunden$Branche)
>chr [1:49673] "231" "151" "151" "231" "231" "111" "231" "111" "231"
>"231" 
>"151" "111" ...
>> str(ZWW_Kunden["Branche"])
>'data.frame':49673 obs. of  1 variable:
> $ Branche: chr  "231" "151" "151" "231" ...
>
>and get different results: "chr {1:49673]" vs. "data.frame". First one
>is 
>a simple vector, second one is a data.frame.
>
>This has consequences when subsetting a dataset and filter cases:
>
>> ZWW_Kunden["Branche"] %in% c("315", "316", "317")
>[1] FALSE
>
>> head(ZWW_Kunden$Branche %in% c("315", "316", "317")) # head() only to
>
>shorten output
>[1] FALSE FALSE FALSE FALSE FALSE FALSE
>
>I have thought dataset$variable is the same as dataset["variable"] but 
>actually it's not.
>
>Can you explain what the difference is?
>
>Kind regards
>
>Georg
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference subsetting (dataset$variable vs. dataset["variable"]

2016-05-31 Thread G . Maubach
Hi All,

I thought dataset$variable is the same as dataset["variable"]. I tried the 
following:

> str(ZWW_Kunden$Branche)
 chr [1:49673] "231" "151" "151" "231" "231" "111" "231" "111" "231" "231" 
"151" "111" ...
> str(ZWW_Kunden["Branche"])
'data.frame':49673 obs. of  1 variable:
 $ Branche: chr  "231" "151" "151" "231" ...

and get different results: "chr {1:49673]" vs. "data.frame". First one is 
a simple vector, second one is a data.frame.

This has consequences when subsetting a dataset and filter cases:

> ZWW_Kunden["Branche"] %in% c("315", "316", "317")
[1] FALSE

> head(ZWW_Kunden$Branche %in% c("315", "316", "317")) # head() only to 
shorten output
[1] FALSE FALSE FALSE FALSE FALSE FALSE

I have thought dataset$variable is the same as dataset["variable"] but 
actually it's not.

Can you explain what the difference is?

Kind regards

Georg

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between require and library

2016-04-27 Thread Thierry Onkelinx
Dear Jean,

Have a look at
http://stackoverflow.com/questions/5595512/what-is-the-difference-between-require-and-library

Best regards,


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-04-27 14:54 GMT+02:00 MAURICE Jean - externe <
jean-externe.maur...@edf.fr>:

> Hi,
>
> Is there any other difference between 'require' and 'library' than the
> error or warning when the library is not found ?
>
> Jean in France
>
>
>
>
>
> Ce message et toutes les pièces jointes (ci-après le 'Message') sont
> établis à l'intention exclusive des destinataires et les informations qui y
> figurent sont strictement confidentielles. Toute utilisation de ce Message
> non conforme à sa destination, toute diffusion ou toute publication totale
> ou partielle, est interdite sauf autorisation expresse.
>
> Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de
> le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou
> partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de
> votre système, ainsi que toutes ses copies, et de n'en garder aucune trace
> sur quelque support que ce soit. Nous vous remercions également d'en
> avertir immédiatement l'expéditeur par retour du message.
>
> Il est impossible de garantir que les communications par messagerie
> électronique arrivent en temps utile, sont sécurisées ou dénuées de toute
> erreur ou virus.
> 
>
> This message and any attachments (the 'Message') are intended solely for
> the addressees. The information contained in this Message is confidential.
> Any use of information contained in this Message not in accord with its
> purpose, any dissemination or disclosure, either whole or partial, is
> prohibited except formal approval.
>
> If you are not the addressee, you may not copy, forward, disclose or use
> any part of it. If you have received this message in error, please delete
> it and all copies from your system and notify the sender immediately by
> return message.
>
> E-mail communication cannot be guaranteed to be timely secure, error or
> virus-free.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] difference between require and library

2016-04-27 Thread MAURICE Jean - externe
Hi,

Is there any other difference between 'require' and 'library' than the error or 
warning when the library is not found ?

Jean in France




Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference between successive values

2016-03-05 Thread Jim Lemon
Hi catalin,
I think what you are trying to do is to retrieve the original
observations from the cumulated values. In that case Olivier's
suggestion will do what you want:

c(x[1],diff(x))

Jim


On Sat, Mar 5, 2016 at 1:59 AM, catalin roibu  wrote:
> I mean the first row value
>
> În Vin, 4 mar. 2016, 16:15 Jeff Newmiller,  a
> scris:
>
>> "Keep the first values" is imprecise, but mixing an absolute value with a
>> bunch of differences doesn't usually work out well.  I frequently choose
>> among
>>
>> x <- sample( 10 )
>> dxright <- c( 0, diff(x) )
>> dxleft <- c( diff(x), 0 )
>>
>> for calculation purposes depending on my needs.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On March 4, 2016 3:28:08 AM PST, Olivier Crouzet <
>> olivier.crou...@univ-nantes.fr> wrote:
>> >Hi,
>> >
>> >(1) You should provide a minimal working example;
>> >
>> >(2) But anyway, does...
>> >
>> >x = sample(10)
>> >c(x[1],diff(x))
>> >
>> >... do what you want?
>> >
>> >Olivier.
>> >
>> >
>> >On Fri, 4 Mar 2016
>> >13:22:07 +0200 catalin roibu  wrote:
>> >
>> >> Dear all!
>> >>
>> >> I want to calculate difference between successive values (cumulative
>> >> values) with R. I used diff function, but I want to keep the first
>> >> values.
>> >>
>> >> Please help me to solve this problem!
>> >>
>> >> Thank you!
>> >>
>> >> Best regards!
>> >>
>> >> CR
>> >>
>> >> --
>> >>
>> >> -
>> >> -
>> >> Catalin-Constantin ROIBU
>> >>
>> >> Lecturer PhD, Forestry engineer
>> >> Forestry Faculty of Suceava
>> >> Str. Universitatii no. 13, Suceava, 720229, Romania
>> >> office phone  +4 0230 52 29 78, ext. 531
>> >> mobile phone+4 0745 53 18 01
>> >> FAX:+4 0230 52 16 64
>> >> silvic.usv.ro 
>> >>
>> >>  [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html and provide commented,
>> >> minimal, self-contained, reproducible code.
>> >
>> >--
>> >  Olivier Crouzet, PhD
>> >  Laboratoire de Linguistique de Nantes -- UMR6310
>> >  CNRS / Université de Nantes
>> >  Chemin de la Censive du Tertre -- BP 81227
>> >  44312 Nantes cedex 3
>> >  France
>> >
>> >  http://www.lling.univ-nantes.fr/
>> >
>> >__
>> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
>
> -
> -
> Catalin-Constantin ROIBU
>
> Lecturer PhD, Forestry engineer
>
> Forestry Faculty of Suceava
> Str. Universitatii no. 13, Suceava, 720229,
> office phone +4 0230 52 29 78, ext. 531
> mobile phone   +4 0745 53 18 01
>+4 0766 71 76 58
> FAX:+4 0230 52 16 64
> silvic.usv.ro 
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference between successive values

2016-03-04 Thread catalin roibu
I mean the first row value

În Vin, 4 mar. 2016, 16:15 Jeff Newmiller,  a
scris:

> "Keep the first values" is imprecise, but mixing an absolute value with a
> bunch of differences doesn't usually work out well.  I frequently choose
> among
>
> x <- sample( 10 )
> dxright <- c( 0, diff(x) )
> dxleft <- c( diff(x), 0 )
>
> for calculation purposes depending on my needs.
> --
> Sent from my phone. Please excuse my brevity.
>
> On March 4, 2016 3:28:08 AM PST, Olivier Crouzet <
> olivier.crou...@univ-nantes.fr> wrote:
> >Hi,
> >
> >(1) You should provide a minimal working example;
> >
> >(2) But anyway, does...
> >
> >x = sample(10)
> >c(x[1],diff(x))
> >
> >... do what you want?
> >
> >Olivier.
> >
> >
> >On Fri, 4 Mar 2016
> >13:22:07 +0200 catalin roibu  wrote:
> >
> >> Dear all!
> >>
> >> I want to calculate difference between successive values (cumulative
> >> values) with R. I used diff function, but I want to keep the first
> >> values.
> >>
> >> Please help me to solve this problem!
> >>
> >> Thank you!
> >>
> >> Best regards!
> >>
> >> CR
> >>
> >> --
> >>
> >> -
> >> -
> >> Catalin-Constantin ROIBU
> >> ​
> >> Lecturer PhD, Forestry engineer
> >> Forestry Faculty of Suceava
> >> Str. Universitatii no. 13, Suceava, 720229, Romania
> >> office phone  +4 0230 52 29 78, ext. 531
> >> mobile phone+4 0745 53 18 01
> >> FAX:+4 0230 52 16 64
> >> silvic.usv.ro 
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html and provide commented,
> >> minimal, self-contained, reproducible code.
> >
> >--
> >  Olivier Crouzet, PhD
> >  Laboratoire de Linguistique de Nantes -- UMR6310
> >  CNRS / Université de Nantes
> >  Chemin de la Censive du Tertre -- BP 81227
> >  44312 Nantes cedex 3
> >  France
> >
> >  http://www.lling.univ-nantes.fr/
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 

-
-
Catalin-Constantin ROIBU
​
Lecturer PhD, Forestry engineer
​
Forestry Faculty of Suceava
Str. Universitatii no. 13, Suceava, 720229,
office phone +4 0230 52 29 78, ext. 531
mobile phone   +4 0745 53 18 01
   +4 0766 71 76 58
FAX:+4 0230 52 16 64
silvic.usv.ro 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference between successive values

2016-03-04 Thread Jeff Newmiller
"Keep the first values" is imprecise, but mixing an absolute value with a bunch 
of differences doesn't usually work out well.  I frequently choose among

x <- sample( 10 )
dxright <- c( 0, diff(x) )
dxleft <- c( diff(x), 0 )

for calculation purposes depending on my needs. 
-- 
Sent from my phone. Please excuse my brevity.

On March 4, 2016 3:28:08 AM PST, Olivier Crouzet 
 wrote:
>Hi,
>
>(1) You should provide a minimal working example;
>
>(2) But anyway, does...
>
>x = sample(10)
>c(x[1],diff(x))
>
>... do what you want?
>
>Olivier.
>
>
>On Fri, 4 Mar 2016
>13:22:07 +0200 catalin roibu  wrote:
>
>> Dear all!
>> 
>> I want to calculate difference between successive values (cumulative
>> values) with R. I used diff function, but I want to keep the first
>> values.
>> 
>> Please help me to solve this problem!
>> 
>> Thank you!
>> 
>> Best regards!
>> 
>> CR
>> 
>> -- 
>> 
>> -
>> -
>> Catalin-Constantin ROIBU
>> ​
>> Lecturer PhD, Forestry engineer
>> Forestry Faculty of Suceava
>> Str. Universitatii no. 13, Suceava, 720229, Romania
>> office phone  +4 0230 52 29 78, ext. 531
>> mobile phone+4 0745 53 18 01
>> FAX:+4 0230 52 16 64
>> silvic.usv.ro 
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html and provide commented,
>> minimal, self-contained, reproducible code.
>
>-- 
>  Olivier Crouzet, PhD
>  Laboratoire de Linguistique de Nantes -- UMR6310
>  CNRS / Université de Nantes
>  Chemin de la Censive du Tertre -- BP 81227
>  44312 Nantes cedex 3
>  France
>
>  http://www.lling.univ-nantes.fr/
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference between successive values

2016-03-04 Thread Olivier Crouzet
Hi,

(1) You should provide a minimal working example;

(2) But anyway, does...

x = sample(10)
c(x[1],diff(x))

... do what you want?

Olivier.


On Fri, 4 Mar 2016
13:22:07 +0200 catalin roibu  wrote:

> Dear all!
> 
> I want to calculate difference between successive values (cumulative
> values) with R. I used diff function, but I want to keep the first
> values.
> 
> Please help me to solve this problem!
> 
> Thank you!
> 
> Best regards!
> 
> CR
> 
> -- 
> 
> -
> -
> Catalin-Constantin ROIBU
> ​
> Lecturer PhD, Forestry engineer
> Forestry Faculty of Suceava
> Str. Universitatii no. 13, Suceava, 720229, Romania
> office phone  +4 0230 52 29 78, ext. 531
> mobile phone+4 0745 53 18 01
> FAX:+4 0230 52 16 64
> silvic.usv.ro 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.

-- 
  Olivier Crouzet, PhD
  Laboratoire de Linguistique de Nantes -- UMR6310
  CNRS / Université de Nantes
  Chemin de la Censive du Tertre -- BP 81227
  44312 Nantes cedex 3
  France

  http://www.lling.univ-nantes.fr/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] difference between successive values

2016-03-04 Thread catalin roibu
Dear all!

I want to calculate difference between successive values (cumulative
values) with R. I used diff function, but I want to keep the first values.

Please help me to solve this problem!

Thank you!

Best regards!

CR

-- 

-
-
Catalin-Constantin ROIBU
​
Lecturer PhD, Forestry engineer
Forestry Faculty of Suceava
Str. Universitatii no. 13, Suceava, 720229, Romania
office phone  +4 0230 52 29 78, ext. 531
mobile phone+4 0745 53 18 01
FAX:+4 0230 52 16 64
silvic.usv.ro 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Difference MNP-package and rmnpGibbs from bayesm-package

2015-09-15 Thread A. Martinovici
Hi,

The first thing to check is if both functions use the same 'base' option - they 
usually don't.
On a more general note, the two approaches use a different data augmentation 
procedure, but this should not have a very large impact on the results provided 
enough draws and similar priors are used. If you want to go more into detail on 
what are the exact differences, the Imai and van Dyk (2005) paper is very 
useful.

Best regards,
Ana

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between write.csv(...) and write.table(..., sep=, )

2015-08-16 Thread Berend Hasselman

 On 16-08-2015, at 16:38, Jinsong Zhao jsz...@yeah.net wrote:
 
 Hi there,
 
 I notice that write.csv is a wrap of write.table. However, I can't get the 
 same results using both functions. Here is a reproducible example:
 
  x - matrix(1:6, nrow =2)
  rownames(x) - letters[1:2]
  colnames(x) - LETTERS[1:3]
  write.csv(x, )
 ,A,B,C
 a,1,3,5
 b,2,4,6
  write.table(x, , sep = ,)
 A,B,C
 a,1,3,5
 b,2,4,6
 
 The difference of outputs from both functions is clear.
 
 Is it possible to get the same results of write.csv using write.table?
 

Yes. Read  the item col.names in the help for write.table and go to the section 
“CSV files”..

Use  write.table(x, , sep = ,, col.names=NA)

Learn to use R’s help.

Berend

 Any suggestions will be really appreciated. Thanks in advance.
 
 Best,
 Jinsong
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference between write.csv(...) and write.table(..., sep=, )

2015-08-16 Thread Marc Schwartz

 On Aug 16, 2015, at 9:38 AM, Jinsong Zhao jsz...@yeah.net wrote:
 
 Hi there,
 
 I notice that write.csv is a wrap of write.table. However, I can't get the 
 same results using both functions. Here is a reproducible example:
 
  x - matrix(1:6, nrow =2)
  rownames(x) - letters[1:2]
  colnames(x) - LETTERS[1:3]
  write.csv(x, )
 ,A,B,C
 a,1,3,5
 b,2,4,6
  write.table(x, , sep = ,)
 A,B,C
 a,1,3,5
 b,2,4,6
 
 The difference of outputs from both functions is clear.
 
 Is it possible to get the same results of write.csv using write.table?
 
 Any suggestions will be really appreciated. Thanks in advance.
 
 Best,
 Jinsong


 write.csv(x)
,A,B,C
a,1,3,5
b”,2,4,6


 write.table(x, sep = ,, qmethod = double, col.names = NA)
,A,B,C
a,1,3,5
b”,2,4,6


Read the section on CSV files in ?write.table

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difference between write.csv(...) and write.table(..., sep=, )

2015-08-16 Thread Michael Dewey

I think that if you do ?write.csv and then page down to the section
entitled CSV files the mystery will be solved for you in the first few 
paragraphs.


On 16/08/2015 15:38, Jinsong Zhao wrote:

Hi there,

I notice that write.csv is a wrap of write.table. However, I can't get
the same results using both functions. Here is a reproducible example:

  x - matrix(1:6, nrow =2)
  rownames(x) - letters[1:2]
  colnames(x) - LETTERS[1:3]
  write.csv(x, )
,A,B,C
a,1,3,5
b,2,4,6
  write.table(x, , sep = ,)
A,B,C
a,1,3,5
b,2,4,6

The difference of outputs from both functions is clear.

Is it possible to get the same results of write.csv using write.table?

Any suggestions will be really appreciated. Thanks in advance.

Best,
Jinsong

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difference between write.csv(...) and write.table(..., sep=, )

2015-08-16 Thread Jinsong Zhao

Hi there,

I notice that write.csv is a wrap of write.table. However, I can't get 
the same results using both functions. Here is a reproducible example:


 x - matrix(1:6, nrow =2)
 rownames(x) - letters[1:2]
 colnames(x) - LETTERS[1:3]
 write.csv(x, )
,A,B,C
a,1,3,5
b,2,4,6
 write.table(x, , sep = ,)
A,B,C
a,1,3,5
b,2,4,6

The difference of outputs from both functions is clear.

Is it possible to get the same results of write.csv using write.table?

Any suggestions will be really appreciated. Thanks in advance.

Best,
Jinsong

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between drop1() vs. anova() for Gaussian glm models

2015-07-20 Thread Karl Ove Hufthammer
Dear list members,

I’m having some problems understanding why drop1() and anova() gives 
different results for *Gaussian* glm models. Here’s a simple example:

  d = data.frame(x=1:6,
 group=factor(c(rep(A,2), rep(B, 4
  l = glm(x~group, data=d)

Running the following code gives *three* different p-values. (I would expect 
it to give two different p-values.)

  anova(l, test=F) # p = 0.04179
  anova(l, test=Chisq) # p = 0.00313
  drop1(l, test=Chisq) # p = 0.00841

I’m used to anova() and drop1() giving identical results for the same ‘test’ 
argument. However, it looks like the first two tests above use the F-
statistic as a test statistic, while the last one uses a ‘scaled deviance’ 
statistic:

  1-pf(8.7273, 1, 4)  # F-statistic
  1-pchisq(8.7273, 1) # F-statistic
  1-pchisq(6.9447, 1) # Scaled deviance

I couldn’t find any documentation on this difference. The help page for 
drop1() does say:

  The F tests for the glm methods are based on analysis of
  deviance tests, so if the dispersion is estimated it is based
  on the residual deviance, unlike the F tests of anova.glm.

But here it’s talking about *F* tests. And drop1() with test=F actually 
gives the *same* p-value as anova() with test=F:

  drop1(l, test=F) # p = 0.04179

Any ideas why anova() and drop(1) uses different test statistics for the 
same ‘test’ arguments? And why the help page implies (?) that the results 
should differ for F-tests (while not mentioning chi-squared test), but here 
they do not (and the chi-squared tests do)?

$ sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-suse-linux-gnu (64-bit)
Running under: openSUSE 20150714 (Tumbleweed) (x86_64)

locale:
 [1] LC_CTYPE=nn_NO.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=nn_NO.UTF-8LC_COLLATE=nn_NO.UTF-8
 [5] LC_MONETARY=nn_NO.UTF-8LC_MESSAGES=nn_NO.UTF-8   
 [7] LC_PAPER=nn_NO.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=nn_NO.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets 
[6] methods   base 

loaded via a namespace (and not attached):
[1] tools_3.2.1

-- 
Karl Ove Hufthammer
E-mail: k...@huftis.org
Jabber: huf...@jabber.no

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between drop1() vs. anova() for Gaussian glm models

2015-07-20 Thread peter dalgaard
I am somewhat surprised that _anything_ sensible comes out of anova.glm(l, 
test=Chisq). I think it is mostly expected that you use F tests for that case.

What does seem to come out is the same as for drop1(l, test=Rao), which gives 
the scaled score test, which would seem to be equivalent to scaled deviance in 
this case. 

drop1.glm(l, test=Chisq) appears to be calculating the real likelihood 
ratio test, evaluated in its asymptotic chi-square distribution:

 2*(logLik(l) - logLik(update(l,.~1)))
'log Lik.' 6.944717 (df=3)

(Apologies for the daft output there... Why does - not either subtract the df 
or unclass the whole thing?)

Notice that the scaled tests basically assume that the scale is known, even if 
it is estimated, so in that sense, the real LRT should be superior. However, in 
that case it is well known that the asymptotic approximation can be improved  
by transforming the LRT to the F statistic, whose exact distribution is known.

The remaining part of the riddle is why anova.glm doesn't do likelihood 
differences in the same fashion as drop1.glm. My best guess is that it tries to 
be consistent with anova.lm and anova.lm tries not to have to refit the 
sequence of submodels.

 

 On 20 Jul 2015, at 14:59 , Karl Ove Hufthammer k...@huftis.org wrote:
 
 Dear list members,
 
 I’m having some problems understanding why drop1() and anova() gives 
 different results for *Gaussian* glm models. Here’s a simple example:
 
  d = data.frame(x=1:6,
 group=factor(c(rep(A,2), rep(B, 4
  l = glm(x~group, data=d)
 
 Running the following code gives *three* different p-values. (I would expect 
 it to give two different p-values.)
 
  anova(l, test=F) # p = 0.04179
  anova(l, test=Chisq) # p = 0.00313
  drop1(l, test=Chisq) # p = 0.00841
 
 I’m used to anova() and drop1() giving identical results for the same ‘test’ 
 argument. However, it looks like the first two tests above use the F-
 statistic as a test statistic, while the last one uses a ‘scaled deviance’ 
 statistic:
 
  1-pf(8.7273, 1, 4)  # F-statistic
  1-pchisq(8.7273, 1) # F-statistic
  1-pchisq(6.9447, 1) # Scaled deviance
 
 I couldn’t find any documentation on this difference. The help page for 
 drop1() does say:
 
  The F tests for the glm methods are based on analysis of
  deviance tests, so if the dispersion is estimated it is based
  on the residual deviance, unlike the F tests of anova.glm.
 
 But here it’s talking about *F* tests. And drop1() with test=F actually 
 gives the *same* p-value as anova() with test=F:
 
  drop1(l, test=F) # p = 0.04179
 
 Any ideas why anova() and drop(1) uses different test statistics for the 
 same ‘test’ arguments? And why the help page implies (?) that the results 
 should differ for F-tests (while not mentioning chi-squared test), but here 
 they do not (and the chi-squared tests do)?
 
 $ sessionInfo()
 R version 3.2.1 (2015-06-18)
 Platform: x86_64-suse-linux-gnu (64-bit)
 Running under: openSUSE 20150714 (Tumbleweed) (x86_64)
 
 locale:
 [1] LC_CTYPE=nn_NO.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=nn_NO.UTF-8LC_COLLATE=nn_NO.UTF-8
 [5] LC_MONETARY=nn_NO.UTF-8LC_MESSAGES=nn_NO.UTF-8   
 [7] LC_PAPER=nn_NO.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=nn_NO.UTF-8 LC_IDENTIFICATION=C   
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets 
 [6] methods   base 
 
 loaded via a namespace (and not attached):
 [1] tools_3.2.1
 
 -- 
 Karl Ove Hufthammer
 E-mail: k...@huftis.org
 Jabber: huf...@jabber.no
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between 32-bit and 64-bit version

2015-06-04 Thread Thierry Onkelinx
Dear Duncan,

I had been thinking about FAQ 7.31. I tried to create a dummy dataset with
the same structure to replicate the problem with the need of sending my
dataset. However all of them gave identical() results between 32-bit and
64-bit. Note that coef()$fRow is a 1266 x 6 data.frame. Is it correct to
infer that tiny difference between 32-bit and 64-bit are possible but have
a low probability of occurring?

signif() makes indeed more sense than round(). Using 20 digits gives
identical results, 21 digits gives non identical results.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-06-03 18:09 GMT+02:00 Duncan Murdoch murdoch.dun...@gmail.com:

 On 03/06/2015 11:56 AM, Thierry Onkelinx wrote:
  Dear all,
 
  I'm a bit puzzled by the difference in an object when created in R 32-bit
  and R 64-bit.
 
  Consider the code below. test.rda is available at
 
 https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing
 
  # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
  library(lme4)
  load(test.rda)
  coef.32 - coef(test)
  save(coef.32, file = 32bit.rda)
 
  # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
  library(lme4)
  load(~/test.rda)
  coef.64 - coef(test)
  save(coef.64, file = 64bit.rda)
 
 
  # Compare the results
  # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
  # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
  library(lme4)
  load(32bit.rda)
  load(64bit.rda)
  identical(coef.32, coef.64) # FALSE
  identical(coef.32$fRow, coef.64$fRow) # FALSE
  identical(coef.32$fLocation, coef.64$fLocation) # TRUE
  identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE
 
  The first comparison is FALSE, because the second is FALSE. But why is
 the
  second FALSE and the third and fourth TRUE?
 
  My goal is the calculate a SHA1 hash on the coef(test) to track if the
  coefficients of test have changed. I'd like to get the same hash on a
  32-bit and 64-bit system. A simple hack would be to calculate the hash on
  round(coef(test), 20). Is that a good or bad idea?
 
  identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE

 Different math libraries round differently, so small differences are
 expected.  This is FAQ 7.31.  In many cases the 32 bit calculations are
 more accurate, because they tend to use more 80 bit extended precision
 intermediate values, but that is not guaranteed.

 Rounding before comparing makes sense, but I would use signif() instead
 of round(), I would choose a relatively small number of significant
 digits, and I would expect to see a few false positives:  if the true
 value is 0 but some random noise is added, I'd expect values rounded
 by signif() to be unequal.

 Duncan Murdoch

 
  Best regards,
 
  ir. Thierry Onkelinx
  Instituut voor natuur- en bosonderzoek / Research Institute for Nature
 and
  Forest
  team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
  Kliniekstraat 25
  1070 Anderlecht
  Belgium
 
  To call in the statistician after the experiment is done may be no more
  than asking him to perform a post-mortem examination: he may be able to
 say
  what the experiment died of. ~ Sir Ronald Aylmer Fisher
  The plural of anecdote is not data. ~ Roger Brinner
  The combination of some data and an aching desire for an answer does not
  ensure that a reasonable answer can be extracted from a given body of
 data.
  ~ John Tukey
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between 32-bit and 64-bit version

2015-06-04 Thread Thierry Onkelinx
low probability of occurring was just statisticians lingo for rare ;-)


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-06-04 11:53 GMT+02:00 Duncan Murdoch murdoch.dun...@gmail.com:

 On 04/06/2015 3:59 AM, Thierry Onkelinx wrote:
  Dear Duncan,
 
  I had been thinking about FAQ 7.31. I tried to create a dummy dataset
  with the same structure to replicate the problem with the need of
  sending my dataset. However all of them gave identical() results between
  32-bit and 64-bit. Note that coef()$fRow is a 1266 x 6 data.frame. Is it
  correct to infer that tiny difference between 32-bit and 64-bit are
  possible but have a low probability of occurring?

 Differences are rare, but it's hard to assign a probability to them.

 Duncan Murdoch

 
  signif() makes indeed more sense than round(). Using 20 digits gives
  identical results, 21 digits gives non identical results.
 
  Best regards,
 
  ir. Thierry Onkelinx
  Instituut voor natuur- en bosonderzoek / Research Institute for Nature
  and Forest
  team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
  Kliniekstraat 25
  1070 Anderlecht
  Belgium
 
  To call in the statistician after the experiment is done may be no more
  than asking him to perform a post-mortem examination: he may be able to
  say what the experiment died of. ~ Sir Ronald Aylmer Fisher
  The plural of anecdote is not data. ~ Roger Brinner
  The combination of some data and an aching desire for an answer does not
  ensure that a reasonable answer can be extracted from a given body of
  data. ~ John Tukey
 
  2015-06-03 18:09 GMT+02:00 Duncan Murdoch murdoch.dun...@gmail.com
  mailto:murdoch.dun...@gmail.com:
 
  On 03/06/2015 11:56 AM, Thierry Onkelinx wrote:
   Dear all,
  
   I'm a bit puzzled by the difference in an object when created in R
  32-bit
   and R 64-bit.
  
   Consider the code below. test.rda is available at
  
 
 https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing
  
   # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
   library(lme4)
   load(test.rda)
   coef.32 - coef(test)
   save(coef.32, file = 32bit.rda)
  
   # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
   library(lme4)
   load(~/test.rda)
   coef.64 - coef(test)
   save(coef.64, file = 64bit.rda)
  
  
   # Compare the results
   # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
   # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
   library(lme4)
   load(32bit.rda)
   load(64bit.rda)
   identical(coef.32, coef.64) # FALSE
   identical(coef.32$fRow, coef.64$fRow) # FALSE
   identical(coef.32$fLocation, coef.64$fLocation) # TRUE
   identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE
  
   The first comparison is FALSE, because the second is FALSE. But
  why is the
   second FALSE and the third and fourth TRUE?
  
   My goal is the calculate a SHA1 hash on the coef(test) to track if
 the
   coefficients of test have changed. I'd like to get the same hash
 on a
   32-bit and 64-bit system. A simple hack would be to calculate the
  hash on
   round(coef(test), 20). Is that a good or bad idea?
  
   identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE
 
  Different math libraries round differently, so small differences are
  expected.  This is FAQ 7.31.  In many cases the 32 bit calculations
 are
  more accurate, because they tend to use more 80 bit extended
 precision
  intermediate values, but that is not guaranteed.
 
  Rounding before comparing makes sense, but I would use signif()
 instead
  of round(), I would choose a relatively small number of significant
  digits, and I would expect to see a few false positives:  if the true
  value is 0 but some random noise is added, I'd expect values
 rounded
  by signif() to be unequal.
 
  Duncan Murdoch
 
  
   Best regards,
  
   ir. Thierry Onkelinx
   Instituut voor natuur- en bosonderzoek / Research Institute for
 Nature and
   Forest
   team Biometrie  Kwaliteitszorg / team Biometrics  Quality
 Assurance
   Kliniekstraat 25
   1070 Anderlecht
   Belgium
  
   To call in the statistician after the experiment is done may be no
 more
   than asking him to perform a post-mortem 

Re: [R] Difference between 32-bit and 64-bit version

2015-06-04 Thread Duncan Murdoch
On 04/06/2015 3:59 AM, Thierry Onkelinx wrote:
 Dear Duncan,
 
 I had been thinking about FAQ 7.31. I tried to create a dummy dataset
 with the same structure to replicate the problem with the need of
 sending my dataset. However all of them gave identical() results between
 32-bit and 64-bit. Note that coef()$fRow is a 1266 x 6 data.frame. Is it
 correct to infer that tiny difference between 32-bit and 64-bit are
 possible but have a low probability of occurring?

Differences are rare, but it's hard to assign a probability to them.

Duncan Murdoch

 
 signif() makes indeed more sense than round(). Using 20 digits gives
 identical results, 21 digits gives non identical results.
 
 Best regards,
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek / Research Institute for Nature
 and Forest
 team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
 Kliniekstraat 25
 1070 Anderlecht
 Belgium
 
 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to
 say what the experiment died of. ~ Sir Ronald Aylmer Fisher
 The plural of anecdote is not data. ~ Roger Brinner
 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of
 data. ~ John Tukey
 
 2015-06-03 18:09 GMT+02:00 Duncan Murdoch murdoch.dun...@gmail.com
 mailto:murdoch.dun...@gmail.com:
 
 On 03/06/2015 11:56 AM, Thierry Onkelinx wrote:
  Dear all,
 
  I'm a bit puzzled by the difference in an object when created in R
 32-bit
  and R 64-bit.
 
  Consider the code below. test.rda is available at
 
 
 https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing
 
  # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
  library(lme4)
  load(test.rda)
  coef.32 - coef(test)
  save(coef.32, file = 32bit.rda)
 
  # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
  library(lme4)
  load(~/test.rda)
  coef.64 - coef(test)
  save(coef.64, file = 64bit.rda)
 
 
  # Compare the results
  # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
  # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
  library(lme4)
  load(32bit.rda)
  load(64bit.rda)
  identical(coef.32, coef.64) # FALSE
  identical(coef.32$fRow, coef.64$fRow) # FALSE
  identical(coef.32$fLocation, coef.64$fLocation) # TRUE
  identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE
 
  The first comparison is FALSE, because the second is FALSE. But
 why is the
  second FALSE and the third and fourth TRUE?
 
  My goal is the calculate a SHA1 hash on the coef(test) to track if the
  coefficients of test have changed. I'd like to get the same hash on a
  32-bit and 64-bit system. A simple hack would be to calculate the
 hash on
  round(coef(test), 20). Is that a good or bad idea?
 
  identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE
 
 Different math libraries round differently, so small differences are
 expected.  This is FAQ 7.31.  In many cases the 32 bit calculations are
 more accurate, because they tend to use more 80 bit extended precision
 intermediate values, but that is not guaranteed.
 
 Rounding before comparing makes sense, but I would use signif() instead
 of round(), I would choose a relatively small number of significant
 digits, and I would expect to see a few false positives:  if the true
 value is 0 but some random noise is added, I'd expect values rounded
 by signif() to be unequal.
 
 Duncan Murdoch
 
 
  Best regards,
 
  ir. Thierry Onkelinx
  Instituut voor natuur- en bosonderzoek / Research Institute for Nature 
 and
  Forest
  team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
  Kliniekstraat 25
  1070 Anderlecht
  Belgium
 
  To call in the statistician after the experiment is done may be no more
  than asking him to perform a post-mortem examination: he may be able to 
 say
  what the experiment died of. ~ Sir Ronald Aylmer Fisher
  The plural of anecdote is not data. ~ Roger Brinner
  The combination of some data and an aching desire for an answer does not
  ensure that a reasonable answer can be extracted from a given body of 
 data.
  ~ John Tukey
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailto:R-help@r-project.org mailing list --
 To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 


__
R-help@r-project.org mailing list -- To 

Re: [R] Difference between 32-bit and 64-bit version

2015-06-03 Thread Duncan Murdoch
On 03/06/2015 11:56 AM, Thierry Onkelinx wrote:
 Dear all,
 
 I'm a bit puzzled by the difference in an object when created in R 32-bit
 and R 64-bit.
 
 Consider the code below. test.rda is available at
 https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing
 
 # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
 library(lme4)
 load(test.rda)
 coef.32 - coef(test)
 save(coef.32, file = 32bit.rda)
 
 # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
 library(lme4)
 load(~/test.rda)
 coef.64 - coef(test)
 save(coef.64, file = 64bit.rda)
 
 
 # Compare the results
 # Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
 # Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
 library(lme4)
 load(32bit.rda)
 load(64bit.rda)
 identical(coef.32, coef.64) # FALSE
 identical(coef.32$fRow, coef.64$fRow) # FALSE
 identical(coef.32$fLocation, coef.64$fLocation) # TRUE
 identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE
 
 The first comparison is FALSE, because the second is FALSE. But why is the
 second FALSE and the third and fourth TRUE?
 
 My goal is the calculate a SHA1 hash on the coef(test) to track if the
 coefficients of test have changed. I'd like to get the same hash on a
 32-bit and 64-bit system. A simple hack would be to calculate the hash on
 round(coef(test), 20). Is that a good or bad idea?
 
 identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE

Different math libraries round differently, so small differences are
expected.  This is FAQ 7.31.  In many cases the 32 bit calculations are
more accurate, because they tend to use more 80 bit extended precision
intermediate values, but that is not guaranteed.

Rounding before comparing makes sense, but I would use signif() instead
of round(), I would choose a relatively small number of significant
digits, and I would expect to see a few false positives:  if the true
value is 0 but some random noise is added, I'd expect values rounded
by signif() to be unequal.

Duncan Murdoch

 
 Best regards,
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
 Forest
 team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
 Kliniekstraat 25
 1070 Anderlecht
 Belgium
 
 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to say
 what the experiment died of. ~ Sir Ronald Aylmer Fisher
 The plural of anecdote is not data. ~ Roger Brinner
 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of data.
 ~ John Tukey
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between 32-bit and 64-bit version

2015-06-03 Thread Thierry Onkelinx
Dear all,

I'm a bit puzzled by the difference in an object when created in R 32-bit
and R 64-bit.

Consider the code below. test.rda is available at
https://drive.google.com/file/d/0BzBrlGSuB9n-NFBWeC1TR093Sms/view?usp=sharing

# Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
library(lme4)
load(test.rda)
coef.32 - coef(test)
save(coef.32, file = 32bit.rda)

# Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
library(lme4)
load(~/test.rda)
coef.64 - coef(test)
save(coef.64, file = 64bit.rda)


# Compare the results
# Run in R 3.2.0 Windows 32-bit, lme4 1.1-8
# Run in R 3.2.0 Windows 64-bit, lme4 1.1-8
library(lme4)
load(32bit.rda)
load(64bit.rda)
identical(coef.32, coef.64) # FALSE
identical(coef.32$fRow, coef.64$fRow) # FALSE
identical(coef.32$fLocation, coef.64$fLocation) # TRUE
identical(coef.32$fSubLocation, coef.64$fSubLocation) # TRUE

The first comparison is FALSE, because the second is FALSE. But why is the
second FALSE and the third and fourth TRUE?

My goal is the calculate a SHA1 hash on the coef(test) to track if the
coefficients of test have changed. I'd like to get the same hash on a
32-bit and 64-bit system. A simple hack would be to calculate the hash on
round(coef(test), 20). Is that a good or bad idea?

identical(round(coef.32$fRow, 20), round(coef.64$fRow, 20)) # TRUE

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between max in summary table and max function

2015-02-16 Thread Franckx Laurent
Thanks for the clarification. The basic error on my side was that I had 
misunderstood the digits option in the summary, I had not understood that 
even integer numbers might end up being rounded. Problem is solved now.

-Original Message-
From: Allen Bingham [mailto:aebingh...@gmail.com]
Sent: zaterdag 14 februari 2015 8:16
To: 'Martyn Byng'; r-help@r-project.org
Cc: Franckx Laurent
Subject: RE: [R] difference between max in summary table and max function

I thought I'd chime in ... submitting the following:

   ?summary

Provides the following documentation for the default for generalized argument 
(other than class=data.frame, factor, or matrix):

   ## Default S3 method:
   summary(object, ..., digits = max(3, getOption(digits)-3))

so passing along the object testrow w/o a corresponding argument for digits 
... defaults to digits=4 (assuming your system has the same default option of 
digits = 7 that mine does).

... and since later in the documentation it indicates that digits is an:

   integer, used for number formatting with signif()

so noting that all of the values you reported from summary(testrow) all have
4 significant digits (including the Max. of 131500) (excepting the min value of 
1), summary() is doing what it is documented to do.

... sorry for being pedantic --- but doing so to point out how helpful the ? 
command can be sometimes.

Hope this helps.

__
Allen Bingham
Bingham Statistical Consulting
aebingh...@gmail.com
LinkedIn Profile: www.linkedin.com/pub/allen-bingham/3b/556/325





-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Martyn Byng
Sent: Friday, February 13, 2015 3:15 AM
To: Franckx Laurent; r-help@r-project.org
Subject: Re: [R] difference between max in summary table and max function

Its a formatting thing, try

summary(testrow,digits=20)

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Franckx Laurent
Sent: 13 February 2015 11:00
To: r-help@r-project.org
Subject: [R] difference between max in summary table and max function

Dear all

I have found out that the max in the summary of an integer vector is not always 
equal to the actual maximum of that vector. For example:


 testrow - c(1:131509)
 summary(testrow)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  1   32880   65760   65760   98630  131500
 max(testrow)
[1] 131509

This has occurred both in a Windows and in a Linux environment.

Does this mean that the max value in the summary is only an approximation?

Best regards

Laurent Franckx, PhD
Senior researcher sustainable mobility
VITO NV | Boeretang 200 | 2400 Mol
Tel. ++ 32 14 33 58 22| mob. +32 479 25 59 07 | Skype: laurent.franckx | 
laurent.fran...@vito.be | Twitter @LaurentFranckx




VITO Disclaimer: http://www.vito.be/e-maildisclaimer

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This e-mail has been scanned for all viruses by Star.\ _...{{dropped:11}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference in dates for unique ID

2015-02-15 Thread arun
HI Farnoosh,



Not sure I understand the expected output.  The difference between the first 2 
days is 136 days

May be this helps

   library(data.table)
   dcast.data.table(setDT(df)[, list(Visit=.N, Diff= 
as.numeric(abs(diff(as.Date(Date, format='%d-%b-%y') ,
 by = ID], ID+Visit~ Diff, value.var='Diff', length)

ID Visit 136 255 857
 1:  1 2   1   0   0
 2:  2 3   0   1   1





On Wednesday, February 11, 2015 5:47 PM, farnoosh sheikhi 
farnoosh...@yahoo.com wrote:



Hi Arun,

I have a data set that look s like below. I wanted to get a difference in dates 
for each unique ID and record it as a new X and have binary input for each one. 

ID   Date
106-Sep-13
120-Jan-14
206-Mar-12
225-Jun-11
229-Oct-13



For example for the first two date for ID=1 ( 20-Jan-14 - 06-Sep-13 ~ 121) and 
I want the data to be like follow:

ID  Visit   121
1   21
2   3 0


I really appreciate if you can help me with this. I know I need to write some 
kind of loop, but I don't know how to think of the logic behind it.
Thanks a lot.



Farnoosh

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference in dates for unique ID

2015-02-15 Thread farnoosh sheikhi via R-help
That's exactly what I was thinking. Thanks tons.

Sent from Yahoo Mail on Android

From:arun smartpink...@yahoo.com
Date:Sun, Feb 15, 2015 at 2:47 AM
Subject:Re: Difference in dates for unique ID

HI Farnoosh,



Not sure I understand the expected output.� The difference between the first 2 
days is 136 days

May be this helps

� library(data.table)
� � � dcast.data.table(setDT(df)[, list(Visit=.N, Diff= 
as.numeric(abs(diff(as.Date(Date, format='%d-%b-%y') ,
� � � � by = ID], ID+Visit~ Diff, value.var='Diff', length)

� � ID Visit 136 255 857
� � 1:� 1� � 2� 1� 0� 0
� � 2:� 2� � 3� 0� 1� 1







On Wednesday, February 11, 2015 5:47 PM, farnoosh sheikhi 
farnoosh...@yahoo.com wrote:



Hi Arun,

I have a data set that look s like below. I wanted to get a difference in dates 
for each unique ID and record it as a new X and have binary input for each one. 

ID� Date
1� � � � 06-Sep-13
1� � � � 20-Jan-14
2� � � � 06-Mar-12
2� � � � 25-Jun-11
2� � � � 29-Oct-13



For example for the first two date for ID=1 ( 20-Jan-14 - 06-Sep-13 ~ 121) and 
I want the data to be like follow:

ID� Visit� 121
1� � � 2� � � � 1
2� � � 3� � � � 0


I really appreciate if you can help me with this. I know I need to write some 
kind of loop, but I don't know how to think of the logic behind it.
Thanks a lot.



Farnoosh


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] difference between max in summary table and max function

2015-02-13 Thread Franckx Laurent
Dear all

I have found out that the max in the summary of an integer vector is not always 
equal to the actual maximum of that vector. For example:


 testrow - c(1:131509)
 summary(testrow)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  1   32880   65760   65760   98630  131500
 max(testrow)
[1] 131509

This has occurred both in a Windows and in a Linux environment.

Does this mean that the max value in the summary is only an approximation?

Best regards

Laurent Franckx, PhD
Senior researcher sustainable mobility
VITO NV | Boeretang 200 | 2400 Mol
Tel. ++ 32 14 33 58 22| mob. +32 479 25 59 07 | Skype: laurent.franckx | 
laurent.fran...@vito.be | Twitter @LaurentFranckx




VITO Disclaimer: http://www.vito.be/e-maildisclaimer

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between max in summary table and max function

2015-02-13 Thread Martyn Byng
Its a formatting thing, try

summary(testrow,digits=20)

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Franckx Laurent
Sent: 13 February 2015 11:00
To: r-help@r-project.org
Subject: [R] difference between max in summary table and max function

Dear all

I have found out that the max in the summary of an integer vector is not always 
equal to the actual maximum of that vector. For example:


 testrow - c(1:131509)
 summary(testrow)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  1   32880   65760   65760   98630  131500
 max(testrow)
[1] 131509

This has occurred both in a Windows and in a Linux environment.

Does this mean that the max value in the summary is only an approximation?

Best regards

Laurent Franckx, PhD
Senior researcher sustainable mobility
VITO NV | Boeretang 200 | 2400 Mol
Tel. ++ 32 14 33 58 22| mob. +32 479 25 59 07 | Skype: laurent.franckx | 
laurent.fran...@vito.be | Twitter @LaurentFranckx




VITO Disclaimer: http://www.vito.be/e-maildisclaimer

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This e-mail has been scanned for all viruses by Star.\ _...{{dropped:3}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between max in summary table and max function

2015-02-13 Thread Allen Bingham
I thought I'd chime in ... submitting the following:

   ?summary

Provides the following documentation for the default for generalized
argument (other than class=data.frame, factor, or matrix):

   ## Default S3 method:
   summary(object, ..., digits = max(3, getOption(digits)-3))

so passing along the object testrow w/o a corresponding argument for
digits ... defaults to digits=4 (assuming your system has the same default
option of digits = 7 that mine does).

... and since later in the documentation it indicates that digits is an:

   integer, used for number formatting with signif()

so noting that all of the values you reported from summary(testrow) all have
4 significant digits (including the Max. of 131500) (excepting the min value
of 1), summary() is doing what it is documented to do.

... sorry for being pedantic --- but doing so to point out how helpful the
? command can be sometimes.

Hope this helps.

__
Allen Bingham
Bingham Statistical Consulting
aebingh...@gmail.com
LinkedIn Profile: www.linkedin.com/pub/allen-bingham/3b/556/325





-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Martyn Byng
Sent: Friday, February 13, 2015 3:15 AM
To: Franckx Laurent; r-help@r-project.org
Subject: Re: [R] difference between max in summary table and max function

Its a formatting thing, try

summary(testrow,digits=20)

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Franckx
Laurent
Sent: 13 February 2015 11:00
To: r-help@r-project.org
Subject: [R] difference between max in summary table and max function

Dear all

I have found out that the max in the summary of an integer vector is not
always equal to the actual maximum of that vector. For example:


 testrow - c(1:131509)
 summary(testrow)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  1   32880   65760   65760   98630  131500
 max(testrow)
[1] 131509

This has occurred both in a Windows and in a Linux environment.

Does this mean that the max value in the summary is only an approximation?

Best regards

Laurent Franckx, PhD
Senior researcher sustainable mobility
VITO NV | Boeretang 200 | 2400 Mol
Tel. ++ 32 14 33 58 22| mob. +32 479 25 59 07 | Skype: laurent.franckx |
laurent.fran...@vito.be | Twitter @LaurentFranckx




VITO Disclaimer: http://www.vito.be/e-maildisclaimer

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This e-mail has been scanned for all viruses by Star.\ _...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between eigs() and eigen()

2014-12-29 Thread Pierrick Bruneau
Dear R users and contributors,

I recently observed a difference between the outputs of the classic
eigen() function, and the Arnoldi variant eigs() that extracts only
the few first eigenpairs. Here is some sample code illustrating the
problem:

library(rARPACK)
library(speccalt)
set.seed(1)

# compute kernel matrix from rows of synth5
# then its Laplacian
kern - local.rbfdot(synth5)
diag(kern) - 0
deg - sapply(1:(dim(synth5)[1]), function(i) {
return(sum(kern[i,]))
})
L - diag(1/sqrt(deg)) %*% kern %*% diag(1/sqrt(deg))

eig1 - eigs(L, 6)
eig2 - eigen(L, symmetric=TRUE)

eig1$values then reads:
1.000 1.000 0.9993805 0.9992561 0.9985084 0.9975311

whereas eig2$values reads:
1.000 1.000 1.000 1.000 0.9993805 0.9992561
which is the correct result (eigenvalue 1 has multiplicity 4 in that example).

I guess there is an issue between Arnoldi methods and eigenvalues with
multiplicities greater than 1 (indeed at the end of the series the
unique eigenvals look identical), but as the problem is not documented
in the package PDF, I'm quite unclear if this is
implementation-specific or Arnoldi-general... The issue is quite
important in my case, as the associated eigenvectors then differ quite
significantly, and this impacts negatively my further operations.

I guess the next step is to dig into the mathematical literature, but
before this I wondered if someone already encountered this issue?

Any help would be appreciated,
Pierrick

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between eigs() and eigen()

2014-12-29 Thread Uwe Ligges
eigs() is from a contributed package. No idea what it is about, but my 
guess is these are actually numerical differences coming from different 
algorithms used to calculate the eigenvalues.

For details, please ask the author of the corresponding contributed package.

Best,
Uwe Ligges


On 29.12.2014 19:02, Pierrick Bruneau wrote:

Dear R users and contributors,

I recently observed a difference between the outputs of the classic
eigen() function, and the Arnoldi variant eigs() that extracts only
the few first eigenpairs. Here is some sample code illustrating the
problem:

library(rARPACK)
library(speccalt)
set.seed(1)

# compute kernel matrix from rows of synth5
# then its Laplacian
kern - local.rbfdot(synth5)
diag(kern) - 0
deg - sapply(1:(dim(synth5)[1]), function(i) {
return(sum(kern[i,]))
})
L - diag(1/sqrt(deg)) %*% kern %*% diag(1/sqrt(deg))

eig1 - eigs(L, 6)
eig2 - eigen(L, symmetric=TRUE)

eig1$values then reads:
1.000 1.000 0.9993805 0.9992561 0.9985084 0.9975311

whereas eig2$values reads:
1.000 1.000 1.000 1.000 0.9993805 0.9992561
which is the correct result (eigenvalue 1 has multiplicity 4 in that example).

I guess there is an issue between Arnoldi methods and eigenvalues with
multiplicities greater than 1 (indeed at the end of the series the
unique eigenvals look identical), but as the problem is not documented
in the package PDF, I'm quite unclear if this is
implementation-specific or Arnoldi-general... The issue is quite
important in my case, as the associated eigenvectors then differ quite
significantly, and this impacts negatively my further operations.

I guess the next step is to dig into the mathematical literature, but
before this I wondered if someone already encountered this issue?

Any help would be appreciated,
Pierrick

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference in cummulative variance depending on print command

2014-12-07 Thread William Revelle
Dear Rena,

As Peter points out, it is better to ask the maintainer of the program for 
detailed questions.  

  As Peter correctly surmised, print.psych (which is used to print the output 
from the fa function), knows that you have an oblique solution and is reporting 
the amount of variance associated with the oblique factors (taking into account 
that they are correlated).  The default print method assumes orthogonal factors.

If you compare the total amount of variance accounted for (cumulative Var) for 
all of the factors (.59) , this will match that found using orthogonal 
rotations, while the default print method of the loadings does not.

Bill

 On Dec 6, 2014, at 10:48 AM, peter dalgaard pda...@gmail.com wrote:
 
 Firstly, there is no fa() function in base R. There is one in package 
 psych(), which has a maintainer, etc.
 
 I guess that it is because fa() does a non-orthogonal factor rotation and its 
 print method knows about it, whereas the default print method for loadings 
 assumes that rotations are orthogonal.
 
 - Peter D.
 
 On 05 Dec 2014, at 13:28 , Rena Büsch rena.bue...@gmx.de wrote:
 
 Hello,
 I am trying a factor analysis via R.
 When running the pricipal axis analysis I do get different tables depending
 on the print command.
 This is my factor analysis:
 fa.pa_cor_3_2- fa(ItemsCor_4, nfactors=3, fm=pa,rotate=oblimin)
 
 To get the h2 I did the following print command:
 print (fa.pa_cor_3_2, digits=2, cut=.3, sort=T)
 To just get the loadings I did the following print command:
 print (fa.pa_cor_3_2$loadings, digits=2, cutoff=.3, sort=T)
 
 The result of the first print is the following Eigenvalue-cumulative
 variance table:
   PA1   PA2  PA3
 SS loadings20.59 18.16 5.03
 Proportion Var  0.28  0.25 0.07
 Cumulative Var  0.28  0.52 0.59
 
 With the second print command I get a different table:
   PA1   PA2  PA3
 SS loadings17.63 15.12 3.14
 Proportion Var  0.24  0.20 0.04
 Cumulative Var  0.24  0.44 0.49
 
 The loadings are the same for both commands. There is just this slight
 difference in the cumulative Var.
 
 Does anyone have an idea of a cause for the difference? What can I report?
 Did I post enough information to fully understand my problem?
 Thanks in Advance
 Rena
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 -- 
 Peter Dalgaard, Professor,
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

William Revellehttp://personality-project.org/revelle.html
Professor  http://personality-project.org
Department of Psychology   http://www.wcas.northwestern.edu/psych/
Northwestern Universityhttp://www.northwestern.edu/
Use R for psychology http://personality-project.org/r
It is 5 minutes to midnighthttp://www.thebulletin.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference in cummulative variance depending on print command

2014-12-06 Thread peter dalgaard
Firstly, there is no fa() function in base R. There is one in package psych(), 
which has a maintainer, etc.

I guess that it is because fa() does a non-orthogonal factor rotation and its 
print method knows about it, whereas the default print method for loadings 
assumes that rotations are orthogonal.

- Peter D.

 On 05 Dec 2014, at 13:28 , Rena Büsch rena.bue...@gmx.de wrote:
 
 Hello,
 I am trying a factor analysis via R.
 When running the pricipal axis analysis I do get different tables depending
 on the print command.
 This is my factor analysis:
 fa.pa_cor_3_2- fa(ItemsCor_4, nfactors=3, fm=pa,rotate=oblimin)
 
 To get the h2 I did the following print command:
 print (fa.pa_cor_3_2, digits=2, cut=.3, sort=T)
 To just get the loadings I did the following print command:
 print (fa.pa_cor_3_2$loadings, digits=2, cutoff=.3, sort=T)
 
 The result of the first print is the following Eigenvalue-cumulative
 variance table:
PA1   PA2  PA3
 SS loadings20.59 18.16 5.03
 Proportion Var  0.28  0.25 0.07
 Cumulative Var  0.28  0.52 0.59
 
 With the second print command I get a different table:
PA1   PA2  PA3
 SS loadings17.63 15.12 3.14
 Proportion Var  0.24  0.20 0.04
 Cumulative Var  0.24  0.44 0.49
 
 The loadings are the same for both commands. There is just this slight
 difference in the cumulative Var.
 
 Does anyone have an idea of a cause for the difference? What can I report?
 Did I post enough information to fully understand my problem?
 Thanks in Advance
 Rena
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference in cummulative variance depending on print command

2014-12-05 Thread Rena Büsch

Hello,
I am trying a factor analysis via R.
When running the pricipal axis analysis I do get different tables depending
on the print command.
This is my factor analysis:
fa.pa_cor_3_2- fa(ItemsCor_4, nfactors=3, fm=pa,rotate=oblimin)

To get the h2 I did the following print command:
print (fa.pa_cor_3_2, digits=2, cut=.3, sort=T)
To just get the loadings I did the following print command:
print (fa.pa_cor_3_2$loadings, digits=2, cutoff=.3, sort=T)

The result of the first print is the following Eigenvalue-cumulative
variance table:
PA1   PA2  PA3
SS loadings20.59 18.16 5.03
Proportion Var  0.28  0.25 0.07
Cumulative Var  0.28  0.52 0.59

With the second print command I get a different table:
PA1   PA2  PA3
SS loadings17.63 15.12 3.14
Proportion Var  0.24  0.20 0.04
Cumulative Var  0.24  0.44 0.49

The loadings are the same for both commands. There is just this slight
difference in the cumulative Var.

Does anyone have an idea of a cause for the difference? What can I report?
Did I post enough information to fully understand my problem?
Thanks in Advance
Rena

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference betweeen cor.test() and formula everyone says to use

2014-10-17 Thread peter dalgaard
This is pretty much standard. I'm quite sure that other stats packages do 
likewise and I wouldn't know who everyone is. It is not unheard of that 
textbook authors give suboptimal formulas in order not to confuse students, 
though.

The basic point is that the t transformation gives the exact distribution under 
the null. Fisher's Z is only approximately normally distributed. 

The t transformation works because if beta is the regression coefficient of y 
on x, beta==0 iff rho==0, and we have exact theory for testing beta==0 by a 
t-test.

Off-null, the t-approach does not readily transfer, so confidence intervals 
tend to be based on the Z-transformation.

-Peter D.



On 17 Oct 2014, at 02:20 , Joshua Wiley jwiley.ps...@gmail.com wrote:

 Hi Jeremy,
 
 I don't know about references, but this around.  See for example:
 http://afni.nimh.nih.gov/sscc/gangc/tr.html
 
 the relevant line in cor.test is:
 
 STATISTIC - c(t = sqrt(df) * r/sqrt(1 - r^2))
 
 You can convert *t*s to *r*s and vice versa.
 
 Best,
 
 Josh
 
 
 
 On Fri, Oct 17, 2014 at 10:32 AM, Jeremy Miles jeremy.mi...@gmail.com
 wrote:
 
 I'm trying to understand how cor.test() is calculating the p-value of
 a correlation. It gives a p-value based on t, but every text I've ever
 seen gives the calculation based on z.
 
 For example:
 data(cars)
 with(cars[1:10, ], cor.test(speed, dist))
 
 Pearson's product-moment correlation
 
 data:  speed and dist
 t = 2.3893, df = 8, p-value = 0.04391
 alternative hypothesis: true correlation is not equal to 0
 95 percent confidence interval:
 0.02641348 0.90658582
 sample estimates:
  cor
 0.6453079
 
 But when I use the regular formula:
 r - cor(cars[1:10, ])[1, 2]
 r.z - fisherz(r)
 se - se - 1/sqrt(10 - 3)
 z - r.z / se
 (1 - pnorm(z))*2
 [1] 0.04237039
 
 My p-value is different.  The help file for cor.test doesn't (seem to)
 have any reference to this, and I can see in the source code that it
 is doing something different. I'm just not sure what.
 
 Thanks,
 
 Jeremy
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Joshua F. Wiley
 Ph.D. Student, UCLA Department of Psychology
 http://joshuawiley.com/
 Senior Analyst, Elkhart Group Ltd.
 http://elkhartgroup.com
 Office: 260.673.5518
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference betweeen cor.test() and formula everyone says to use

2014-10-17 Thread JLucke
The distribution of the statistic $ndf * r^2 / (1-r^2)$ with  the true 
value $\rho = zero$ follows an $F(1,ndf)$ distribution.
So the t-test is the correct test for $\rho=0$. 
Fisher's z is an asymptotically normal  transformation for any value of 
$\rho$. 
Thus  Fisher's z is better for testing $\rho= \rho_0 $ or $\rho_1 = 
\rho_2$.
The two statistics will not be equivalent at $\rho=0$ because the 
statistics are based on different assumptions.




Jeremy Miles jeremy.mi...@gmail.com 
Sent by: r-help-boun...@r-project.org
10/16/2014 07:32 PM

To
r-help r-help@r-project.org, 
cc

Subject
[R] Difference betweeen cor.test() and formula everyone says to use






I'm trying to understand how cor.test() is calculating the p-value of
a correlation. It gives a p-value based on t, but every text I've ever
seen gives the calculation based on z.

For example:
 data(cars)
 with(cars[1:10, ], cor.test(speed, dist))

Pearson's product-moment correlation

data:  speed and dist
t = 2.3893, df = 8, p-value = 0.04391
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.02641348 0.90658582
sample estimates:
  cor
0.6453079

But when I use the regular formula:
 r - cor(cars[1:10, ])[1, 2]
 r.z - fisherz(r)
 se - se - 1/sqrt(10 - 3)
 z - r.z / se
 (1 - pnorm(z))*2
[1] 0.04237039

My p-value is different.  The help file for cor.test doesn't (seem to)
have any reference to this, and I can see in the source code that it
is doing something different. I'm just not sure what.

Thanks,

Jeremy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference betweeen cor.test() and formula everyone says to use

2014-10-16 Thread Jeremy Miles
I'm trying to understand how cor.test() is calculating the p-value of
a correlation. It gives a p-value based on t, but every text I've ever
seen gives the calculation based on z.

For example:
 data(cars)
 with(cars[1:10, ], cor.test(speed, dist))

Pearson's product-moment correlation

data:  speed and dist
t = 2.3893, df = 8, p-value = 0.04391
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.02641348 0.90658582
sample estimates:
  cor
0.6453079

But when I use the regular formula:
 r - cor(cars[1:10, ])[1, 2]
 r.z - fisherz(r)
 se - se - 1/sqrt(10 - 3)
 z - r.z / se
 (1 - pnorm(z))*2
[1] 0.04237039

My p-value is different.  The help file for cor.test doesn't (seem to)
have any reference to this, and I can see in the source code that it
is doing something different. I'm just not sure what.

Thanks,

Jeremy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference betweeen cor.test() and formula everyone says to use

2014-10-16 Thread Joshua Wiley
Hi Jeremy,

I don't know about references, but this around.  See for example:
http://afni.nimh.nih.gov/sscc/gangc/tr.html

the relevant line in cor.test is:

STATISTIC - c(t = sqrt(df) * r/sqrt(1 - r^2))

You can convert *t*s to *r*s and vice versa.

Best,

Josh



On Fri, Oct 17, 2014 at 10:32 AM, Jeremy Miles jeremy.mi...@gmail.com
wrote:

 I'm trying to understand how cor.test() is calculating the p-value of
 a correlation. It gives a p-value based on t, but every text I've ever
 seen gives the calculation based on z.

 For example:
  data(cars)
  with(cars[1:10, ], cor.test(speed, dist))

 Pearson's product-moment correlation

 data:  speed and dist
 t = 2.3893, df = 8, p-value = 0.04391
 alternative hypothesis: true correlation is not equal to 0
 95 percent confidence interval:
  0.02641348 0.90658582
 sample estimates:
   cor
 0.6453079

 But when I use the regular formula:
  r - cor(cars[1:10, ])[1, 2]
  r.z - fisherz(r)
  se - se - 1/sqrt(10 - 3)
  z - r.z / se
  (1 - pnorm(z))*2
 [1] 0.04237039

 My p-value is different.  The help file for cor.test doesn't (seem to)
 have any reference to this, and I can see in the source code that it
 is doing something different. I'm just not sure what.

 Thanks,

 Jeremy

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua F. Wiley
Ph.D. Student, UCLA Department of Psychology
http://joshuawiley.com/
Senior Analyst, Elkhart Group Ltd.
http://elkhartgroup.com
Office: 260.673.5518

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference in coefficients in Cox proportional hazard estimates between R and Stata, why?

2014-05-30 Thread Hiyoshi, Ayako
Dear R users,



Hi, thank you so much for your help in advance.

I have been using Stata but new to R. For my paper revision using Aalen's 
survival analysis, I need to use R, as the command including Aalen's survival 
seems to be available in R (32-bit, version 3.1.0 (2014-04-10)) but less ready 
to be used in Stata (version 13/SE).



To make sure that I can do basics, I have fitted logistic regression and Cox 
proportional hazard regression using R and Stata.



The data I used were from UCLA R's textbook example page: 
http://www.ats.ucla.edu/stat/r/examples/asa/asa_ch1_r.htm. 
http://www.ats.ucla.edu/stat/r/examples/asa/asa_ch1_r.htm. I used this in Stata 
too.



When I fitted logistic regression as below, the estimates were exactly same 
between R and Stata.



Example using logistic regression

R:



logistic1 - glm(censor ~ age + drug, data=, family = binomial)

summary(logistic1)

exp(cbind(OR=coef(logistic1), confint(logistic1)))

   OR  2.5 %97.5 %
(Intercept) 1.0373731 0.06358296 16.797896
age 1.0436805 0.96801933  1.131233
drug0.7192149 0.26042635  1.937502



Stata:



logistic censor age i.drug
OR CI_lower CI_upper
age |   1.043681   .96623881.127329
drug |.719215   .26651941.940835
_cons |   1.037373   .065847 16.3431



However, when I fitted Cox proportional hazard regression, there were some 
discrepancies in coefficient (and exponentiated hazard ratios).



Example using Cox proportioanl hazard regression

R:



cox1 - coxph(Surv(time, censor) ~ drug, age, data=)
summary(cox1)

Call:
coxph(formula = Surv(time, censor) ~ drug + age, data = )
  n= 100, number of events= 80
coef exp(coef) se(coef) z Pr(|z|)
drug 1.01670   2.76405  0.25622 3.968 7.24e-05 ***
age  0.09714   1.10202  0.01864 5.211 1.87e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 exp(coef) exp(-coef) lower .95 upper .95
drug 2.764 0.3618 1.673 4.567
age  1.102 0.9074 1.062 1.143
Concordance= 0.711  (se = 0.042 )
Rsquare= 0.324   (max possible= 0.997 )
Likelihood ratio test= 39.13  on 2 df,   p=3.182e-09
Wald test= 36.13  on 2 df,   p=1.431e-08
Score (logrank) test = 38.39  on 2 df,   p=4.602e-09

Stata:

stset time, f(censor)
stcox drug age
--
  _t | Haz. Ratio   Std. Err.  zP|z| [95% Conf. Interval]
-+
drug |   2.563531   .6550089 3.68   0.000  1.553634.229893
 age |   1.095852 .02026 4.95   0.000 1.0568541.136289
--




The HR estimates for drug was 2.76 from R, but 2.56 from Stata.

I searched in internet for explanation, but could not find any.



In parametric survival regression with exponential distribution, R and Stata's 
coefficients were completely opposite while the values were exactly same (i.e. 
say 0.08 for Stata and -0.08 for R). I suspected something like this 
(http://www.theanalysisfactor.com/ordinal-logistic-regression-mystery/) going 
on, but for Cox proportional hazard regression, i coudl not find any resource 
helping me.



I highly appreciate if anyone could explain this for me, or suggest me resource 
that I can read.



Thank you so much for your help.



Best,

Ayako


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference in coefficients in Cox proportional hazard estimates between R and Stata, why?

2014-05-30 Thread Göran Broström
In the Cox regression case, the probable explanation is that you have 
ties in your data; Stata and coxph may have different defaults for 
handling ties. Read the manuals!


The difference in sign in the other cases is simply due to different 
definitions of the models. I am sure it is well documented in relevant 
manuals.


Göran

On 2014-05-30 13:37, Hiyoshi, Ayako wrote:

Dear R users,



Hi, thank you so much for your help in advance.

I have been using Stata but new to R. For my paper revision using
Aalen's survival analysis, I need to use R, as the command including
Aalen's survival seems to be available in R (32-bit, version 3.1.0
(2014-04-10)) but less ready to be used in Stata (version 13/SE).



To make sure that I can do basics, I have fitted logistic regression
and Cox proportional hazard regression using R and Stata.



The data I used were from UCLA R's textbook example page:
http://www.ats.ucla.edu/stat/r/examples/asa/asa_ch1_r.htm.
http://www.ats.ucla.edu/stat/r/examples/asa/asa_ch1_r.htm. I used
this in Stata too.



When I fitted logistic regression as below, the estimates were
exactly same between R and Stata.



Example using logistic regression

R:



logistic1 - glm(censor ~ age + drug, data=, family =
binomial)

summary(logistic1)

exp(cbind(OR=coef(logistic1), confint(logistic1)))

OR  2.5 %97.5 % (Intercept) 1.0373731 0.06358296 16.797896
age 1.0436805 0.96801933  1.131233 drug0.7192149
0.26042635  1.937502



Stata:



logistic censor age i.drug OR CI_lower CI_upper age |
1.043681   .96623881.127329 drug |.719215   .2665194
1.940835 _cons |   1.037373   .065847 16.3431



However, when I fitted Cox proportional hazard regression, there were
some discrepancies in coefficient (and exponentiated hazard ratios).



Example using Cox proportioanl hazard regression

R:



cox1 - coxph(Surv(time, censor) ~ drug, age, data=)
summary(cox1)

Call: coxph(formula = Surv(time, censor) ~ drug + age, data = )
n= 100, number of events= 80 coef exp(coef) se(coef) z Pr(|z|)
drug 1.01670   2.76405  0.25622 3.968 7.24e-05 *** age  0.09714
1.10202  0.01864 5.211 1.87e-07 *** --- Signif. codes:  0 '***' 0.001
'**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 exp(coef) exp(-coef) lower .95 upper
.95 drug 2.764 0.3618 1.673 4.567 age  1.102
0.9074 1.062 1.143 Concordance= 0.711  (se = 0.042 ) Rsquare=
0.324   (max possible= 0.997 ) Likelihood ratio test= 39.13  on 2 df,
p=3.182e-09 Wald test= 36.13  on 2 df,   p=1.431e-08
Score (logrank) test = 38.39  on 2 df,   p=4.602e-09

Stata:

stset time, f(censor) stcox drug age
--



_t | Haz. Ratio   Std. Err.  zP|z| [95% Conf. Interval]

-+



drug |   2.563531   .6550089 3.68   0.000  1.553634.229893

age |   1.095852 .02026 4.95   0.000 1.056854
1.136289
--





The HR estimates for drug was 2.76 from R, but 2.56 from Stata.

I searched in internet for explanation, but could not find any.



In parametric survival regression with exponential distribution, R
and Stata's coefficients were completely opposite while the values
were exactly same (i.e. say 0.08 for Stata and -0.08 for R). I
suspected something like this
(http://www.theanalysisfactor.com/ordinal-logistic-regression-mystery/)
going on, but for Cox proportional hazard regression, i coudl not
find any resource helping me.



I highly appreciate if anyone could explain this for me, or suggest
me resource that I can read.



Thank you so much for your help.



Best,

Ayako


[[alternative HTML version deleted]]

__ R-help@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between comma separated values in column

2014-04-29 Thread arun
Hi,

It is better to show the example data using ?dput().  Here, it is not clear 
whether the columns are character columns or lists.
##If it is the latter case

dat1 - data.frame(V1=I(list(1:3, c(1,2,4), c(2,3,4,5))), V2= I(list(c(3,6,5), 
c(7,10,9), 2:5)))
 dat1$V3 - mapply(`c`,mapply(`-`, lapply(dat1$V2, `[`,-1), 
lapply(dat1$V1,head,-1)), lapply(dat1$V1,tail,1))
 dat1
#  V1 V2 V3
#1    1, 2, 3    3, 6, 5    5, 3, 3
#2    1, 2, 4   7, 10, 9    9, 7, 4
#3 2, 3, 4, 5 2, 3, 4, 5 1, 1, 1, 5


##If the columns are character vectors.

dat2 - structure(list(V1 = c(1,2,3, 1,2,4, 2,3,4,5), V2 = c(3,6,5, 
7,10,9, 2,3,4,5)), .Names = c(V1, V2), row.names = c(NA, 
-3L), class = data.frame)
 lst1 - sapply(dat2, function(x) lapply(strsplit(x, split=,),as.numeric))
dat2$V3 - unlist(lapply(mapply(`c`,mapply(`-`, lapply(lst1[,2],`[`, -1), 
lapply(lst1[,1], head,-1)), lapply(lst1[,1], tail,1)), paste, collapse=,))
 dat2
#   V1  V2  V3
#1   1,2,3   3,6,5   5,3,3
#2   1,2,4  7,10,9   9,7,4
#3 2,3,4,5 2,3,4,5 1,1,1,5


A.K.


 Hi,

I have a quick question in R. I have dataframe with two columns with multiple 
values separated by comma.
Example:
   
    V1   V2
1    1, 2, 3  3, 6, 5
2    1, 2, 4  7, 10, 9
3    2, 3, 4, 5   2, 3, 4, 5

I want to calculate the difference between both the column.

Expected results (suppose results are stored in V3) - it is basically 
subtracting (n-th) value of the column1 from  (n-th + 1) value of column2.
   
    V3  
1    6-1, 5-2, 3  
2    10-1, 9-2, 4    
3    3-2, 4-3, 5-4, 5

which gives    (Last value doesn't matter)
 
   V3  
1    5, 3, 3  
2    9, 7, 4    
3    1, 1, 1, 5

Would greatly appreciate if anyone can suggest how can I proceed? 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between comma separated values in column

2014-04-29 Thread arun


HI,

I guess this should be a bit faster.
#1st case
dat1$V3 - lapply(seq_along(dat1$V2),function(i) c(dat1$V2[[i]][-1] - 
head(dat1$V1[[i]],-1), tail(dat1$V1[[i]],1)))
#2nd case
dat2$V3 - unlist(lapply(seq_along(lst1[,2]),function(i) 
paste(c(lst1[,2][[i]][-1] - head(lst1[,1][[i]], -1), 
tail(lst1[,1][[i]],1)),collapse=,)))

A.K.



On Tuesday, April 29, 2014 3:58 AM, arun smartpink...@yahoo.com wrote:
Hi,

It is better to show the example data using ?dput().  Here, it is not clear 
whether the columns are character columns or lists.
##If it is the latter case

dat1 - data.frame(V1=I(list(1:3, c(1,2,4), c(2,3,4,5))), V2= I(list(c(3,6,5), 
c(7,10,9), 2:5)))
 dat1$V3 - mapply(`c`,mapply(`-`, lapply(dat1$V2, `[`,-1), 
lapply(dat1$V1,head,-1)), lapply(dat1$V1,tail,1))
 dat1
#  V1 V2 V3
#1    1, 2, 3    3, 6, 5    5, 3, 3
#2    1, 2, 4   7, 10, 9    9, 7, 4
#3 2, 3, 4, 5 2, 3, 4, 5 1, 1, 1, 5


##If the columns are character vectors.

dat2 - structure(list(V1 = c(1,2,3, 1,2,4, 2,3,4,5), V2 = c(3,6,5, 
7,10,9, 2,3,4,5)), .Names = c(V1, V2), row.names = c(NA, 
-3L), class = data.frame)
 lst1 - sapply(dat2, function(x) lapply(strsplit(x, split=,),as.numeric))
dat2$V3 - unlist(lapply(mapply(`c`,mapply(`-`, lapply(lst1[,2],`[`, -1), 
lapply(lst1[,1], head,-1)), lapply(lst1[,1], tail,1)), paste, collapse=,))
 dat2
#   V1  V2  V3
#1   1,2,3   3,6,5   5,3,3
#2   1,2,4  7,10,9   9,7,4
#3 2,3,4,5 2,3,4,5 1,1,1,5


A.K.


 Hi,

I have a quick question in R. I have dataframe with two columns with multiple 
values separated by comma.
Example:
   
    V1   V2
1    1, 2, 3  3, 6, 5
2    1, 2, 4  7, 10, 9
3    2, 3, 4, 5   2, 3, 4, 5

I want to calculate the difference between both the column.

Expected results (suppose results are stored in V3) - it is basically 
subtracting (n-th) value of the column1 from  (n-th + 1) value of column2.
   
    V3  
1    6-1, 5-2, 3  
2    10-1, 9-2, 4    
3    3-2, 4-3, 5-4, 5

which gives    (Last value doesn't matter)
 
   V3  
1    5, 3, 3  
2    9, 7, 4    
3    1, 1, 1, 5

Would greatly appreciate if anyone can suggest how can I proceed?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between times

2014-04-20 Thread Prof Brian Ripley

On 18/04/2014 21:46, David Winsemius wrote:


On Apr 18, 2014, at 12:59 PM, Prof Brian Ripley wrote:


On 18/04/2014 19:46, Rui Barradas wrote:

Hello,

The reason why is that you've misspelled CET (not CEST)


Neither CET nor CEST are portable time-zone names.  We have not been given the 
'at a minimum' information required by the posting guide, so please read 
?Sys.timezone on your system.


Dear Prof;

Thanks for the impetus to yet again read that page. Despite frequently reading help pages and in 
particular reading that one many times, I still was not getting the 'tz' arguments correct on a 
Mac. I do now see that I was spelling my TZ incorrectly (as Americas/Los_Angeles rather 
than America/Los_Angeles.

Fellow Mac users may face a problem when using the Finder unless they set it up to 
display hidden ('dot') files. The /usr/ folder is greyed out but it still 
does open. If I restore my Finder defaults to not show system files and folders, I no 
longer see that directory and would not have been able to resolve my spelling error on my 
own:



dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = 
America/New_York)
  dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = 
America/Los_Angeles)
  dt1-dt2

Time difference of 3 hours

I don't suppose a warning could be issued by the as.POSIXct code when a tz 
argument is not found in the database to let people know that 'UTC' will be the default?


No, as the underlying POSIX function does not report this.  We could 
perhaps do this on platforms which use --with-internal-tzcode but not 
e.g. on Linux.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between times

2014-04-20 Thread Prof Brian Ripley

On 20/04/2014 08:50, Prof Brian Ripley wrote:

On 18/04/2014 21:46, David Winsemius wrote:


On Apr 18, 2014, at 12:59 PM, Prof Brian Ripley wrote:


On 18/04/2014 19:46, Rui Barradas wrote:

Hello,

The reason why is that you've misspelled CET (not CEST)


Neither CET nor CEST are portable time-zone names.  We have not been
given the 'at a minimum' information required by the posting guide,
so please read ?Sys.timezone on your system.


Dear Prof;

Thanks for the impetus to yet again read that page. Despite frequently
reading help pages and in particular reading that one many times, I
still was not getting the 'tz' arguments correct on a Mac. I do now
see that I was spelling my TZ incorrectly (as Americas/Los_Angeles
rather than America/Los_Angeles.

Fellow Mac users may face a problem when using the Finder unless they
set it up to display hidden ('dot') files. The /usr/ folder is greyed
out but it still does open. If I restore my Finder defaults to not
show system files and folders, I no longer see that directory and
would not have been able to resolve my spelling error on my own:


You can always use the command-line or OlsonNames() in R.



dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
America/New_York)
  dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
America/Los_Angeles)
  dt1-dt2

Time difference of 3 hours

I don't suppose a warning could be issued by the as.POSIXct code when
a tz argument is not found in the database to let people know that
'UTC' will be the default?


No, as the underlying POSIX function does not report this.  We could
perhaps do this on platforms which use --with-internal-tzcode but not
e.g. on Linux.


In fact we already do: R 3.1.0 on a Mac shows

 as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = 
Americas/New_York)

[1] 2014-04-18 09:00:00 GMT
Warning messages:
1: In strptime(x, format, tz = tz) : unknown timezone 'Americas/New_York'
2: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
  unknown timezone 'Americas/New_York'
3: In as.POSIXlt.POSIXct(x, tz) : unknown timezone 'Americas/New_York'


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between times

2014-04-20 Thread David Winsemius

On Apr 20, 2014, at 2:16 AM, Prof Brian Ripley wrote:

 On 20/04/2014 08:50, Prof Brian Ripley wrote:
 On 18/04/2014 21:46, David Winsemius wrote:
 
 On Apr 18, 2014, at 12:59 PM, Prof Brian Ripley wrote:
 
 On 18/04/2014 19:46, Rui Barradas wrote:
 Hello,
 
 The reason why is that you've misspelled CET (not CEST)
 
 Neither CET nor CEST are portable time-zone names.  We have not been
 given the 'at a minimum' information required by the posting guide,
 so please read ?Sys.timezone on your system.
 
 Dear Prof;
 
 Thanks for the impetus to yet again read that page. Despite frequently
 reading help pages and in particular reading that one many times, I
 still was not getting the 'tz' arguments correct on a Mac. I do now
 see that I was spelling my TZ incorrectly (as Americas/Los_Angeles
 rather than America/Los_Angeles.
 
 Fellow Mac users may face a problem when using the Finder unless they
 set it up to display hidden ('dot') files. The /usr/ folder is greyed
 out but it still does open. If I restore my Finder defaults to not
 show system files and folders, I no longer see that directory and
 would not have been able to resolve my spelling error on my own:
 
 You can always use the command-line or OlsonNames() in R.

I'm not finding an OlsonNames function on a Mac (but is that because I haven't 
updated?). Before seeing that the zone-checking feature had been added as a 
feature, I was building an OlsonNames function that extracts the 
sub-directories of the zoneinfo directory and appends the file names to them as 
well as extracting the non-OlsonNames entries in zoneinfo. My plan had been to 
make my own warnings in strptime, but that appears to be unnecessary.

OlsonNames - function(onlyOlson=FALSE) { 
  MacOlsonDirs - system('ls -p /usr/share/zoneinfo ', intern=TRUE)
  OlsonNames - unlist( lapply( MacOlsonDirs[grep(/$, MacOlsonDirs)], 
function(dir) paste0( dir, 
   system( paste0('ls -p 
/usr/share/zoneinfo/', dir) , 
   intern=TRUE) ) ) )
  nonOlsonNames - MacOlsonDirs[grepl(^[A-Z], MacOlsonDirs)  ! grepl(/$, 
MacOlsonDirs) ]
  if ( !onlyOlson){ c(OlsonNames, nonOlsonNames)} else {OlsonNames}
   }

Yes. It is because I haven't updated. I now see this in the NEWS that was 
posted in this list 10 days ago.

There is more support to explore the system's idea of time-zone
names.  Sys.timezone() tries to give the current system setting
by name (and succeeds at least on Linux, OS X, Solaris and
Windows), and OlsonNames() lists the names in the system's Olson
database. Sys.timezone(location = FALSE) gives the previous
behaviour.
 
I guess I will have fun comparing my efforts with those of the masters.

 
 dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 America/New_York)
  dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 America/Los_Angeles)
  dt1-dt2
 Time difference of 3 hours
 
 I don't suppose a warning could be issued by the as.POSIXct code when
 a tz argument is not found in the database to let people know that
 'UTC' will be the default?
 
 No, as the underlying POSIX function does not report this.  We could
 perhaps do this on platforms which use --with-internal-tzcode but not
 e.g. on Linux.
 
 In fact we already do: R 3.1.0 on a Mac shows

My apologies. And thank you to whomever added the feature and to you, Prof, for 
checking and letting us know. I was going to offer the code above as a patch 
but that seems not needed now. I have not yet updated to 3.1.0. The 
Mavericks/3.1.0 incompatibilities have been scaring me off from updating. Still 
on Lion/3.0.2.

 
  as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = 
  Americas/New_York)
 [1] 2014-04-18 09:00:00 GMT
 Warning messages:
 1: In strptime(x, format, tz = tz) : unknown timezone 'Americas/New_York'
 2: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
  unknown timezone 'Americas/New_York'
 3: In as.POSIXlt.POSIXct(x, tz) : unknown timezone 'Americas/New_York'

Thank you, all of R-Core.


 -- 
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between times

2014-04-19 Thread Nicola Sturaro Sommacal
Thank you for your reply.

I discovered the OlsonNames() function to get the time-zone names in my
system. Rui get a warning message when using a not recognized tz. On my
system this doesn't succed.

I solved as follow:

dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
Europe/Rome)

dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
America/Los_Angeles)

dt1[1] 2014-04-18 09:00:00 CESTdt2[1] 2014-04-18 09:00:00
PDTdt1-dt2Time difference of -9 hours


Thank you,
Nicola



2014-04-18 21:59 GMT+02:00 Prof Brian Ripley rip...@stats.ox.ac.uk:

 On 18/04/2014 19:46, Rui Barradas wrote:

 Hello,

 The reason why is that you've misspelled CET (not CEST)


 Neither CET nor CEST are portable time-zone names.  We have not been given
 the 'at a minimum' information required by the posting guide, so please
 read ?Sys.timezone on your system.




   dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 CEST)
 Warning messages:
 1: In strptime(x, format, tz = tz) : unknown timezone 'CEST'
 2: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
unknown timezone 'CEST'
   dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 GMT)
   dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 CET)
   dt1-dt2
 Time difference of -2 hours


 Hope this helps,

 Rui Barradas

 Em 18-04-2014 17:13, Nicola Sturaro Sommacal escreveu:

 Hi.

 I am new to POSIX and I'd like to understand the reason of this
 difference.

 dt1 = as.POSIXct(2014-03-29 09.00, format=%Y-%m-%d %H.%M)
 dt2 = as.POSIXct(2014-03-30 09.00, format=%Y-%m-%d %H.%M)
 dt2-dt1

  dt1[1] 2014-03-29 09:00:00 CET dt2[1] 2014-03-30 09:00:00 CEST
 dt2-dt1


 Time difference of 23 hours

 This is right, because on Mar 31 at 2 PM we jump directly to 3PM, DST.

 On the contrary, I don't understand the following:

 dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 CEST)
 dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 GMT)

  dt1[1] 2014-04-18 09:00:00 CEST dt2[1] 2014-04-18 09:00:00 GMT
 dt1-dt2Time difference of 0 secs



 I should expected a time difference of 2 hours, as CEST is GMT+2.

 Anyone can help me?

 Thank you,
 Nicola

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between times

2014-04-19 Thread Nicola Sturaro Sommacal
I forgot:

  sysname
 release
  Linux
  3.5.0-48-generic
  version
#72-Ubuntu SMP Mon Mar 10 23:18:29 UTC 2014



2014-04-19 14:03 GMT+02:00 Nicola Sturaro Sommacal 
mailingl...@nicolasturaro.com:

 Thank you for your reply.

 I discovered the OlsonNames() function to get the time-zone names in my
 system. Rui get a warning message when using a not recognized tz. On my
 system this doesn't succed.

 I solved as follow:

 dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = 
 Europe/Rome)

 dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = 
 America/Los_Angeles)

 dt1[1] 2014-04-18 09:00:00 CESTdt2
 [1] 2014-04-18 09:00:00 PDTdt1-dt2Time difference of -9 hours


 Thank you,
 Nicola



 2014-04-18 21:59 GMT+02:00 Prof Brian Ripley rip...@stats.ox.ac.uk:

 On 18/04/2014 19:46, Rui Barradas wrote:

 Hello,

 The reason why is that you've misspelled CET (not CEST)


 Neither CET nor CEST are portable time-zone names.  We have not been
 given the 'at a minimum' information required by the posting guide, so
 please read ?Sys.timezone on your system.




   dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 CEST)
 Warning messages:
 1: In strptime(x, format, tz = tz) : unknown timezone 'CEST'
 2: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
unknown timezone 'CEST'
   dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 GMT)
   dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 CET)
   dt1-dt2
 Time difference of -2 hours


 Hope this helps,

 Rui Barradas

 Em 18-04-2014 17:13, Nicola Sturaro Sommacal escreveu:

 Hi.

 I am new to POSIX and I'd like to understand the reason of this
 difference.

 dt1 = as.POSIXct(2014-03-29 09.00, format=%Y-%m-%d %H.%M)
 dt2 = as.POSIXct(2014-03-30 09.00, format=%Y-%m-%d %H.%M)
 dt2-dt1

  dt1[1] 2014-03-29 09:00:00 CET dt2[1] 2014-03-30 09:00:00 CEST
 dt2-dt1


 Time difference of 23 hours

 This is right, because on Mar 31 at 2 PM we jump directly to 3PM, DST.

 On the contrary, I don't understand the following:

 dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 CEST)
 dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 GMT)

  dt1[1] 2014-04-18 09:00:00 CEST dt2[1] 2014-04-18 09:00:00 GMT
 dt1-dt2Time difference of 0 secs



 I should expected a time difference of 2 hours, as CEST is GMT+2.

 Anyone can help me?

 Thank you,
 Nicola

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between times

2014-04-18 Thread Nicola Sturaro Sommacal
Hi.

I am new to POSIX and I'd like to understand the reason of this difference.

dt1 = as.POSIXct(2014-03-29 09.00, format=%Y-%m-%d %H.%M)
dt2 = as.POSIXct(2014-03-30 09.00, format=%Y-%m-%d %H.%M)
dt2-dt1

 dt1[1] 2014-03-29 09:00:00 CET dt2[1] 2014-03-30 09:00:00 CEST dt2-dt1

Time difference of 23 hours

This is right, because on Mar 31 at 2 PM we jump directly to 3PM, DST.

On the contrary, I don't understand the following:

dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = CEST)
dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = GMT)

 dt1[1] 2014-04-18 09:00:00 CEST dt2[1] 2014-04-18 09:00:00 GMT 
 dt1-dt2Time difference of 0 secs


I should expected a time difference of 2 hours, as CEST is GMT+2.

Anyone can help me?

Thank you,
Nicola

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between times

2014-04-18 Thread Rui Barradas

Hello,

The reason why is that you've misspelled CET (not CEST)

 dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = 
CEST)

Warning messages:
1: In strptime(x, format, tz = tz) : unknown timezone 'CEST'
2: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
  unknown timezone 'CEST'
 dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = GMT)
 dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = CET)
 dt1-dt2
Time difference of -2 hours


Hope this helps,

Rui Barradas

Em 18-04-2014 17:13, Nicola Sturaro Sommacal escreveu:

Hi.

I am new to POSIX and I'd like to understand the reason of this difference.

dt1 = as.POSIXct(2014-03-29 09.00, format=%Y-%m-%d %H.%M)
dt2 = as.POSIXct(2014-03-30 09.00, format=%Y-%m-%d %H.%M)
dt2-dt1


dt1[1] 2014-03-29 09:00:00 CET dt2[1] 2014-03-30 09:00:00 CEST dt2-dt1


Time difference of 23 hours

This is right, because on Mar 31 at 2 PM we jump directly to 3PM, DST.

On the contrary, I don't understand the following:

dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = CEST)
dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = GMT)


dt1[1] 2014-04-18 09:00:00 CEST dt2[1] 2014-04-18 09:00:00 GMT 
dt1-dt2Time difference of 0 secs



I should expected a time difference of 2 hours, as CEST is GMT+2.

Anyone can help me?

Thank you,
Nicola

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between times

2014-04-18 Thread Prof Brian Ripley

On 18/04/2014 19:46, Rui Barradas wrote:

Hello,

The reason why is that you've misspelled CET (not CEST)


Neither CET nor CEST are portable time-zone names.  We have not been 
given the 'at a minimum' information required by the posting guide, so 
please read ?Sys.timezone on your system.





  dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
CEST)
Warning messages:
1: In strptime(x, format, tz = tz) : unknown timezone 'CEST'
2: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
   unknown timezone 'CEST'
  dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
GMT)
  dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
CET)
  dt1-dt2
Time difference of -2 hours


Hope this helps,

Rui Barradas

Em 18-04-2014 17:13, Nicola Sturaro Sommacal escreveu:

Hi.

I am new to POSIX and I'd like to understand the reason of this
difference.

dt1 = as.POSIXct(2014-03-29 09.00, format=%Y-%m-%d %H.%M)
dt2 = as.POSIXct(2014-03-30 09.00, format=%Y-%m-%d %H.%M)
dt2-dt1


dt1[1] 2014-03-29 09:00:00 CET dt2[1] 2014-03-30 09:00:00 CEST
dt2-dt1


Time difference of 23 hours

This is right, because on Mar 31 at 2 PM we jump directly to 3PM, DST.

On the contrary, I don't understand the following:

dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
CEST)
dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = GMT)


dt1[1] 2014-04-18 09:00:00 CEST dt2[1] 2014-04-18 09:00:00 GMT
dt1-dt2Time difference of 0 secs



I should expected a time difference of 2 hours, as CEST is GMT+2.

Anyone can help me?

Thank you,
Nicola

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between times

2014-04-18 Thread David Winsemius

On Apr 18, 2014, at 12:59 PM, Prof Brian Ripley wrote:

 On 18/04/2014 19:46, Rui Barradas wrote:
 Hello,
 
 The reason why is that you've misspelled CET (not CEST)
 
 Neither CET nor CEST are portable time-zone names.  We have not been given 
 the 'at a minimum' information required by the posting guide, so please read 
 ?Sys.timezone on your system.

Dear Prof;

Thanks for the impetus to yet again read that page. Despite frequently reading 
help pages and in particular reading that one many times, I still was not 
getting the 'tz' arguments correct on a Mac. I do now see that I was spelling 
my TZ incorrectly (as Americas/Los_Angeles rather than America/Los_Angeles. 

Fellow Mac users may face a problem when using the Finder unless they set it up 
to display hidden ('dot') files. The /usr/ folder is greyed out but it still 
does open. If I restore my Finder defaults to not show system files and 
folders, I no longer see that directory and would not have been able to resolve 
my spelling error on my own:


 dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = 
 America/New_York)
  dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = 
 America/Los_Angeles)
  dt1-dt2
Time difference of 3 hours

I don't suppose a warning could be issued by the as.POSIXct code when a tz 
argument is not found in the database to let people know that 'UTC' will be the 
default?

-- 
David.


 
 
 
  dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 CEST)
 Warning messages:
 1: In strptime(x, format, tz = tz) : unknown timezone 'CEST'
 2: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
   unknown timezone 'CEST'
  dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 GMT)
  dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 CET)
  dt1-dt2
 Time difference of -2 hours
 
 
 Hope this helps,
 
 Rui Barradas
 
 Em 18-04-2014 17:13, Nicola Sturaro Sommacal escreveu:
 Hi.
 
 I am new to POSIX and I'd like to understand the reason of this
 difference.
 
 dt1 = as.POSIXct(2014-03-29 09.00, format=%Y-%m-%d %H.%M)
 dt2 = as.POSIXct(2014-03-30 09.00, format=%Y-%m-%d %H.%M)
 dt2-dt1
 
 dt1[1] 2014-03-29 09:00:00 CET dt2[1] 2014-03-30 09:00:00 CEST
 dt2-dt1
 
 Time difference of 23 hours
 
 This is right, because on Mar 31 at 2 PM we jump directly to 3PM, DST.
 
 On the contrary, I don't understand the following:
 
 dt1 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz =
 CEST)
 dt2 = as.POSIXct(2014-04-18 09.00, format=%Y-%m-%d %H.%M, tz = GMT)
 
 dt1[1] 2014-04-18 09:00:00 CEST dt2[1] 2014-04-18 09:00:00 GMT
 dt1-dt2Time difference of 0 secs
 
 
 I should expected a time difference of 2 hours, as CEST is GMT+2.
 
 Anyone can help me?
 
 Thank you,
 Nicola
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 -- 
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between two datetimes

2014-01-28 Thread David Fox
I have a data frame with variable datetime which is of class POSIXct.
Consecutive observations are separated by 30 minutes.
However, some of the differences reported by R give unexpected results.
For example consider the following two consecutive entries:

 par.dat$datetime[5944]
[1] 2010-04-04 02:30:00 EST

 par.dat$datetime[5945]
[1] 2010-04-04 03:00:00 EST

When I examine the difference, R reports 1.5 hours instead of 30 minutes:

 par.dat$datetime[5945]-par.dat$datetime[5944]
Time difference of 1.5 hours

On further investigation it appears there's something peculiar to this
particular date. Other years work fine, eg:
 as.POSIXct(2011-04-04 03:00:00)- as.POSIXct(2011-04-04 02:30:00)
Time difference of 30 mins

 as.POSIXct(2012-04-04 03:00:00)- as.POSIXct(2012-04-04 02:30:00)
Time difference of 30 mins

 as.POSIXct(2009-04-04 03:00:00)- as.POSIXct(2009-04-04 02:30:00)
Time difference of 30 mins

But when I use 2010 I get a difference of 1.5 hours:

 as.POSIXct(2010-04-04 03:00:00)- as.POSIXct(2010-04-04 02:30:00)
Time difference of 1.5 hours






Any suggestions?
Thanks,
David Fox.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between two datetimes

2014-01-28 Thread Daniel Nordlund
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of David Fox
 Sent: Tuesday, January 28, 2014 5:15 PM
 To: r-help@r-project.org
 Subject: [R] Difference between two datetimes
 
 I have a data frame with variable datetime which is of class POSIXct.
 Consecutive observations are separated by 30 minutes.
 However, some of the differences reported by R give unexpected results.
 For example consider the following two consecutive entries:
 
  par.dat$datetime[5944]
 [1] 2010-04-04 02:30:00 EST
 
  par.dat$datetime[5945]
 [1] 2010-04-04 03:00:00 EST
 
 When I examine the difference, R reports 1.5 hours instead of 30 minutes:
 
  par.dat$datetime[5945]-par.dat$datetime[5944]
 Time difference of 1.5 hours
 
 On further investigation it appears there's something peculiar to this
 particular date. Other years work fine, eg:
  as.POSIXct(2011-04-04 03:00:00)- as.POSIXct(2011-04-04 02:30:00)
 Time difference of 30 mins
 
  as.POSIXct(2012-04-04 03:00:00)- as.POSIXct(2012-04-04 02:30:00)
 Time difference of 30 mins
 
  as.POSIXct(2009-04-04 03:00:00)- as.POSIXct(2009-04-04 02:30:00)
 Time difference of 30 mins
 
 But when I use 2010 I get a difference of 1.5 hours:
 
  as.POSIXct(2010-04-04 03:00:00)- as.POSIXct(2010-04-04 02:30:00)
 Time difference of 1.5 hours
 
 
 
 
 
 
 Any suggestions?
 Thanks,
 David Fox.
 

Daylight savings time change in Australia?

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between arima(1, 1, 1) of y and arima(1, 0, 1) of diff(y)

2013-07-18 Thread George Milunovich
Dear all,
When I run an arima(1,1,1) on an I(1) variable, y, I get different estimates to 
when I first difference the variable myself, e.g y2-diff(y), and then run 
arima(1,0,1) on y2. Shouldn't these two approaches give the same output?
Any help will be much appreciated.
george

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between arima(1, 1, 1) for y and arima(1, 0, 1) for diff(y)

2013-07-18 Thread george
Dear all,
When I run an arima(1,1,1) on an I(1) variable, e.g. y, I get different
estimates to when I first difference the variable myself, e.g y2-diff(y),
and then run arima(1,0,1) on y2. Shouldn't these two approaches give the
same output?
Any help will be much appreciated.
george



--
View this message in context: 
http://r.789695.n4.nabble.com/Difference-between-arima-1-1-1-for-y-and-arima-1-0-1-for-diff-y-tp4671873.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between arima(1, 1, 1) of y and arima(1, 0, 1) of diff(y)

2013-07-18 Thread Mark Leeds
Hi George: Assuming it's still relevant, the link below will explain why.

http://www.stat.pitt.edu/stoffer/tsa2/Rissues.htm



On Thu, Jul 18, 2013 at 2:14 PM, George Milunovich 
george.milunov...@mq.edu.au wrote:

 Dear all,
 When I run an arima(1,1,1) on an I(1) variable, y, I get different
 estimates to when I first difference the variable myself, e.g y2-diff(y),
 and then run arima(1,0,1) on y2. Shouldn't these two approaches give the
 same output?
 Any help will be much appreciated.
 george

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference between arima(1, 1, 1) of y and arima(1, 0, 1) of diff(y)

2013-07-18 Thread George Milunovich
Hi Mark,
This is very helpful!!
Much appreciated

Sent from my iPad

On Jul 19, 2013, at 3:51 AM, Mark Leeds marklee...@gmail.com wrote:

 Hi George: Assuming it's still relevant, the link below will explain why. 
 
 http://www.stat.pitt.edu/stoffer/tsa2/Rissues.htm
 
 
 
 On Thu, Jul 18, 2013 at 2:14 PM, George Milunovich 
 george.milunov...@mq.edu.au wrote:
 Dear all,
 When I run an arima(1,1,1) on an I(1) variable, y, I get different estimates 
 to when I first difference the variable myself, e.g y2-diff(y), and then 
 run arima(1,0,1) on y2. Shouldn't these two approaches give the same output?
 Any help will be much appreciated.
 george
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference between Lloyd and Forgy algorithms used in R built-in kmeans clustering function

2013-06-20 Thread Safiye Celik
Hi,

Does anybody know the difference between the Lloyd and Forgy algorithms
specified for R's kmeans clustering options? I know how Lloyd works, but I
cannot access Forgy's paper and could not find any specific information on
the web about how it really differs from Lloyd's method.

I appreciate your help. Thanks!

-- 
-safiye

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   >