from:"Ben Bolker"

Re: [R] Current version of R, 4.4.0 and patch to correct the bug fix related to the RStudio viewer pane on Windows systems

2024-05-16 Thread Ben Bolker

  Yes, but this sounds more like a bureaucratic requirement ("all 
available patches must be installed") and less like something someone 
has thought through.


  It's conceivable that one might be able to talk to a security officer 
and convince them that this is not in fact an important issue, but I'm 
not optimistic about that ...


  Ben Bolker

On 2024-05-16 1:38 p.m., CALUM POLWART wrote:

Do you receive RDS objects from unknown (untrusted) sources?

?? If not - the security issue is a non-issue as I understand it.


On Thu, 16 May 2024, 16:21 Vega, Ann (she/her/hers) via R-help, <
r-help@r-project.org> wrote:


I help to coordinate the USEPA's R user group.  We have over 500 members
and our security officer has required us to update to R version 4.4.0
because of the security vulnerability to versions prior.  However, we
cannot download the patched version because it does not have a signed
certificate and Microsoft Defender won't allow us to install it.

Most of our users rely on the RStudio viewer pane so we are in a bit of a
quandary.  We suspect other government agencies are impacted by this as
well.

Can you give me an estimated time for when another official version will
be released with the patch included?  I may be able to ask our security
officer to allow us to delay our install until that official version is
released.  Alternatively, if the patched version could have a signed
certificate, that would allow us to install it.

Thank you.

Ann Vega, PSPO
She/Her/Hers (Learn More<https://intranet.ord.epa.gov/dei/gender-pronouns

)

Office of Science Information Management, Data Architect
EPA Office of Research and Development
Cincinnati, OH

Mobile: 513-418-1922 - or reach out to me on Teams!
Hours:  Monday-Thursday, 7:30am - 6:00 pm, CDO:  Fridays
Email: vega@epa.gov<mailto:vega@epa.gov>



 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Print date on y axis with month, day, and year

2024-05-09 Thread Ben Bolker

gg0 <- ggplot(data=yyy[1:30,],aes(as.Date(jdate,format="%m-%d-%Y"),Sum)) 
+geom_point()

gg0 + scale_x_date(date_labels = "%m/%d/%Y")


On 2024-05-09 7:58 p.m., Sorkin, John wrote:

I am trying to use ggplot to plot the data, and R code, below. The dates 
(jdate) are printing as Mar 01, Mar 15, etc. I want to have the date printed as 
MMM DD  (or any other way that will show month, date, and year, e.g. 
mm/dd/yy). How can I accomplish this?

yyy  <- structure(list(
   jdate = structure(c(19052, 19053, 19054, 19055,
   19058, 19059, 19060, 19061, 19062, 19063, 19065, 19066, 
19067,
   19068, 19069, 19072, 19073, 19074, 19075, 19076, 19077, 
19083,
   19086, 19087, 19088, 19089, 19090, 19093, 19094, 19095), class = 
"Date"),
 Sum = c ( 1,  3,  9, 11, 13, 16, 18, 22, 26, 27, 30, 32, 35, 39,  41,
  43, 48, 51, 56, 58, 59, 63, 73, 79, 81, 88, 91, 93, 96, 103)),
 row.names = c(NA, 30L), class = "data.frame")
yyy
class(yyy$jdate)
ggplot(data=yyy[1:30,],aes(as.Date(jdate,format="%m-%d-%Y"),Sum)) +geom_point()


Thank you
John



John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center;
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Get a copy of a matrix only for TRUE entries of a matching size boolean matrix?

2024-05-03 Thread Ben Bolker


  In two steps:

result <- matrix(NA_character_,
   nrow=nrow(mat_letters), ncol =ncol(mat_letters))
result[mat_bools] <- mat_letters[mat_bools]

On 2024-05-03 8:47 a.m., DynV Montrealer wrote:

Is there a way to get a copy of a matrix only for TRUE entries of a
matching size boolean matrix? For *example*:

mat_letters <- matrix(data=c('A', 'B', 'C', 'D'), ncol=2, byrow=TRUE)
mat_letters

  [,1] [,2]
[1,] "A"  "B"
[2,] "C"  "D"

mat_bools <- matrix(data=c(FALSE, TRUE, TRUE, FALSE), ncol=2, byrow=TRUE)
mat_bools

   [,1]  [,2]
[1,] FALSE  TRUE
[2,]  TRUE FALSE
*Reminder:* The following is only an example ; the solution might look very
different.
some_command(mat_letters, mat_bools, false=empty)
  [,1] [,2]
[1,] ""  "B"
[2,] "C"  ""
some_command(mat_letters, mat_bools, false=na)
  [,1] [,2]
[1,] NA  "B"
[2,] "C"  NA

Thank you kindly

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] x[0]: Can '0' be made an allowed index in R?

2024-04-21 Thread Ben Bolker

  Also https://cran.r-project.org/package=Oarray (which is older and 
hence possibly more stable)


On 2024-04-21 3:55 a.m., Hans W wrote:

As we all know, in R indices for vectors start with 1, i.e, x[0] is not a
correct expression. Some algorithms, e.g. in graph theory or combinatorics,
are much easier to formulate and code if 0 is an allowed index pointing to
the first element of the vector.

Some programming languages, for instance Julia (where the index for normal
vectors also starts with 1), provide libraries/packages that allow the user
to define an index range for its vectors, say 0:9 or 10:20 or even negative
indices.

Of course, this notation would only be feasible for certain specially
defined vectors. Is there a library that provides this functionality?
Or is there a simple trick to do this in R? The expression 'x[0]' must
be possible, does this mean the syntax of R has to be twisted somehow?

Thanks, Hans W.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] x[0]: Can '0' be made an allowed index in R?

2024-04-21 Thread Ben Bolker

https://cran.r-project.org/package=index0

On Sun, Apr 21, 2024, 3:56 AM Hans W  wrote:

> As we all know, in R indices for vectors start with 1, i.e, x[0] is not a
> correct expression. Some algorithms, e.g. in graph theory or combinatorics,
> are much easier to formulate and code if 0 is an allowed index pointing to
> the first element of the vector.
>
> Some programming languages, for instance Julia (where the index for normal
> vectors also starts with 1), provide libraries/packages that allow the user
> to define an index range for its vectors, say 0:9 or 10:20 or even negative
> indices.
>
> Of course, this notation would only be feasible for certain specially
> defined vectors. Is there a library that provides this functionality?
> Or is there a simple trick to do this in R? The expression 'x[0]' must
> be possible, does this mean the syntax of R has to be twisted somehow?
>
> Thanks, Hans W.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Printout and saved results

2024-03-26 Thread Ben Bolker

  Fragile and probably a bad idea, but:

"%.%" <- function(x,y) { assign(deparse(substitute(x)), y, 
parent.frame()); print(y) }

  > a %.% "hello"
  [1] "hello"
  > a
  [1] "hello"

 Not sure how much value this has over other idioms such as wrapping 
the assignment in parentheses, which makes the result into a new 
expression, which is then printed:

(a <- "hello")
[1] "hello"

On 2024-03-26 8:39 a.m., avi.e.gr...@gmail.com wrote:

Just FYI, the R interpreter typically saves the last value returned briefly
in a variable called .Last.value that can be accessed before you do anything
else.

sin(.5)

[1] 0.4794255

temp <- .Last.value
print(temp)

[1] 0.4794255

sin(.666)

[1] 0.6178457

.Last.value

[1] 0.6178457

temp

[1] 0.4794255

invisible(sin(0.2))
.Last.value

[1] 0.1986693

So perhaps if you grab it in time, you can call your function and let the
REPL display it (or not) and yet save the value.

-Original Message-
From: R-help  On Behalf Of Jeff Newmiller via
R-help
Sent: Tuesday, March 26, 2024 1:03 AM
To: r-help@r-project.org
Subject: Re: [R] Printout and saved results

Your desire is not unusual among novices... but it is really not a good idea
for your function to be making those decisions. Look at how R does things:

The lm function prints nothing... it returns an object containing the result
of a linear regression. If you happen to call it directly from the R command
prompt and don't assign it to a variable, then the command interpreter
notices that return value and prints it. Since the lm object has a dedicated
print method associated with it, that output looks different than a plain
list object would... but the fact that it has a special print method
(?print.lm) is just window dressing unrelated to your request.

The important part is that the lm function doesn't even consider printing
anything out... it is the code that calls the function that determines
whether it will get printed. So...

lm( hp ~ disp, data = mtcars )  # printed by command interpreter
z <- lm( hp ~ disp, data = mtcars ) # assignment operator returns the value
of z to the command processor, but invisibly
( z <- lm( hp ~ disp, data = mtcars ) ) # strips off the invisible marking
so the value gets printed

Another example:

f <- function() {
   x <- 4
   x  # doesn't print
   invisible( 5 ) # return invisible result
}

f()  # doesn't print 4 because there is no command prompt looking at x alone
on a line... it is inside f
# command prompt doesn't print 5 because that 5 has been marked as invisible
(f()) # command interpreter prints 5

Leaving it up to the calling code to decide whether to print gives you the
option of calling your analysis function possibly thousands of times and
figuring out some slick way to summarize all those runs without thousands of
printouts that you are not going to wade through anyway and would only slow
the computer down (printing really does slow the computer down!)

On March 25, 2024 9:00:49 PM PDT, Steven Yen  wrote:

I just like the subroutine to spit out results (Mean, Std.dev, etc.) and

also be able to access the results for further processing, i.e.,

v$Mean

v$Std.dev

On 3/26/2024 11:24 AM, Richard O'Keefe wrote:

Not clear what you mean by "saved".
If you call a function and the result is printed, the result is
remembered for a wee while in
the variable .Last.value, so you can do

function.with.interesting.result(...)
retained.interesting.result <- .Last.value

or even

.Last.value -> retained.interesting.result

If you know before you start writing the expression that you want to
save the value,
you can wrap the assignment in parentheses, making it an expression:

(retained.interesting.result <-

function.with.interesting.result(..))

On Tue, 26 Mar 2024 at 15:03, Steven Yen  wrote:

How can I have both printout and saved results at the same time.

The subroutine first return "out" and the printout gets printed, but not
saved.

I then run the "invisible" line. Results got saved and accessible but no
printout.

How can I have both printout and also have the results saved? Thank you!

   > dstat4 <- function(data,digits=3){
+   Mean<- apply(data,2,mean,na.rm=TRUE)
+   Std.dev <- apply(data,2,sd,  na.rm=TRUE)
+   Min <- apply(data,2,min,na.rm=TRUE)
+   Max <- apply(data,2,max,na.rm=TRUE)
+   Obs <- dim(data)[1]
+   out <-round(cbind(Mean,Std.dev,Min,Max,Obs),digits)
+   out
+ # invisible(list(Mean=Mean,Std.dev=Std.dev,Min=Min,Max=Max))
+ }
   > x1<-rnorm(n=5,mean=5, sd=1)
   > x2<-rnorm(n=5,mean=10,sd=2)
   > w<-rnorm(n=5,mean=2,sd=0.3)
   > mydata<-data.frame(cbind(x1,x2))
   > v<-dstat4(mydata); v
Mean Std.dev   MinMax Obs
x1  5.000   0.922 3.900  6.282   5
x2 10.769   1.713 9.209 13.346   5
   > v$Mean
Error in v$Mean : $ operator is invalid for atomic vectors
   > dstat4 <- function(data,digits=3){
+   Mean<- apply(data,2,mean,na.rm=TRUE)
+   Std.dev <- apply(data,2,sd,  na.rm=TRUE)
+   Min <- apply(data,2,min,na.rm=TRUE)
+   Max <-

Re: [R] as.complex()

2024-03-25 Thread Ben Bolker

  That's hard to define unambiguously at a mathematical level.  What 
definition did you have in mind?  Can you provide more context?  (Maybe 
you want to compare Mod(x) to Mod(y) ?)


On 2024-03-25 3:23 a.m., Thomas K wrote:

Needing a < , > comparison for imaginary numbers

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] an issue about subsetting a vector

2024-03-24 Thread Ben Bolker

   As with a lot of things in R, the behaviour is counterintuitive but 
(arguably) logical. The key (maybe) is that `a[-x]` does not mean "take 
all of the elements of a except those indicated by `x`, but rather, 
"negate x, then take all but the negative elements".  In other words,

 -integer(0)  is   integer(0)  (we multiply *each of the elements of 
integer(0)* by -1, but integer(0) has no elements)

  that reduces to a[integer(0)]

and from there you get "select none of the elements".

  A related point is explained in the R Inferno 
https://www.burns-stat.com/pages/Tutor/R_inferno.pdf section 8.1.13, 
"negative nothing is something".

  See also

https://stackoverflow.com/questions/42615728/subsetting-vector-how-to-programatically-pass-negative-index-safely

https://stackoverflow.com/questions/40026975/subsetting-with-negative-indices-best-practices/40029485#40029485

On 2024-03-24 2:19 p.m., Paulo Barata wrote:

To the R-Help list,

I would like to have a clarification about an issue that I observe when 
subsetting a vector. This is R version 4.3.3 on Windows 10.

-- Example with a vector:

 > a <- 1:5
 > a
[1] 1 2 3 4 5

So we have, with a negative index:
 > a[-3]
[1] 1 2 4 5

But when the index is integer(0), we have this:

 > a[integer(0)]
integer(0)
 > a[-integer(0)]
integer(0)

When it comes to the function integer(), the R Help file says:

"Value: integer creates a integer vector of the specified length. Each 
element of the vector is equal to 0."

But we see below that integer(0) is some kind of null vector, that is, 
no numbers are represented by integer(0):

 > class(integer(0))
[1] "integer"
 > length(integer(0))
[1] 0

So my question: in the expression a[-integer(0)], what happens exactly 
with the index -integer(0)? We see that integer(0), although of class 
integer, does not represent any numbers, as its length is zero, so it 
seems to me that it makes no sense to calculate its negative 
-integer(0). What exactly is -integer(0)?

In particular, why a[-integer(0)] is not the whole vector a, that is, 
[1] 1 2 3 4 5? In the example below, if the invalid index -99 is 
presented to a, the result is the whole vector:

 > a[-99]
[1] 1 2 3 4 5

If -integer(0) is an invalid index, why do we have this?

 > a[-integer(0)]
integer(0)

Why a[-integer(0)] is not the whole vector a?

Thank you very much.

Paulo Barata

Rio de Janeiro, Brazil

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building Packages.

2024-03-21 Thread Ben Bolker

  I think this might be a good conversation for someone to have with 
the Posit folks

  * is there a more transparent way to do what they want?
* either, long-term, by having utils::install_packages() add a 
'hook' feature as mentioned by someone
* using a similar method to bspm::enable(), which calls trace() to 
add an external hook

  * is there a good place for them to put the documentation of what 
they're doing?

  * can they fix things that are broken (e.g.? handling of paths), and 
allow optional arguments to override behaviours that are different from 
utils::install.packages (e.g. add a possibility to say force=TRUE to 
override the "don't reinstall if already installed")

  ?

  cheers
   Ben

On 2024-03-21 11:51 a.m., Duncan Murdoch wrote:

I posted a description of their changes this morning.

Duncan Murdoch

On 21/03/2024 11:37 a.m., avi.e.gr...@gmail.com wrote:

With all this discussion, I shudder to ask this. I may have missed the
answers but the discussion seems to have been about identifying and 
solving

the problem rapidly rather than what maybe is best going forward if all
parties agree.

What was the motivation for what RSTUDIO did for their version and the
decision to replace what came with utils unless someone very explicitly
over-rode them by asking for the original? Is their version better in 
other
ways? Is there a possibility the two implementations may someday merge 
into

something that meets several sets of needs or are they incompatible?

Is there agreement that what broke with the substitution is a valid 
use or

is it something that just happens to work on the utils version if not
patched?

-Original Message-
From: R-help  On Behalf Of Duncan Murdoch
Sent: Thursday, March 21, 2024 5:53 AM
To: peter dalgaard 
Cc: Jorgen Harmse ; r-help@r-project.org; Martin 
Maechler

Subject: Re: [R] Building Packages.

Yes, you're right.  The version found in the search list entry for
"package:utils" is the RStudio one; the ones found with two or three
colons are the original.

Duncan Murdoch

On 21/03/2024 5:48 a.m., peter dalgaard wrote:
Um, what's with the triple colon? At least on my install, double 
seems to

suffice:

identical(utils:::install.packages, utils::install.packages)

[1] TRUE

install.packages

function (...)
.rs.callAs(name, hook, original, ...)

-pd

On 21 Mar 2024, at 09:58 , Duncan Murdoch 

wrote:

The good news for Jorgen (who may not be reading this thread any 
more) is

that one can still be sure of getting the original install.packages() by
using

 utils:::install.packages( ... )

with *three* colons, to get the internal (namespace) version of the

function.

Duncan Murdoch

On 21/03/2024 4:31 a.m., Martin Maechler wrote:

"Duncan Murdoch on Wed, 20 Mar 2024 13:20:12 -0400 writes:

  > On 20/03/2024 1:07 p.m., Duncan Murdoch wrote:
  >> On 20/03/2024 12:37 p.m., Ben Bolker wrote:
  >>> Ivan, can you give more detail on this? I've heard this
  >>> issue mentioned, but when I open RStudio and run
  >>> find("install.packages") it returns
  >>> "utils::install.packages", and running dump() from
  >>> within RStudio console and from an external "R
  >>> --vanilla" gives identical results.
  >>>
  >>> I thought at one point this might only refer to the GUI
  >>> package-installation interface, but you seem to be
  >>> saying it's the install.packages() function as well.
  >>>
  >>> Running an up-to-date RStudio on Linux, FWIW -- maybe
  >>> weirdness only happens on other OSs?
  >>
  >> On MacOS, I see this:
  >>
  >> > install.packages function (...)  .rs.callAs(name, hook,
  >> original, ...)  
  >>
  >> I get the same results as you from find().  I'm not sure
  >> what RStudio is doing to give a different value for the
  >> function than what find() sees.
  > Turns out that RStudio replaces the install.packages
  > object in the utils package.
  > Duncan Murdoch
Yes, and this has been the case for several years now, and I
have mentioned this several times, too  (though some of it
possibly not in a public R-* mailing list).
And yes, that they modify the package environment
    as.environment("package:utils")
but leave the
    namespace  asNamespace("utils")
unchanged, makes it harder to see what's
going on (but also has less severe consequences; if they kept to
the otherwise universal *rule* that the namespace and package must 
have

the same objects

apart from those only in the namespace,
people would not even have access to R's true install.packages()
but only see the RStudio fake^Hsubstitute..
We are still not happy with their decision. Also
help(install.pac

Re: [R] Building Packages.

2024-03-21 Thread Ben Bolker

  Is your Fedora machine using the bspm package with bspm::enable() in 
the .Rprofile (to install binary packages from the r2u repository)? 
bspm adds a hook by using trace() on install.packages, which makes it 
look like this.


  My guess is that if you start with --vanilla *or* run bspm::disable() 
that you'll get back to the original-as-installed version.


  Even if you have RStudio installed you could change the association 
in your GUI file browser to open R files in emacs by default ...


  cheers
   Ben Bolker


On 2024-03-21 4:40 a.m., Martin Maechler wrote:

Ben Bolker
 on Wed, 20 Mar 2024 13:25:33 -0400 writes:


 >Hmm, looks platform-specific.  Under Linux both RStudio
 > and external R console return

 > a0b52513622c41c11e3ef57c7a485767

 > for digest::digest(install.packages)

Well, platform-specific maybe, notably probably the *RStudio*-version
matters (for once).

One one of our public compute-machines running Linux Fedora 38
   (I don't have RStudio installed on my desktop as I loathe it
badly to see RStudio start up when I click at an *R script in
the OS gui file browser ... !:!P:!)(*&))
  
I definitely see



R.version.string

[1] "R version 4.3.3 Patched (2024-02-29 r86162)"

RStudio.Version()$version

[1] ‘2023.12.1.402’

install.packages

function (...)
.rs.callAs(name, hook, original, ...)





No need for any hashes to see that install.packages is not the
one from R.

---
Concluding from your, Ben's, finding I'd guess that Posit
finally decided to move away from this very unfriendly idea of
sneakily replacing a base R function ?

That would actually give raise to some applause..

Martin



 > On 2024-03-20 1:20 p.m., Duncan Murdoch wrote:
 >> On 20/03/2024 1:07 p.m., Duncan Murdoch wrote:
     >>> On 20/03/2024 12:37 p.m., Ben Bolker wrote:
 >>>>  Ivan, can you give more detail on this? I've heard
 >>>> this issue mentioned, but when I open RStudio and run
 >>>> find("install.packages") it returns
 >>>> "utils::install.packages", and running dump() from
 >>>> within RStudio console and from an external "R
 >>>> --vanilla" gives identical results.
 >>>>
 >>>>  I thought at one point this might only refer to
 >>>> the GUI package-installation interface, but you seem to
 >>>> be saying it's the install.packages() function as well.
 >>>>
 >>>>  Running an up-to-date RStudio on Linux, FWIW --
 >>>> maybe weirdness only happens on other OSs?
 >>>
 >>> On MacOS, I see this:
 >>>
 >>>   > install.packages function (...)  .rs.callAs(name,
 >>> hook, original, ...)  
 >>>
 >>> I get the same results as you from find().  I'm not sure
 >>> what RStudio is doing to give a different value for the
 >>> function than what find() sees.
 >>
 >> Turns out that RStudio replaces the install.packages
 >> object in the utils package.
 >>
 >> Duncan Murdoch
 >>
 >>>
 >>> Duncan Murdoch
 >>>
 >>>>
 >>>>   Ben Bolker
 >>>>
 >>>> On 2024-03-20 12:13 p.m., Ivan Krylov via R-help wrote:
 >>>>> В Wed, 20 Mar 2024 16:02:27 + Jorgen Harmse via
 >>>>> R-help  пишет:
 >>>>>
 >>>>>>> install.packages(tar,type='source',repos=NULL)
 >>>>>>

Error in library(jhBase) : there is no package called

 >>>>>> ‘jhBase’
 >>>>>>

Execution halted

 >>>>>>

Warning in install.packages(tar, type = "source", repos =

 >>>>>> NULL) :
 >>>>>>

  installation of package


 >>>>>> 
‘/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz’

had non-zero exit status

 >>>>>
 >>>>> Using RStudio? It happens to override install.packages
 >>>>> with a function that doesn't quite handle file
 >>>>> paths. Try utils::install.packages(tar, type =
 >>>>> "source", repos = NULL).
 >>>>>
 >>>>
 >>>> __
 >>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and
 >>>> more, see https://stat.ethz.ch/mailman/listinfo/r-help
 >>>> PLEASE do read the posting guide
 >>>> http://www.R-project.org/posting-guide.ht

Re: [R] Building Packages.

2024-03-20 Thread Ben Bolker

  Hmm, looks platform-specific.  Under Linux both RStudio and external 
R console return


a0b52513622c41c11e3ef57c7a485767

for digest::digest(install.packages)

On 2024-03-20 1:20 p.m., Duncan Murdoch wrote:

On 20/03/2024 1:07 p.m., Duncan Murdoch wrote:

On 20/03/2024 12:37 p.m., Ben Bolker wrote:

 Ivan, can you give more detail on this? I've heard this issue
mentioned, but when I open RStudio and run find("install.packages") it
returns "utils::install.packages", and running dump() from within
RStudio console and from an external "R --vanilla" gives identical 
results.


 I thought at one point this might only refer to the GUI
package-installation interface, but you seem to be saying it's the
install.packages() function as well.

 Running an up-to-date RStudio on Linux, FWIW -- maybe weirdness 
only

happens on other OSs?


On MacOS, I see this:

  > install.packages
function (...)
.rs.callAs(name, hook, original, ...)


I get the same results as you from find().  I'm not sure what RStudio is
doing to give a different value for the function than what find() sees.


Turns out that RStudio replaces the install.packages object in the utils 
package.


Duncan Murdoch



Duncan Murdoch



  Ben Bolker

On 2024-03-20 12:13 p.m., Ivan Krylov via R-help wrote:

В Wed, 20 Mar 2024 16:02:27 +
Jorgen Harmse via R-help  пишет:


install.packages(tar,type='source',repos=NULL)


Error in library(jhBase) : there is no package called ‘jhBase’

Execution halted

Warning in install.packages(tar, type = "source", repos = NULL) :

 installation of package
‘/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz’
had non-zero exit status


Using RStudio? It happens to override install.packages with a function
that doesn't quite handle file paths. Try utils::install.packages(tar,
type = "source", repos = NULL).



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building Packages.

2024-03-20 Thread Ben Bolker

  Ivan, can you give more detail on this? I've heard this issue 
mentioned, but when I open RStudio and run find("install.packages") it 
returns "utils::install.packages", and running dump() from within 
RStudio console and from an external "R --vanilla" gives identical results.


  I thought at one point this might only refer to the GUI 
package-installation interface, but you seem to be saying it's the 
install.packages() function as well.


  Running an up-to-date RStudio on Linux, FWIW -- maybe weirdness only 
happens on other OSs?


   Ben Bolker

On 2024-03-20 12:13 p.m., Ivan Krylov via R-help wrote:

В Wed, 20 Mar 2024 16:02:27 +
Jorgen Harmse via R-help  пишет:


install.packages(tar,type='source',repos=NULL)


Error in library(jhBase) : there is no package called ‘jhBase’

Execution halted

Warning in install.packages(tar, type = "source", repos = NULL) :

   installation of package
‘/Users/jharmse/Library/CloudStorage/OneDrive-RokuInc/jhBase_1.0.1.tar.gz’
had non-zero exit status


Using RStudio? It happens to override install.packages with a function
that doesn't quite handle file paths. Try utils::install.packages(tar,
type = "source", repos = NULL).



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help

2024-02-22 Thread Ben Bolker


  I agree that the posting guide is due for updating.

  If the mailing list maintainers were willing I think r-consult might 
not be a terrible idea. I do think the center of gravity has moved to 
Cross Validated, and it might be minimally sufficient to point people 
there (or Math Overflow for probability/math questions) rather than 
starting a new group.


On 2024-02-21 12:53 p.m., Joakim Linde wrote:

Lisa, this seems to be fairly straight forward to do in R and I'm happy to help 
you get started. However, please be aware that you do have to have knowledge of 
statistics to do the analysis/modeling.

Rolf, Jeff, I do appreciate your view that this is not a R probelm. It's more a 'how to 
use R' / 'help me get started' problem. The posting guidelines point to "Usenet 
groups sci.stat.consult (applied statistics and consulting) and sci.stat.math 
(mathematical stat and probability)." Since Google announced [1] that Google groups 
will not support new usenet content starting tomorrow, would it make sense to have a 
r-consult mailing list or tag it [consult] on r-help?

Regards,
Joakim

[1]: https://support.google.com/groups/answer/11036538

On Wed, Feb 21, 2024, at 1:28 AM, Jeff Newmiller via R-help wrote:

Regarding 1 and 2, please read the Posting Guide mentioned at the
bottom of every R-help post. R does not equal statistics... and
education about statistics is way too ambitious to include in this
mailing list that is about a tool that happens to be useful for
statisticians.

There are forums online that do cater to statistical methods (e.g.
Cross Validated or many results from a search engine)... but such
conversations can be extensive so as Rolf suggests this is a good time
to learn what resources your educational institutions can provide...
online forums may be too limiting when your questions are so vague.

On February 20, 2024 2:14:58 PM PST, Rolf Turner  wrote:


On Mon, 19 Feb 2024 17:39:23 +0100
Lisa Hupfer via R-help  wrote:


I am writing my master thesis in which I compared two cultures . So
for my statistics I need to compare Age,Sex,Culture as well as have a
look at the tasks scores .

Anyone familiar with this ?
I’d love to share my script so you guide me where I did wrong .


(1) This post is far too vague to be appropriate for this list.

(2) You should learn some statistics; probably linear modelling.

(3) You should talk to your thesis advisor.

(4) Please see fortunes::fortune(285).

cheers,

Rolf Turner




--
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use of geometric mean for geochemical concentrations

2024-01-22 Thread Ben Bolker

  I think https://stats.stackexchange.com would be best: r-sig-ecology 
is pretty quiet these days


On 2024-01-22 11:05 a.m., Rich Shepard wrote:

On Mon, 22 Jan 2024, Bert Gunter wrote:


better posted on r-sig-ecology? -- or maybe even stack exchange?


Bert,

Okay.

Regards,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strange results : bootrstrp CIs

2024-01-13 Thread Ben Bolker

  It took me a little while to figure this out, but: the problem is 
that if your resampling leaves out any countries (which is very likely), 
your model applied to the bootstrapped data will have fewer coefficients 
than your original model.


I tried this:

cc <- unique(e$Country)
func <- function(data, idx) {
   coef(lm(Score~ Time + factor(Country, levels =cc),data=data[idx,]))
}

but lm() automatically drops coefficients for missing countries (I 
didn't think about it too hard, but I thought they might get returned as 
NA and that boot() might be able to handle that ...)


  If you want to do this I think you'll have to find a way to do a 
*stratified* bootstrap, restricting the bootstrap samples so that they 
always contain at least one sample from each country ... (I would have 
expected "strata = as.numeric(e$Country)" to do this, but it doesn't 
work the way I thought ... it tries to compute the statistics for *each* 
stratum ...)






 Debugging attempts:

set.seed(101)
options(error=recover)
B= boot(e, func, R=1000)


Error in t.star[r, ] <- res[[r]] :
  number of items to replace is not a multiple of replacement length

Enter a frame number, or 0 to exit

1: boot(e, func, R = 1000)



Selection: 1
Called from: top level
Browse[1]> print(r)
[1] 2
Browse[1]> t.star[r,]
[1] NA NA NA NA NA NA NA NA NA

i[2,]
 [1] 14 15 22 22 21 14  8  2 12 22 10 15  9  7  9 13 12 23  1 20 15  7 
5 10





On 2024-01-13 5:22 p.m., varin sacha via R-help wrote:

Dear Duncan,
Dear Ivan,

I really thank you a lot for your response.
So, if I correctly understand your answers the problem is coming from this line:

coef(lm(Score~ Time + factor(Country)),data=data[idx,])

This line should be:
coef(lm(Score~ Time + factor(Country),data=data[idx,]))

If yes, now I get an error message (code here below)! So, it still does not 
work.

Error in t.star[r, ] <- res[[r]] :
   number of items to replace is not a multiple of replacement length


##
Score=c(345,564,467,675,432,346,476,512,567,543,234,435,654,411,356,658,432,345,432,345,
 345,456,543,501)
  
Country=c("Italy", "Italy", "Italy", "Turkey", "Turkey", "Turkey", "USA", "USA", "USA", "Korea", "Korea", "Korea", "Portugal", "Portugal", "Portugal", "UK", "UK", "UK", "Poland", "Poland", "Poland", "Austria", "Austria", "Austria")
  
Time=c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3)
  
e=data.frame(Score, Country, Time)
  
  
library(boot)

func= function(data, idx) {
coef(lm(Score~ Time + factor(Country),data=data[idx,]))
}
B= boot(e, func, R=1000)
  
boot.ci(B, index=2, type="perc")

#








Le samedi 13 janvier 2024 à 21:56:58 UTC+1, Ivan Krylov  a 
écrit :





В Sat, 13 Jan 2024 20:33:47 + (UTC)

varin sacha via R-help  пишет:


coef(lm(Score~ Time + factor(Country)),data=data[idx,])



Wrong place for the data=... argument. You meant to give it to lm(...),
but in the end it went to coef(...). Without the data=... argument, the
formula passed to lm() picks up the global variables inherited by the
func() closure.

Unfortunately, S3 methods really do have to ignore extra arguments they
don't understand if the class is to be extended, so coef.lm isn't
allowed to complain to you about it.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Obtaining a value of pie in a zero inflated model (fm-zinb2)

2024-01-07 Thread Ben Bolker


   A little more.

  As Christopher Ryan points out, as long as the zero-inflation model 
is non-trivial (i.e. more complex than a single intercept for the whole 
population), there's a pi(i) for every observation i (you could 
certainly average the pi(i) values if you wanted, or compute pi for an 
averaged set of covariates).


library(pscl)
fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "poisson")

pi_i <- predict(fm_zinb2, type = "zero")
head(pi_i)
##  1  2  3  4  5  6
## 0.13392803 0.21936320 0.21973454 0.24697313 0.01890023 0.34281196


To get the conditional probability that a zero observation is a 
structural or a sampling zero, compute the probability of a sampling 
zero (exp(-mu[i]) for a Poisson model) and use Bayes' rule, assuming 
equal prior probs


## probability of _sampling_ zero

psampz <- exp(-predict(fm_zinb2, type = "count"))
head(psampz)
##  1  2  3  4  5  6
## 0.09507376 0.18361241 0.18688646 0.14774602 0.08992667 0.27235498

pz <- 1 - (1-pi_i)*(1-psampz)  ## prob of zero of either type
## 1 2 3 4 5 6
## 0.2162687 0.3626978 0.366 0.3582299 0.1071273 0.5218004

psampz/pz
head(psampz)/head(pz)
## 1 2 3 4 5 6
## 0.4396093 0.5062408 0.5112395 0.4124336 0.8394378 0.5219524

   3. I'm not aware of a closed-form solution for zero-inflation models 
as long as the count and/or structural-zero components depend on 
covariates?  In any case, looking inside zeroinfl() you can see that it 
calls optim() [using BFGS by default, see ?pscl::zeroinfl.control]


  cheers
   Ben Bolker


On 2024-01-04 9:38 a.m., Christopher W. Ryan via R-help wrote:

Are you referring to the zeroinfl() function in the countreg package? If
so, I think

predict(fm_zinb2, type = "zero", newdata = some.new.data)

will give you pi for each combination of covariate values that you
provide in some.new.data

where pi is the probability to observe a zero from the point mass component.

As to your second question, I'm not sure that's possible, for any
*particular, individual* subject. Others will undoubtedly know better
than I.

--Chris Ryan

Sorkin, John wrote:

I am running a zero inflated regression using the zeroinfl function similar to 
the model below:
   
  fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "poisson")

summary(fm_zinb2)

I have three questions:

1) How can I obtain a value for the parameter pie, which is the fraction of the 
population that is in the zero inflated model vs the fraction in the count 
model?

2) For any particular subject, how can I determine if the subject is in the 
portion of the population that contributes a zero count because the subject is 
in the group of subjects who have structural zero responses vs. the subject 
being in the portion of the population who can contribute a zero or a non-zero 
response?

3) zero inflated models can be solved using closed form solutions, or using 
iterative methods. Which method is used by fm_zinb2?

Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;

Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center;

PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;

Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2: Get the regression line with 95% confidence bands

2023-12-12 Thread Ben Bolker

Use scale_x_continuous() and specify your desired breaks

On Tue, Dec 12, 2023, 4:19 PM varin sacha  wrote:

> Dear Ben,
> Dear Daniel,
> Dear Rui,
> Dear Bert,
>
> Here below my R code.
> I really appreciate all your comments. My R code is perfectly working but
> there is still something I would like to improve. The X-axis is showing
> 2012.5 ;   2015.0   ;   2017.5   ;  2020.0
> I would like to see on X-axis only the year (2012 ; 2015 ; 2017 ; 2020).
> How to do?
>
>
> #
> library(ggplot2)
>
> df=data.frame(year= c(2012,2015,2018,2022), score=c(495,493, 495, 474))
>
> ggplot(df, aes(x = year, y = score)) + geom_point() + geom_smooth(method =
> "lm", formula = y ~ x) +
>  labs(title = "Standard linear regression for France", x = "Year", y =
> "PISA score in mathematics") +
> scale_y_continuous(limits=c(470,500),oob=scales::squish)
> #
>
>
>
>
>
>
>
>
>
> Le lundi 11 décembre 2023 à 23:38:06 UTC+1, Ben Bolker 
> a écrit :
>
>
>
>
>
>
>
> On 2023-12-11 5:27 p.m., Daniel Nordlund wrote:
> > On 12/10/2023 2:50 PM, Rui Barradas wrote:
> >> Às 22:35 de 10/12/2023, varin sacha via R-help escreveu:
> >>>
> >>> Dear R-experts,
> >>>
> >>> Here below my R code, as my X-axis is "year", I must be missing one
> >>> or more steps! I am trying to get the regression line with the 95%
> >>> confidence bands around the regression line. Any help would be
> >>> appreciated.
> >>>
> >>> Best,
> >>> S.
> >>>
> >>>
> >>> #
> >>> library(ggplot2)
> >>>   df=data.frame(year=factor(c("2012","2015","2018","2022")),
> >>> score=c(495,493, 495, 474))
> >>>   ggplot(df, aes(x=year, y=score)) + geom_point( ) +
> >>> geom_smooth(method="lm", formula = score ~ factor(year), data = df) +
> >>> labs(title="Standard linear regression for France", y="PISA score in
> >>> mathematics") + ylim(470, 500)
> >>> #
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >> Hello,
> >>
> >> I don't see a reason why year should be a factor and the formula in
> >> geom_smooth is wrong, it should be y ~ x, the aesthetics envolved.
> >> It still doesn't plot the CI's though. There's a warning and I am not
> >> understanding where it comes from. But the regression line is plotted.
> >>
> >>
> >>
> >> ggplot(df, aes(x = as.numeric(year), y = score)) +
> >>   geom_point() +
> >>   geom_smooth(method = "lm", formula = y ~ x) +
> >>   labs(
> >> title = "Standard linear regression for France",
> >> x = "Year",
> >> y = "PISA score in mathematics"
> >>   ) +
> >>   ylim(470, 500)
> >> #> Warning message:
> >> #> In max(ids, na.rm = TRUE) : no non-missing arguments to max;
> >> returning -Inf
> >>
> >>
> >>
> >> Hope this helps,
> >>
> >> Rui Barradas
> >>
> >>
> >>
> > After playing with this for a little while, I realized that the problem
> > with plotting the confidence limits is the addition of ylim(470, 500).
> > The confidence values are outside the ylim values.  Remove the limits,
> > or increase the range, and the confidence curves will plot.
> >
> > Hope this is helpful,
> >
> > Dan
> >
>
>   Or use + scale_y_continuous(limits = c(470, 500), oob = scales::squish)
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2: Get the regression line with 95% confidence bands

2023-12-11 Thread Ben Bolker





On 2023-12-11 5:27 p.m., Daniel Nordlund wrote:

On 12/10/2023 2:50 PM, Rui Barradas wrote:

Às 22:35 de 10/12/2023, varin sacha via R-help escreveu:


Dear R-experts,

Here below my R code, as my X-axis is "year", I must be missing one 
or more steps! I am trying to get the regression line with the 95% 
confidence bands around the regression line. Any help would be 
appreciated.


Best,
S.


#
library(ggplot2)
  df=data.frame(year=factor(c("2012","2015","2018","2022")), 
score=c(495,493, 495, 474))
  ggplot(df, aes(x=year, y=score)) + geom_point( ) + 
geom_smooth(method="lm", formula = score ~ factor(year), data = df) + 
labs(title="Standard linear regression for France", y="PISA score in 
mathematics") + ylim(470, 500)

#

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

Hello,

I don't see a reason why year should be a factor and the formula in 
geom_smooth is wrong, it should be y ~ x, the aesthetics envolved.
It still doesn't plot the CI's though. There's a warning and I am not 
understanding where it comes from. But the regression line is plotted.




ggplot(df, aes(x = as.numeric(year), y = score)) +
  geom_point() +
  geom_smooth(method = "lm", formula = y ~ x) +
  labs(
    title = "Standard linear regression for France",
    x = "Year",
    y = "PISA score in mathematics"
  ) +
  ylim(470, 500)
#> Warning message:
#> In max(ids, na.rm = TRUE) : no non-missing arguments to max; 
returning -Inf




Hope this helps,

Rui Barradas



After playing with this for a little while, I realized that the problem 
with plotting the confidence limits is the addition of ylim(470, 500). 
The confidence values are outside the ylim values.  Remove the limits, 
or increase the range, and the confidence curves will plot.


Hope this is helpful,

Dan



 Or use + scale_y_continuous(limits = c(470, 500), oob = scales::squish)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] adding "Page X of XX" to PDFs

2023-12-02 Thread Ben Bolker

   Sorry, jumped into the thread too late. (On the other hand, once the 
document gets complicated enough, it may be worth it in the long run to 
convert to something that actually has a document-generating back-end, 
rather than reinventing everything from scratch ...)


On 2023-12-02 2:03 p.m., Jeff Newmiller via R-help wrote:

He clearly stated he was using the pdf() graphics device.

On December 2, 2023 10:36:44 AM PST, Ben Bolker  wrote:

  It's still not entirely clear to me what framework you're using to generate 
the PDF, but if it's rmarkdown/Rnw (Sweave)/Quarto-based, then as far as I know 
all of those frameworks use LaTeX as the last step in the script-to-PDF 
pipeline, and allow the inclusion of arbitrary LaTeX code, so the 'lastpage' 
package would do this for you:

https://stackoverflow.com/questions/70343001/how-to-show-the-total-number-of-pages-in-a-pdf-via-the-rmarkdown-i-e-display

https://tex.stackexchange.com/questions/227/how-can-i-add-page-of-on-my-document



On 2023-12-02 12:23 p.m., Ebert,Timothy Aaron wrote:

Would this work in general? Say I have a document with figures, special 
equations, text, and tables. The text and tables are relatively easy. The 
figures would need a conversion from pixels to lines, and the equations maybe 
printed out, counted as a figure, and then added to the line count. It would 
also be tricky if a title line was at 32 point font and the text at 12, and the 
more complex the formatting the harder to deal with rows as related to page 
size.

Thankfully I do not think I will have to do this, so the question is for 
theoretical interest on my part (at least for now).

Tim

-Original Message-
From: R-help  On Behalf Of Jeff Newmiller via 
R-help
Sent: Saturday, December 2, 2023 11:46 AM
To: r-help@r-project.org
Subject: Re: [R] adding "Page X of XX" to PDFs

[External Email]

One of the most fundamental characteristics of R programming is the use of data 
frames of column vectors, and one of the very first challenges I had as a 
then-Perl-programmer was coming to grips with the fact that unknown-length CSV 
files would be read completely into memory as rows and once the entire CSV was 
in memory it would be transposed into column vectors. I was resistant to this 
philosophy at first, but the advantages in computation speed and simplicity 
eventually won me over.

I would say that if you want to know how many pages you are going to produce 
with R, then you are going to have to count them before you create them. 
Building a dataframe that describes (in terms of parameters to be passed to a 
page-generating function in each row) what you are going to put on each page 
before you actually print it can make this pre-counting problem trivial, and 
the code that does the printing is likely to be more modular and testable as 
well.

On December 1, 2023 12:53:25 PM PST, Dennis Fisher  wrote:

OS X
R 4.3.1

Colleagues

I often create multipage PDFs [pdf()] in which the text "Page X" appears in the 
margin.  These PDFs are created automatically using a massive R script.

One of my clients requested that I change this to:
Page X of XX
where XX is the total number of pages.

I don't know the number of expected pages so I can't think of any clever way to 
do this.  I suppose that I could create the PDF, find out the number of pages, 
then have a second pass in which the R script was fed the number of pages.  
However, there is one disadvantage to this -- the original PDF contains a 
timestamp on each page -- the new version would have a different timestamp -- 
so I would prefer to not use this approach.

Has anyone thought of some terribly clever way to solve this problem?

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone / Fax: 1-866-PLessThan (1-866-753-7784)
http://www.plessthan.com/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do re

Re: [R] adding "Page X of XX" to PDFs

2023-12-02 Thread Ben Bolker

  It's still not entirely clear to me what framework you're using to 
generate the PDF, but if it's rmarkdown/Rnw (Sweave)/Quarto-based, then 
as far as I know all of those frameworks use LaTeX as the last step in 
the script-to-PDF pipeline, and allow the inclusion of arbitrary LaTeX 
code, so the 'lastpage' package would do this for you:


https://stackoverflow.com/questions/70343001/how-to-show-the-total-number-of-pages-in-a-pdf-via-the-rmarkdown-i-e-display

https://tex.stackexchange.com/questions/227/how-can-i-add-page-of-on-my-document



On 2023-12-02 12:23 p.m., Ebert,Timothy Aaron wrote:

Would this work in general? Say I have a document with figures, special 
equations, text, and tables. The text and tables are relatively easy. The 
figures would need a conversion from pixels to lines, and the equations maybe 
printed out, counted as a figure, and then added to the line count. It would 
also be tricky if a title line was at 32 point font and the text at 12, and the 
more complex the formatting the harder to deal with rows as related to page 
size.

Thankfully I do not think I will have to do this, so the question is for 
theoretical interest on my part (at least for now).

Tim

-Original Message-
From: R-help  On Behalf Of Jeff Newmiller via 
R-help
Sent: Saturday, December 2, 2023 11:46 AM
To: r-help@r-project.org
Subject: Re: [R] adding "Page X of XX" to PDFs

[External Email]

One of the most fundamental characteristics of R programming is the use of data 
frames of column vectors, and one of the very first challenges I had as a 
then-Perl-programmer was coming to grips with the fact that unknown-length CSV 
files would be read completely into memory as rows and once the entire CSV was 
in memory it would be transposed into column vectors. I was resistant to this 
philosophy at first, but the advantages in computation speed and simplicity 
eventually won me over.

I would say that if you want to know how many pages you are going to produce 
with R, then you are going to have to count them before you create them. 
Building a dataframe that describes (in terms of parameters to be passed to a 
page-generating function in each row) what you are going to put on each page 
before you actually print it can make this pre-counting problem trivial, and 
the code that does the printing is likely to be more modular and testable as 
well.

On December 1, 2023 12:53:25 PM PST, Dennis Fisher  wrote:

OS X
R 4.3.1

Colleagues

I often create multipage PDFs [pdf()] in which the text "Page X" appears in the 
margin.  These PDFs are created automatically using a massive R script.

One of my clients requested that I change this to:
   Page X of XX
where XX is the total number of pages.

I don't know the number of expected pages so I can't think of any clever way to 
do this.  I suppose that I could create the PDF, find out the number of pages, 
then have a second pass in which the R script was fed the number of pages.  
However, there is one disadvantage to this -- the original PDF contains a 
timestamp on each page -- the new version would have a different timestamp -- 
so I would prefer to not use this approach.

Has anyone thought of some terribly clever way to solve this problem?

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone / Fax: 1-866-PLessThan (1-866-753-7784)
http://www.plessthan.com/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mann Kendall mutation package?

2023-12-01 Thread Ben Bolker


  Have you looked at the Kendall package?

https://stackoverflow.com/questions/62288340/mann-kendall-in-r

 (you've cross-posted a version of this question to SO as well:

https://stackoverflow.com/questions/77587426/what-is-the-algorithm-for-the-mann-kendall-mutation-test

)

Please don't cross-post between the R help lists and other forums like 
SO -- it dilutes/duplicates effort.




On 2023-12-01 6:58 a.m., Nick Wray wrote:

Hello - does anyone know whether there are any packages for Mann-Kendall
mutation tests in R available?  The only one I could find online is this
MK_mut_test: Mann-Kendall mutation test in Sibada/sibadaR: Sibada's
accumulated R scripts for next probably use to avoid reinventing the wheel.
(rdrr.io)  but
there doesn't seem to be a package corresponding to this.  I've tried
installing various permutations of the apparent name Sibada/sibadaR but
nothing comes up, so I'm not sure whether it even exists...

Thanks Nick Wray

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Code editor for writing R code

2023-11-29 Thread Ben Bolker





  Presumably there's nothing stopping you *writing* LaTeX in comments 
-- do you want a code editor that will render and display the LaTeX as 
you write? (Or am I misunderstanding something?)


  Does anyone do classic literate programming *sensu* Knuth any more? 
https://rpubs.com/bbolker/3153 
https://cran.r-project.org/web/packages/noweb/vignettes/noweb.pdf



On 2023-11-29 10:57 a.m., Christofer Bogaso wrote:

Hi,

Currently I use VS-Code to write codes in R. While it is very good, it
does not allow me to write Latex expressions in comments, which I am
willing to have to write corresponding mathematical expressions as
comments in my code files.

Does there exist any Code editor for R, that allows me to write Latex
in comments?

Any information will be appreciated.

Thanks,

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] make a lattice dotplot with symbol size proportional to a variable in the plotted dataframe

2023-11-08 Thread Ben Bolker


  Non-standard evaluation

On 2023-11-08 10:56 a.m., Christopher W. Ryan via R-help wrote:

Very helpful, Deepayan, and educational. Thank you.

What does NSE stand for?

Thanks,
Chris

Deepayan Sarkar wrote:


--Chris Ryan


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I remove my packages from rdrr.io?

2023-11-01 Thread Ben Bolker

  There is a github site with an issues list: 
https://github.com/rdrr-io/rdrr-issues/issues


  It looks like people have successfully requested removal in the past, 
e.g. https://github.com/rdrr-io/rdrr-issues/issues/113


On 2023-11-01 9:06 a.m., Kim Emilia wrote:

Hello all,

I would like to take down my packages posted/created on the website rdrr.io.
[https://rdrr.io/] Is there any way to take down packages from the website?
It would be appreciated if you suggested/offered a way to remove the
package from the website.

Thank you.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] weights vs. offset (negative binomial regression)

2023-10-31 Thread Ben Bolker

  [Please keep r-help in the cc: list]

  I don't quite know how to interpret the difference between specifying 
effort as an offset vs. as weights; I would have to spend more time 
thinking about it/working through it than I have available at the moment.

   I don't know that specifying effort as weights is *wrong*, but I 
don't know that it's right or what it is doing: if I were the reviewer 
of a paper (for example) I would require you to explain what the 
difference is and convince me that it was appropriate. (Furthermore, "I 
want to do it this way because it gives me significant effects" is 
automatically suspicious.)

  This would be a good question for CrossValidated 
(https://stats.stackexchange.com), you could try posting it there (I 
would be interested in the answer!)

  cheers
Ben Bolker

On 2023-10-30 8:19 p.m., 유준택 wrote:

Dear Mr. Bolker,

Thank you for the fast response.

I also know that a poisson (or negative binomial ) regression of glm is  
generally modelled using an offset variable.

In this case, when a weights term instead of the offset is used, this 
gave me significant coefficients of covariance.
I understand that the weights function for exponential family 
distributions in glm affects the variance of response variable.

I was just wondering whether my first model is a completely wrong model 
and the use of offset variable is valid in the case that
response variable is  not proportional to offset variable such as my 
dataset.

Sincerely,

Joon-Taek

2023년 10월 29일 (일) 오전 3:25, Ben Bolker <mailto:bbol...@gmail.com>>님이 작성:

    Using an offset of log(Effort) as in your second model is the more
standard way to approach this problem; it corresponds to assuming that
catch is strictly proportional to effort. Adding log(Effort) as a
covariate (as illustrated below) tests whether a power-law model (catch
propto (Effort)^(b+1), b!=0) is a better description of the data.  (In
this case it is not, although the confidence intervals on b are very
wide, indicating that we have very little information -- this is not
surprising since the proportional range of effort is very small
(246-258) in this data set.

    In general you should *not* check overdispersion of the raw data
(i.e., the *marginal distribution* of the data, you should check
overdispersion of a fitted (e.g. Poisson) model, as below.

    cheers
     Ben Bolker

edata <- data.frame(Catch, Effort, xx1, xx2, xx3)

## graphical exploration

library(ggplot2); theme_set(theme_bw())
library(tidyr)
edata_long <- edata |> pivot_longer(names_to="var", cols =-c("Catch",
"Effort"))
ggplot(edata_long, aes(value, Catch)) +
      geom_point(alpha = 0.2, aes(size = Effort)) +
      facet_wrap(~var, scale="free_x") +
      geom_smooth(method = "glm", method.args = list(family =
"quasipoisson"))
#

library(MASS)
g1 <- glm.nb(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata)
g2 <- update(g1, . ~ . + log(Effort))
g0 <- glm(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata,
            family = poisson)
performance::check_overdispersion(g0)
summary(g1)
summary(g2)
options(digits = 3)
confint(g2)
summary(g1)

On 2023-10-28 3:30 a.m., 유준택 wrote:
 > Colleagues,
 >
 >
 >
 > I have a dataset that includes five variables.
 >
 > - Catch: the catch number counted in some species (ind.)
 >
 > - Effort: fishing effort (the number of fishing vessels)
 >
 > - xx1, xx2, xx3: some environmental factors
 >
 > As an overdispersion test on the “Catch” variable, I modeled with
negative
 > binomial distribution using a GLM. The “Effort” variable showed a
gradually
 > decreasing trend during the study period. I was able to get the
results I
 > wanted when considered “Effort” function as a weights function in the
 > negative binomial regression as follows:
 >
 >
 >
 > library(qcc)
 >
 >
Catch=c(25,2,7,6,75,5,1,4,66,15,9,25,40,8,7,4,36,11,1,14,141,9,74,38,126,3)
 >
 >

Effort=c(258,258,258,258,258,258,258,254,252,252,252,252,252,252,252,252,252,252,252,248,246,246,246,246,246,246)
 >
 >

xx1=c(0.8,0.5,1.2,0.5,1.1,1.1,1.0,0.6,0.9,0.5,1.2,0.6,1.2,0.7,1.0,0.6,1.6,0.7,0.8,0.6,1.7,0.9,1.1,0.5,1.4,0.5)
 >
 >

xx2=c(1.7,1.6,2.7,2.6,1.5,1.5,2.8,2.5,1.7,1.9,2.2,2.4,1.6,1.4,3.0,2.4,1.4,1.5,2.2,2.3,1.7,1.7,1.9,1.9,1.4,1.4)
 >
 >

xx3=c(188,40,2,10,210,102,117,14,141,28,48,15,220,115,10,14,320,20,3,10,400,150,145,160,460,66)
 >
 > #
 >
 > edata <- data.frame(Catch, Effort, xx1, xx2, xx3)
 >
 > #
 >
 > qcc.ove

Re: [R] weights vs. offset (negative binomial regression)

2023-10-28 Thread Ben Bolker

  Using an offset of log(Effort) as in your second model is the more 
standard way to approach this problem; it corresponds to assuming that 
catch is strictly proportional to effort. Adding log(Effort) as a 
covariate (as illustrated below) tests whether a power-law model (catch 
propto (Effort)^(b+1), b!=0) is a better description of the data.  (In 
this case it is not, although the confidence intervals on b are very 
wide, indicating that we have very little information -- this is not 
surprising since the proportional range of effort is very small 
(246-258) in this data set.


  In general you should *not* check overdispersion of the raw data 
(i.e., the *marginal distribution* of the data, you should check 
overdispersion of a fitted (e.g. Poisson) model, as below.


  cheers
   Ben Bolker


edata <- data.frame(Catch, Effort, xx1, xx2, xx3)

## graphical exploration

library(ggplot2); theme_set(theme_bw())
library(tidyr)
edata_long <- edata |> pivot_longer(names_to="var", cols =-c("Catch", 
"Effort"))

ggplot(edata_long, aes(value, Catch)) +
geom_point(alpha = 0.2, aes(size = Effort)) +
facet_wrap(~var, scale="free_x") +
geom_smooth(method = "glm", method.args = list(family = 
"quasipoisson"))

#

library(MASS)
g1 <- glm.nb(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata)
g2 <- update(g1, . ~ . + log(Effort))
g0 <- glm(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata,
  family = poisson)
performance::check_overdispersion(g0)
summary(g1)
summary(g2)
options(digits = 3)
confint(g2)
summary(g1)



On 2023-10-28 3:30 a.m., 유준택 wrote:

Colleagues,



I have a dataset that includes five variables.

- Catch: the catch number counted in some species (ind.)

- Effort: fishing effort (the number of fishing vessels)

- xx1, xx2, xx3: some environmental factors

As an overdispersion test on the “Catch” variable, I modeled with negative
binomial distribution using a GLM. The “Effort” variable showed a gradually
decreasing trend during the study period. I was able to get the results I
wanted when considered “Effort” function as a weights function in the
negative binomial regression as follows:



library(qcc)

Catch=c(25,2,7,6,75,5,1,4,66,15,9,25,40,8,7,4,36,11,1,14,141,9,74,38,126,3)

Effort=c(258,258,258,258,258,258,258,254,252,252,252,252,252,252,252,252,252,252,252,248,246,246,246,246,246,246)

xx1=c(0.8,0.5,1.2,0.5,1.1,1.1,1.0,0.6,0.9,0.5,1.2,0.6,1.2,0.7,1.0,0.6,1.6,0.7,0.8,0.6,1.7,0.9,1.1,0.5,1.4,0.5)

xx2=c(1.7,1.6,2.7,2.6,1.5,1.5,2.8,2.5,1.7,1.9,2.2,2.4,1.6,1.4,3.0,2.4,1.4,1.5,2.2,2.3,1.7,1.7,1.9,1.9,1.4,1.4)

xx3=c(188,40,2,10,210,102,117,14,141,28,48,15,220,115,10,14,320,20,3,10,400,150,145,160,460,66)

#

edata <- data.frame(Catch, Effort, xx1, xx2, xx3)

#

qcc.overdispersion.test(edata$Catch, type="poisson")

#

summary(glm.nb(Catch~xx1+xx2+xx3, weights=Effort, data=edata))

summary(glm.nb(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata))



I am not sure the application of the weights function to the negative
binomial regression is correct. Also I wonder if there is a better way
doing this. Can anyone help?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need help to resolve the Error: evaluation nested too deeply: infinite recursion / options(expressions=)? Execution halted

2023-10-26 Thread Ben Bolker

  Hmm, I can't replicate (i.e., it works fine for me).  What are the 
results of your sessionInfo() (from a *clean* R session)?


==
R Under development (unstable) (2023-10-25 r85410)
Platform: x86_64-pc-linux-gnu
Running under: Pop!_OS 22.04 LTS

Matrix products: default
BLAS:   /usr/local/lib/R/lib/libRblas.so
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3; 
LAPACK version 3.10.0




attached base packages:
[1] stats graphics  grDevices datasets  utils methods   base

loaded via a namespace (and not attached):
 [1] desc_1.4.2R6_2.5.1  bspm_0.5.3 
remotes_2.4.2.1
 [5] ps_1.7.5  cli_3.6.1 processx_3.8.2callr_3.7.3 

 [9] compiler_4.4.0prettyunits_1.2.0 rprojroot_2.0.3   tools_4.4.0 


[13] pkgbuild_1.4.2curl_5.0.1crayon_1.5.2


On 2023-10-26 9:59 a.m., siddharth sahasrabudhe via R-help wrote:

Hello Colleagues,

I am trying to get the Git repository using *remotes* package. I am
using *remotes::install_github("dcl-docs/dcldata")
*to get the Git repo.

However, I am getting the following error message. I have absolutely no
idea on what this error message means and how to get away with this. Can
anyone please suggest the way out...?

Here is the error message that i am getting:


remotes::install_github("dcl-docs/dcldata")Downloading GitHub repo 
dcl-docs/dcldata@HEADRunning `R CMD build`...* checking for file 
'C:\Users\admin\AppData\Local\Temp\RtmpWmVvZI\remotes2ddc1b982b8\dcl-docs-dcldata-0a08cbb/DESCRIPTION'
 ... OK

* preparing 'dcldata':
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building 'dcldata_0.1.2.9000.tar.gz'Installing package into
‘C:/Users/admin/AppData/Local/R/win-library/4.3’
(as ‘lib’ is unspecified)Error: evaluation nested too deeply: infinite
recursion / options(expressions=)?
Execution haltedWarning: installation of package
‘C:/Users/admin/AppData/Local/Temp/RtmpWmVvZI/file2ddc7bed1881/dcldata_0.1.2.9000.tar.gz’
had non-zero exit status


Many thanks for the prompt help as always!

Regards
Siddharth
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] running crossvalidation many times MSE for Lasso regression

2023-10-23 Thread Ben Bolker

  For what it's worth it looks like spm2 is specifically for *spatial* 
predictive modeling; presumably its version of CV is doing something 
spatially aware.


  I agree that glmnet is old and reliable.  One might want to use a 
tidymodels wrapper to create pipelines where you can more easily switch 
among predictive algorithms (see the `parsnip` package), but otherwise 
sticking to glmnet seems wise.


On 2023-10-23 4:38 a.m., Martin Maechler wrote:

Jin Li
 on Mon, 23 Oct 2023 15:42:14 +1100 writes:


 > If you are interested in other validation methods (e.g., LOO or n-fold)
 > with more predictive accuracy measures, the function, glmnetcv, in the 
spm2
 > package can be directly used, and some reproducible examples are
 > also available in ?glmnetcv.

... and once you open that can of w..:   the  glmnet package itself
contains a function  cv.glmnet()  which we (our students) use when teaching.

What's the advantage of the spm2 package ?
At least, the glmnet package is authored by the same who originated and
first published (as in "peer reviewed" ..) these algorithms.



 > On Mon, Oct 23, 2023 at 10:59 AM Duncan Murdoch 

 > wrote:

 >> On 22/10/2023 7:01 p.m., Bert Gunter wrote:
 >> > No error message shown Please include the error message so that it is
 >> > not necessary to rerun your code. This might enable someone to see the
 >> > problem without running the code (e.g. downloading packages, etc.)
 >>
 >> And it's not necessarily true that someone else would see the same error
 >> message.
 >>
 >> Duncan Murdoch
 >>
 >> >
 >> > -- Bert
 >> >
 >> > On Sun, Oct 22, 2023 at 1:36 PM varin sacha via R-help
 >> >  wrote:
 >> >>
 >> >> Dear R-experts,
 >> >>
 >> >> Here below my R code with an error message. Can somebody help me to 
fix
 >> this error?
 >> >> Really appreciate your help.
 >> >>
 >> >> Best,
 >> >>
 >> >> 
 >> >> # MSE CROSSVALIDATION Lasso regression
 >> >>
 >> >> library(glmnet)
 >> >>
 >> >>
 >> >>
 >> 
x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91)
 >> >>
 >> 
x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9)
 >> >>
 >> 
y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2)
 >> >> T=data.frame(y,x1,x2)
 >> >>
 >> >> z=matrix(c(x1,x2), ncol=2)
 >> >> cv_model=glmnet(z,y,alpha=1)
 >> >> best_lambda=cv_model$lambda.min
 >> >> best_lambda
 >> >>
 >> >>
 >> >> # Create a list to store the results
 >> >> lst<-list()
 >> >>
 >> >> # This statement does the repetitions (looping)
 >> >> for(i in 1 :1000) {
 >> >>
 >> >> n=45
 >> >>
 >> >> p=0.667
 >> >>
 >> >> sam=sample(1 :n,floor(p*n),replace=FALSE)
 >> >>
 >> >> Training =T [sam,]
 >> >> Testing = T [-sam,]
 >> >>
 >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2)
 >> >>
 >> >> predictLasso=predict(cv_model, newx=test1)
 >> >>
 >> >>
 >> >> ypred=predict(predictLasso,newdata=test1)
 >> >> y=T[-sam,]$y
 >> >>
 >> >> MSE = mean((y-ypred)^2)
 >> >> MSE
 >> >> lst[i]<-MSE
 >> >> }
 >> >> mean(unlist(lst))
 >> >> ##
 >> >>
 >> >>
 >> >>
 >> >>
 >> >> __
 >> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 >> >> https://stat.ethz.ch/mailman/listinfo/r-help
 >> >> PLEASE do read the posting guide
 >> http://www.R-project.org/posting-guide.html
 >> >> and provide commented, minimal, self-contained, reproducible code.
 >> >
 >> > __
 >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 >> > https://stat.ethz.ch/mailman/listinfo/r-help
 >> > PLEASE do read the posting guide
 >> http://www.R-project.org/posting-guide.html
 >> > and provide commented, minimal, self-contained, reproducible code.
 >>
 >> __
 >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 >> https://stat.ethz.ch/mailman/listinfo/r-help
 >> PLEASE do read the posting guide
 >> http://www.R-project.org/posting-guide.html
 >> and provide commented, minimal, self-contained, reproducible code.
 >>


 > --
 > Jin
 > --
 > Jin Li, PhD
 > Founder, Data2action, Australia
 > https://www.researchgate.net/profile/Jin_Li32
 > https://scholar.google.com/citations?user=Jeot53EJ=en

 > [[alternative

Re: [R] [Tagged] Re: Fwd: r-stats: Geometric Distribution

2023-10-19 Thread Ben Bolker

  Jeff: I think you might be misdiagnosing the OP's problem; I'm not 
sure that the R parameterization is primarily intended to avoid FP 
problems, but rather is *one* possible sensible choice of definition. 
Note that the Wikipedia page on the geometric distribution: 
https://en.wikipedia.org/wiki/Geometric_distribution (obviously not an 
authoritative reference, but Wikipedia is generally very good for these 
things) gives *both* the starting-from-zero and starting-from-1 definitions.


  Sahil: this is not an error in R, it's simply a different (and 
equally sensible) choice of definition from yours. You can certainly 
write your code to do `dgeom(x-1, p)`: in fact, you could write a wrapper


my_dgeom <- function(x, ...) dgeom(x-1, ...)

so that you never have to think about it again ...

  cheers
    Ben Bolker


On 2023-10-19 3:20 a.m., Jeff Newmiller via R-help wrote:

What makes sense in a math class is not necessarily the same as what makes 
sense in a floating point analysis environment.

You lose a lot of significant digits when you add 1 to a floating point number 
that is close to zero, and this implementation allows the user to avoid that 
structural deficiency inherent in your preferred formulation.

This is a case where you need to read the documentation and adapt your handling 
of numbers to get the most accurate results.

Or write your own version that destroys significant digits.

There are other functions that allow for similar maintenance of significant 
digits... like log1p and expm1. See i.e. 
https://en.m.wikipedia.org/wiki/Natural_logarithm#lnp1 for discussion.

On October 18, 2023 10:44:21 PM PDT, Sahil Sharma 
 wrote:

Hi, today I came across the same problem. And, I'm able to explain it with
an example as well.

Suppose I want to PDF or P(X=5) in Geometric Distribution with P = 0.2.

The theoretical formula is P * (1-P) ^ (x -1). But the R function dgeom(x,
p) works like P * (1-P) ^ x, it does not reduce 1 from x because in r the x
starts from 0. In that case, if I am writing x as 5 then in-principle it
should work like x = 4 because starting from zero, 4 is the 5th place of x.
E.g., 0,1,2,3,4 there are five digits.

However, the x in dgeom(x,p) is exactly working like 5.

Here are some codes that I used:


dgeom(5, 0.2)

[1] 0.065536

If I use the formula manually, i.e., p(1-P)^x-1, I get this.


0.2 * (1-0.2)^(5-1)

[1] 0.08192

Even if x starts from 0 in r, that's why we do not minus 1 from x, it
should work like 4 when I'm writing 5, but not, it is working exactly 5.
For example, if I manually put the 5 at the place of X, I get same results
as dgeom(x,p).


0.2 * (1-0.2)^(5)

[1] 0.065536



I guess there is a need for solution to this problem otherwise, it may
result in erroneous calculations. Either the function dgeom(x,p) can
perform and result as per the theoretical definition of PDF in Geometric
Distribution, or the user applying this function must be prompted about the
nature of this function so that the user manually minus one from x and then
enter it into the function dgeom(x,p).

Thanks, and Regards
Sahil





On Tue, Oct 17, 2023 at 6:39 PM Ivan Krylov  wrote:


В Tue, 17 Oct 2023 12:12:05 +0530
Sahil Sharma  пишет:


The original formula for Geometric Distribution PDF is
*((1-p)^x-1)*P*. However, the current r function *dgeom(x, p)* is
doing this: *((1-p)^x)*P, *it is not reducing 1 from x.


Your definition is valid for integer 'x' starting from 1. ('x'th trial
is the first success.)

The definition in help(dgeom):


p(x) = p (1-p)^x
for x = 0, 1, 2, ..., 0 < p <= 1.


...is valid for integer x starting from 0. ('x' failures until the
first success.)

They are equivalent, but they use the name 'x' for two subtly different
things.

Thank you for giving attention to this and best of luck in your future
research!

--
Best regards,
Ivan



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Yamamoto test in BreakPoints package

2023-10-19 Thread Ben Bolker

  Maybe contact the package maintainer (maintainer("BreakPoints")) and 
ask?  (Normally I avoid bugging package maintainers if I can, but it 
seems you've looked everywhere else you can ...)


  Ben Bolker

On 2023-10-19 4:18 a.m., Richard O'Keefe wrote:

Visit the page at CRAN
https://cran.r-project.org/web/packages/BreakPoints/index.html
and download

BreakPoints_1.2.tar.gz
<https://cran.r-project.org/src/contrib/BreakPoints_1.2.tar.gz>
and you will find yamamoto.R in there.  Sadly, there are no
useful comments in there.

<https://cran.r-project.org/src/contrib/BreakPoints_1.2.tar.gz>
I tried example(yamamoto), and out of 4 actual break-points,
it found 5 of them.  In another test, modelled on that
example, it found 6 out of 3 actual breaks.

On Thu, 19 Oct 2023 at 20:30, Nick Wray  wrote:


Hello I’m not sure whether this strictly speaking counts as an R-help query
but anyway…  I have been using the Yamamoto test in the BreakPoints package
to find breakpoints in flow data for Scottish rivers.  However, I can’t
really just use the Yamamoto test as a “black box” ie data in, data out --
I need to find the actual algorithm which the Yamamoto test uses, either in
algebraic form or as R code, but despite exhaustive searching I can’t find
it.  I’ve tried to find a Github repository and various other things but
nothing comes up to give detailed information about the Yamamoto test as in
the package.  The original 1985 paper

  Climatic Jump: A Hypothesis in Climate Diagnosis

Ryozaburo Yamamoto
<
https://www.jstage.jst.go.jp/search/global/_search/-char/en?item=8=Ryozaburo+Yamamoto



, Tatsuya Iwashima
<
https://www.jstage.jst.go.jp/search/global/_search/-char/en?item=8=Tatsuya+Iwashima



, Sanga-Ngoie Kazadi
<
https://www.jstage.jst.go.jp/search/global/_search/-char/en?item=8=Sanga-Ngoie+Kazadi



, Makoto Hoshiai
<
https://www.jstage.jst.go.jp/search/global/_search/-char/en?item=8=Makoto+Hoshiai






  which is cited on the CRAN R info BreakPoints: Identify Breakpoints in
Series of Data (r-project.org)
<https://cran.r-project.org/web/packages/BreakPoints/BreakPoints.pdf>
doesn’t give any details.


There is another paper by Yamamoto et al (1987) Proc. NIPR Symp. Polar
Meteorol. Glaciol., 1, 91-102, 1987 but the method is not very clear and
whether it’s actually what the package does I can’t tell.

There is info about a Toda-Yamamoto causality test but this doesn’t seem to
be the same thing as the Yamamoto test in the R package BreakPoints

If anyone can point me to where either an algebraic algorithm or the R code
is I’d be v grateful

Thanks Nick Wray

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Best way to test for numeric digits?

2023-10-18 Thread Ben Bolker


   There are some answers on Stack Overflow:

https://stackoverflow.com/questions/14984989/how-to-avoid-warning-when-introducing-nas-by-coercion



On 2023-10-18 10:59 a.m., Leonard Mada via R-help wrote:

Dear List members,

What is the best way to test for numeric digits?

suppressWarnings(as.double(c("Li", "Na", "K",  "2", "Rb", "Ca", "3")))
# [1] NA NA NA  2 NA NA  3
The above requires the use of the suppressWarnings function. Are there 
any better ways?


I was working to extract chemical elements from a formula, something 
like this:

split.symbol.character = function(x, rm.digits = TRUE) {
     # Perl is partly broken in R 4.3, but this works:
     regex = "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])";
     # stringi::stri_split(x, regex = regex);
     s = strsplit(x, regex, perl = TRUE);
     if(rm.digits) {
     s = lapply(s, function(s) {
         isNotD = is.na(suppressWarnings(as.numeric(s)));
         s = s[isNotD];
     });
     }
     return(s);
}

split.symbol.character(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl"))


Sincerely,


Leonard


Note:
# works:
regex = "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])";
strsplit(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl"), regex, perl = T)


# broken in R 4.3.1
# only slightly "erroneous" with stringi::stri_split
regex = "(?<=[A-Z])(?![a-z]|$)|(?=[A-Z])|(?<=[a-z])(?=[^a-z])";
strsplit(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl"), regex, perl = T)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] if-else that returns vector

2023-10-12 Thread Ben Bolker


 how about

if(T) c(1,2,3) else c(5,6)

?

On 2023-10-12 4:22 p.m., Christofer Bogaso wrote:

Hi,

Following expression returns only the first element

ifelse(T, c(1,2,3), c(5,6))

However I am looking for some one-liner expression like above which
will return the entire vector.

Is there any way to achieve this?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear discriminant analysis

2023-10-12 Thread Ben Bolker


  It's possible that neither of these will help, but

(1) you can look at the source code of the predict method 
(MASS:::predict.lda)


(2) you can look at the source reference ("Modern Applied Statistics in 
S", Venables and Ripley) to see if it gives more information (although 
it might not); there's a chance that you can get the information you 
need via a google books search


On 2023-10-12 10:25 a.m., Fernando Archuby wrote:

Hi.
I have successfully performed the discriminant analysis with the lda
function, I can classify new individuals with the predict function, but I
cannot figure out how the lda results translate into the classification
decision. That is, I don't realize how the classification equation for new
individuals is constructed from the lda output. I want to understand it but
also, I need to communicate it and provide a mechanism for other colleagues
to make classifications with their data.
Thank you very much,
Fernando



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R issue / No buffer space available

2023-10-05 Thread Ben Bolker

This looks like an RStudio issue; it might be better to post the question
on their forum

On Thu, Oct 5, 2023, 7:43 AM Ohad Oren, MD  wrote:

> Hello,
>
> I keep getting the following message about 'no buffer space available'. I
> am using R studio via connection to server. I verified that the connection
> to the server is good.
>
> 2023-10-04T20:26:25.698193Z [rsession-oo968] ERROR system error 105
> (No buffer space available) [host: localhost, uri: /log_message, path:
> /var/run/rstudio-server/rstudio-rserver/rserver-monitor.socket];
> OCCURRED AT void
> rstudio::core::http::LocalStreamAsyncClient::handleConnect(const
> rstudio_boost::system::error_code&)
> src/cpp/session/SessionModuleContext.cpp:124
>
>
> Will appreciate your help!
>
> Ohad
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] download.file strict certificate revocation check

2023-10-04 Thread Ben Bolker

  This is sad news indeed.

https://cran.r-project.org/web/checks/check_summary_by_maintainer.html

  lists Jim as the maintainer of clinsig, crank, eventInterval, 
plotrix, and prettyR.

> library(packageRank)
> packageRank(c("clinsig", "crank", "eventInterval", "plotrix", "prettyR"))
date  packages downloads rank percentile
1 2023-10-03   clinsig 1 14,454 of 18,0240.0
2 2023-10-03 crank 2 11,344 of 18,024   22.8
3 2023-10-03 eventInterval 4  8,001 of 18,024   51.0
4 2023-10-03   plotrix 3,082310 of 18,024   98.3
5 2023-10-03   prettyR90  1,954 of 18,024   89.1

It seems that at least plotrix and prettyR would be worth rescuing ... 
volunteers ... ? (prettyR has 1 strong reverse dep, plotrix has many ...)

  Ben Bolker

On 2023-10-04 6:30 p.m., Jim Lemon wrote:

Hello,
I am very sad to let you know that my husband Jim died in 18th September. I
apologise for not letting you know earlier but I had trouble finding the
password for his phone.
Kind regards,
Juel Briggs

On Thu, 5 Oct 2023, 02:07 Ivan Krylov 
В Wed, 4 Oct 2023 14:32:49 +
John Neset  пишет:

No change with True or False with R_LIBCURL_SSL_REVOKE_BEST_EFFORT

Judging by the screenshot, it looks like you've set an R variable
R_LIBCURL_SSL_REVOKE_BEST_EFFORT instead of setting an environment
variable using Sys.setenv:

  Sys.setenv('R_LIBCURL_SSL_REVOKE_BEST_EFFORT' = 'TRUE')

(Use Sys.getenv to verify the result.)

For the next time, most people on the R-help mailing list would
probably appreciate it if you copied and pasted the text from the R
console instead of attaching screenshots.

--
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about R software and output

2023-10-03 Thread Ben Bolker

   It's conceivable that functions in a contributed package would 
communicate/transmit or receive data from a remote server, but base R 
does nothing like that (unless you explicitly ask it to).


  cheers
   Ben Bolker

On 2023-10-02 3:48 a.m., Ferguson Charity (CEMINFERGUSON) wrote:

To whom it may concern,



My understanding is that the R software is downloaded from a CRAN network and 
data is imported into it using Microsoft Excel for example. Could I please just 
double check whether any data or results from the output is held on external 
servers or is it just held on local files on the computer?



Many thanks,



Charity


*

The information contained in this message and or attachments is intended only 
for the
person or entity to which it is addressed and may contain confidential and/or
privileged material. Unless otherwise specified, the opinions expressed herein 
do not
necessarily represent those of Guy's and St Thomas' NHS Foundation Trust or
any of its subsidiaries. The information contained in this e-mail may be 
subject to
public disclosure under the Freedom of Information Act 2000. Unless the 
information
is legally exempt from disclosure, the confidentiality of this e-mail and any 
replies
cannot be guaranteed.

Any review, retransmission,dissemination or other use of, or taking of any 
action in
reliance upon, this information by persons or entities other than the intended
recipient is prohibited. If you received this in error, please contact the 
sender
and delete the material from any system and destroy any copies.

We make every effort to keep our network free from viruses. However, it is your
responsibility to ensure that this e-mail and any attachments are free of 
viruses as
we can take no responsibility for any computer virus which might be transferred 
by
way of this e-mail.

*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replace character by numeric value

2023-09-29 Thread Ben Bolker

  The reason you're getting the result as character is that you have 
'side' as your alternative result in the second ifelse().  If "BUY" and 
"SELL" are the only options you might try


   ifelse(side == 'BUY', 1, ifelse(side == 'SELL', -1, NA))

or

   c(1,-1)[match(side, c("BUY", "SELL"))]

or

vals <- c(BUY=1, SELL = -1)
vals[side]


On 2023-09-29 9:21 a.m., Ebert,Timothy Aaron wrote:

Does this work?
mynewdf$side <- as.numeric(mynewdf$side)

This code would be the next line after your mutate.

TIm

-Original Message-
From: R-help  On Behalf Of Enrico Schumann
Sent: Thursday, September 28, 2023 3:13 AM
To: arnaud gaboury 
Cc: r-help 
Subject: Re: [R] replace character by numeric value

[External Email]

On Wed, 27 Sep 2023, arnaud gaboury writes:


I have two data.frames:

mydf1 <- structure(list(symbol = "ETHUSDT", cummulative_quote_qty =
1999.9122, side = "BUY", time = structure(1695656875.805, tzone = "",
class = c("POSIXct", "POSIXt"))), row.names = c(NA, -1L), class =
c("data.table",
"data.frame"))

mydf2 <- structure(list(symbol = c("ETHUSDT", "ETHUSDT", "ETHUSDT"),
cummulative_quote_qty = c(1999.119408, 0, 2999.890985), side =
c("SELL", "BUY", "BUY"), time = structure(c(1695712848.487,
1695744226.993, 1695744509.082), class = c("POSIXct", "POSIXt"
), tzone = "")), row.names = c(NA, -3L), class = c("data.table",
"data.frame"))

I use this line to replace 'BUY' by numeric 1 and 'SELL' by numeric -1
in
mydf1 and mydf2:
mynewdf <- mydf |> dplyr::mutate(side = ifelse(side == 'BUY', 1,
ifelse(side == 'SELL', -1, side)))

This does the job but I am left with an issue: 1 and -1 are characters
for
mynewdf2 when it is numeric for mynewdf1. The result I am expecting is
getting numeric values.
I can't solve this issue (using as.numeric(1) doesn't work) and don't
understand why I am left with num for mynewdf1 and characters for mynewdf2.


mynewdf1 <- mydf1 |> dplyr::mutate(side = ifelse(side == 'BUY', 1,

ifelse(side == 'SELL', -1, side)))

str(mynewdf1)

Classes 'data.table' and 'data.frame': 1 obs. of  4 variables:
  $ symbol   : chr "ETHUSDT"
  $ cummulative_quote_qty: num 2000
  $ side : num 1  <<<--
  $ time : POSIXct, format: "2023-09-25 17:47:55"
  - attr(*, ".internal.selfref")=


mynewdf2 <- mydf2 |> dplyr::mutate(side = ifelse(side == 'BUY', 1,

ifelse(side == 'SELL', -1, side)))

  str(mynewdf2)

Classes 'data.table' and 'data.frame': 3 obs. of  4 variables:
  $ symbol   : chr  "ETHUSDT" "ETHUSDT" "ETHUSDT"
  $ cummulative_quote_qty: num  1999 0 3000
  $ side : chr  "-1" "1" "1"   <<<--
  $ time : POSIXct, format: "2023-09-26 09:20:48"
"2023-09-26 18:03:46" "2023-09-26 18:08:29"
  - attr(*, ".internal.selfref")=

Thank you for help



I'd use something like this:

 map <- c(BUY = 1, SELL = -1)
 mydf1$side <- map[mydf1$side]
 str(mydf1)
 ## Classes 'data.table' and 'data.frame':   1 obs. of  4 variables:
 ##  $ symbol   : chr "ETHUSDT"
 ##  $ cummulative_quote_qty: num 2000
 ##  $ side : num 1

 mydf2$side <- map[mydf2$side]
 str(mydf2)
 ## Classes 'data.table' and 'data.frame':   3 obs. of  4 variables:
 ##  $ symbol   : chr  "ETHUSDT" "ETHUSDT" "ETHUSDT"
 ##  $ cummulative_quote_qty: num  1999 0 3000
 ##  $ side : num  -1 1 1
 ##  $ time : POSIXct, format: "2023-09-26 09:20:48" ...



--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of 
working hours.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odd result

2023-09-24 Thread Ben Bolker

  For what it's worth the janitor::remove_empty() (which removes all-NA 
rows by default, can be set to remove columns instead) can be useful for 
this kind of cleanup.


On 2023-09-24 5:58 a.m., Michael Dewey wrote:

Dear David

To get the first 46 rows just do KurtzData[1:43,]

However really you want to find out why it happened. It looks as though 
the .csv file you read has lots of blank lines at the end. I would open 
it in an editor to check that.


Michael

On 23/09/2023 23:55, Parkhurst, David wrote:
With help from several people, I used file.choose() to get my file 
name, and read.csv() to read in the file as KurtzData.  Then when I 
print KurtzData, the last several lines look like this:

39   5/31/22  16.0  341    1.75525 0.0201 0.0214   7.00
40   6/28/22  2:00 PM  0.0  215    0.67950 0.0156 0.0294 NA
41   7/25/22 11:00 AM  11.9   1943.5    NA NA 0.0500   7.80
42   8/31/22  0    220.5    NA NA 0.0700  30.50
43   9/28/22  0.067 10.9    NA NA 0.0700  10.20
44  10/26/22  0.086  237    NA NA 0.1550  45.00
45   1/12/23  1:00 PM 36.26    24196    NA NA 0.7500 283.50
46   2/14/23  1:00 PM 20.71   55    NA NA 0.0500   2.40
47  NA NA NA NA
48  NA NA NA NA
49  NA NA NA NA

Then the NA�s go down to one numbered 973.  Where did those extras 
likely come from, and how do I get rid of them?  I assume I need to 
get rid of all the lines after #46,  to do calculations and graphics, no?


David

[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Theta from negative binomial regression and power_NegativeBinomiial from PASSED

2023-09-15 Thread Ben Bolker

   Yes, theta is the 'size' or overdispersion parameter.  Sometimes 
also denoted as k. Wikipedia discusses this parameterization in the 
paragraph starting "In negative binomial regression ..." (but they call 
this parameter r rather than theta or k).


  You can also see this in MASS on google books:

https://www.google.ca/books/edition/Modern_Applied_Statistics_with_S/CzwmBQAAQBAJ?hl=en=1=venables+ripley+negative+binomial=PA206=frontcover

  This parameterization was added to R in version 1.3.0 ...



On 2023-09-15 2:27 a.m., Ivan Krylov wrote:

On Fri, 15 Sep 2023 01:51:27 +
"Sorkin, John"  wrote:


What is theta, and how does it relate to the parameters of the
negative binomial distribution?


Plugging the p (the success probability) and the r (the number of
successes until the experiment is stopped) from the Wikipedia article
(where they are defined in terms of mean mu and variance sigma^2)
together with the variance from ?MASS::rnegbin (where it's defined as
mu + mu^2/theta) into Maxima and then solving for theta, I get:

solve(
  [
   p = mu / sigma^2,
   r = mu^2/(sigma^2-mu),
   sigma^2 = mu + mu^2/theta
  ],
  [mu, sigma, theta]
);
[
  mu = ((1-p)*r)/p,
  sigma = sqrt(r-p*r)/p,
  theta = r
]

That is, the theta from MASS seems to be equivalent to the number of
successes from the formulation in the Wikipedia article.



--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of 
working hours.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] graph in R with grouping letters from the turkey test with agricolae package

2023-09-13 Thread Ben Bolker

  As a side note, I'm curious how often "Tukey test" is misspelled as 
"Turkey test".



Googling '"turkey test" mean comparison' gives 36.1K results (vs 14.3M 
for '"tukey test" mean comparison" ...




On 2023-09-13 10:02 a.m., Richard O'Keefe wrote:

d <- read.table("data.txt", TRUE)
cor(d[, 3:6])

  VAR1 VAR2 VAR3 VAR4
VAR11111
VAR21111
VAR31111
VAR41111

VAR1 to VAR4 are, up to linear scaling,
exactly the same variable.  Why is that?


On Wed, 13 Sept 2023 at 07:38, Loop Vinyl  wrote:


I would like to produce the attached graph (graph1) with the R package
agricolae, could someone give me an example with the attached data (data)?

I expect an adapted graph (graph2) with the data (data)

Best regards
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of 
working hours.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Query on finding root

2023-08-28 Thread Ben Bolker

(I mean pdavies)

On Mon, Aug 28, 2023, 7:52 AM Ben Bolker  wrote:

> I would probably use the built in qdavies() function...
>
> On Mon, Aug 28, 2023, 7:48 AM Leonard Mada via R-help <
> r-help@r-project.org> wrote:
>
>> Dear R-Users,
>>
>> Just out of curiosity:
>> Which of the 2 methods is the better one?
>>
>> The results seem to differ slightly.
>>
>>
>> fun = function(u){((26104.50*u^0.03399381)/((1-u)^0.107)) - 28353.7}
>>
>> uniroot(fun, c(0,1))
>> # 0.6048184
>>
>> curve(fun(x), 0, 1)
>> abline(v=0.3952365, col="red")
>> abline(v=0.6048184, col="red")
>> abline(h=0, col="blue")
>>
>>
>>
>> fun = function(u){ (0.03399381*log(u) - 0.107*log(1-u)) -
>> log(28353.7/26104.50) }
>> fun = function(u){ (0.03399381*log(u) - 0.107*log1p(-u)) -
>> log(28353.7/26104.50) }
>>
>> uniroot(fun, c(0,1))
>> # 0.6047968
>>
>> curve(fun(x), 0, 1)
>> abline(v=0.3952365, col="red")
>> abline(v=0.6047968, col="red")
>> abline(h=0, col="blue")
>>
>> Sincerely,
>>
>> Leonard
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Query on finding root

2023-08-28 Thread Ben Bolker

I would probably use the built in qdavies() function...

On Mon, Aug 28, 2023, 7:48 AM Leonard Mada via R-help 
wrote:

> Dear R-Users,
>
> Just out of curiosity:
> Which of the 2 methods is the better one?
>
> The results seem to differ slightly.
>
>
> fun = function(u){((26104.50*u^0.03399381)/((1-u)^0.107)) - 28353.7}
>
> uniroot(fun, c(0,1))
> # 0.6048184
>
> curve(fun(x), 0, 1)
> abline(v=0.3952365, col="red")
> abline(v=0.6048184, col="red")
> abline(h=0, col="blue")
>
>
>
> fun = function(u){ (0.03399381*log(u) - 0.107*log(1-u)) -
> log(28353.7/26104.50) }
> fun = function(u){ (0.03399381*log(u) - 0.107*log1p(-u)) -
> log(28353.7/26104.50) }
>
> uniroot(fun, c(0,1))
> # 0.6047968
>
> curve(fun(x), 0, 1)
> abline(v=0.3952365, col="red")
> abline(v=0.6047968, col="red")
> abline(h=0, col="blue")
>
> Sincerely,
>
> Leonard
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Query on finding root

2023-08-27 Thread Ben Bolker

   This doesn't look like homework to me -- too specific.  The posting 
guide  says that the list 
is not intended for "Basic statistics and classroom homework" -- again, 
this doesn't seem to fall into that category.


  tl;dr, I think the difference between the two approaches is just 
whether the lower or upper tail is considered (i.e., adding lower.tail = 
FALSE to pdavies(), or more simply taking (1-x), makes the two answers 
agree).


  If you look at the source code for pdavies() you'll see that it's 
essentially doing the same uniroot() calculation that you are.


## Q(u)=(c*u^lamda1)/((1-u)^lamda2)
mean <- 28353.7 # mean calculated from data
lambda1 <- .03399381 # estimates c, lambda1 and lambda2 calculated from data
lambda2 <- .107
c <- 26104.50
library(Davies)# using package
params <- c(c,lambda1,lambda2)
u <- pdavies(x = mean, params = params, lower.tail = FALSE)
u
fun <- function(u) {
with(as.list(params), (c*u^lambda1)/((1-u)^lambda2)) - mean
}
curve(fun, from = 0.01, to = 1)
uniroot <- uniroot(fun,c(0.01,1))
abline(h = 0)
uniroot$root



On 2023-08-27 5:40 p.m., Rolf Turner wrote:


On Fri, 25 Aug 2023 22:17:05 +0530
ASHLIN VARKEY  wrote:


Sir,


Please note that r-help is a mailing list, not a knight! ️


I want to solve the equation Q(u)=mean, where Q(u) represents the
quantile function. Here my Q(u)=(c*u^lamda1)/((1-u)^lamda2), which is
the quantile function of Davies (Power-pareto) distribution.  Hence I
want to solve , *(c*u^lamda1)/((1-u)^lamda2)=28353.7(Eq.1)*
where lamda1=0.03399381, lamda2=0.107 and c=26104.50. When I used
the package 'Davies' and solved Eq 1, I got the answer u=0.3952365.
But when I use the function  'uniroot' to solve the Eq.1, I got a
different answer which is  u=0.6048157.  Why did this difference
happen?  Which is the correct method to solve Eq.1. Using the value
of *u *from the first method my further calculation was nearer to
empirical values.  The R-code I used is herewith. Kindly help me to
solve this issue.

R-code
Q(u)=(c*u^lamda1)/((1-u)^lamda2)
mean=28353.7 # mean calculated from data
lamda1=.03399381 # estimates c, lamda1 and lamda2 calculated from data
lamda2=.107
c=26104.50
library(Davies)# using package
params=c(c,lamda1,lamda2)
u=pdavies(28353.7,params)
u
fun=function(u){((26104.50*u^0.03399381)/((1-u)^0.107))-28353.7}
uniroot= uniroot(fun,c(0.01,1))
uniroot


As Prof. Nash has pointed out, this looks like homework.

Some general advice:  graphics can be very revealing, and are easy to
effect in R.  Relevant method: plot.function(); relevant utility:
abline().  Look at the help for these.

cheers,

Rolf Turner



--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of 
working hours.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interpreting Results from LOF.test() from qpcR package

2023-08-20 Thread Ben Bolker

  The p-values are non-significant by any standard cutoff (e.g. 
p<=0.05, p<=0.1) but note that this is a *lack-of-fit* test -- i.e., 
"does my function fit the data well enough?", **not** a "significant 
pattern" test (e.g., "does my function fit the data better than a 
reasonable null model?").  In other words, this test tells you that you 
*can't* reject the null hypothesis that the model is "good enough" in 
some sense.


  To test against a constant null model, you could do

nullmod <- nlsr(y ~ const,
data = mod14data2_random,
start = list(const = 0.45))
anova(nlregmod3, nullmod)


  (This question seems to be verging on "general question about 
statistics" rather than "question about R", so maybe better for a venue 
like https://stats.stackexchange.com ??)


On 2023-08-20 9:01 p.m., Paul Bernal wrote:

I am using LOF.test() function from the qpcR package and got the following
result:


LOF.test(nlregmod3)

$pF
[1] 0.97686

$pLR
[1] 0.77025

Can I conclude from the LOF.test() results that my nonlinear regression
model is significant/statistically significant?

Where my nonlinear model was fitted as follows:
nlregmod3 <- nlsr(formula=y ~ theta1 - theta2*exp(-theta3*x), data =
mod14data2_random,
   start = list(theta1 = 0.37,
theta2 = -exp(-1.8),
theta3 = 0.05538))
And the data used to fit this model is the following:
dput(mod14data2_random)
structure(list(index = c(14L, 27L, 37L, 33L, 34L, 16L, 7L, 1L,
39L, 36L, 40L, 19L, 28L, 38L, 32L), y = c(0.44, 0.4, 0.4, 0.4,
0.4, 0.43, 0.46, 0.49, 0.41, 0.41, 0.38, 0.42, 0.41, 0.4, 0.4
), x = c(16, 24, 32, 30, 30, 16, 12, 8, 36, 32, 36, 20, 26, 34,
28)), row.names = c(NA, -15L), class = "data.frame")

Cheers,
Paul

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issues when trying to fit a nonlinear regression model

2023-08-20 Thread Ben Bolker

  I haven't looked to see whether you or Bert made an algebraic mistake 
in translating the parameters of the log-linear model to their 
equivalents for the nonlinear model, but nls() gives me the same answer 
as nls() in this case (I called my data 'dd2'):




n1 <- nlxb(y~theta1 - theta2*exp(-theta3*(x-8)),
 start = list(theta1 = 0.4, theta2 = -0.1, theta3 = 1/5),
 data = dd2)

cc <- coef(n1)
start2 <- with(as.list(cc),
   list(theta1 = theta1,
theta2 = theta2*exp(theta3*8),
theta3 = theta3))
unlist(start2)

n2 <- nlxb(y~theta1 - theta2*exp(-theta3*x),
 start = start2,
 data = dd2)
all.equal(unlist(start2), c(coef(n2)), tolerance = 1e-7)
cc2 <- coef(n2)

nlregmod2 <- nls(y ~ theta1 - theta2*exp(-theta3*x),
  start =
list(theta1 = cc2[["theta1"]],
 theta2 = cc2[["theta2"]],
 theta3 = cc2[["theta3"]]), data=dd2)


On 2023-08-20 2:50 p.m., Paul Bernal wrote:

Dear Bert,

Thank you so much for your kind and valuable feedback. I tried finding the
starting values using the approach you mentioned, then did the following to
fit the nonlinear regression model:
nlregmod2 <- nls(y ~ theta1 - theta2*exp(-theta3*x),
   start =
 list(theta1 = 0.37,
  theta2 = exp(-1.8),
  theta3 = -0.05538), data=mod14data2_random)
However, I got this error:
Error in nls(y ~ theta1 - theta2 * exp(-theta3 * x), start = list(theta1 =
0.37,  :
   step factor 0.000488281 reduced below 'minFactor' of 0.000976562
nlregmod2 <- nlxb(y ~ theta1 - theta2*exp(-theta3*x),
   start =
 list(theta1 = 0.37,
  theta2 = exp(-1.8),
  theta3 = -0.05538), data=mod14data2_random)
summary(nlregmod2)
Object has try-error or missing parameters
nlregmod2
And I get some NA values when retrieving the statistics for the fitted
model:
residual sumsquares =  0.0022973  on  15 observations
 after  2235Jacobian and  2861 function evaluations
   namecoeff  SE   tstat  pval  gradient
  JSingval
theta1   9330.89NA NA NA   5.275e-11
967470
theta2   9330.41NA NA NA  -5.318e-11
 1.772
theta3   -3.0032e-07NA NA NA   1.389e-05
8.028e-12

Kind regards,
Paul


El dom, 20 ago 2023 a las 13:21, Bert Gunter ()
escribió:


I got starting values as follows:
Noting that the minimum data value is .38, I fit the linear model log(y -
.37) ~ x to get intercept = -1.8 and slope = -.055. So I used .37,
exp(-1.8)  and -.055 as the starting values for theta0, theta1, and theta2
in the nonlinear model. This converged without problems.

Cheers,
Bert


On Sun, Aug 20, 2023 at 10:15 AM Paul Bernal 
wrote:


Dear friends,

This is the dataset I am currently working with:

dput(mod14data2_random)

structure(list(index = c(14L, 27L, 37L, 33L, 34L, 16L, 7L, 1L,
39L, 36L, 40L, 19L, 28L, 38L, 32L), y = c(0.44, 0.4, 0.4, 0.4,
0.4, 0.43, 0.46, 0.49, 0.41, 0.41, 0.38, 0.42, 0.41, 0.4, 0.4
), x = c(16, 24, 32, 30, 30, 16, 12, 8, 36, 32, 36, 20, 26, 34,
28)), row.names = c(NA, -15L), class = "data.frame")

I did the following to try to fit a nonlinear regression model:

#First, Procedure to Find Starting (initial) Values For Theta1, Theta2,
and
Theta3

mymod2 <- y ~ theta1 - theta2*exp(-theta3*x)

strt2 <- c(theta1 = 1, theta2 = 2, theta3 = 3)

trysol2<-nlxb(formula=mymod2, data=mod14data2_random, start=strt2,
trace=TRUE)
trysol2
trysol2$coefficients[[3]]

#Fitting nonlinear Regression Model Using Starting Values From Previous
Part
nonlinearmod2 <- nls(mymod2, start = list(theta1 =
trysol2$coefficients[[1]],
  theta2 = trysol2$coefficients[[2]],
  theta3 = trysol2$coefficients[[3]]), data =
mod14data2_random)

And I got this error:
Error in nlsModel(formula, mf, start, wts, scaleOffset = scOff, nDcentral
=
nDcntr) :
   singular gradient matrix at initial parameter estimates

Any idea on how to proceed in this situation? What could I do?

Kind regards,
Paul

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__

Re: [R] Issues when trying to fit a nonlinear regression model

2023-08-20 Thread Ben Bolker


  My answer is WAY longer than Bert Gunter's but maybe useful nonetheless.

  (Q for John Nash:  why does the coef() method for nlmrt objects 
return the coefficient vector **invisibly**?  That seems confusing!)


  Here's what I did:

* as a preliminary step, adjust the formula so that I don't have to 
think as hard to eyeball starting values from the plot: in particular, 
replace (x) with (x-8) so that the data start from x = 0


* with this adjustment, we can approximate as follows:
* theta1-theta2 is the value at (x' = 0) (or x = 8)
* theta1 is the value as x goes to infinity
* so since y(0) approx. 0.5 and y(inf) approx 0.4, we can use 
theta1 = 0.4, theta2 = -0.1
* theta3 gives the rate of decline. Since the curve drops by 
*approximately* a factor of e over the range from x'=0 to x'=5, 1/5 
should be a good starting value for this


n1 <- nlxb(y~theta1 - theta2*exp(-theta3*(x-8)),
 start = list(theta1 = 0.4, theta2 = -0.1, theta3 = 1/5),
 data = dd2)

residual sumsquares =  0.00076151  on  15 observations
after  6Jacobian and  9 function evaluations
  namecoeff  SE   tstat  pval  gradient 
   JSingval
theta1  0.391967  0.006128  63.96  0  -2.629e-11 
  4.085
theta2-0.0997234  0.007897 -12.63  2.733e-08  -1.035e-12 
 0.9956
theta3  0.107376   0.02365   4.54  0.0006777  -1.327e-11 
 0.3276


  Now if the formula is

theta1 - theta2*exp(-theta3*(x-8))

this is equivalent to

theta1 - theta2*exp(-theta3*x)*exp(8*theta3) =
theta1 - (theta2*exp(8*theta3))*exp(-theta3*x))

cc <- coef(n1)
start2 <- with(as.list(cc), list(theta1 = theta1, theta2 = 
theta2*exp(8*theta3), theta3 = theta3))

unlist(start2)
 theta1   theta2   theta3
 0.39197 -0.23543  0.10738

To confirm, rerun the fit with these starting values:

n1 <- nlxb(y~theta1 - theta2*exp(-theta3*x),
   start = start2,
  data = dd2)


nlmrt class object: x
residual sumsquares =  0.00076151  on  15 observations
after  4Jacobian and  5 function evaluations
  namecoeff  SE   tstat  pval  gradient 
   JSingval
theta1  0.391967  0.006128  63.96  0  -4.066e-12 
  4.228
theta2 -0.235429   0.04553 -5.171  0.0002328   3.151e-12 
 0.8508
theta3  0.107376   0.02365   4.54  0.0006777  -5.795e-12 
 0.1569




On 2023-08-20 1:15 p.m., Paul Bernal wrote:

Dear friends,

This is the dataset I am currently working with:

dput(mod14data2_random)

structure(list(index = c(14L, 27L, 37L, 33L, 34L, 16L, 7L, 1L,
39L, 36L, 40L, 19L, 28L, 38L, 32L), y = c(0.44, 0.4, 0.4, 0.4,
0.4, 0.43, 0.46, 0.49, 0.41, 0.41, 0.38, 0.42, 0.41, 0.4, 0.4
), x = c(16, 24, 32, 30, 30, 16, 12, 8, 36, 32, 36, 20, 26, 34,
28)), row.names = c(NA, -15L), class = "data.frame")

I did the following to try to fit a nonlinear regression model:

#First, Procedure to Find Starting (initial) Values For Theta1, Theta2, and
Theta3

mymod2 <- y ~ theta1 - theta2*exp(-theta3*x)

strt2 <- c(theta1 = 1, theta2 = 2, theta3 = 3)

trysol2<-nlxb(formula=mymod2, data=mod14data2_random, start=strt2,
trace=TRUE)
trysol2
trysol2$coefficients[[3]]

#Fitting nonlinear Regression Model Using Starting Values From Previous Part
nonlinearmod2 <- nls(mymod2, start = list(theta1 =
trysol2$coefficients[[1]],
  theta2 = trysol2$coefficients[[2]],
  theta3 = trysol2$coefficients[[3]]), data =
mod14data2_random)

And I got this error:
Error in nlsModel(formula, mf, start, wts, scaleOffset = scOff, nDcentral =
nDcntr) :
   singular gradient matrix at initial parameter estimates

Any idea on how to proceed in this situation? What could I do?

Kind regards,
Paul

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Determining Starting Values for Model Parameters in Nonlinear Regression

2023-08-19 Thread Ben Bolker


  Adding one more method:

glm(y~ x1 + x2 + x3 - 1, family = gaussian(link = "inverse"), data = mydata)

  will fit the exact model (including the desired error structure). 
The default GLM starting values usually work OK, but it is true that 
inverse-links can sometimes be more finicky than more typical links.	


On 2023-08-19 6:48 p.m., John Fox wrote:

Dear John, John, and Paul,

In this case, one can start values by just fitting

 > lm(1/y ~ x1 + x2 + x3 - 1, data=mydata)

Call:
lm(formula = 1/y ~ x1 + x2 + x3 - 1, data = mydata)

Coefficients:
  x1   x2   x3
0.00629  0.00868  0.00803

Of course, the errors enter this model differently, so this isn't the 
same as the nonlinear model, but the regression coefficients are very 
close to the estimates for the nonlinear model.


Best,
  John



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?

2023-08-17 Thread Ben Bolker

  Thanks, I was missing the point that this was *non-repeatability on 
the same platform*.


On 2023-08-17 10:31 a.m., Bill Dunlap wrote:

MKL's results can depend on the number of threads running and perhaps other
things.  They blame it on the non-associativity of floating point
arithmetic.  This article gives a way to make results repeatable:

https://www.intel.com/content/www/us/en/developer/articles/technical/introduction-to-the-conditional-numerical-reproducibility-cnr.html

-Bill

On Wed, Aug 16, 2023 at 8:11 PM Shu Fai Cheung 
wrote:


Hi All,

When addressing an error in one of my packages in the CRAN check at CRAN,
I found something strange which, I believe, is unrelated to my package.
I am not sure whether it is a bug or a known issue. Therefore, I would like
to have advice from experts here.

The error at CRAN check occurred in a test, and only on R-devel with MKL.
No
issues on all other platforms. The test compares two sets of random
numbers,
supposed to be identical because generated from the same seed. However,
when
I tried that in R-devel (2023-08-15 r84957) on Ubuntu 22.04.3 LTS, linked
to
MKL (2023.2.0), I found that this is not the case in one situation.

I don't know why but somehow only one particular set of means and
covariance matrix, among many others in the code, led to that problem.
Please find
below the code to reproduce the means and the covariance matrix (pardon me
for the long code):

mu0 <- structure(c(0.52252416853188655, 0.39883382931927774,
1.6296642535174521,
2.9045763671121816, -0.19816874840500939, 0.42610841566522556,
0.30155498531316366, 1.0339601619394503, 3.4125587827873192,
13.125481598475405, 19.275480386183503, 658.94225353462195,
1.0997193726987393,
9.9980286642877214, 6.4947188998602092, -12.952617780813679,
8.4991882784024799, 106.82209520278377, 0.19623712449219838), class =
c("numeric"))

sigma0 <- structure(c(0.0047010182215945009, 1.4688424622163808e-17,
-2.5269697696885822e-17,
-1.2451516929358655e-18, 3.6553475725253408e-19, 3.2092363914356227e-19,
-2.6341938382508503e-18, 8.6439582878287556e-19, -1.295240602054123e-17,
1.1154057497258702e-18, -8.2407621704807367e-33, -1.0891686096304908e-15,
-2.6800671450262819e-18, -0.0009225142979911, -1.4833018148344697e-16,
2.292242132203e-16, -3.5128615476876035e-18, 2.8004770623734002e-33,
2.0818099301348832e-34, 1.294907062174661e-17, 0.012788608283099183,
1.5603789410838518e-15, 1.0425182851251724e-16, 2.758318745465066e-17,
-9.7047963858148572e-18, -7.4685438142667792e-17, -1.9426723638193857e-17,
-7.8775307262071738e-17, -4.0773523140359949e-17, 4.1212975037936416e-18,
-1.199934828824835e-14, 2.301740948367742e-17, -2.5410883836579325e-18,
-0.129172198695584, -1.9922957470809647e-14, 1.7077690739402761e-16,
1.663702957392817e-30, 5.5362608501514495e-32, -2.5575000656645995e-17,
7.3879960338026333e-16, 0.052246980157830372, -0.007670030643844231,
-1.5468402204507243e-17, 1.0518757786432683e-17, 2.9992131812421253e-16,
1.6588830885081527e-17, -8.3276045185697992e-16, -3.3713008849128695e-15,
1.5823980389961169e-17, 2.2243227896948194e-15, -3.2071920299461956e-17,
5.0187645877462975e-18, -7.4622951185527265e-15, -0.44701112744184102,
-1.854066605407503e-16, -5.2511356793649759e-30, -6.7993385117653912e-32,
-1.5343740968067595e-18, -3.6785921616583949e-17, -0.0076700306438433133,
0.019231143599164325, 5.7818784939170702e-18, 1.7819897158350214e-17,
-2.3927039320969442e-17, -3.9630951242056644e-18, 1.9779107704573469e-15,
7.8103367010164485e-15, -1.2351275675875924e-17, -8.6959160183013082e-15,
-1.601002787863e-18, 3.0110116065267407e-19, 3.7155867714997651e-16,
-0.12490087179942209, -2.2029631919898945e-16, -8.0869824211018481e-31,
-5.058673986695e-33, 2.0667225433033359e-19, 1.0141735013231455e-18,
3.2845977347050714e-17, -2.9662734037779067e-19, 9.9595082331331027e-05,
-1.9982466114987023e-18, 2.6420113469249318e-18, 5.2415327077690342e-19,
3.007533948014542e-18, -5.1128158961673558e-17, 6.7791246559883483e-19,
-3.042327665547861e-15, 1.9523738271941284e-19, -4.0556768902105205e-20,
-1.024372770865438e-17, -0.010638955366526863, 2.5786757042756338e-17,
1.3546130005558746e-32, 1.7947249868721723e-33, 1.9228405363682382e-19,
-4.1209014001548407e-18, 1.5228291529275058e-17, 2.0443162438349059e-17,
-1.9572693345249383e-18, 0.0012709581028842473, -0.0018515998737074948,
9.3128024867175073e-20, -5.1895788671618993e-17, 2.7373981615289174e-16,
1.2812977711597223e-17, -2.2792319486263267e-15, 4.1599503721278813e-19,
-3.7733269771394201e-20, 4.16234419478765e-17, -1.5986158133468129e-16,
-0.016037670900735834, -5.0763064708173244e-33, -1.0176066166236902e-50,
-5.9296704797665404e-18, -8.0698801402861772e-17, 2.9646619457173492e-16,
-2.8879431119718227e-17, 2.9393253663483117e-18, -0.0018515998737074959,
0.090335687736246548, 5.6849412731758135e-19, 8.8934458836007799e-17,
-4.1390858756690365e-16, 4.120323677410211e-16, 2.8000915545933503e-15,
2.8094462743052983e-17, 1.1636214841356605e-18,

Re: [R] MASS::mvrnorm() on MKL may produce different numbers even when the seed is the same?

2023-08-17 Thread Ben Bolker

> However, should the numbers
> generated identical if the same seed is used?

  I don't see how using the same seed can overcome floating-point 
differences across platforms (compilers etc.) stemming from differences 
in an eigen() computation (based on arcane details like use of 
registers, compiler optimizations, etc.).  set.seed() guarantees that 
the normal deviates generated by rnorm() will be identical, but those 
random numbers are then multiplied by the eigenvectors derived from 
eigen(), at which point the differences will crop up.

  This has been discussed on Twitter: 
https://twitter.com/wviechtb/status/1230078883317387264

Wolfgang Viechtbauer, 2020-02-18:

Another interesting reproducibility issue that came up. MASS::mvrnorm() 
can give different values despite setting the same seed. The problem: 
Results of eigen() (which is used by mvrnorm) can be machine dependent 
(help(eigen) does warn about this).

Interestingly, mvtnorm::rmvnorm() with method="eigen" gives the same 
values across all machines I tested this on (and method="svd" give the 
same values). With method="chol" I get different values, but again 
consistent across machines.

Ah, mvtnorm::rmvnorm() applies the results from eigen() in a different 
way that appears to be less (not?) affected by the indeterminacy in the 
eigenvectors. Quite clever.

On 2023-08-16 11:10 p.m., Shu Fai Cheung wrote:

Hi All,

When addressing an error in one of my packages in the CRAN check at CRAN,
I found something strange which, I believe, is unrelated to my package.
I am not sure whether it is a bug or a known issue. Therefore, I would like
to have advice from experts here.

The error at CRAN check occurred in a test, and only on R-devel with MKL. No
issues on all other platforms. The test compares two sets of random numbers,
supposed to be identical because generated from the same seed. However, when
I tried that in R-devel (2023-08-15 r84957) on Ubuntu 22.04.3 LTS, linked to
MKL (2023.2.0), I found that this is not the case in one situation.

I don't know why but somehow only one particular set of means and
covariance matrix, among many others in the code, led to that problem.
Please find
below the code to reproduce the means and the covariance matrix (pardon me
for the long code):

mu0 <- structure(c(0.52252416853188655, 0.39883382931927774, 1.6296642535174521,
2.9045763671121816, -0.19816874840500939, 0.42610841566522556,
0.30155498531316366, 1.0339601619394503, 3.4125587827873192,
13.125481598475405, 19.275480386183503, 658.94225353462195, 1.0997193726987393,
9.9980286642877214, 6.4947188998602092, -12.952617780813679,
8.4991882784024799, 106.82209520278377, 0.19623712449219838), class =
c("numeric"))

sigma0 <- structure(c(0.0047010182215945009, 1.4688424622163808e-17,
-2.5269697696885822e-17,
-1.2451516929358655e-18, 3.6553475725253408e-19, 3.2092363914356227e-19,
-2.6341938382508503e-18, 8.6439582878287556e-19, -1.295240602054123e-17,
1.1154057497258702e-18, -8.2407621704807367e-33, -1.0891686096304908e-15,
-2.6800671450262819e-18, -0.0009225142979911, -1.4833018148344697e-16,
2.292242132203e-16, -3.5128615476876035e-18, 2.8004770623734002e-33,
2.0818099301348832e-34, 1.294907062174661e-17, 0.012788608283099183,
1.5603789410838518e-15, 1.0425182851251724e-16, 2.758318745465066e-17,
-9.7047963858148572e-18, -7.4685438142667792e-17, -1.9426723638193857e-17,
-7.8775307262071738e-17, -4.0773523140359949e-17, 4.1212975037936416e-18,
-1.199934828824835e-14, 2.301740948367742e-17, -2.5410883836579325e-18,
-0.129172198695584, -1.9922957470809647e-14, 1.7077690739402761e-16,
1.663702957392817e-30, 5.5362608501514495e-32, -2.5575000656645995e-17,
7.3879960338026333e-16, 0.052246980157830372, -0.007670030643844231,
-1.5468402204507243e-17, 1.0518757786432683e-17, 2.9992131812421253e-16,
1.6588830885081527e-17, -8.3276045185697992e-16, -3.3713008849128695e-15,
1.5823980389961169e-17, 2.2243227896948194e-15, -3.2071920299461956e-17,
5.0187645877462975e-18, -7.4622951185527265e-15, -0.44701112744184102,
-1.854066605407503e-16, -5.2511356793649759e-30, -6.7993385117653912e-32,
-1.5343740968067595e-18, -3.6785921616583949e-17, -0.0076700306438433133,
0.019231143599164325, 5.7818784939170702e-18, 1.7819897158350214e-17,
-2.3927039320969442e-17, -3.9630951242056644e-18, 1.9779107704573469e-15,
7.8103367010164485e-15, -1.2351275675875924e-17, -8.6959160183013082e-15,
-1.601002787863e-18, 3.0110116065267407e-19, 3.7155867714997651e-16,
-0.12490087179942209, -2.2029631919898945e-16, -8.0869824211018481e-31,
-5.058673986695e-33, 2.0667225433033359e-19, 1.0141735013231455e-18,
3.2845977347050714e-17, -2.9662734037779067e-19, 9.9595082331331027e-05,
-1.9982466114987023e-18, 2.6420113469249318e-18, 5.2415327077690342e-19,
3.007533948014542e-18, -5.1128158961673558e-17, 6.7791246559883483e-19,
-3.042327665547861e-15, 1.9523738271941284e-19, -4.0556768902105205e-20,
-1.024372770865438e-17, -0.010638955366526863, 2.5786757042756338e-17,

Re: [R] X11 font

2023-08-16 Thread Ben Bolker


There's an ancient related question ... maybe it helps?

 https://stat.ethz.ch/pipermail/r-help//2016-October/442326.html


sudo apt-get install xfonts-100dpi
sudo-apt-get install xfonts-75dpi

apt-cache search xfonts doesn't pull up anything else obvious 
(presumably you already have xfonts-base ... ???)


On 2023-08-16 5:08 p.m., Therneau, Terry M., Ph.D. via R-help wrote:

I get the following error out of R, on a newer Ubuntu installation.

Error in `axis()`:
! X11 font -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*, face 1 at size 12 could 
not be loaded
Backtrace:
   1. graphics::matplot(...)
   3. graphics::plot.default(...)
   4. graphics (local) localAxis(...)
   6. graphics:::Axis.default(...)
   7. graphics::axis(side = side, at = at, labels = labels, ...)

The context is pretty simple:

library(survival)
matplot(60:100, 36525* survexp.us[61:101, 1:2, "2014"], col=2:1, lty=1, lwd=2,
      xlab="Age", ylab="Death rate per 100", log='y', type='l', yaxt='n',
      main="US Death Rates")
axis(2, c(1,2,5,10,20, 50), c(1,2,5,10, 20, 50), las=2)

This code works fine on my screen.   The error comes up when I put it into an 
.Rmd file
and apply rmarkdown::render() to it.

Likely some font file is needed, but what one?

Terry

Version:
R Under development (unstable) (2023-08-01 r84818) -- "Unsuffered Consequences"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: aarch64-unknown-linux-gnu



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Noisy objective functions

2023-08-13 Thread Ben Bolker


  This is a huge topic.

  Differential evolution (DEoptim package) would be one good starting 
point; there is a simulated annealing method built into optim() (method 
= "SANN") but it usually requires significant tuning.


  Also genetic algorithms.

  You could look at the NLopt list of algorithms 
https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/ , focusing on 
the options for derivative-free global optimization , and then use them 
via the nloptr package.


  Good luck ...

On 2023-08-13 3:28 p.m., Hans W wrote:

While working on 'random walk' applications, I got interested in
optimizing noisy objective functions. As an (artificial) example, the
following is the Rosenbrock function, where Gaussian noise of standard
deviation `sd = 0.01` is added to the function value.

 fn <- function(x)
   (1+rnorm(1, sd=0.01)) * adagio::fnRosenbrock(x)

To smooth out the noise, define another function `fnk(x, k = 1)` that
calls `fn` k times and returns the mean value of those k function
applications.

 fnk <- function(x, k = 1) { # fnk(x) same as fn(x)
 rv = 0.0
 for (i in 1:k) rv <- rv + fn(x)
 return(rv/n)
 }

When we apply several optimization solvers to this noisy and smoothed
noise functions we get for instance the following results:
(Starting point is always `rep(0.1, 5)`, maximal number of iterations 5000,
  relative tolerance 1e-12, and the optimization is successful if the
function value at the minimum is below 1e-06.)

   k   nmk   anms neldermead ucminf optim_BFGS
  ---
   1  0.21   0.32   0.13   0.00   0.00
   3  0.52   0.63   0.50   0.00   0.00
  10  0.81   0.91   0.87   0.00   0.00

Solvers: nmk = dfoptim::nmk, anms = pracma::anms [both Nelder-Mead codes]
  neldermead = nloptr::neldermead,
  ucminf = ucminf::ucminf, optim_BFGS = optim with method "BFGS"

Read the table as follows: `nmk` will be successful in 21% of the
trials, while for example `optim` will never come close to the true
minimum.

I think it is reasonable to assume that gradient-based methods do
poorly with noisy objectives, though I did not expect to see them fail
so clearly. On the other hand, Nelder-Mead implementations do quite
well if there is not too much noise.

In real-world applications, it will often not be possible to do the
same measurement several times. That is, we will then have to live
with `k = 1`. In my applications with long 'random walks', doing the
calculations several times in a row will really need some time.

QUESTION: What could be other approaches to minimize noisy functions?

I looked through some "Stochastic Programming" tutorials and did not
find them very helpful in this situation. Of course, I might have
looked into these works too superficially.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] group consecutive dates in a row

2023-08-07 Thread Ben Bolker

rle(as.numeric(diff(mydf$data_POSIX))) should get you started, I think?

On 2023-08-07 12:41 p.m., Stefano Sofia wrote:

Dear R users,

I have a data frame with a single column of POSIXct elements, like

mydf <- data.frame(data_POSIX=as.POSIXct(c("2012-02-05", "2012-02-06", "2012-02-07", "2012-02-13",
"2012-02-21"), format = "%Y-%m-%d", tz="Etc/GMT-1"))

I need to transform it in a two-columns data frame where I can get rid of
consecutive dates. It should appear like

data_POSIX_init data_POSIX_fin

2012-02-05 2012-02-07

2012-02-13 NA

2012-02-21 NA

I started with two "while cycles" and so on, but this is not an efficient way
to do it.

Could you please give me an hint on how to proceed?

Thank you for your precious attention and help

Stefano

(oo)
--oOO--( )--OOo--
Stefano Sofia PhD
Civil Protection - Marche Region - Italy
Meteo Section
Snow Section
Via del Colle Ameno 5
60126 Torrette di Ancona, Ancona (AN)
Uff: +39 071 806 7743
E-mail: stefano.so...@regione.marche.it
---Oo-oO

AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu� contenere
informazioni confidenziali, pertanto � destinato solo a persone autorizzate
alla ricezione. I messaggi di posta elettronica per i client di Regione Marche
possono contenere informazioni confidenziali e con privilegi legali. Se non si
� il destinatario specificato, non leggere, copiare, inoltrare o archiviare
questo messaggio. Se si � ricevuto questo messaggio per errore, inoltrarlo al
mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi
dell'art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit� ed
urgenza, la risposta al presente messaggio di posta elettronica pu� essere
visionata da persone estranee al destinatario.
IMPORTANT NOTICE: This e-mail message is intended to be received only by
persons entitled to receive the confidential information it may contain. E-mail
messages to clients of Regione Marche may contain information that is
confidential and legally privileged. Please do not read, copy, forward, or
store this message unless you are an intended recipient of it. If you have
received this message in error, please forward it to the sender and delete it
completely from your computer system.

--
Questo messaggio stato analizzato da Libraesva ESG ed risultato non infetto.
This message was scanned by Libraesva ESG and is believed to be clean.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting Fitted vs Observed Values in Logistic Regression Model

2023-08-01 Thread Ben Bolker


logistic_regmod2 <- glm(formula = ratio~x, family = binomial(logit), data =
random_mod12_data2, weights =n)

plot(ratio ~ x, data = random_mod12_data2)

pframe <- data.frame(x = sort(random_mod12_data2$x))
pframe$ratio <- predict(logistic_regmod2, newdata = pframe, type = 
"response")

with(pframe, lines(x, ratio))



On 2023-08-01 9:57 a.m., Paul Bernal wrote:

Dear friends,

I hope  this email finds you all well. This is the dataset I am working
with:
dput(random_mod12_data2)
structure(list(Index = c(1L, 5L, 11L, 3L, 2L, 8L, 9L, 4L), x = c(5,
13, 25, 9, 7, 19, 21, 11), n = c(500, 500, 500, 500, 500, 500,
500, 500), r = c(100, 211, 391, 147, 122, 310, 343, 176), ratio = c(0.2,
0.422, 0.782, 0.294, 0.244, 0.62, 0.686, 0.352)), row.names = c(NA,
-8L), class = "data.frame")

A brief description of the dataset:
Index: is just a column that shows the ID of each observation (row)
x: is a column which gives information on the discount rate of the coupon
n: is the sample or number of observations
r: is the count of redeemed coupons
ratio: is just the ratio of redeemed coupons to n (total number of
observations)

#Fitting a logistic regression model to response variable y for problem 13.4
logistic_regmod2 <- glm(formula = ratio~x, family = binomial(logit), data =
random_mod12_data2)

I would like to plot the value of r (in the y-axis) vs x (the different
discount rates) and then superimpose the logistic regression fitted values
all in the same plot.

How could I accomplish this?

Any help and/or guidance will be greatly appreciated.

Kind regards,
Paul

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of 
working hours.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Downloading a directory of text files into R

2023-07-25 Thread Ben Bolker

 Where is readtext() from?

  Some combination of scraping

http://home.brisnet.org.au/~bgreen/Data/Hanson1/

and

http://home.brisnet.org.au/~bgreen/Data/Hanson2/

to recover the required file names:

library(rvest)
read_html("http://home.brisnet.org.au/~bgreen/Data/Hanson1/;) |> 
html_element("body") |> html_element("table") |> html_table()

will get you most of the way there ...

then an lapply() or for loop to download all the bits ...?

On 2023-07-25 6:06 p.m., Bob Green wrote:

Hello,

I am seeking advice as to how I can download the 833 files from this 
site:"http://home.brisnet.org.au/~bgreen/Data/;

I want to be able to download them to perform a textual analysis.

If the 833 files, which are in a Directory with two subfolders were on 
my computer I could read them through readtext. Using readtext I get the 
error:

 > x = readtext("http://home.brisnet.org.au/~bgreen/Data/*;)
Error in download_remote(file, ignore_missing, cache, verbosity) :
   Remote URL does not end in known extension. Please download the file 
manually.

 > x = readtext("http://home.brisnet.org.au/~bgreen/Data/Dir/()")
Error in download_remote(file, ignore_missing, cache, verbosity) :
   Remote URL does not end in known extension. Please download the file 
manually.

Any suggestions are appreciated.

Bob

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of 
working hours.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] col2rgb() function

2023-07-23 Thread Ben Bolker

  You could also adjustcolor for this approach (levels of red, green, 
blue, alpha can all be adjusted proportionally)


On 2023-07-23 5:35 p.m., David Stevens via R-help wrote:

Nick,

I've also made colors transparent by pasting the hex equivalent of, say,
0.3*256 = 76.9 to the hex color code. e.q. for black it might be
"#004d" and the 4d is 77 in hex. That way you don't need to convert
back and forth so much. If col is "#00" the transparent version is

tcol <- paste0(col,"4d")

This would work in one step on a whole palette.

David

David K Stevens, PhD, PE, Professor
Civil and Environmental Engineering
Utah Water Research Laboratory
Utah State University
8200 Old Main Hill
Logan, UT 84322-8200
david.stev...@usu.edu
(435) 797-3229 (office)

On 7/23/2023 1:00 PM, Nick Wray wrote:

Thanks That works nicely  Nick

On Sun, 23 Jul 2023 at 19:26, Ben Bolker  wrote:


 Does adjustcolor() help?

cb8<- c("#00", "#E69F00", "#56B4E9", "#009E73","#F0E442", "#0072B2",
  "#D55E00", "#CC79A7")
plot(0,0,xlim=c(1,8),ylim=c(0,1))
points(1:8,rep(0.5,8),col=cb8,pch=19,cex=2)
points(1:8,rep(0.75,8),col=adjustcolor(cb8, alpha.f = 0.3), pch=19,cex=2)

On 2023-07-23 2:15 p.m., Nick Wray wrote:

Hello  I have a palette vector of colour blind colours (in hexadecimal)
which I’m using for plots, but they are not see-through, and as I wanted

to

overlay some histograms I wanted to convert these colours to rgb, when

you

can set the opacity.

I have found the function col2rgb(), which works in the sense that it

gives

a vector of numbers but these don’t work directly in rgb because they are
too big.  If I divide through to make them all less than 1 I don’t get

the

corresponding colour-blind hue, but something somewhat off.

Here is the colour-blind palette in a plot:


*cb8<- c("#00", "#E69F00", "#56B4E9", "#009E73","#F0E442", "#0072B2",
"#D55E00", "#CC79A7")*

*plot(0,0,xlim=c(1,8),ylim=c(0,1))*

*points(1:8,rep(0.5,8),col=cb8,pch=19,cex=2)*



so if I try to convert the red dot ("#D55E00") (number 7) I get

*col2rgb("#D55E00"*

 [,1]

red213

green   94

blue 0

*points(7,0.25,col=rgb(rgb(213,94,0)),pch=19,cex=2)*

gives me an error message and although if  I divide through

*points(7,0.25,col=rgb(213/307,94/307,0),pch=19,cex=2)*

gives me a reddish dot, but not the same as in the colour-blind palette



Somewhat mystified.  Can anyone help?? Thanks Nick Wray

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] col2rgb() function

2023-07-23 Thread Ben Bolker


  Does adjustcolor() help?

cb8<- c("#00", "#E69F00", "#56B4E9", "#009E73","#F0E442", "#0072B2",
   "#D55E00", "#CC79A7")
plot(0,0,xlim=c(1,8),ylim=c(0,1))
points(1:8,rep(0.5,8),col=cb8,pch=19,cex=2)
points(1:8,rep(0.75,8),col=adjustcolor(cb8, alpha.f = 0.3), pch=19,cex=2)

On 2023-07-23 2:15 p.m., Nick Wray wrote:

Hello  I have a palette vector of colour blind colours (in hexadecimal)
which I’m using for plots, but they are not see-through, and as I wanted to
overlay some histograms I wanted to convert these colours to rgb, when you
can set the opacity.

I have found the function col2rgb(), which works in the sense that it gives
a vector of numbers but these don’t work directly in rgb because they are
too big.  If I divide through to make them all less than 1 I don’t get the
corresponding colour-blind hue, but something somewhat off.

Here is the colour-blind palette in a plot:


*cb8<- c("#00", "#E69F00", "#56B4E9", "#009E73","#F0E442", "#0072B2",
"#D55E00", "#CC79A7")*

*plot(0,0,xlim=c(1,8),ylim=c(0,1))*

*points(1:8,rep(0.5,8),col=cb8,pch=19,cex=2)*



so if I try to convert the red dot ("#D55E00") (number 7) I get

*col2rgb("#D55E00"*

   [,1]

red213

green   94

blue 0

*points(7,0.25,col=rgb(rgb(213,94,0)),pch=19,cex=2)*

gives me an error message and although if  I divide through

*points(7,0.25,col=rgb(213/307,94/307,0),pch=19,cex=2)*

gives me a reddish dot, but not the same as in the colour-blind palette



Somewhat mystified.  Can anyone help?? Thanks Nick Wray

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glmmPQL crashes on inclusion of corSpatial object

2016-07-25 Thread Ben Bolker

Patrick Johann Schratz  gmail.com> writes:

> 
> Link to data: <https://www.dropbox.com/s/yi3vf0bmqvydr8h/data.Rd?dl=0>  
> (1170 obs, 9 variables, .Rd file) [plain link in case sth goes wrong 
> with the hyperlink: https://www.dropbox.com/s/yi3vf0bmqvydr8h/data.Rd?dl=0]
> 

 By the way, this has been cross-posted both to StackOverflow
and to r-sig-mixed-mod...@r-project.org. Cross-posting between
R lists is explicitly deprecated; cross-posting between R lists
and StackOverflow is not *explicitly* deprecated anywhere, but is
probably a bad idea (tends to waste the time of answerers who don't
know the question has been commented on and/or answered already
elsewhere).  Try to pick the single best venue and stick to it.

  Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glmmLasso with interactions errors

2016-07-15 Thread Ben Bolker

Cade, Brian  usgs.gov> writes:

> 
> It has never been obvious to me that the lasso approach can handle
> interactions among predictor variables well at all. 
> I'ld be curious to see
> what others think and what you learn.
> 
> Brian
> 

  For what it's worth I think lasso *does* handle interactions
reasonably (although I forget where I read that) -- there is a
newer "hierarchical lasso" that tries to deal with marginality
concerns more carefully.

  Related questions asked on StackOverflow:

http://stackoverflow.com/questions/37910042/glmmlasso-warning-messages/
  37922918#37922918
(warning, broken URL)

My answer (in comments) there was

my guess is that you're going to have to build your own model
matrix/dummy variables; I think that as.factor() in formulas is
treated specially, so including the interaction term will probably
just confuse it. (It would be worth trying as.factor(Novelty:ROI) - I
doubt it'll work but if it does it would be the easiest way forward.)


> 
> On Wed, Jul 13, 2016 at 2:20 PM, Walker Pedersen  uwm.edu> wrote:

[snip]

> >
> > An abbreviated version of my dataset is here:
> >
> > https://drive.google.com/open?id=0B_LliPDGUoZbVVFQS2VOV3hGN3c
> >

[snip snip]

> > Before glmmLasso I am running:
> >
> > KNov$Subject <- factor(KNov$Subject)
> >
> > to ensure the subject ID is not treated as a continuous variable.
> >
> > If I run:
> >
> > glm1 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> > STAIt + as.factor(ROI)
> > + as.factor(Valence):as.factor(ROI), list(Subject=~1), data = KNov,
> > lambda=10)
> > summary(glm1)
> >
> > I don't get any warning messages, but the output contains b estimates
> > only, no SE or p-values.
> >
> > If I try to include a 3-way interaction, such as:
> >
> > glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> > STAIt + as.factor(ROI)
> > + as.factor(Novelty):as.factor(Valence):as.factor(ROI),
> > list(Subject=~1), data = Nov7T, lambda=10)
> > summary(glm2)
> >
> > I get the warnings:
> >
> > Warning messages:
> > 1: In split.default((1:ncol(X))[-inotpen.which], ipen) :
> >   data length is not a multiple of split variable
> > 2: In lambda_vec * sqrt(block2) :
> >   longer object length is not a multiple of shorter object length
> >
> > And again, I do get parameter estimates, and no SE or p-values.
> >
> > If I include my continuous variable in any interaction, such as:
> >
> > glm3 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> > STAIt + as.factor(ROI)
> > + as.factor(Valence):as.factor(ROI) + as.factor(Novelty):STAIt,
> > list(Subject=~1), data = Nov7T, lambda=10)
> > summary(glm3)
> >
> > I get the error message:
> >
> > Error in rep(control$index[i], length.fac) : invalid 'times' argument
> >
> > and no output.
> >
> > If anyone has an input as to (1) why I am not getting SE or p-values
> > in my outputs (2) the meaning of there warnings I get when I include a
> > 3-way variable, and if they are something to worry about, how to fix
> > them and (3) how to fix the error message I get when I include my
> > continuous factor in an interatction, I would be very appreciative.


 [snip snip snip]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dependent p.values in R

2016-07-10 Thread Ben Bolker

Fernando Marmolejo Ramos  psychology.su.se>
writes:

> 
> hi marc
> 
> say i have a vector with some x number of observations
> 
> x = c(23, 56, 123, . )
> 
> and i want to know how normal it is
> 
> as there are many normality tests, i want to combine their p.values
> 
> so, suppose i use shapiro.wilk, anderson darling and
>jarque bera and each will give a pvalue
> 
> i could simply average those p,values but to my knowledge 
>that approach is biased
> 
> so i thought, in the same way there is a method to combine
>independent pvalues (e.g. stouffer method); is
> there a way to combine dependent pvalues?
> 
> best
> 
> f
> 

  Yikes.  There is extensive discussion, e.g. at
http://tinyurl.com/normtests , that suggests that much of the
time (if not always) formal statistical hypothesis tests for
normality are misguided.  Combining p-values from different tests
feels like compounding the issue.  In any case, I would definitely
say that this a question for CrossValidated 
(http://stats.stackexchange.com), rather than r-help ...

  Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fixed Effects in lme function

2016-07-09 Thread Ben Bolker

li li  gmail.com> writes:

> 
>  Dear all,

> For the data below, I would like to fit a model with common
> random slope and common random intercept as shown below. I am
> interested in obtaining separate fixed effect estimates (intercept
> and slope and corresponding hypothesis test) for each
> method. Instead of performing the analysis method by method, can I
> use the syntax as below, specifically, the part "fixed = response ~
> method/time"?  I know this is legitimate specification if there is
> no random effects involved.  Thanks so much in advance!  Hanna

  This is probably a better question for the r-sig-mixed-models
mailing list ... followups there, please.  (Also, please don't post in HTML)

Yes, that seems as though it should work.

dat <- read.table(header=TRUE,
text="response individual time method
102.9 3 0 3
103.0 3 3 3
103.0 3 6 3
102.8 3 9 3
102.2 3 12 3
102.5 3 15 3
103.0 3 18 3
102.0 3 24 3
102.8 1 0 3
102.7 1 3 3
103.0 1 6 3
102.2 1 9 3
103.0 1 12 3
102.8 1 15 3
102.8 1 18 3
102.9 1 24 3
102.2 2 0 3
102.6 2 3 3
103.4 2 6 3
102.3 2 9 3
101.3 2 12 3
102.1 2 15 3
102.1 2 18 3
102.2 2 24 3
102.7 4 0 3
102.3 4 3 3
102.6 4 6 3
102.7 4 9 3
102.8 4 12 3
102.5 5 0 3
102.4 5 3 3
102.1 5 6 3
102.3 6 0 3
102.3 6 3 3
101.9 7 0 3
102.0 7 3 3
107.4 3 0 1
101.3 3 12 1
92.8 3 15 1
73.7 3 18 1
104.7 3 24 1
92.6 1 0 1
101.9 1 12 1
106.3 1 15 1
104.1 1 18 1
95.6 1 24 1
79.8 2 0 1
89.7 2 12 1
97.0 2 15 1
108.4 2 18 1
103.5 2 24 1
96.4 4 0 1
89.3 4 12 1
112.6 5 0 1
93.3 6 0 1
99.6 7 0 1
109.5 3 0 2
98.5 3 12 2
103.5 3 24 2
113.5 1 0 2
94.5 1 12 2
88.5 1 24 2
99.5 2 0 2
97.5 2 12 2
98.5 2 24 2
103.5 4 0 2
89.5 5 0 2
87.5 6 0 2
82.5 7 0 2
")

library(nlme)

## check design
with(dat,table(method,time))
with(dat,table(method,individual))
with(dat,table(time,individual))
dat <- transform(dat,
method=factor(method),
individual=factor(individual))
library(ggplot2); theme_set(theme_bw())
ggplot(dat,aes(time,response,colour=method))+geom_point()+
stat_summary(fun.y=mean,geom="line",
aes(group=individual),colour="black",alpha=0.6)

library(nlme)

Is 'lot' supposed to be 'individual' here? Or is there another variable
we don't know about?

m1 <- lme(fixed= response ~ method/time, random=~ 1+time | individual,
data=dat,
  weights= varIdent(form=~1|method),   control = lmeControl(opt = "optim"),
  na.action = na.exclude)

summary(m1)$tTable

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Effect size measures for GLM

2016-06-24 Thread Ben Bolker

Gianfranco Lovison  unipa.it> writes:

> 
> Is there a library for (friendly) calculation of effect size measures for
> Generalized Linear Models? I have found "compute.es", but it seems to be
> suitable only for Linear Models. A library taking a glm object and 
> computing
> partial R^2-type statistics, appropriate for GLMs, would be enough, but I have
> bee unable to find it!
> 
> 

  I don't know about *partial* R^2 statistics, but as a starting point try
library("sos"); findFn("nagelkerke") for example.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] model specification using lme

2016-05-29 Thread Ben Bolker

li li  gmail.com> writes:

> 
> Hi all,
>   For the following data, I consider the following random intercept and
> random slope model. Denote as y_ijk the response value from *j*th
> individual within *i*th method at time point *k*. Assume the following
> model for y_ijk:
> 
>   y_ijk= (alpha_0+ tau_i +a_j(i))+(beta_i+b_j(i)) T_k + e_ijk
> 

  If someone answers here, that's great, but I'm guessing that you
will have better luck re-posting this to r-sig-mixed-mod...@r-project.org
(maybe wait a day or so before re-posting, and when you do, mention that 
you're reposting)

  cheers
Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] code to provoke a crash running rterm.exe on windows

2016-05-28 Thread Ben Bolker

Anthony Damico  gmail.com> writes:

> 
> hi, here's a minimal reproducible example that crashes my R 3.3.0 console
> on a powerful windows server.  below the example, i've put the error (not
> crash) that occurs on R 3.2.3.
> 
> should this be reported to http://bugs.r-project.org/ or am i doing
> something silly?  thanx

From the R FAQ (9.1):

If R executes an illegal instruction, or dies with an operating system
error message that indicates a problem in the program (as opposed to
something like “disk full”), then it is certainly a bug.

  So you could submit a bug report, *or* open a discussion on
r-de...@r-project.org  (which I'd have said was a more appropriate
venue for this question in any case) ...

  Ben Bolker
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error in lmerTest after updating to R 3.3.0

2016-05-10 Thread Ben Bolker

Pascal A. Niklaus  ieu.uzh.ch> writes:

> 
> Dear all,
> 
> After updating to R 3.3.0 (inadvertently, via apt-get), I get an error 
> when using lmerTest. Here is an example:
> 
> library(lmerTest)
> library(MASS)
> data(oats)
> m <- lmer(Y ~ N*V + (1|B/V), data=oats)
> summary(m)
> 
> summary from lme4 is returned
> some computational error has occurred in lmerTest
> 
> I have removed all old libraries and tried to have as clean an 
> installation as possible, but this did not fix the problem.
> 
> Is this an incompatibility or a problem with my installation?
> 

 Hard to say.  Works for me on a clean MacOS install of R 3.3/lme4/
lmerTest.

  You might want to send followups to r-sig-mixed-mod...@r-project.org



Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.9.5 (Mavericks)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods  
[7] base 

other attached packages:
[1] MASS_7.3-45 lmerTest_2.0-30 lme4_1.1-12 Matrix_1.2-6   

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.4 Formula_1.2-1   cluster_2.0.4  
 [4] splines_3.3.0   munsell_0.4.3   colorspace_1.2-6   
 [7] lattice_0.20-33 minqa_1.2.4 plyr_1.8.3 
[10] tools_3.3.0 nnet_7.3-12 grid_3.3.0 
[13] data.table_1.9.6gtable_0.2.0nlme_3.1-127   
[16] latticeExtra_0.6-28 survival_2.39-3 gridExtra_2.2.1
[19] RColorBrewer_1.1-2  nloptr_1.0.4ggplot2_2.1.0  
[22] acepack_1.3-3.3 rpart_4.1-10scales_0.4.0   
[25] Hmisc_3.17-4chron_2.3-47foreign_0.8-66

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coverage Probability

2016-05-10 Thread Ben Bolker

Muhammad  Kashif  uaf.edu.pk> writes:

> 
> Dears
> Can anyone help me to solve the issue.
> 
> By using" boot" and "boot.ci" package in R we can construct bootstrap
confidence intervals. How we
> calculate the coverage probability of these intervals.

  Calculating coverage probability for any but the simplest
cases requires simulations.  You need to simulate a particular
scenario; for each simulation, use your estimation and confidence-interval
calculation procedure; and then score the particular simulation
as 1 (estimated CI included the true parameter value) or 0
(true parameter value fell outside the estimated CI). Across
a large number of simulations, the proportion of ones is
an estimate of the coverage.

This is discussed more here:

http://ms.mcmaster.ca/~bolker/emdbook/chap5A.pdf

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regarding to R

2016-05-10 Thread Ben Bolker

shweta shukla  gmail.com> writes:

> 
> Dear,
> I recently started R for my analysis but still not clear concept.
> please guide me with some informative mterial to learn R.
> 
> Iam also confuse that which one I should prefer R or R studio and what
> differences in between.
> 

R and RStudio are two different types of thing; it's not a question of
"which one to use".  RStudio is a front-end (interface) for R. You
probably *should* use RStudio (rather than the standard "R console"
interface that comes with R), it's well supported and has lots of
useful features.  Most of the time when you're asking questions,
though, they will be questions about R (unless they are specific
issues about the *interface*, they won't be RStudio-specific).

  It's hard to tell you where to start with "informative material".
What is most useful will depend on your background; your field/desired
type of analyses; language; personality; etc. etc..  There are literally
thousands of R tutorials and books, many available for free on the
internet.  I'd suggest you google "learning R programming", inspect
a dozen or so of the top hits, and see which ones seem to suit you
best.

  good luck,
Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to solve an NLME problem?

2016-03-29 Thread Ben Bolker

David C Blouin  lsu.edu> writes:

>  I have a nonlinear model where I want to include random
> coefficients for a sample of random SUBJECTS. The fixed effects part
> of the model is Y ~ B + (T - B) / (1 + 10**(X - C)) and I would like
> to include random coefficients b for B, t for T, and c for C. The
> model is a fairly well-known three-parameter log-inhibitor model
> where Y is a measure of growth at X = log(CONCENTRATION), B is the
> unknown lower asymptote, T is the unknown upper asymptote, and C is
> the unknown log(IC50), where IC50 is the half maximal inhibitory
> concentration. I do not know how to solve this problem in R, where I
> am assuming NLME contains the appropriate methodology.
> [[alternative HTML version deleted]]

  something like

  nlme(Y ~ B + (T - B) / (1 + 10**(X - C)),
   random=B+T+C~group,
   data=your_data,
   start=list(fixed=c(B=something,T=something2,C=something3)))

 Follow-ups to r-sig-mixed-mod...@r-project.org, please ...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regression with factor having1 level

2016-03-10 Thread Ben Bolker

Robert McGehee  gmail.com> writes:

> 
> Hello R-helpers,
> I'd like a function that given an arbitrary formula and a data frame
> returns the residual of the dependent variable, and maintains all
>  NA values.
> 
> Here's an example that will give me what I want if my formula is y~x1+x2+x3
> and my data frame is df:
> 
> resid(lm(y~x1+x2+x3, data=df, na.action=na.exclude))
> 
> Here's the catch, I do not want my function to ever fail due to a factor
> with only one level. A one-level factor may appear because 1) the user
> passed it in, or 2) (more common) only one factor in a term is left after
> na.exclude removes the other NA values.
> 

 [snip to try to make Gmane happy]
> 
> Can anyone provide me a straight forward recommendation for how 
> to do this?

  The only approach I can think of is to screen for single-level factors
yourself and remove these factors from the
formula. It's a little tricky; you can't call model.frame() with a single-level
factor (that's where the error comes from), and you have to strip out NA
values yourself so you can see which factors end up with only a single
level after NA removal.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ruofei Mo - How can I generate correlated data with non-normal distribution?

2016-03-02 Thread Ben Bolker

Ruofei Mo【莫若飞】 <911mruofei  tongji.edu.cn> writes:

> 
> Hi, All,
> 
> I have a question about how to generate correlated data with non-normal
> distribution? Basic, I have a variable a that follows a normal distribution,
> a ~ N(0,1), then I want to generate another variable b that follows a
> uniform distribution, b ~ U(0, 1). Most importantly, I want the correlation
> between a and b to be fixed at -.9, cor(a,b) = -.90
> 
> I tried the following code,
> 
> ### Correlation matrix rmvnorm() function ###
> 

  I don't know that there's a closed-form solution to this problem.
Here's an attempt to do it by brute force.  By eyeball, you need to
set the nominal rho to about -0.92 to get a realized rho of -0.9.

simfun <- function(rho,n=1) {
cormat <- matrix(c(1, rho, rho, 1), ncol = 2)
dd <- setNames(data.frame(MASS::mvrnorm(1000, mu=c(0,0), Sigma=cormat)),
   c("a","trans"))
dd <- transform(dd,
   b=pnorm(trans,mean(trans),sd(trans)))
dd[,c("a","b")]
}

cvec <- seq(-0.999,-0.85,length=51)
res <- expand.grid(rho=cvec,rep=1:10)
set.seed(101)
res$cor <- sapply(res$rho,
  function(r) cor(simfun(rho=r,n1e6))[1,2])

par(las=1,bty="l")
plot(cor~rho,data=res)
abline(a=0,b=1,col=2)
abline(h=-0.9,col=4)
abline(v=-0.92,col=4)
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] MCMCglmm and iteration

2016-02-16 Thread Ben Bolker

Rémi Lesmerises  yahoo.ca> writes:

>  Hi everyone, I'm running a bayesian regression using the package
> MCMCglmm (Hadfield 2010) and to reach a normal posterior
> distribution of estimates, I increased the number of iteration as
> well as the burnin threshold. However, it had unexpected
> outcomes. Although it improved posterior distribution, it also
> increased dramatically the value of estimates and decrease DIC. 
> Here's an example:

head(spring)

pres large_road  small_road  cab 
0  2011 32 78 
1   102179204 
0  1256654984 
1   187986756 
021438 57 
113  5439 

# pres is presence/absence data and other variable are distance to these
features
# with 200,000 iteration and 30,000 burnin

prior <- list(R = list(V = 1, nu=0.002))
sp.simple <- MCMCglmm(pres ~ large_road + cab + small_road,
   family = "categorical", nitt = 20, thin = 200,
burnin = 3, data = spring, prior = prior, verbose = FALSE, pr = TRUE)

(1) you will do much better with this kind of question on r-sig-mixed-models.
(2) it looks like your chain is mixing very, very badly.  If I'm reading
the output correctly, it looks like your effective sample sizes for the
first run (200K iterations) are 1-3 (!) -- you should be aiming for 
effective sample sizes of 100s to 1000s.  Even with a million iterations
you're only getting up to effective sample sizes of ~150 for some
parameters.  I would recommend (a) centring and scaling your parameters
to improve mixing and (b) cross-checking with a different method
(e.g. lme4 or glmmADMB) to make sure you're in the right ballpark.

  You shouldn't necessarily expect a Normal posterior as you increase
the number of iterations; the posterior distributions are only 
asymptotically Normal as the number of *observations* increases.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] installing tikzDevices for R 3.2.3

2016-01-16 Thread Ben Bolker

Ranjan Maitra  inbox.com> writes:

> 
> On Fri, 15 Jan 2016 21:25:06 -0600 Ranjan Maitra 
inbox.com> wrote:
> 
> > Hi,
> > 
> > I wanted to install tikzDevices on a installation of 
> R (3.2.3) on a new machine. However, I am getting:
> > 
> > > install.packages('tikzDevices')
> > 
> > Warning message:
> > package ‘tikzDevices’ is not available (for R version 3.2.3) 
> > 
> > Is there any way out for me other than wait for the tikzDevices to be
updated in the repos? I am on Fedora 23
> with everything up to date.
> > 
> > Many thanks in advance for any suggestions and best wishes,
> > Ranjan
> 
> Sorry to answer my own question, but I found a way out 
> (only specific to tikzDevices):
> 
> install.packages("tikzDevice", repos="http://R-Forge.R-project.org;)

  This is all a little bit surprising since tizkDevice seems
perfectly fine on CRAN: 
https://cran.rstudio.com/web/packages/tikzDevice/index.html

  What are getOption("repos") and sessionInfo() ?
  Perhaps there is some temporary binary-building delay because
R 3.2.3 just came out?

   Did you try install.packages("tikzDevice", type="source") ?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find Crossover Points of Two Spline Functions

2015-09-28 Thread Ben Bolker

Dario Strbenac  uni.sydney.edu.au> writes:

> 
> Good day,
> 
> I have two probability densities, each with a function determined
> by splinefun(densityResult[['x']],
> densityResult[['y']], "natural"), where densityResult is the
> output of the density function in stats.
> How can I determine all of the x values at which the densities cross ?
> 

  My initial thought was this is non-trivial, because the two densities could
cross (or nearly-but-not-quite cross) at an unlimited number of points.
I thought it would essentially boils down to "how do I find all 
the roots of an arbitrary (continuous, smooth) function?

  However, after thinking about it for a few more seconds I realize
that at least the functions are piecewise cubic.  I still don't see
a *convenient* way to do it ... if the knots were all coincident between
the two densities (maybe you could constrain them to be so?) then you
just have a difference of cubics within each segment, and you can use
polyroot() to find the roots (and throw out any that are complex or
don't fall within the segment).
  If the knots are not coincident it's more of a pain but you should
still be able to do it by considering overlapping segments ...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find Crossover Points of Two Spline Functions

2015-09-28 Thread Ben Bolker

Bert Gunter  gmail.com> writes:

> 
> Use ?uniroot to do it numerically instead of polyroot()?
> 
> Cheers,
> Bert
> Bert Gunter

  The problem with uniroot() is that we don't know how many intersections/
roots we might be looking for. With polyroot(), we know that there can
be at most 3 roots that lie within a particular cubic segment.

  (I'm not arguing that doing this is necessarily a good idea in
the larger context.)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Accessing defunct package

2015-09-25 Thread Ben Bolker

Dennis Fisher  plessthan.com> writes:

> Colleagues,
> 
> In the past, I used a package:
>   SASxport
> to output files to SAS’s XPT format.  This was useful because FDA requests
that data be submitted in that
> format (even though they typically must reconvert to some other format
before the data are used).
> 

  [snip]

  Try this?

library("devtools")
install_version("SASxport","1.5.0")

or download the last archived version from

https://cran.r-project.org/src/contrib/Archive/SASxport/SASxport_1.5.0.tar.gz

and use R CMD INSTALL from the command line or
install.packages(,repos=NULL)
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] maximum-likelihood-estimation with mle()

2015-09-12 Thread Ben Bolker

peter dalgaard  gmail.com> writes:

> 
> You are being over-optimistic with your starting values, and/or 
> with constrains on the parameter space. 
> Your fit is diverging in sigma for some reason known
>  only to nonlinear-optimizer gurus...
> 
> For me, it works either to put in an explicit
>  constraint or to reparametrize with log(sigma).

  [snip]

  Another few strategies:

set.seed(101)
x <- 1:10
y <- 3*x - 1 + rnorm(length(x), mean=0, sd=0.5)

library("stats4")
nLL <- function(a, b, sigma) {
  -sum(dnorm(y, mean=a*x+b, sd=sigma, log=TRUE))
}

fit <- mle(nLL, start=list(a=0, b=0, sigma=1), nobs=length(y))


   Nelder-Mead is generally more robust than BFGS, although slower:


fit2 <- mle(nLL, start=list(a=0, b=0, sigma=1),
 method="Nelder-Mead",nobs=length(y))

   bbmle can be used to simplify things slightly (this repeats
Peter Dalgaard's transformation of sigma):

library("bbmle")
fit3 <- mle2(y~dnorm(mean=mu,sd=exp(logsigma)),
   parameters=list(mu~x),
   data=data.frame(x,y),
   start=list(mu=0, logsigma=0))

  You can also profile out the sigma:

## dnorm with sd profiled out
dnorm2 <- function(x,mean,log=FALSE) {
  ssq <- sum((x-mean)^2)
  dnorm(x,mean,sd=sqrt(ssq/length(x)),log=log)
}
fit4 <- mle2(y~dnorm2(mean=mu),
   parameters=list(mu~x),
   data=data.frame(x,y),
   start=list(mu=0))

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] urgent question about Lmer models

2015-09-03 Thread Ben Bolker

Bert Gunter  gmail.com> writes:

> 
> You need to consult a local statistical expert or post on a statistics
> list like stats.stackexchange.com .  This is a statistical question
> not an R question.
> 
> The answer is: Because the design is balanced and the treatment error
> is obtained from the (within site) rsd, but you will probably need
> more explanation than this.
> 
> BTW, if you haven't already done so, try making some informative
> trellised plots to understand what is going on. Formal statistical
> analysis alone can be very misleading.
> 
> Cheers,
> Bert
> 
> Bert Gunter

   That's a good explanation, very similar to the one I posted
at http://stackoverflow.com/questions/32383259/
   lmertest-package-standard-errors  (broken URL for Gmane).

  I know most people who read this message will already know this,
but (to the original poster) *please don't cross-post to R lists and
StackOverflow*; it causes wasted effort, as in this case.

  Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] equivalent of repeated measures anova using lmer

2015-08-20 Thread Ben Bolker

asudar aruna.sudarshan at gmail.com writes:

 
 I am trying to run a linear mixed effects model similar to the 2*2*2 anova
 design. My DV is reaction time and fixed factors are time (pre vs.
 post:within-subject), condition (congruent vs. incongruent: within subject)
 and stimulation (vertex vs. DLPFC: between subject)
 My concerns are:
 a)I have very few participants: 7 in the vertex condition and 7 in the
 DLPFC condition.
 b)2 out of the 7 participants participated in both vertex and DLPFC
 condition.
  How do I compute a nested lmer model with reaction time as DV and time,
 condition and stimulation as factors? and how do i account for random
 intercepts and slopes with few number of participants?
 Thanks for all the help

  You'll probably get more useful answers to this question on
the r-sig-mixed-mod...@r-project.org mailing list.

   7 participants is small but doesn't seem horribly small
(but it might preclude estimating a full random-slopes model).
You'll get more information on r-sig-mixed-models ...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Jaccard index

2015-08-14 Thread Ben Bolker

sreenath sreenath.rajur at macfast.ac.in writes:

 
 sim(file_name,method=jaccard)
 this command is giving the raw wise similarity matrix
 how can i find column wise similarity matrix?
 what is the command?
 please help me
 

  Based on a search 

library(sos); findFn(jaccard sim)

I'm guessing that you're using the simba package (which you really
should have said in your message ...).  In general transposing the
matrix  

  sim(t(file_name),method=jaccard)

should work.

If you have ecology-related questions you might want to check out
the r-sig-ecol...@r-project.org mailing list ...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Predict Function use with GLM

2015-08-12 Thread Ben Bolker

trisgutt trisgutt at hotmail.com writes:

 
 I am currently using a GLM with Gaussian family to model fish depth~length +
 distance from shore: 
 
 model1 - glm(Depth ~ length + distance from shore,
 family=gaussian(link=log))
 
 There are no zero depths. I would like to use the above model with the
 predict function in R to generate three lines (with confidence intervals)
 for the depth at size for three distances, say 100 m, 500 m and 1000m.
 Problem is I am unable to figure out how to do this? 
 
 Any advice / assistance would be gratefully received...
 
 Thank you
 
 Tris
 

Something like:

pp - expand.grid(distance=c(100,500,1000),
   length=seq(min_depth,max_depth,length.out=51))
## 51 is arbitrary -- you just need enough points to make the curve
## appear smooth

predvals - predict(model1,newdata=newdata,se.fit=TRUE,type=link)
pp - transform(pp,est=exp(predvals$fit),
lwr=exp(predvals$fit-1.96*predvals$se),
upr=exp(predvals$fit+1.96*predvals$se))

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error with predict and newdata

2015-08-08 Thread Ben Bolker

David Winsemius dwinsemius at comcast.net writes:

 
 
 On Aug 7, 2015, at 2:41 PM, kira taylor wrote:
 
  Hi!
  
  I am trying to use predict to apply my model to data from
  one time period to
  see what might be the values for another time period.  I did this
  successfully for one dataset, and then tried on another
  with identical code
  and got the following error:
  
  Error in eval(predvars, data, env) :
   numeric 'envir' arg not of length one
  

  Please don't cross-post on StackOverflow and the r-help lists
(it will usually lead to duplicated/wasted effort).  If you must,
at least post a link/indicate in each venue that you have cross-posted
to the other: 

(line-broken link, reassemble to visit)

http://stackoverflow.com/questions/31887043/
error-with-predict-and-newdata-dependent-on-number-of-
predictor-variable-in-mod/31887398#31887398

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help plotting glmm values

2015-08-07 Thread Ben Bolker

Luis Fernando García luysgarcia at gmail.com writes:

 
 Dear Fellows,
 
 I´m sorry if my question is very basic, but I'm still new in the field of R
 and the GLMM.

  In future, you might consider posting to r-sig-mixed-mod...@r-project.org
instead ...

 I have made a following glmm, and I need to plot the predicted mean values
 from the model and the CI, with this barplot +/- the CI, using the prey as
 explanatory factor.
 
  Although I have been looking for it, I have not found 
 a website which show
 the way to do this, so  I'm unsure about the right procedure. 
 If any of you
 could help me with the correct procedure I would really apprreciate it!

See:

http://glmm.wikidot.com/faq#predconf
http://ms.mcmaster.ca/~bolker/R/misc/foxchapter/  

Some modified analysis:

library(lme4)
glmm1 - glmer(Acc ~ Prey + (1 | Sp), data=Lac, family=binomial)
summary(glmm1)
deviance(glmm1)/df.residual(glmm1)
glmm2 - update(glmm1,nAGQ=10)

## aggregate
library(plyr)
Lac_agg - ddply(Lac,c(Prey,Sp),
 summarise,
 N=length(Acc),
 p=mean(Acc))
Lac_agg - aggregate(Acc~Prey*Sp,FUN=function(x) c(k=sum(x),N=length(x)),
 data=Lac)
glmm3 - glmer(p ~ Prey + (1 | Sp), weights=N,
   data=Lac_agg, family=binomial)
deviance(glmm3)/df.residual(glmm3)
## model with ind-level obs blows up: leave it alone ...
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Contr.sum and coefficient tests

2015-07-23 Thread Ben Bolker

François Collin stxfc at nottingham.ac.uk writes:

 
 Dear all,
 
 I would like to run a linear model which includes two factors:

 - The first one has two levels, including a reference level. Thus I
 have to use the treatment contrast (contr.treatment, reference level
 effect = 0, then the intercept).

 - The second is a 6-level factor without reference contrast nor
 order. So, I would like to use sum contrat: sum of the effects = 0.

 The problem arises when it comes to the coefficient test. I
 understand it is not relevant to test the reference level for the
 first factor as the reference level is set to 0. However, using sum
 contrast for the second factor, I would have expected the test of
 each level to be included in the classical summary print of the lm
 function result but it is not. And here is my problem, how can I
 have every coefficients tested and printed in the summary output
 when my factor is studied from this sum.contrast standpoint?

  [some context snipped to make gmane happy -- sorry ]

  I think you should look at the effects package or the lsmeans
package (or possibly the multcomp package, if you want to be careful
about the number of (non-orthogonal) tests) -- the issue is that
summary.lm always reports on the *parameters* estimated.  If some
parameters are not independently estimable (e.g. the effect corresponding
to the last level can be reconstructed from the parameter values
for all of the preceding levels), then summary.lm() won't give
them to you.

  In fact, these packages will (probably) give you the results you
want even if the original model uses default treatment contrasts.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Statistical distribution not fitting

2015-07-23 Thread Ben Bolker

Amelia Marsh via R-help r-help at r-project.org writes:

 
 Dear Sir,
 

 [snip]
 
 Lastly using the command rsnorm(1, mean = m, sd = s, xi = x)
 where m, s and x are the estimated parameters obtained from loss
 data. The usual procedure is to arrange these simulated values in
 descending order and select an observation representing (say 99.9%)
 and this is Value at Risk (VaR) which is say 'p'.

 My understanding is to this value 'p', I need to apply the
 transformation 10^p to arrive at the value which is in line with my
 original loss data. Am I right?

 [snip; sorry to remove context, but Gmane doesn't like it]

(1) you can probably calculate the 0.999 quantile directly from
qsnorm(0.999, [params]) rather than by simulating ...
(2) ... I believe that my original example used log(), so you
would need to use exp() (not 10^x) to get back to the original scale ...
(3) ... if you're concerned about extreme events it would be
a very good idea to use the skew-t rather than the skew-Normal
(4) you should certainly consider Boris Steipe's concerns about
non-independence (although I have to admit that without more
information and further time/effort/thought I don't have any
simple suggestions how ...)

  cheers
Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Statistical distribution not fitting

2015-07-22 Thread Ben Bolker

Amelia Marsh amelia_marsh08 at yahoo.com writes:

 
 Hello!  (I dont know if I can raise this query here on this forum,
 but I had already raised on teh finance forum, but have not received
 any sugegstion, so now raising on this list. Sorry for the same. The
 query is about what to do, if no statistical distribution is fitting
 to data).
 
 I am into risk management and deal with Operatioanl risk. As a part
 of BASEL II guidelines, we need to arrive at the capital charge the
 banks must set aside to counter any operational risk, if it
 happens. As a part of Loss Distribution Approach (LDA), we need to
 collate past loss events and use these loss amounts. The usual
 process as being practised in the industry is as follows -
 
 Using these historical loss amounts and using the various
 statistical tests like KS test, AD test, PP plot, QQ plot etc, we
 try to identify best statistical (continuous) distribution fitting
 this historical loss data. Then using these estimated parameters
 w.r.t. the statistical distribution, we simulate say 1 miliion loss
 anounts and then taking appropriate percentile (say 99.9%), we
 arrive at the capital charge.
 
 However, many a times, loss data is such that fitting of
 distribution to loss data is not possible. May be loss data is
 multimodal or has significant variability, making the fitting of
 distribution impossible.  Can someone guide me how to deal with such
 data and what can be done to simulate losses using this historical
 loss data in R.
 
A skew-(log)-normal fit doesn't look too bad ... (whenever you
have positive data that are this strongly skewed, log-transforming
is a good step)

hist(log10(mydat),col=gray,breaks=FD,freq=FALSE)
## default breaks are much coarser:
## hist(log10(mydat),col=gray,breaks=Sturges,freq=FALSE)
lines(density(log10(mydat)),col=2,lwd=2)
library(fGarch)
ss - snormFit(log10(mydat))
xvec - seq(2,6.5,length=101)
lines(xvec,do.call(dsnorm,c(list(x=xvec),as.list(ss$par))),
  col=blue,lwd=2)
## or try a skew-Student-t: not very different:
ss2 - sstdFit(log10(mydat))
lines(xvec,do.call(dsstd,c(list(x=xvec),as.list(ss2$estimate))),
  col=purple,lwd=2)

There are more flexible distributional families (Johnson,
log-spline ...)

Multimodal data are a different can of worms -- consider
fitting a finite mixture model ...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glm help - final predictor variable NA

2015-07-22 Thread Ben Bolker

matthewjones43 matthew.jones at kellogg.ox.ac.uk writes:

 
 Hi, I am not a statistician and so I am sure whatever it is I 
 am doing wrong
 must be an obvious error for those who are...Basically I can
  not understand
 why I get NA for variable 'CDSTotal' when running a glm? 
 Does anyone have an
 idea of why this might be happening?

It might be useful to look at 
http://stackoverflow.com/questions/7337761/
 linear-regression-na-estimate-just-for-last-coefficient/7341074#7341074

(broken URL).  You are overfitting the model by including
a predictor that can be expressed as a linear combination of
other predictors, and R is trying to handle it automatically.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why does dredge() rank models differently for lmer() and MCMCglmm()?

2015-07-18 Thread Ben Bolker

Corina itsme at CorinaLogan.com writes:

 
 Hello,
 I am running my full model (fm) through lmer() and MCMCglmm() using the
 default settings:
 
 model.lmer - lmer(fm)
 model.MCMCglmm - MCMCglmm(fm)
 

  [snip]

 However, when I run the models through dredge():
 
 dredge(model.lmer)
 dredge(model.MCMCglmm)
 
 It ranks the models very differently for lmer() and MCMCglmm() in
 the model selection tables, even though I used the default settings
 for dredge() as well (ranked by AICc). I would assume the difference
 is because lmer() uses REML and MCMCglmm() uses Bayes Rule, however
 if this is the case then why weren’t the summary outputs different
 as well?
 
  [snip]

   This question *might* be better on r-sig-mixed-mod...@r-project.org ...
You shouldn't be comparing models with fixed effects that are fitted
with REML -- try  refitting with REML=FALSE.  The model comparison
functions in the lme4 package (e.g. anova()) try to stop you from 
making this mistake, but I'm not sure the MuMIn::dredge does.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A question about nlmer in lme4 package

2015-07-11 Thread Ben Bolker

Patty Haaem via R-help r-help at r-project.org writes:


 Dear all,I have studied “Mixed models in R using the lme4 package
 Part 6: Nonlinear mixed models” by Douglas Bates. In this tutorial,
 there are some codes to fit nonlinear mixed models for Theoph
 data. The codes are as [follows:]

Th. start - c(lKe = -2.5, lKa = 0.5 , lCl = -3)
nm1 - nlmer ( conc ~ SSfol(Dose , Time ,lKe , lKa , lCl) ~
0+ lKe+lKa+lCl +(0+ lKe| Subject )+(0+ lKa| Subject )+ 
(0+ lCl| Subject ), nAGQ =0, Theoph , start = Th.start , verbose = TRUE )


   I want to add a covariate (like age) to CL parameter. How should I
 modify above codes? what's more, how  are selected initial
 values (lKe = -2.5, lKa = 0.5 , lCl = -3) ?  Thanks in advanceElham

  You'll do better asking this question on r-sig-mixed-mod...@r-project.org
The short answer (to the first question) is that it's not easy, but it
can be done, see e.g.

http://stackoverflow.com/questions/15141952/nlmer-longitudinal-data
http://rpubs.com/bbolker/3423

  Ben Bolker
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Hypergeometric Function seems to give wrong results

2015-07-08 Thread Ben Bolker

Carlos Nasher via R-help r-help at r-project.org writes:



[snip]

 I need to evaluate the Hypergeometric Function of the 2nd kind (Tricomi
 confluent hypergeometric function). Therefore I'm using the kummerU
 function from the fAsianOptions package. It seems to me that kummerU is
 giving wrong results. Here's an example:
 
 library(fAsianOptions)
 kummerU(a=19, b=19, x = 10)
 
 R gives 1838.298 for the real part.
 
 If I use Mathematica via the wolfram site (
 http://functions.wolfram.com/webMathematica
   /FunctionEvaluation.jsp?name=HypergeometricU)
 the result is 3.52603e-20 which is more reasonable in the context of my
 analysis.
 

 [snip]

  Your best bet is probably to contact the package maintainer
(use maintainer(fAsianOptions) to see who it is, or look on 
the CRAN page for the package).  If this functionality is very
commonly used in finance you might try the r-sig-finance mailing
list (indicating that you've already tried here).

  good luck
Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cross-over Data with Kenward-Roger correction

2015-06-09 Thread Ben Bolker

knouri nouri4 at yahoo.com writes:

 
 Dear all:for the folowing data, a two-period, two treatment (A=1 vs. B=2)
 cross-over is fitted
 using the folowing SAS code.  
 data one;

[snip]

 run;
 proc mixed data=one method=reml;
 class Sbj Per Trt;
    model PEF = Per Trt /ddfm=kr;
    repeated Trt / sub=Sbj type=un r;
    lsmeans Trt / cl alpha=0.05;
    estimate 'B vs. A' Trt -1  1 / alpha=0.1 cl;
 run;

 (where kr option is for Kenward-Roger method).I need to use R to
 reproduce the results similar to what the above SAS code generates.
 I have used several R functions including lme, lmer with no success
 so far.Any advice will be greatly appreciated,Sincerely, Keramat


This is more appropriate for r-sig-mixed-mod...@r-project.org.
Please post followups there.

The lmerTest and lsmeans packages will probably be useful.

As a statistical point, I don't understand why you can't just
do a paired t-test on these data??

dat - read.table(header=TRUE,text=
Sbj Seq Per Trt PEF
1 1 1 1 310
1 1 2 2 270
4 1 1 1 310
4 1 2 2 260
6 1 1 1 370
6 1 2 2 300
7 1 1 1 410
7 1 2 2 390
10 1 1 1 250
10 1 2 2 210
11 1 1 1 380
11 1 2 2 350
14 1 1 1 330
14 1 2 2 365
2 2 1 2 370
2 2 2 1 385
3 2 1 2 310
3 2 2 1 400
5 2 1 2 380
5 2 2 1 410
9 2 1 2 290
9 2 2 1 320
12 2 1 2 260
12 2 2 1 340
13 2 1 2 90
13 2 2 1 220)

library(lmerTest)
library(ggplot2); theme_set(theme_bw())
ggplot(dat,aes(x=Per,y=PEF,colour=factor(Trt)))+geom_point()+
geom_line(colour=gray,aes(group=Sbj))+
scale_x_continuous(breaks=c(1,2))

m1 - lmer(PEF~Per+Trt +(Trt|Sbj), data=dat)

## warning about unidentifiability

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different random intercepts but same random slope for groups

2015-06-09 Thread Ben Bolker

li li hannah.hlx at gmail.com writes:

 

[snip]

   I'd like to fit a random intercept and random slope model. In my
 data, there are three groups. I want to have different random
 intercept for each group but the same random slope effect for all
 three groups. I used the following R command.
 However, there seems to be some problem. Any suggestions?

  Please do not cross-post to more than one R list (in this
case, r-sig-mixed-models is more appropriate, and you've already
gotten some answers there).
 
  Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why I am not able to load library(R.matlab)? Other packages are fine.

2015-05-29 Thread Ben Bolker

C W tmrsg11 at gmail.com writes:

 
 Hi Henrik,
 
 I don't quite get what I should do here.  I am not familiar with
 R.methodS3.  Can you tell me what command exactly do I need to do?
 
 Thanks,
 
 Mike

install.packages(R.methodsS3)
install.packages(R.matlab)
library(R.matlab)



  [snip snip snip]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why I am not able to load library(R.matlab)? Other packages are fine.

2015-05-29 Thread Ben Bolker

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

 I think Henrik's point (which I merely clarified) was that something
funky (we'll probably never know what, and it's not worth figuring out
unless it happens again/to other people) had gone wrong and that the
easiest thing to do was just to reinstall.

References:
* https://www.youtube.com/watch?v=t2F1rFmyQmY
*
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.208.9970rep=rep1type=pdf


On 15-05-29 05:11 PM, C W wrote:
 Wow, thanks Ben.  That worked very well.
 
 I guess I didn't have R.methodS3?  But that doesn't make sense,
 because I was using R.matlab few weeks ago.  I believe I was on R
 3.1.
 
 Maybe it's in R 3.1 folder?  I am using a Mac, btw.
 
 Cheers,
 
 -M
 
 On Fri, May 29, 2015 at 1:55 PM, Ben Bolker bbol...@gmail.com
 wrote:
 
 C W tmrsg11 at gmail.com writes:
 
 
 Hi Henrik,
 
 I don't quite get what I should do here.  I am not familiar
 with R.methodS3.  Can you tell me what command exactly do I
 need to do?
 
 Thanks,
 
 Mike
 
 install.packages(R.methodsS3) install.packages(R.matlab) 
 library(R.matlab)
 
 
 
 [snip snip snip]
 
 __ 
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more,
 see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
 the posting guide http://www.R-project.org/posting-guide.html and
 provide commented, minimal, self-contained, reproducible code.
 
 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJVaNXMAAoJEOCV5YRblxUHj6kH/3W3etyn+HlT0X1PEj7DQf2c
Qo0q9ed2csPRLbLLrpX2FPKbxLg/g6MSxmIQ118tbWhkzKfRoyxCZHLcT+U2xLuR
V7QAS3Yns2ENSSSH1GvdSeFZTQWW3XFZN/kT+/zQYjaZewZOlo4Cgqc16c6mGBRS
eSIRIyA3iJWnMEc878nbMJztvsEqnpZSNSIXiI91UX/l8sDrBNYCNtfzY86JqJhp
8O0q7zkaRIrb6UuViY3qTC5+qpGruUYIUbeqyNei7MNErrG3AufsODfs5d/CjSCa
5jlbS512JRrQFV2JKHU+AH+4Q9CJQBVS+F6JZdjhHB2fUmAx0XIR6IJEBfSvBSk=
=nO+b
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [R-sig-ME] different results from lme and lmer function

2015-05-26 Thread Ben Bolker

  These actually aren't terribly different from each other.  I suspect
that lmer is slightly closer to the correct answer, because lme
reports a log-likelihood (really -1/2 times the REML criterion) of
49.30021, while lmer reports a REML criterion of  -98.8 - slightly
better fit at -R/2 = 49.4.  The residual sds are 0.0447 (lme) vs.
0.0442 (lmer); the intercept sd estimate is 0.016 vs 0.0089,
admittedly a bit low, and both month sds are very small.  lmer
indicates a singular fit (correlation of -1).If you look at the
confidence intervals on these estimates (confint(fitted_model) in
lme4; intervals(fitted_model) in lme) I think you'll find that the
confidence intervals are much wider than these differences (you may
even find that lme reports that it can't give you the intervals
because the Hessian [curvature] matrix is not positive definite).

  Both should be comparable to SAS PROC MIXED results, I think, if
you get the syntax right ...

On Tue, May 26, 2015 at 7:09 PM, li li hannah@gmail.com wrote:
 Hi all,
   I am fitting a random slope and random intercept model using R. I
 used both lme and lmer funciton for the same model. However I got
 different results as shown below (different variance component
 estimates and so on). I think that is really confusing. They should
 produce close results. Anyone has any thoughts or suggestions. Also,
 which one should be comparable to sas results?
  Thanks!
   Hanna

 ## using lme function
 mod_lme - lme(ti  ~ type*months, random=~ 1+months|lot, na.action=na.omit,
 + data=one, control = lmeControl(opt = optim))
 summary(mod_lme)
 Linear mixed-effects model fit by REML
  Data: one
 AIC   BIC   logLik
   -82.60042 -70.15763 49.30021

 Random effects:
  Formula: ~1 + months | lot
  Structure: General positive-definite, Log-Cholesky parametrization
 StdDev   Corr
 (Intercept) 8.907584e-03 (Intr)
 months  6.039781e-05 -0.096
 Residual4.471243e-02

 Fixed effects: ti ~ type * months
  Value   Std.Error DF   t-value p-value
 (Intercept) 0.25831245 0.016891587 31 15.292373  0.
 type0.13502089 0.026676101  4  5.061493  0.0072
 months  0.00804790 0.001218941 31  6.602368  0.
 type:months -0.00693679 0.002981859 31 -2.326329  0.0267
  Correlation:
(Intr) typPPQ months
 type   -0.633
 months -0.785  0.497
 type:months  0.321 -0.762 -0.409

 Standardized Within-Group Residuals:
   MinQ1   MedQ3   Max
 -2.162856e+00 -1.962972e-01 -2.771184e-05  3.749035e-01  2.088392e+00

 Number of Observations: 39
 Number of Groups: 6




 ###Using lmer function
 mod_lmer -lmer(ti  ~ type*months+(1+months|lot), na.action=na.omit, 
 data=one)
 summary(mod_lmer)
 Linear mixed model fit by REML t-tests use Satterthwaite approximations to
   degrees of freedom [merModLmerTest]
 Formula: ti ~ type * months + (1 + months | lot)
Data: one

 REML criterion at convergence: -98.8

 Scaled residuals:
 Min  1Q  Median  3Q Max
 -2.1347 -0.2156 -0.0067  0.3615  2.0840

 Random effects:
  Groups   NameVariance  Std.Dev.  Corr
  lot  (Intercept) 2.870e-04 0.0169424
   months  4.135e-07 0.0006431 -1.00
  Residual 1.950e-03 0.0441644
 Number of obs: 39, groups:  lot, 6

 Fixed effects:
 Estimate Std. Errordf t value Pr(|t|)
 (Intercept) 0.258312   0.018661  4.82  13.842 4.59e-05 ***
 type 0.135021   0.028880  6.802000   4.675  0.00245 **
 months  0.008048   0.001259 11.943000   6.390 3.53e-05 ***
 type:months -0.006937   0.002991 28.91  -2.319  0.02767 *
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Correlation of Fixed Effects:
 (Intr) typPPQ months
 type -0.646
 months  -0.825  0.533
 type:month  0.347 -0.768 -0.421

 ___
 r-sig-mixed-mod...@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lme function to obtain pvalue for fixed effect

2015-05-26 Thread Ben Bolker

li li hannah.hlx at gmail.com writes:

 
 Hi all,
   I am using the lme function to run a random coefficient model.
 Please see
 output (mod1) as below.

  Please don't cross-post to different R lists (this is explicitly
deprecated by the list policy: http://www.r-project.org/mail.html, 
cross-posting is considered to be impolite).  r-sig-mixed-models
seems to be more appropriate for these questions.

  sincerely
Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cannot run install binary package on Windows 8

2015-04-11 Thread Ben Bolker

Maciej Węgorkiewicz wegorkie at gmail.com writes:

 
 Hi all,
 
 I am just starting with R and cannot use lmtest package on Windows 8.
 This package contains dwtest function that I am interested in.
 
 I began with installing the core of R (with rgui) - version 3.1.3 64-bit.
 
 Then I installed lmtest package using GUI menu, but after success
 message I cannot run dwtest() getting message that the it cannot find
 the function.
 

  This sounds like R FAQ 7.30
http://cran.r-project.org/doc/FAQ/R-FAQ.html#
I-installed-a-package-but-the-functions-are-not-there

(URL broken to make gmane happy)

  Try library(lmtest).  Everything else you've done after 
initially installing the package sounds like a wild goose chase ...
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question on CCA and RDA analysis

2015-04-10 Thread Ben Bolker

Luis Fernando García luysgarcia at gmail.com writes:

 
 Dear R experts,
 
 I wanted to know if you can suggest me any website or tutorial just to
 learn about how to make a RDA or CDA in R
 
 Thanks in advance!

  I hate to ask, but did you try Googling

canonical correspondence analysis R 

... ?


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package build system adds line break in DESCRIPTION URL

2015-04-03 Thread Ben Bolker

Daniel Lewandowski daniel at nextpagesoft.net writes:

 
 Has anybody noticed that if field URL in DESCRIPTION contains a uri with 
 66 or more characters, then file DESCRIPTION in the resulting package 
 includes a line break at the beginning?
 
 So this (source DESCRIPTION):
 
 URL: 
 http://ecdc.europa.eu/en/data-tools/
   seroincidence-calculator-tool/Pages/default.aspx

 
 becomes (again file DESCRIPTION, but inside the package)
 
 URL:
 http://ecdc.europa.eu/en/data-tools/seroincidence-calculator-tool/
   Pages/default.aspx
 
 This has been tested with R on Windows 8.1 (devel 01/04/2015 and 3.1.3) 
 and Linux Mint (3.1.3). It has many sad implications including not 
 acceptance of such packages to CRAN.
 

  (links line-broken above to make gmane happy).

  Haven't tested, but seems it would not be *too* hard to trace
through the code to see what's happening in the package-building
process.  Two thoughts:

 (1) as a workaround, could you use a URL-shortener such as tinyurl?
tinyurl allows you to specify a somewhat meaningful name for the
shortened URL, e.g. http://tinyurl.com/reproducible-000

 (2) this feels like it is more suitable for r-de...@r-project.org

  Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] temporal autocorrelation in MCMCglmm

2015-03-30 Thread Ben Bolker

David Villegas Ríos chirleu at gmail.com writes:

 Hi,
 For a number of individuals, I have measured several behavioral traits in
 the wild. Those traits (e.g. home range) can be estimated on different
 temporal scales, for example daily, weekly or monthly. I want to estimate
 repeatability of those traits, assuming that the daily/weekly/monthly
 measurements represent replicates. I have 3 months (90 days) of data for
 each trait. Two questions:
 
 1) How can assess if there is temporal autocorrelation in my model? I guess
 that if I consider daily measurements as replicates (90 replicates), I will
 have some autocorrelation, but if I use just monthly measurements (3
 replicates) maybe I avoid it.
 
 2) How can account for temporal autocorrelation in MCMCglmm?
 
 Sorry for this pretty basic questions but I haven't found an answer so far.

You'll probably be better off asking this question at r-sig-mixed-models
(at) r-project.org.

As a first pass, you might be able take the residuals from your fit and use
acf() to compute the autocorrelation function.  Actually, though, you'll
probably be better off fitting a 'null' lme() model (fixed=resid~1,
random=~1|individual) and then using the ACF() method (not the same
thing as acf()) on the resulting model fit.

  Ben Bolker

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error in lm() with very small (close to zero) regressor

2015-03-29 Thread Ben Bolker

RiGui raluca.gui at business.uzh.ch writes:



[snip]
 
 I am terribly sorry for the code not being reproducible, is the
 first time I am posting here, I run the code several times before I
 posted, but...I forgot about the library used.

  Thanks for updating.
 
 To answer to your questions:
 
 How do you know this answer is correct? 
 
 What I am doing is actually a fixed effect estimation. I apply a
 projection matrix to the data, both dependent and independent
 variables, projection which renders the regressors that do not vary,
 equal to basically zero - the x1 from the post.
 
 Once I apply the projection, I need to run OLS to get the estimates,
 so x1 should be zero.

  Yes, but not *exactly* zero.
 
 Therefore, the results with the scaled regressor is not correct. 
 
 Besides, I do not see why the bOLS is wrong, since is the formula of
 the OLS estimator from any Econometrics book.

 Because it's numerically unstable.
 Unfortunately, you can't always translate formulas directly from
books into code and expect them to be reliable.

  Based on Peter's comments, I believe that as expected lm()
is actually getting closer to the 'correct' answer.

 Here again the corrected code: 

 library(corpcor)
 
 n_obs - 1000
 y  - rnorm(n_obs, 10,2.89)
 x1 - rnorm(n_obs, 0.01235657,0.45)
 x2 - rnorm(n_obs, 10,3.21)
 X  - cbind(x1,x2)

As Peter points out, one example uses an intercept and the
other doesn't: you should either use X - cbind(1,x1,x2) or
lm(y~x1+x2-1) for compatibility.

  bFE - lm(y ~ x1 + x2)
  bFE
 
  bOLS - pseudoinverse(t(X) %*% X) %*% t(X) %*% y
  bOLS

[snip: Gmane doesn't like it if I quote too much]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help w/ variable names in loop with lmer

2015-03-29 Thread Ben Bolker

David Crow david.crow at cide.edu writes:

 
 Hi, R users-
 
 I'm estimating random effects models with cross-level interactions; I want
 to interact each of a vector of level-1 variables with each of a vector of
 level-2 variables.  Here's the code:
 
 
 #create data frame with level-1 variables
 k - as.data.frame(cbind(robo, asalto, secuestro, asesinato))
 
 #create data frame with level-2 variables
 l - as.data.frame(cbind(IDH_IDH, IDH_ingpc, eco_pb, IM_indice, tasa_robo,
 hom_tasa, totdelitos1, totdelitos2, total, pri, pan, prd))
 
 #get cross-level interactions
 
 for (i in 1:length(k)) {
 for (j in 1:length(l)) {
 print(summary(lmer(hrprotcrim ~ k[,i]*l[,j] + (k[,i] | Municipio
 }
 }
 ==
 
 The code works and produces 48 (4 level-1 x 12 level-2) sets of output.
 The problem is, the output is illegible because instead of the variable
 names, I get the indices:
 
 [output]
 ==
 Linear mixed model fit by REML ['lmerMod']
 Formula: hrprotcrim ~ k[, i] * l[, j] + (k[, i] | Municipio)
 

[snip]

 Two questions:
 
 1)  How can I get variable names instead of indices in the above output
 2)  How can I estimate this with mapply instead of the double loop?
 
 Here's the code for mapply
 
 M4 - mapply(function(k,l){summary(lmer(hrprotcrim ~ k*l + (k |
 Municipio)))})
 
 And here's what I get:
 
 list()
 

I would suggest appropriate use of ?reformulate

Assume dd is a data frame containing all variables

lev1vars - c(robo, asalto, secuestro, asesinato)
lev2vars - c(IDH_IDH, IDH_ingpc, ...)
ffun - function(L1var,L2var) {
ff - reformulate(paste(L1var,L2var,sep=*),
  paste(L1var,Municipio,sep=|),
  response=hrprotcrim)
environment(ff) - parent.frame()  ## trickiness
return(summary(lmer(ff,data=dd)))
}
mapply(ffun,lev1vars,lev2vars)

?

If you had given a reproducible example I could have tested this ...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1787 matches

Mail list logo