Re: [R] serialize does not work as expected

2020-08-29 Thread Jeff King
compact sequences are actually an ALTREP object. I do not know if there is
any standard way to do it, but here is a trick for what you want.

```
> x <- 1:3
> .Internal(inspect(x))
@0x0196bed8dd78 13 INTSXP g0c0 [NAM(7)]  1 : 3 (compact)
> x[1] <- x[1]
> .Internal(inspect(x))
@0x0196bef90b60 13 INTSXP g0c2 [NAM(7)] (len=3, tl=0) 1,2,3
```

Best,
Jiefei

On Sat, Aug 29, 2020 at 1:10 PM Sigbert Klinke 
wrote:

> Hi,
>
> is there in R a way to "normalize" a vector from
> compact_intseq/compact_realseq to a "normal" vector?
>
> Sigbert
>
> Am 29.08.20 um 18:13 schrieb Duncan Murdoch:
> > Element 1
> > A
> > 3
> > 262146
> > 197888
> > 5
> > UTF-8
> > 238
> > 2
> > 1
> > 262153
> > 14
> > compact_intseq
> > 2
> > 1
> > 262153
> > 4
> > base
> > 2
> > 13
> > 1
> > 13
> > 254
> > 14
> > 3
> > 3
> > 1
> > 1
> > 254
> >
> > Element 2
> > A
> > 3
> > 262146
> > 197888
> > 5
> > UTF-8
> > 238
> > 2
> > 1
> > 262153
> > 15
> > compact_realseq
> > 2
> > 1
> > 262153
> > 4
> > base
> > 2
> > 13
> > 1
> > 14
> > 254
> > 14
> > 3
> > 3
> > 1
> > 1
> > 254
> >
> > Element 3
> > A
> > 3
> > 262146
> > 197888
> > 5
> > UTF-8
> > 14
> > 3
> > 1
> > 2
> > 3
>
>
> --
> https://hu.berlin/sk
> https://hu.berlin/mmstat3
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solving derivates, getting the minimum of a function, and helpful documentation of the deriv function

2020-08-29 Thread Rolf Turner


On Sat, 29 Aug 2020 21:15:56 +
"Sorkin, John"  wrote:

> I am trying to find the minimum of a linear function:

Quadratic function???
 
> y <- (-0.0263*b) + (0.0010*B^2)
> 
> I am having GREAT difficulty with the documentation of the deriv
> function. I have (after playing for two-hours) been able to get the
> following to work:
> 
> zoop <- deriv(expression((-0.0263*B)+(0.0010*B^2)),"B",func=TRUE)
> class(zoop)
> zoop(2)
> 
> which appears to give me the value of the derivative of my expression
> w.r.t. B (I am not certain what the func arugment does, but it
> appears to be necessary)

It causes deriv() to return a *function* rather than an *expression*.
> 
> Following what one learns in calculus 1, I now need to set the
> derivative equal to 0 and solve for B. I have no idea how to do this
> 
> Can someone point me in the right direction. Additionally can someone
> suggest documentation for deriv that is easily intelligible to
> someone who wants to learn how to use the function, rather that
> documentation that helps one who is already familiar with the
> function. (I have a need for derivatives that is beyond finding the
> minimum of a function)
> 
> Thank you
> John
> 
> P.S. Please don�t flame. I spent a good deal of time looking at
> documentation and searching the internet. There may be something on
> line, but I clearly am not using the correct search terms.

Couple of things that you could play around with.

y <- expression(-0.0263*B + 0.0010*B^2)
z <- deriv(y,"B",func=TRUE)
f <- function(x,z){as.vector(attr(z(x),"gradient"))}

(1) uniroot(f,c(5,15),z=z)$root
# 13.15 --- right answer!!! :-)

(2) library(polynom) # You may need to install this package.
p <- poly.calc(x=1:2,y=f(1:2,z=z))
polyroot(p)
# 13.15+0i You can get rid of the extraneous imaginary part
# by using Re(polyroot(p))

HTH

cheers,

Rolf

P.S. It's irritating the way that one has to fiddle about in order to
get a function that returns the value of the derivative, rather than the
value of the function being differentiated!

R.

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tempdir() does not respect TMPDIR

2020-08-29 Thread Jinsong Zhao
I read the help page, I don't understand it very well, since I set the 
environmental variable TMPDIR in .Renviron. What confused me is when 
double clicking the *.RData to launch R, the tempdir() does not respect 
the environmental variable TMPDIR, but launch R by double clicking Rgui 
icon does.


Best,
Jinsong

On 2020/8/30 0:36, Henrik Bengtsson wrote:

It is too late to set TMPDIR in .Renviron.  It is one of the
environment variables that has to be set prior to launching R.  From
help("tempfile", package = "base"):

The environment variables TMPDIR, TMP and TEMP are checked in turn and
the first found which points to a writable directory is used: if none
succeeds ‘/tmp’ is used. The path should not contain spaces. **Note
that setting any of these environment variables in the R session has
no effect on tempdir(): the per-session temporary directory is created
before the interpreter is started.**

/Henrik

On Sat, Aug 29, 2020 at 6:40 AM Jinsong Zhao  wrote:


Hi there,

When I started R by double clicking on Rgui icon (I am on Windows), the
tempdir() returned the tmpdir in the directory I set in .Renviron. If I
started R by double clicking on a *.RData file, the tempdir() return the
tmpdir in the directory setting by Windows system. I don't know whether
it's designed.

  > sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
...

Best,
Jinsong

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solving derivates, getting the minimum of a function, and helpful documentation of the deriv function

2020-08-29 Thread Roy Mendelssohn - NOAA Federal via R-help
Hi John:

Can I ask if this is the specific problem you are after,  or a test for  more 
general problem?  If the former,  the derivative is

 -0.0263 + 0.002 * B

so the solution for B is:

B = (0263)/0.002

If you are after a more general way fo doing this:

?solve

-Roy

> On Aug 29, 2020, at 2:15 PM, Sorkin, John  wrote:
> 
> I am trying to find the minimum of a linear function:
> 
> y <- (-0.0263*b) + (0.0010*B^2)
> 
> I am having GREAT difficulty with the documentation of the deriv function. I 
> have (after playing for two-hours) been able to get the following to work:
> 
> zoop <- deriv(expression((-0.0263*B)+(0.0010*B^2)),"B",func=TRUE)
> class(zoop)
> zoop(2)
> 
> which appears to give me the value of the derivative of my expression w.r.t. B
> (I am not certain what the func arugment does, but it appears to be necessary)
> 
> Following what one learns in calculus 1, I now need to set the derivative 
> equal to 0 and solve for B. I have no idea how to do this
> 
> Can someone point me in the right direction. Additionally can someone suggest 
> documentation for deriv that is easily intelligible to someone who wants to 
> learn how to use the function, rather that documentation that helps one who 
> is already familiar with the function. (I have a need for derivatives that is 
> beyond finding the minimum of a function)
> 
> Thank you
> John
> 
> P.S. Please don�t flame. I spent a good deal of time looking at documentation 
> and searching the internet. There may be something on line, but I clearly am 
> not using the correct search terms.
> 
> 
> 
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

**
"The contents of this message do not reflect any position of the U.S. 
Government or NOAA."
**
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new street address***
110 McAllister Way
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: roy.mendelss...@noaa.gov www: https://www.pfeg.noaa.gov/

"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected" 
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Solving derivates, getting the minimum of a function, and helpful documentation of the deriv function

2020-08-29 Thread Sorkin, John
I am trying to find the minimum of a linear function:

y <- (-0.0263*b) + (0.0010*B^2)

I am having GREAT difficulty with the documentation of the deriv function. I 
have (after playing for two-hours) been able to get the following to work:

zoop <- deriv(expression((-0.0263*B)+(0.0010*B^2)),"B",func=TRUE)
class(zoop)
zoop(2)

which appears to give me the value of the derivative of my expression w.r.t. B
(I am not certain what the func arugment does, but it appears to be necessary)

Following what one learns in calculus 1, I now need to set the derivative equal 
to 0 and solve for B. I have no idea how to do this

Can someone point me in the right direction. Additionally can someone suggest 
documentation for deriv that is easily intelligible to someone who wants to 
learn how to use the function, rather that documentation that helps one who is 
already familiar with the function. (I have a need for derivatives that is 
beyond finding the minimum of a function)

Thank you
John

P.S. Please don�t flame. I spent a good deal of time looking at documentation 
and searching the internet. There may be something on line, but I clearly am 
not using the correct search terms.







[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Would Like Some Advise

2020-08-29 Thread Jan Galkowski
Hi Philip,

This ends up being a pretty personal decision, but here's my advice.  

I have used Windows of various flavors, and Linux in a couple of versions.  I 
have also used four or five Unixen, in addition to Linux. I've never spent a 
lot of time using a Mac, although in many instances most of my colleagues at 
companies have.  It's invariably a cubicle-like environment, so when they have 
problems, you know.   I also have a Chromebook, which is what I am using to 
write this, and while awaiting the arrival of a new Windows 10 system. 

I have used R heavily on both Windows and Linux. On Linux I used it on my 
desktop, and I still use it on various large servers, now via RStudio, before 
from the shell. In the case of the servers, I don't have to maintain them, 
although I sometimes need to put up with peculiarities of their being 
maintained by others. (I rarely have sudo access, and sometimes someone has to 
install something for me, or help me install an R package, because the 
configuration of libraries on the server isn't quite what R expects.)

My experience with Linux desktops is that they seem fine initially, but then, 
inevitably, one day you need to upgrade to the next version of Ubuntu or 
whatever, and, for me, then the hell begins. In the last two times I did it, 
even with help of co-workers, it was so problematic, that I turned the desktop 
in, and stopped using the Linux. 

Prior to my last Linux version, I also seemed to need to spend an increasingly 
large amount of time doing maintanence and moving things around ... I ran out 
of R library space once and had to move the entire installation elsewhere.  I 
did, but it took literally 2 days to figured it out. 

Yes, if Linux runs out of physical store -- a moment which isn't always 
predictable -- R freezes.  Memory is of course an issue with Windows, but it 
simply does what, in my opinion, any modern system does and pages out to 
virtual memory, up to some limit of course.  (I always begin my  Windows R 
workspaces with 16 GB of RAM, and have expanded to 40 GB at times.)  I have 
just purchased a new Windows 10 system, was going to get 64 GB of RAM, but, for 
economy, settled on 32 GB. (I'm semi-retired as well.) My practice on the old 
Windows 7 system (with 16 GB RAM) was that I purchased a 256 GB SSD and put the 
paging file there.  That's not quite as good as RAM, but it's much better than 
a mechanical magnetic drive. My new Windows 10 has a 1 TB SSD.  I may move my 
old 256 GB SSD over to the new just as a side store, but will need to observe 
system cooling limits.  The new system is an 8 core Intel I7. 

Windows updates are a pain, mostly because they almost always involve a reboot. 
I *loved* using my Windows 7 past end of support because there were no updates. 
 I always found Windows Office programs to be incredibly annoying, tolerating 
them because if you exchange documents with the rest of the world, some 
appreciable fraction will be Word and Excel spreadsheets.  That said, I got rid 
of all my official Microsoft Office and moved to Open Office, which is fine. I 
also primarily use LaTeX and MikTeX for my own documents authored, and often 
use R to generate tables and other things for including in the LaTeX. 

On the other hand, when using Linux, ultimately YOU are responsible for keeping 
your libraries and everything else updated. When R updates, and new packages 
need to be updated, too, the update mechanism for Linux is recompiling from 
source. You sometimes need to do that for Windows, and Rtools gives you the 
way, but generally packages are in binary form. This means they are independent 
of the particular configuration of libraries you have on your system. That's 
great in my opinion. And easy.  Occasionally you'll find an R package which is 
source only and for some reason doesn't work with Rtools.  Then you are 
sometimes out of luck or need to run the source version of the package, if it's 
supported, which can be slow.  Sometimes, but rarely, source versions aren't 
supported.  I have also found in server environments that administrators are 
sometimes sloppy about keeping their gcc and other things updated. So at times 
I couldn't compile R packages because the admin on the server had an 
out-of-date gcc which produced a buggy version. 
 
Whether Linux or Windows, I often use multi-core for the Monte Carlo 
calculations I run, whether bootstraps, random forests, or MCMC.  I have used 
JAGS quite a lot but I don't believe it supports multi-core (unless something 
has changed recently).  I use MCMCpack and others. 

The media support for Windows is much better than Linux.  (At least Ubuntu now 
*has* some.) And it is work to keep Linux meda properly updated.  Still, I 
don't use Windows Media Player, preferring VLC.

And there are a wealth of programs and software available for Windows.  

No doubt, you need a good anti-virus and a good firewall. (Heck, I have that on 
my Google Pixel 2, too.)  I'm moving 

Re: [R] serialize does not work as expected

2020-08-29 Thread Duncan Murdoch

On 29/08/2020 1:10 p.m., Sigbert Klinke wrote:

Hi,

is there in R a way to "normalize" a vector from
compact_intseq/compact_realseq to a "normal" vector?


I don't know if there's a function specifically designed to do that, but 
as Henrik proposed, this works:


 l_normalized <- unserialize(serialize(l, connection=NULL, version=2))

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Fox

Dear John,

If you look at the code for logitreg() in the MASS text, you'll see that 
the casewise components of the log-likelihood are multiplied by the 
corresponding weights. As far as I can see, this only makes sense if the 
weights are binomial trials. Otherwise, while the coefficients 
themselves will be the same as obtained for proportionally similar 
integer weights (e.g., using your weights rather than weights/10), 
quantities such as the maximized log-likelihood, deviance, and 
coefficient standard errors will be uninterpretable.


logitreg() is simply another way to compute the MLE, using a 
general-purpose optimizer rather than than iteratively weighted 
least-squares, which is what glm() uses. That the two functions provide 
the same answer within rounding error is unsurprising -- they're solving 
the same problem. A difference between the two functions is that glm() 
issues a warning about non-integer weights, while logitreg() doesn't. As 
I understand it, the motivation for writing logitreg() is to provide a 
function that could easily be modified, e.g., to impose parameter 
constraints on the solution.


I think that this discussion has gotten unproductive. If you feel that 
proceeding with noninteger weights makes sense, for a reason that I 
don't understand, then you should go ahead.


Best,
 John

On 2020-08-29 1:23 p.m., John Smith wrote:
In the book Modern Applied Statistics with S, 4th edition, 2002, by 
Venables and Ripley, there is a function logitreg on page 445, which 
does provide the weighted logistic regression I asked, judging by the 
loss function. And interesting enough, logitreg provides the same 
coefficients as glm in the example I provided earlier, even with weights 
< 1. Also for residual deviance, logitreg yields the same number as glm. 
Unless I misunderstood something, I am convinced that glm is a 
valid tool for weighted logistic regression despite the description on 
weights and somehow questionable logLik value in the case of non-integer 
weights < 1. Perhaps this is a bold claim: the description of weights 
can be modified and logLik can be updated as well.


The stackexchange inquiry I provided is what I feel interesting, not the 
link in that post. Sorry for the confusion.


On Sat, Aug 29, 2020 at 10:18 AM John Smith > wrote:


Thanks for very insightful thoughts. What I am trying to achieve
with the weights is actually not new, something like

https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
I thought my inquiry was not too strange, and I could utilize some
existing codes. It is just an optimization problem at the end of
day, or not? Thanks

On Sat, Aug 29, 2020 at 9:02 AM John Fox mailto:j...@mcmaster.ca>> wrote:

Dear John,

On 2020-08-29 1:30 a.m., John Smith wrote:
 > Thanks Prof. Fox.
 >
 > I am curious: what is the model estimated below?

Nonsense, as Peter explained in a subsequent response to your
prior posting.

 >
 > I guess my inquiry seems more complicated than I thought:
with y being 0/1, how to fit weighted logistic regression with
weights <1, in the sense of weighted least squares? Thanks

What sense would that make? WLS is meant to account for
non-constant
error variance in a linear model, but in a binomial GLM, the
variance is
purely a function for the mean.

If you had binomial (rather than binary 0/1) observations (i.e.,
binomial trials exceeding 1), then you could account for
overdispersion,
e.g., by introducing a dispersion parameter via the quasibinomial
family, but that isn't equivalent to variance weights in a LM,
rather to
the error-variance parameter in a LM.

I guess the question is what are you trying to achieve with the
weights?

Best,
   John

 >
 >> On Aug 28, 2020, at 10:51 PM, John Fox mailto:j...@mcmaster.ca>> wrote:
 >>
 >> Dear John
 >>
 >> I think that you misunderstand the use of the weights
argument to glm() for a binomial GLM. From ?glm: "For a binomial
GLM prior weights are used to give the number of trials when the
response is the proportion of successes." That is, in this case
y should be the observed proportion of successes (i.e., between
0 and 1) and the weights are integers giving the number of
trials for each binomial observation.
 >>
 >> I hope this helps,
 >> John
 >>
 >> John Fox, Professor Emeritus
 >> McMaster University
 >> Hamilton, Ontario, Canada
 >> web: https://socialsciences.mcmaster.ca/jfox/
 >>
 >>> On 2020-08-28 9:28 p.m., John Smith wrote:
 >>> If the weights < 1, then we have different values! See an
   

Re: [R] serialize does not work as expected

2020-08-29 Thread William Dunlap via R-help
For some reason l[[2]] is serialized as a 'compact_realseq' and l[3]]
is not.  They both unserialize to the same thing.  On Windows I get:

> lapply(l, function(x)rawToChar(serialize(x, connection=NULL, ascii=TRUE)))
[[1]]
[1] 
"A\n3\n262146\n197888\n6\nCP1252\n238\n2\n1\n262153\n14\ncompact_intseq\n2\n1\n262153\n4\nbase\n2\n13\n1\n13\n254\n14\n3\n3\n1\n1\n254\n"

[[2]]
[1] 
"A\n3\n262146\n197888\n6\nCP1252\n238\n2\n1\n262153\n15\ncompact_realseq\n2\n1\n262153\n4\nbase\n2\n13\n1\n14\n254\n14\n3\n3\n1\n1\n254\n"

[[3]]
[1] "A\n3\n262146\n197888\n6\nCP1252\n14\n3\n1\n2\n3\n"

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sat, Aug 29, 2020 at 8:37 AM Sigbert Klinke
 wrote:
>
> Hi,
>
> if I create a list with
>
> l <- list(1:3, as.numeric(1:3), c(1,2,3))
>
> and applying
>
> lapply(l, 'class')
> lapply(l, 'mode')
> lapply(l, 'storage.mode')
> lapply(l, 'typeof')
> identical(l[[2]], l[[3]])
>
> then I would believe that as,numeric(1:3) and c(1,2,3) are identical
> objects. However,
>
> lapply(l, serialize, connection=NULL)
>
> returns different results for each list element :(
>
> Any ideas, why it is like that?
>
> Best Sigbert
>
> --
> https://hu.berlin/sk
> https://hu.berlin/mmstat3
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Smith
In the book Modern Applied Statistics with S, 4th edition, 2002, by
Venables and Ripley, there is a function logitreg on page 445, which does
provide the weighted logistic regression I asked, judging by the loss
function. And interesting enough, logitreg provides the same coefficients
as glm in the example I provided earlier, even with weights < 1. Also for
residual deviance, logitreg yields the same number as glm. Unless I
misunderstood something, I am convinced that glm is a valid tool for
weighted logistic regression despite the description on weights and somehow
questionable logLik value in the case of non-integer weights < 1. Perhaps
this is a bold claim: the description of weights can be modified and logLik
can be updated as well.

The stackexchange inquiry I provided is what I feel interesting, not the
link in that post. Sorry for the confusion.

On Sat, Aug 29, 2020 at 10:18 AM John Smith  wrote:

> Thanks for very insightful thoughts. What I am trying to achieve with the
> weights is actually not new, something like
> https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
> I thought my inquiry was not too strange, and I could utilize some existing
> codes. It is just an optimization problem at the end of day, or not? Thanks
>
> On Sat, Aug 29, 2020 at 9:02 AM John Fox  wrote:
>
>> Dear John,
>>
>> On 2020-08-29 1:30 a.m., John Smith wrote:
>> > Thanks Prof. Fox.
>> >
>> > I am curious: what is the model estimated below?
>>
>> Nonsense, as Peter explained in a subsequent response to your prior
>> posting.
>>
>> >
>> > I guess my inquiry seems more complicated than I thought: with y being
>> 0/1, how to fit weighted logistic regression with weights <1, in the sense
>> of weighted least squares? Thanks
>>
>> What sense would that make? WLS is meant to account for non-constant
>> error variance in a linear model, but in a binomial GLM, the variance is
>> purely a function for the mean.
>>
>> If you had binomial (rather than binary 0/1) observations (i.e.,
>> binomial trials exceeding 1), then you could account for overdispersion,
>> e.g., by introducing a dispersion parameter via the quasibinomial
>> family, but that isn't equivalent to variance weights in a LM, rather to
>> the error-variance parameter in a LM.
>>
>> I guess the question is what are you trying to achieve with the weights?
>>
>> Best,
>>   John
>>
>> >
>> >> On Aug 28, 2020, at 10:51 PM, John Fox  wrote:
>> >>
>> >> Dear John
>> >>
>> >> I think that you misunderstand the use of the weights argument to
>> glm() for a binomial GLM. From ?glm: "For a binomial GLM prior weights are
>> used to give the number of trials when the response is the proportion of
>> successes." That is, in this case y should be the observed proportion of
>> successes (i.e., between 0 and 1) and the weights are integers giving the
>> number of trials for each binomial observation.
>> >>
>> >> I hope this helps,
>> >> John
>> >>
>> >> John Fox, Professor Emeritus
>> >> McMaster University
>> >> Hamilton, Ontario, Canada
>> >> web: https://socialsciences.mcmaster.ca/jfox/
>> >>
>> >>> On 2020-08-28 9:28 p.m., John Smith wrote:
>> >>> If the weights < 1, then we have different values! See an example
>> below.
>> >>> How  should I interpret logLik value then?
>> >>> set.seed(135)
>> >>>   y <- c(rep(0, 50), rep(1, 50))
>> >>>   x <- rnorm(100)
>> >>>   data <- data.frame(cbind(x, y))
>> >>>   weights <- c(rep(1, 50), rep(2, 50))
>> >>>   fit <- glm(y~x, data, family=binomial(), weights/10)
>> >>>   res.dev <- residuals(fit, type="deviance")
>> >>>   res2 <- -0.5*res.dev^2
>> >>>   cat("loglikelihood value", logLik(fit), sum(res2), "\n")
>>  On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard 
>> wrote:
>>  If you don't worry too much about an additive constant, then half the
>>  negative squared deviance residuals should do. (Not quite sure how
>> weights
>>  factor in. Looks like they are accounted for.)
>> 
>>  -pd
>> 
>> > On 25 Aug 2020, at 17:33 , John Smith  wrote:
>> >
>> > Dear R-help,
>> >
>> > The function logLik can be used to obtain the maximum log-likelihood
>>  value
>> > from a glm object. This is an aggregated value, a summation of
>> individual
>> > log-likelihood values. How do I obtain individual values? In the
>>  following
>> > example, I would expect 9 numbers since the response has length 9. I
>>  could
>> > write a function to compute the values, but there are lots of
>> > family members in glm, and I am trying not to reinvent wheels.
>> Thanks!
>> >
>> > counts <- c(18,17,15,20,10,20,25,13,12)
>> >  outcome <- gl(3,1,9)
>> >  treatment <- gl(3,3)
>> >  data.frame(treatment, outcome, counts) # showing data
>> >  glm.D93 <- glm(counts ~ outcome + treatment, family =
>> poisson())
>> >  (ll <- logLik(glm.D93))
>> >
>> >[[alternative HTML version deleted]]
>> >
>> 

Re: [R] serialize does not work as expected

2020-08-29 Thread Henrik Bengtsson
Does serialize(..., version = 2L) do what you want?

/Henrik

On Sat, Aug 29, 2020 at 10:10 AM Sigbert Klinke
 wrote:
>
> Hi,
>
> is there in R a way to "normalize" a vector from
> compact_intseq/compact_realseq to a "normal" vector?
>
> Sigbert
>
> Am 29.08.20 um 18:13 schrieb Duncan Murdoch:
> > Element 1
> > A
> > 3
> > 262146
> > 197888
> > 5
> > UTF-8
> > 238
> > 2
> > 1
> > 262153
> > 14
> > compact_intseq
> > 2
> > 1
> > 262153
> > 4
> > base
> > 2
> > 13
> > 1
> > 13
> > 254
> > 14
> > 3
> > 3
> > 1
> > 1
> > 254
> >
> > Element 2
> > A
> > 3
> > 262146
> > 197888
> > 5
> > UTF-8
> > 238
> > 2
> > 1
> > 262153
> > 15
> > compact_realseq
> > 2
> > 1
> > 262153
> > 4
> > base
> > 2
> > 13
> > 1
> > 14
> > 254
> > 14
> > 3
> > 3
> > 1
> > 1
> > 254
> >
> > Element 3
> > A
> > 3
> > 262146
> > 197888
> > 5
> > UTF-8
> > 14
> > 3
> > 1
> > 2
> > 3
>
>
> --
> https://hu.berlin/sk
> https://hu.berlin/mmstat3
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] serialize does not work as expected

2020-08-29 Thread Sigbert Klinke

Hi,

is there in R a way to "normalize" a vector from 
compact_intseq/compact_realseq to a "normal" vector?


Sigbert

Am 29.08.20 um 18:13 schrieb Duncan Murdoch:

Element 1
A
3
262146
197888
5
UTF-8
238
2
1
262153
14
compact_intseq
2
1
262153
4
base
2
13
1
13
254
14
3
3
1
1
254

Element 2
A
3
262146
197888
5
UTF-8
238
2
1
262153
15
compact_realseq
2
1
262153
4
base
2
13
1
14
254
14
3
3
1
1
254

Element 3
A
3
262146
197888
5
UTF-8
14
3
1
2
3



--
https://hu.berlin/sk
https://hu.berlin/mmstat3

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Fox

Dear John,

On 2020-08-29 11:18 a.m., John Smith wrote:

Thanks for very insightful thoughts. What I am trying to achieve with the
weights is actually not new, something like
https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
I thought my inquiry was not too strange, and I could utilize some existing
codes. It is just an optimization problem at the end of day, or not? Thanks


So the object is to fit a regularized (i.e, penalized) logistic 
regression rather than to fit by ML. glm() won't do that.


I took a quick look at the stackexchange link that you provided and the 
document referenced in that link.  The penalty proposed in the document 
is just a multiple of the sum of squared regression coefficients, what 
usually called an L2 penalty in the machine-learning literature.  There 
are existing implementations of regularized logistic regression in R -- 
see the machine learning CRAN taskview 
. I believe 
that the penalized package will fit a regularized logistic regression 
with an L2 penalty.


As well, unless my quick reading was inaccurate, I think that you, and 
perhaps the stackexchange poster, might have been confused by the 
terminology used in the document: What's referred to as "weights" in the 
document is what statisticians more typically call "regression 
coefficients," and the "bias weight" is the "intercept" or "regression 
constant." Perhaps I'm missing some connection -- I'm not the best 
person to ask about machine learning.


Best,
 John



On Sat, Aug 29, 2020 at 9:02 AM John Fox  wrote:


Dear John,

On 2020-08-29 1:30 a.m., John Smith wrote:

Thanks Prof. Fox.

I am curious: what is the model estimated below?


Nonsense, as Peter explained in a subsequent response to your prior
posting.



I guess my inquiry seems more complicated than I thought: with y being

0/1, how to fit weighted logistic regression with weights <1, in the sense
of weighted least squares? Thanks

What sense would that make? WLS is meant to account for non-constant
error variance in a linear model, but in a binomial GLM, the variance is
purely a function for the mean.

If you had binomial (rather than binary 0/1) observations (i.e.,
binomial trials exceeding 1), then you could account for overdispersion,
e.g., by introducing a dispersion parameter via the quasibinomial
family, but that isn't equivalent to variance weights in a LM, rather to
the error-variance parameter in a LM.

I guess the question is what are you trying to achieve with the weights?

Best,
   John




On Aug 28, 2020, at 10:51 PM, John Fox  wrote:

Dear John

I think that you misunderstand the use of the weights argument to glm()

for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used
to give the number of trials when the response is the proportion of
successes." That is, in this case y should be the observed proportion of
successes (i.e., between 0 and 1) and the weights are integers giving the
number of trials for each binomial observation.


I hope this helps,
John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/


On 2020-08-28 9:28 p.m., John Smith wrote:
If the weights < 1, then we have different values! See an example

below.

How  should I interpret logLik value then?
set.seed(135)
   y <- c(rep(0, 50), rep(1, 50))
   x <- rnorm(100)
   data <- data.frame(cbind(x, y))
   weights <- c(rep(1, 50), rep(2, 50))
   fit <- glm(y~x, data, family=binomial(), weights/10)
   res.dev <- residuals(fit, type="deviance")
   res2 <- -0.5*res.dev^2
   cat("loglikelihood value", logLik(fit), sum(res2), "\n")

On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard 

wrote:

If you don't worry too much about an additive constant, then half the
negative squared deviance residuals should do. (Not quite sure how

weights

factor in. Looks like they are accounted for.)

-pd


On 25 Aug 2020, at 17:33 , John Smith  wrote:

Dear R-help,

The function logLik can be used to obtain the maximum log-likelihood

value

from a glm object. This is an aggregated value, a summation of

individual

log-likelihood values. How do I obtain individual values? In the

following

example, I would expect 9 numbers since the response has length 9. I

could

write a function to compute the values, but there are lots of
family members in glm, and I am trying not to reinvent wheels.

Thanks!


counts <- c(18,17,15,20,10,20,25,13,12)
  outcome <- gl(3,1,9)
  treatment <- gl(3,3)
  data.frame(treatment, outcome, counts) # showing data
  glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
  (ll <- logLik(glm.D93))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide


Re: [R] tempdir() does not respect TMPDIR

2020-08-29 Thread Henrik Bengtsson
It is too late to set TMPDIR in .Renviron.  It is one of the
environment variables that has to be set prior to launching R.  From
help("tempfile", package = "base"):

The environment variables TMPDIR, TMP and TEMP are checked in turn and
the first found which points to a writable directory is used: if none
succeeds ‘/tmp’ is used. The path should not contain spaces. **Note
that setting any of these environment variables in the R session has
no effect on tempdir(): the per-session temporary directory is created
before the interpreter is started.**

/Henrik

On Sat, Aug 29, 2020 at 6:40 AM Jinsong Zhao  wrote:
>
> Hi there,
>
> When I started R by double clicking on Rgui icon (I am on Windows), the
> tempdir() returned the tmpdir in the directory I set in .Renviron. If I
> started R by double clicking on a *.RData file, the tempdir() return the
> tmpdir in the directory setting by Windows system. I don't know whether
> it's designed.
>
>  > sessionInfo()
> R version 4.0.2 (2020-06-22)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 18363)
> ...
>
> Best,
> Jinsong
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] serialize does not work as expected

2020-08-29 Thread Duncan Murdoch

On 29/08/2020 11:34 a.m., Sigbert Klinke wrote:

Hi,

if I create a list with

l <- list(1:3, as.numeric(1:3), c(1,2,3))

and applying

lapply(l, 'class')
lapply(l, 'mode')
lapply(l, 'storage.mode')
lapply(l, 'typeof')
identical(l[[2]], l[[3]])

then I would believe that as,numeric(1:3) and c(1,2,3) are identical
objects. However,

lapply(l, serialize, connection=NULL)

returns different results for each list element :(

Any ideas, why it is like that?


Objects like 1:3 are stored in a special compact form, where 1:3 takes 
up the same space as 1:100.  Apparently as.numeric() knows to work 
with that special form, and produces the numeric version of it.


You can confirm this by looking at the results of

serialize(l[[i]], connection=stdout(), ascii=TRUE)

for each of i=1,2,3:

> for (i in 1:3) {
+  cat("\nElement", i, "\n")
+  serialize(l[[i]], connection=stdout(), ascii=TRUE)
+ }

Element 1
A
3
262146
197888
5
UTF-8
238
2
1
262153
14
compact_intseq
2
1
262153
4
base
2
13
1
13
254
14
3
3
1
1
254

Element 2
A
3
262146
197888
5
UTF-8
238
2
1
262153
15
compact_realseq
2
1
262153
4
base
2
13
1
14
254
14
3
3
1
1
254

Element 3
A
3
262146
197888
5
UTF-8
14
3
1
2
3

Notice how element 1 is a "compact_intseq" and element 2 is a 
"compact_realseq".


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Would Like Some Advice

2020-08-29 Thread Robert D. Bowers M.A.
Besides monitization, Windows has a few other things that infuriate 
me... (1) VERY hard to control updates, (2) "sneaker" updates - things 
installed that people don't want (like trying to force Windows computer 
owners to update - and sometimes wrecking the computer when it does), 
(3) bad updates - suddenly you find features or programs you use all the 
time not working, and you have to find out which update (or combination) 
broke your software and remove them, and worst of all IMO - there used 
to be things you could do with Windows that have been completely stopped 
- configurations and ways of increasing REAL security (not just 
protecting profits) and making the system that much more efficient.  
When I purchase a computer - I don't want some corporation forcing me to 
fit ITS stereotypes - I want total control over it and I will be the one 
making the final decisions about it (including what I do with it and the 
software I use).  (I should add that I often have to spend hours helping 
my wife with her work computer - W10, because of updates breaking her 
work software or other problems.)


I dumped Windows in 2010 - although I'd been using Linux off and on 
before then.  You see, I was finishing up my Thesis using Office, and no 
matter what I did - Office would scramble the format into what IT 
thought was right (and creating all sorts of "Widows and Orphans" and 
other format errors not matching my school's requirements).  I had to 
export the thesis into text format and load it into OpenOffice - instant 
cure of the headaches.  Also, back in 2007, there was a "security" 
update to Windows media player (I forget the name).  I was using it to 
save a video I'd taken the summer before while camping - of a black bear 
walking through our campsite.  Their software popped up a very nasty 
message that I didn't have the right to a video I HAD TAKEN MYSELF... 
and deleted every copy and version from my computer.  No backups yet... 
total loss.  Microsoft suddenly sent out a new update after I'd lost the 
video and that problem vanished, but we had friends at that time who 
also experienced the same thing (even music one person had written and 
recorded).  I think you can see why I support Linux.


Linux - for the most part, you have total control over what goes in your 
computer - which can be both good and bad (if you're not careful).  I 
myself prefer Ubuntu with the Gnome (old style) desktop - I'm a firm 
believer of "If it's not broken, don't fix it!!!".  The desktop is a 
personal preference thing.  I also very much like stability in my 
computer - so if that is important, avoid the experimental and stick 
with the LTS (Long Term Support) versions. (There are people who are 
always after the "latest and greatest" and they sometimes forget that 
not everyone has the same interests they do!)


Another drawback of Linux... software can lose support (the author gets 
tired of it) - as I've experienced a few times, or "updates" to core 
modules in the OS itself (more of the "if it's not broken don't fix 
it!!!" stuff) that break entire packages because of internal changes.  
Sometimes programmers forget about backwards compatibility... and that 
not everyone wants "the latest and greatest" at all.  I also firmly 
believe that if equipment does the job to your satisfaction, it is NOT 
'obsolete'.  I don't support throwaway culture.


There is also this problem - many software authors don't think to export 
their program to Linux, or don't want to bother.  Some may even be 
pressured into only doing Windows.  I use Windows 7 (I absolutely HATE 
10) in a virtual machine when I have software that is Windows only 
(often Paid-for software, where nothing else will do the job or where 
the equipment it runs will only work with specific software).  That's 
the only case where I willingly use Windows.


I would finish by saying I use Ubuntu LTS with Gnome because of the wide 
variety of programs I use, besides the usual Word 
Processor/Email/Browser that usually comes with the OS.  I use my 
computer for Amateur Radio, research (document research, but also doing 
things like XRF calibration curves, radiocarbon dating correction, 
optical spectroscopy work, and so on), and rarely for games - plus I do 
like to watch videos and movies now and then. There are specialized 
"flavors" of Linux that might fit one's need better.  Oh, and using 
Linux often requires a bit more knowledge (to really be able to utilize 
it) than Windows - but then, that also depends on the flavor.


BTW - I don't use R like I used to, but have always had good luck with 
it running under Linux.  I don't know how it works under Windows - maybe 
someone can speak to that.


I hope this is helpful!

Bob

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and 

Re: [R] serialize does not work as expected

2020-08-29 Thread Jeff Newmiller
Did you really conclude from looking at class that they were identical?

Numeric mode sometimes makes it hard to distinguish integers from doubles, but 
they are different.

On August 29, 2020 8:34:29 AM PDT, Sigbert Klinke  
wrote:
>Hi,
>
>if I create a list with
>
>l <- list(1:3, as.numeric(1:3), c(1,2,3))
>
>and applying
>
>lapply(l, 'class')
>lapply(l, 'mode')
>lapply(l, 'storage.mode')
>lapply(l, 'typeof')
>identical(l[[2]], l[[3]])
>
>then I would believe that as,numeric(1:3) and c(1,2,3) are identical 
>objects. However,
>
>lapply(l, serialize, connection=NULL)
>
>returns different results for each list element :(
>
>Any ideas, why it is like that?
>
>Best Sigbert

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] serialize does not work as expected

2020-08-29 Thread Sigbert Klinke

Hi,

if I create a list with

l <- list(1:3, as.numeric(1:3), c(1,2,3))

and applying

lapply(l, 'class')
lapply(l, 'mode')
lapply(l, 'storage.mode')
lapply(l, 'typeof')
identical(l[[2]], l[[3]])

then I would believe that as,numeric(1:3) and c(1,2,3) are identical 
objects. However,


lapply(l, serialize, connection=NULL)

returns different results for each list element :(

Any ideas, why it is like that?

Best Sigbert

--
https://hu.berlin/sk
https://hu.berlin/mmstat3

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Smith
Thanks for very insightful thoughts. What I am trying to achieve with the
weights is actually not new, something like
https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
I thought my inquiry was not too strange, and I could utilize some existing
codes. It is just an optimization problem at the end of day, or not? Thanks

On Sat, Aug 29, 2020 at 9:02 AM John Fox  wrote:

> Dear John,
>
> On 2020-08-29 1:30 a.m., John Smith wrote:
> > Thanks Prof. Fox.
> >
> > I am curious: what is the model estimated below?
>
> Nonsense, as Peter explained in a subsequent response to your prior
> posting.
>
> >
> > I guess my inquiry seems more complicated than I thought: with y being
> 0/1, how to fit weighted logistic regression with weights <1, in the sense
> of weighted least squares? Thanks
>
> What sense would that make? WLS is meant to account for non-constant
> error variance in a linear model, but in a binomial GLM, the variance is
> purely a function for the mean.
>
> If you had binomial (rather than binary 0/1) observations (i.e.,
> binomial trials exceeding 1), then you could account for overdispersion,
> e.g., by introducing a dispersion parameter via the quasibinomial
> family, but that isn't equivalent to variance weights in a LM, rather to
> the error-variance parameter in a LM.
>
> I guess the question is what are you trying to achieve with the weights?
>
> Best,
>   John
>
> >
> >> On Aug 28, 2020, at 10:51 PM, John Fox  wrote:
> >>
> >> Dear John
> >>
> >> I think that you misunderstand the use of the weights argument to glm()
> for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used
> to give the number of trials when the response is the proportion of
> successes." That is, in this case y should be the observed proportion of
> successes (i.e., between 0 and 1) and the weights are integers giving the
> number of trials for each binomial observation.
> >>
> >> I hope this helps,
> >> John
> >>
> >> John Fox, Professor Emeritus
> >> McMaster University
> >> Hamilton, Ontario, Canada
> >> web: https://socialsciences.mcmaster.ca/jfox/
> >>
> >>> On 2020-08-28 9:28 p.m., John Smith wrote:
> >>> If the weights < 1, then we have different values! See an example
> below.
> >>> How  should I interpret logLik value then?
> >>> set.seed(135)
> >>>   y <- c(rep(0, 50), rep(1, 50))
> >>>   x <- rnorm(100)
> >>>   data <- data.frame(cbind(x, y))
> >>>   weights <- c(rep(1, 50), rep(2, 50))
> >>>   fit <- glm(y~x, data, family=binomial(), weights/10)
> >>>   res.dev <- residuals(fit, type="deviance")
> >>>   res2 <- -0.5*res.dev^2
> >>>   cat("loglikelihood value", logLik(fit), sum(res2), "\n")
>  On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard 
> wrote:
>  If you don't worry too much about an additive constant, then half the
>  negative squared deviance residuals should do. (Not quite sure how
> weights
>  factor in. Looks like they are accounted for.)
> 
>  -pd
> 
> > On 25 Aug 2020, at 17:33 , John Smith  wrote:
> >
> > Dear R-help,
> >
> > The function logLik can be used to obtain the maximum log-likelihood
>  value
> > from a glm object. This is an aggregated value, a summation of
> individual
> > log-likelihood values. How do I obtain individual values? In the
>  following
> > example, I would expect 9 numbers since the response has length 9. I
>  could
> > write a function to compute the values, but there are lots of
> > family members in glm, and I am trying not to reinvent wheels.
> Thanks!
> >
> > counts <- c(18,17,15,20,10,20,25,13,12)
> >  outcome <- gl(3,1,9)
> >  treatment <- gl(3,3)
> >  data.frame(treatment, outcome, counts) # showing data
> >  glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
> >  (ll <- logLik(glm.D93))
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
>  http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
>  --
>  Peter Dalgaard, Professor,
>  Center for Statistics, Copenhagen Business School
>  Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>  Phone: (+45)38153501
>  Office: A 4.23
>  Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> >>> [[alternative HTML version deleted]]
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, 

Re: [R] Base package being deleted recurrently

2020-08-29 Thread J C Nash
Possibly way off target, but I know some of our U of O teaching
systems boot by reverting to a standard image i.e., you get back
to a vanilla system. That would certainly kill any install.

JN

On 2020-08-28 10:22 a.m., Rene J Suarez-Soto wrote:
> Hi,
> 
> I have a very strange issue. I am currently running R 4.0.2. The files in
> my library/base/ are being deleted by some unknown reason. I have had to
> install R over 20 times in the last 2 month. I have installed using user
> privileges and admin. I have installed it to different directories but the
> same issue repeats. I have checked the history of the antivirus program and
> it does not seem to be a problem. This is in an enterprise environment but
> IT checked an it does not seem to be related to any security processes. Any
> ideas? Thanks.
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Fox

Dear John,

On 2020-08-29 1:30 a.m., John Smith wrote:

Thanks Prof. Fox.

I am curious: what is the model estimated below?


Nonsense, as Peter explained in a subsequent response to your prior posting.



I guess my inquiry seems more complicated than I thought: with y being 0/1, how to 
fit weighted logistic regression with weights <1, in the sense of weighted 
least squares? Thanks


What sense would that make? WLS is meant to account for non-constant 
error variance in a linear model, but in a binomial GLM, the variance is 
purely a function for the mean.


If you had binomial (rather than binary 0/1) observations (i.e., 
binomial trials exceeding 1), then you could account for overdispersion, 
e.g., by introducing a dispersion parameter via the quasibinomial 
family, but that isn't equivalent to variance weights in a LM, rather to 
the error-variance parameter in a LM.


I guess the question is what are you trying to achieve with the weights?

Best,
 John




On Aug 28, 2020, at 10:51 PM, John Fox  wrote:

Dear John

I think that you misunderstand the use of the weights argument to glm() for a binomial 
GLM. From ?glm: "For a binomial GLM prior weights are used to give the number of 
trials when the response is the proportion of successes." That is, in this case y 
should be the observed proportion of successes (i.e., between 0 and 1) and the weights 
are integers giving the number of trials for each binomial observation.

I hope this helps,
John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/


On 2020-08-28 9:28 p.m., John Smith wrote:
If the weights < 1, then we have different values! See an example below.
How  should I interpret logLik value then?
set.seed(135)
  y <- c(rep(0, 50), rep(1, 50))
  x <- rnorm(100)
  data <- data.frame(cbind(x, y))
  weights <- c(rep(1, 50), rep(2, 50))
  fit <- glm(y~x, data, family=binomial(), weights/10)
  res.dev <- residuals(fit, type="deviance")
  res2 <- -0.5*res.dev^2
  cat("loglikelihood value", logLik(fit), sum(res2), "\n")

On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:
If you don't worry too much about an additive constant, then half the
negative squared deviance residuals should do. (Not quite sure how weights
factor in. Looks like they are accounted for.)

-pd


On 25 Aug 2020, at 17:33 , John Smith  wrote:

Dear R-help,

The function logLik can be used to obtain the maximum log-likelihood

value

from a glm object. This is an aggregated value, a summation of individual
log-likelihood values. How do I obtain individual values? In the

following

example, I would expect 9 numbers since the response has length 9. I

could

write a function to compute the values, but there are lots of
family members in glm, and I am trying not to reinvent wheels. Thanks!

counts <- c(18,17,15,20,10,20,25,13,12)
 outcome <- gl(3,1,9)
 treatment <- gl(3,3)
 data.frame(treatment, outcome, counts) # showing data
 glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
 (ll <- logLik(glm.D93))

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com











[[alternative HTML version deleted]]
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tempdir() does not respect TMPDIR

2020-08-29 Thread Jinsong Zhao

Hi there,

When I started R by double clicking on Rgui icon (I am on Windows), the 
tempdir() returned the tmpdir in the directory I set in .Renviron. If I 
started R by double clicking on a *.RData file, the tempdir() return the 
tmpdir in the directory setting by Windows system. I don't know whether 
it's designed.


> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
...

Best,
Jinsong

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread peter dalgaard
Briefly, you shouldn't. One way of seeing it is if you switch the model to y~1, 
you still get logLik==0.

The root cause is the rounding in binomial()$aic:

> binomial()$aic
function (y, n, mu, wt, dev) 
{
m <- if (any(n > 1)) 
n
else wt
-2 * sum(ifelse(m > 0, (wt/m), 0) * dbinom(round(m * y), 
round(m), mu, log = TRUE))
}

which, if wt is small enough ends up calculating dbinom(0, 0, p, log=TRUE) 
which is zero. 

(Not rounding gives you NaN, because you're trying to fit a model with a 
non-integer number of observations.)

-pd

> On 29 Aug 2020, at 03:28 , John Smith  wrote:
> 
> If the weights < 1, then we have different values! See an example below. How  
> should I interpret logLik value then?
> 
> set.seed(135)
>  y <- c(rep(0, 50), rep(1, 50))
>  x <- rnorm(100)
>  data <- data.frame(cbind(x, y))
>  weights <- c(rep(1, 50), rep(2, 50))
>  fit <- glm(y~x, data, family=binomial(), weights/10)
>  res.dev <- residuals(fit, type="deviance")
>  res2 <- -0.5*res.dev^2
>  cat("loglikelihood value", logLik(fit), sum(res2), "\n")
> 
> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:
> If you don't worry too much about an additive constant, then half the 
> negative squared deviance residuals should do. (Not quite sure how weights 
> factor in. Looks like they are accounted for.)
> 
> -pd
> 
> > On 25 Aug 2020, at 17:33 , John Smith  wrote:
> > 
> > Dear R-help,
> > 
> > The function logLik can be used to obtain the maximum log-likelihood value
> > from a glm object. This is an aggregated value, a summation of individual
> > log-likelihood values. How do I obtain individual values? In the following
> > example, I would expect 9 numbers since the response has length 9. I could
> > write a function to compute the values, but there are lots of
> > family members in glm, and I am trying not to reinvent wheels. Thanks!
> > 
> > counts <- c(18,17,15,20,10,20,25,13,12)
> > outcome <- gl(3,1,9)
> > treatment <- gl(3,3)
> > data.frame(treatment, outcome, counts) # showing data
> > glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
> > (ll <- logLik(glm.D93))
> > 
> >   [[alternative HTML version deleted]]
> > 
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread peter dalgaard



> On 25 Aug 2020, at 18:40 , peter dalgaard  wrote:
> 
> If you don't worry too much about an additive constant, then half the 
> negative squared deviance residuals should do. (Not quite sure how weights 
> factor in. Looks like they are accounted for.)
> 
> -pd
> 
>> On 25 Aug 2020, at 17:33 , John Smith  wrote:
>> 
>> Dear R-help,
>> 
>> The function logLik can be used to obtain the maximum log-likelihood value
>> from a glm object. This is an aggregated value, a summation of individual
>> log-likelihood values. How do I obtain individual values? In the following
>> example, I would expect 9 numbers since the response has length 9. I could
>> write a function to compute the values, but there are lots of
>> family members in glm, and I am trying not to reinvent wheels. Thanks!
>> 
>> counts <- c(18,17,15,20,10,20,25,13,12)
>>outcome <- gl(3,1,9)
>>treatment <- gl(3,3)
>>data.frame(treatment, outcome, counts) # showing data
>>glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
>>(ll <- logLik(glm.D93))
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Would Like Some Advise

2020-08-29 Thread Duy Tran
I've worked with a 16 Gb laptop of RAM and it's been plenty for me. If you
need to work with larger data, I think you should look into packages like
sparklyr, which is basically dplyr running on a Spark cluster. Hope that
helps !
Duy


On Fri, Aug 28, 2020 at 9:09 AM Philip  wrote:

> I need a new computer.  have a friend who is convinced that I have an aura
> about me that just kills electronic devices.
>
> Does anyone out there have an opinion about Windows vs. Linux?
>
> I’m retired so this is just for my own enjoyment but I’m crunching some
> large National Weather Service files and will move on to baseball data and
> a few other things.  I’d like some advise about how much RAM and stuff like
> that.  I understand there is something called zones of computer memory. Can
> someone direct me to a good source so I can learn more?   I really don’t
> understand stuff like this.  Does anyone think I need to upgrade my wifi?
>
> Thanks,
> Philip
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Base package being deleted recurrently

2020-08-29 Thread Rene J Suarez-Soto
Hi,

I have a very strange issue. I am currently running R 4.0.2. The files in
my library/base/ are being deleted by some unknown reason. I have had to
install R over 20 times in the last 2 month. I have installed using user
privileges and admin. I have installed it to different directories but the
same issue repeats. I have checked the history of the antivirus program and
it does not seem to be a problem. This is in an enterprise environment but
IT checked an it does not seem to be related to any security processes. Any
ideas? Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PROBLEM: quickly downloading 10,000 articles to sift through

2020-08-29 Thread Fraedrich, John



To analyze 10,000+ articles within several journals to determine major theories 
used, empirical research of models, constructs, and variables, differences in 
standard definitions by discipline, etc. Is/does R have this in a software 
package?



Sent from Mail for Windows 10


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.