Re: [R] How long to wait for process?

2017-07-26 Thread Bert Gunter
Dunno. You might wish to email the maintainer (see ?maintainer), who
may not monitor this list, if you do not get a satisfactory reply
here.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Jul 26, 2017 at 7:14 AM, john polo  wrote:
> UseRs,
>
> I have a dataframe with 2547 rows and several hundred columns in R 3.1.3. I
> am trying to run a small logistic regression with a subset of the data.
>
> know_fin ~
> comp_grp2+age+gender+education+employment+income+ideol+home_lot+home+county
>
> > str(knowf3)
> 'data.frame':   2033 obs. of  18 variables:
> $ userid: Factor w/ 2542 levels "FNCNM1639","FNCNM1642",..: 1857 157
> 965 1967 164 315 849 1017 699 189 ...
> $ round_id   : Factor w/ 1 level "Round 11": 1 1 1 1 1 1 1 1 1 1 ...
> $ age   : int  67 66 44 27 32 67 36 76 70 66 ...
> $ county: Factor w/ 80 levels "Adair","Alfalfa",..: 75 75 75 75 75 75 64
> 64 64 64 ...
> $ gender: Factor w/ 2 levels "0","1": 1 2 1 1 2 1 2 1 2 2 ...
> $ education : Factor w/ 8 levels "1","2","3","4",..: 6 7 6 8 2 4 2 4 2 6
> ...
> $ employment: Factor w/ 9 levels "1","2","3","4",..: 8 4 4 4 3 8 5 8 4 4
> ...
> $ income: num  55 8 9 19000 42000 3 18000 5
> 80 1 ...
> $ home: num  0 0 0 0 0 0 0 0 0 0 ...
> $ ideol : Factor w/ 7 levels "1","2","3","4",..: 2 7 4 3 2 4 2 3 2 6
> ...
> $ home_lot  : Factor w/ 3 levels "1","2","3": 2 2 2 2 2 2 3 3 1 2 ...
> $ hispanic  : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
> $ comp_grp2 : Factor w/ 16 levels "Cr_Gr","Cr_Ot",..: 13 13 13 13 13 13
> 10 10 10 10 ...
> $ know_fin : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...
>
>
> With the regular glm() function, I get a warning about "perfect or
> quasi-perfect separation"[1]. I looked for a method to deal with this and a
> penalized GLM is an accepted method[2]. This is implemented in logistf(). I
> used the default settings for the function.
>
> Just before I run the model, memory.size() for my session is ~4500 (MB).
> memory.limit() is ~25500. When I start the model, R immediately becomes
> non-responsive. This is in a Windows environment and in Task Manager, the
> instance of R is, and has been, using ~13% of CPU aand ~4997 MB of RAM. It's
> been ~24 hours now in that state and I don't have any idea of how long this
> should take. If I run the same model in the same setting with the base
> glm(), the model runs in about 60 seconds. Is there a way to know if the
> process is going to produce something useful after all this time or if it's
> hanging on some kind of problem?
>
>
>   [1]:
> https://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression#68917
>   [2]:
> https://academic.oup.com/biomet/article-abstract/80/1/27/228364/Bias-reduction-of-maximum-likelihood-estimates
>
>
> --
> Men occasionally stumble
> over the truth, but most of them
> pick themselves up and hurry off
> as if nothing had happened.
> -- Winston Churchill
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How long to wait for process?

2017-07-26 Thread john polo

UseRs,

I have a dataframe with 2547 rows and several hundred columns in R 
3.1.3. I am trying to run a small logistic regression with a subset of 
the data.


know_fin ~ 
comp_grp2+age+gender+education+employment+income+ideol+home_lot+home+county


> str(knowf3)
'data.frame':   2033 obs. of  18 variables:
$ userid: Factor w/ 2542 levels "FNCNM1639","FNCNM1642",..: 
1857 157 965 1967 164 315 849 1017 699 189 ...

$ round_id   : Factor w/ 1 level "Round 11": 1 1 1 1 1 1 1 1 1 1 ...
$ age   : int  67 66 44 27 32 67 36 76 70 66 ...
$ county: Factor w/ 80 levels "Adair","Alfalfa",..: 75 75 75 75 75 
75 64 64 64 64 ...

$ gender: Factor w/ 2 levels "0","1": 1 2 1 1 2 1 2 1 2 2 ...
$ education : Factor w/ 8 levels "1","2","3","4",..: 6 7 6 8 2 4 2 
4 2 6 ...
$ employment: Factor w/ 9 levels "1","2","3","4",..: 8 4 4 4 3 8 5 
8 4 4 ...
$ income: num  55 8 9 19000 42000 3 18000 5 
80 1 ...

$ home: num  0 0 0 0 0 0 0 0 0 0 ...
$ ideol : Factor w/ 7 levels "1","2","3","4",..: 2 7 4 3 2 4 2 
3 2 6 ...

$ home_lot  : Factor w/ 3 levels "1","2","3": 2 2 2 2 2 2 3 3 1 2 ...
$ hispanic  : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ comp_grp2 : Factor w/ 16 levels "Cr_Gr","Cr_Ot",..: 13 13 13 13 
13 13 10 10 10 10 ...

$ know_fin : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...


With the regular glm() function, I get a warning about "perfect or 
quasi-perfect separation"[1]. I looked for a method to deal with this 
and a penalized GLM is an accepted method[2]. This is implemented in 
logistf(). I used the default settings for the function.


Just before I run the model, memory.size() for my session is ~4500 (MB). 
memory.limit() is ~25500. When I start the model, R immediately becomes 
non-responsive. This is in a Windows environment and in Task Manager, 
the instance of R is, and has been, using ~13% of CPU aand ~4997 MB of 
RAM. It's been ~24 hours now in that state and I don't have any idea of 
how long this should take. If I run the same model in the same setting 
with the base glm(), the model runs in about 60 seconds. Is there a way 
to know if the process is going to produce something useful after all 
this time or if it's hanging on some kind of problem?



  [1]: 
https://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression#68917
  [2]: 
https://academic.oup.com/biomet/article-abstract/80/1/27/228364/Bias-reduction-of-maximum-likelihood-estimates



--
Men occasionally stumble
over the truth, but most of them
pick themselves up and hurry off
as if nothing had happened.
-- Winston Churchill

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fill out a PDF form in R

2017-07-26 Thread Ulrik Stervbo
On second thought, you could also use pdftk to fill out the pdf form with
data generated in R.

On Wed, 26 Jul 2017 at 14:01 Ulrik Stervbo  wrote:

> Hi Elahe,
>
> I have no clue, but maybe you can dump the data fields using pdftk, and
> work with those in R.
>
> HTH
> Ulrik
>
> On Wed, 26 Jul 2017 at 13:50 Elahe chalabi via R-help <
> r-help@r-project.org> wrote:
>
>> Hi all,
>>
>> I would like to get ideas about how to fill out a PDF form in R and to
>> know if it's possible or not. I could not find something helpful in
>> Internet.
>>
>> Does anyone know a good link for that or have experience in this?
>> Thanks for any help!
>>
>> Elahe
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fill out a PDF form in R

2017-07-26 Thread Ulrik Stervbo
Hi Elahe,

I have no clue, but maybe you can dump the data fields using pdftk, and
work with those in R.

HTH
Ulrik

On Wed, 26 Jul 2017 at 13:50 Elahe chalabi via R-help 
wrote:

> Hi all,
>
> I would like to get ideas about how to fill out a PDF form in R and to
> know if it's possible or not. I could not find something helpful in
> Internet.
>
> Does anyone know a good link for that or have experience in this?
> Thanks for any help!
>
> Elahe
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fill out a PDF form in R

2017-07-26 Thread Elahe chalabi via R-help
Hi all,

I would like to get ideas about how to fill out a PDF form in R and to know if 
it's possible or not. I could not find something helpful in Internet.

Does anyone know a good link for that or have experience in this?
Thanks for any help!

Elahe

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] axis() after image.plot() does not work except if points() is inserted between

2017-07-26 Thread Marc Girondot via R-help
Thanks... I agree that the problem was explained in the documentation 
but I can't find a way to have axis() working even manipulating 
par("plt") or with graphics.reset = TRUE:

- adding graphics.reset=TRUE does not allow axis() to be shown;
- I see that par()$plt is involved but it is the not sufficient to 
explain why axis() works because if it is changed by hand, axes are not 
shown.


Thanks for the trick about range(). I didn't notice that it has also a 
na.rm option. It is more elegant.


Here is the code showing the various propositions and checks:


library(fields)
D <- matrix(c(10, 20, 25, 30, 12, 22, 32, 35, 13, 25, 38, 40), nrow=3)
(pplt <- par()$plt)
 [1] 0.08844944 0.95469663 0.14253275 0.88541485

# original problem. Axis() not shown

image.plot(D, col=rev(heat.colors(128)),bty="n", xlab="Lines",
   ylab="Columns", cex.lab = 0.5, zlim=range(D, na.rm=TRUE),
   las=1, axes=FALSE)
# Check the value of par()$plt; it is indeed modified
par()$plt
 [1] 0.08844944 0.86408989 0.14253275 0.88541485
# axis() does not work
axis(1, at=seq(from=0, to=1, length=nrow(D)), labels=0:2, cex.axis=0.5)
axis(2, at=seq(from=0, to=1, length=ncol(D)), labels=0:3, las=1, 
cex.axis=0.5)


# I restore par("plt") to it original value
par(plt=pplt)

# graphics.reset = TRUE. Axis() not shown

par()$plt
### [1] 0.08844944 0.95469663 0.14253275 0.88541485
image.plot(D, col=rev(heat.colors(128)),bty="n", xlab="Lines",
   ylab="Columns", cex.lab = 0.5, zlim=range(D, na.rm=TRUE),
   las=1, axes=FALSE, graphics.reset = TRUE)
# Check an effect on par()$plt. Indeed with graphics.reset = TRUE 
par()$plt is restored

par()$plt
### [1] 0.08844944 0.95469663 0.14253275 0.88541485
# But the axes at not shown
axis(1, at=seq(from=0, to=1, length=nrow(D)), labels=0:2, cex.axis=0.5)
axis(2, at=seq(from=0, to=1, length=ncol(D)), labels=0:3, las=1, 
cex.axis=0.5)


# Check an effect of points() on par()$plt
# There is an effect
# axes are shown

par()$plt
### [1] 0.08844944 0.95469663 0.14253275 0.88541485
image.plot(D, col=rev(heat.colors(128)),bty="n", xlab="Lines",
   ylab="Columns", cex.lab = 0.5, zlim=range(D, na.rm=TRUE),
   las=1, axes=FALSE)
points(1.5, 1.5, type="p")
par()$plt
### [1] 0.08844944 0.86408989 0.14253275 0.88541485
axis(1, at=seq(from=0, to=1, length=nrow(D)), labels=0:2, cex.axis=0.5)
axis(2, at=seq(from=0, to=1, length=ncol(D)), labels=0:3, las=1, 
cex.axis=0.5)


# Try to reproduce the effect of points() on par()$plt
# axes are not shown. Then points() is doing something else !

image.plot(D, col=rev(heat.colors(128)),bty="n", xlab="Lines",
   ylab="Columns", cex.lab = 0.5, zlim=range(D, na.rm=TRUE),
   las=1, axes=FALSE)
par(plt=c(0.08844944, 0.86408989, 0.14253275, 0.88541485))
axis(1, at=seq(from=0, to=1, length=nrow(D)), labels=0:2, cex.axis=0.5)
axis(2, at=seq(from=0, to=1, length=ncol(D)), labels=0:3, las=1, 
cex.axis=0.5)

# check that par("plt") is correctly setup
par()$plt
### [1] 0.08844944 0.86408989 0.14253275 0.88541485

I think that it will remain a mystery !
At least the trick with points() is working.

Thanks
Marc


Le 25/07/2017 à 13:03, Martin Maechler a écrit :

Marc Girondot via R-help 
 on Mon, 24 Jul 2017 09:35:06 +0200 writes:

 > Thanks for the proposition. As you see bellow, par("usr") is the same
 > before and after the points() (the full code is bellow):
 > 
 >> par("usr")
 > [1] -0.250  1.250 -0.167  1.167
 >> # if you remove this points() function, axis will show nothing.
 >>
 >> points(1.5, 1.5, type="p")
 >> p2 <- par(no.readonly=TRUE)
 >> par("usr")
 > [1] -0.250  1.250 -0.167  1.167
 > ...

 > I can reproduce it in Ubuntu and MacosX R Gui and Rstudio (R 3.4.1).

 > Marc

 > Here is the code:
 > library(fields)
 > par(mar=c(5,4.5,4,7))
 > D <- matrix(c(10, 20, 25, 30, 12, 22, 32, 35, 13, 25, 38, 40), nrow=3)

 > p0 <- par(no.readonly=TRUE)
 > image.plot(D, col=rev(heat.colors(128)),bty="n", xlab="Lines",
 >   ylab="Columns", cex.lab = 0.5,
 >   zlim=c(min(D, na.rm=TRUE),max(D, na.rm=TRUE)),
 >   las=1, axes=FALSE)
 > p1 <- par(no.readonly=TRUE)

 > par("usr")
 > par("xpd")

 > # if you remove this points() function, axis will show nothing.

 > points(1.5, 1.5, type="p")
 > p2 <- par(no.readonly=TRUE)
 > par("usr")
 > par("xpd")

 > ##
 > axis(1, at=seq(from=0, to=1, length=nrow(D)), labels=0:2, cex.axis=0.5)
 > axis(2, at=seq(from=0, to=1, length=ncol(D)), labels=0:3, las=1,
 > cex.axis=0.5)

 > identical(p1, p2)

Have you ever carefully read the detailed help page about image.plot()?

I haven't, but a cursory reading already shows me that the

author of the function did this partly on purpose:

   > Side Effects:
   >
   >  After exiting, the plotting region may be changed to make it
   >