Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them?

Michal Figurski Tue, 22 Jul 2008 12:49:09 -0700

Dear Marc and all,

Thank you for all the due respect.

I tried to explain as much explicitly as I could what I am trying to doin my first email. I did not invent this procedure, it was alreadypublished in the paper:

T. Pawinski, M. Hale, M. Korecka, W.E. Fitzsimmons, L.M. Shaw. LimitedSampling Strategy for the Estimation of Mycophenolic Acid Area under theCurve in Adult Renal Transplant Patients Treated with ConcomitantTacrolimus. Clinical Chemistry 2002(48:9), 1497-1504

I only adopted this methodology to work under SAS and now I try to do itunder R, because I like R. I need a practical advice because I have apractical problem, and I do not understand much of the theoreticaldiscussion on what bootstrap is suitable for or not. Apparently I amtrying to use it for something else than the experts are used to...

Honestly, I did not learn anything from this discussion so far, I amjust disappointed.

Though, since the discussion has already started, I'd welcome yourcriticism on this procedure - I just ask that you express it in humanlanguage.


--
Michal J. Figurski

Marc Schwartz wrote:

Michal,
With all due respect, you have openly acknowledged that you don't knowenough about the subject at hand.
If that is the case, on what basis are you in a position to challengethe collective wisdom of those professionals who have voluntarilyoffered *expert* level statistical advice to you?
You have erected a wall around your thinking.
You may choose to use R or any other software application to"Git-R-Done". But that does not make it correct.
There are other methods to consider that could be used during the modelbuilding process itself, rather than on a post-hoc basis and I wouldspecifically refer you to Frank's book, Regression Modeling Strategies:
  http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RmS

Marc Schwartz

on 07/22/2008 09:43 AM Michal Figurski wrote:
Hmm...
It sounds like ideology to me. I was asking for technical help. I knowwhat I want to do, just don't know how to do it in R. I'll go back toSAS then. Thank you.
--
Michal J. Figurski

Doran, Harold wrote:
I think the answer has been given to you. If you want to continue to
ignore that advice and use bootstrap for point estimates rather than the
properties of those estimates (which is what bootstrap is for) then you
are on your own.
-----Original Message-----
From: [EMAIL PROTECTED][mailto:[EMAIL PROTECTED] On Behalf Of Michal Figurski
Sent: Tuesday, July 22, 2008 9:52 AM
To: r-help@r-project.org
Subject: Re: [R] Coefficients of Logistic Regression from bootstrap- how to get them?
Dear all,
I don't want to argue with anybody about words or about whatbootstrap is suitable for - I know too little for that.
All I need is help to get the *equation coefficients* optimized bybootstrap - either by one of the functions or by simple median.
Please help,

--
Michal J. Figurski
HUP, Pathology & Laboratory Medicine
Xenobiotics Toxicokinetics Research Laboratory 3400 Spruce St. 7Maloney Philadelphia, PA 19104 tel. (215) 662-3413
Frank E Harrell Jr wrote:
Michal Figurski wrote:
Frank,

"How does bootstrap improve on that?"
I don't know, but I have an idea. Since the data in my set
are just a
small sample of a big population, then if I use my whole
dataset to
obtain max likelihood estimates, these estimates may be
best for this
dataset, but far from ideal for the whole population.
The bootstrap, being a resampling procedure from your
sample, has the
same issues about the population as MLEs.
I used bootstrap to virtually increase the size of my dataset, itshould result in estimates more close to that from the
population -
isn't it the purpose of bootstrap?
No
When I use such median coefficients on another dataset (anothersample from population), the predictions are better, than
using max
likelihood estimates. I have already tested that and it worked!
Then your testing procedure is probably not valid.
I am not a statistician and I don't feel what
"overfitting" is, but
it may be just another word for the same idea.
Nevertheless, I would still like to know how can I get thecoeffcients for the model that gives the "nearly unbiased
estimates".
I greatly appreciate your help.
More info in my book Regression Modeling Strategies.

Frank
--
Michal J. Figurski
HUP, Pathology & Laboratory Medicine
Xenobiotics Toxicokinetics Research Laboratory 3400 Spruce St. 7Maloney Philadelphia, PA 19104 tel. (215) 662-3413
Frank E Harrell Jr wrote:
Michal Figurski wrote:
Hello all,
I am trying to optimize my logistic regression model by usingbootstrap. I was previously using SAS for this kind of
tasks, but I
am now switching to R.
My data frame consists of 5 columns and has 109 rows.
Each row is a
single record composed of the following values: Subject_name,numeric1, numeric2, numeric3 and outcome (yes or no). All threenumerics are used to predict outcome using LR.
In SAS I have written a macro, that was splitting the dataset,running LR on one half of data and making predictions on secondhalf. Then it was collecting the equation coefficients from eachiteration of bootstrap. Later I was just taking medians of thesecoefficients from all iterations, and used them as an
optimal model
- it really worked well!
Why not use maximum likelihood estimation, i.e., the coefficientsfrom the original fit. How does the bootstrap improve on that?
Now I want to do the same in R. I tried to use the 'validate' or'calibrate' functions from package "Design", and I alsoexperimented with function 'sm.binomial.bootstrap' from package"sm". I tried also the function 'boot' from package
"boot", though
without success
- in my case it randomly selected _columns_ from my data frame,while I wanted it to select _rows_.
validate and calibrate in Design do resampling on the rows
Resampling is mainly used to get a nearly unbiased
estimate of the
model performance, i.e., to correct for overfitting.

Frank Harrell
Though the main point here is the optimized LR equation. I wouldappreciate any help on how to extract the LR equation
coefficients
from any of these bootstrap functions, in the same form
as given by
'glm' or 'lrm'.

Many thanks in advance!


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them?

Reply via email to