Re: Murray/Hernstein

2000-10-29 Thread Bryan Caplan

Chris Rasch wrote:
 
 My current level of understanding of econometrics and statistics is such that I
 don't feel qualified to evaluate the arguments presented in the recent exchange
 between Brian and Chris regarding the merits (or lack thereof) of Murry and
 Herrnstein's research in The Bell Curve.  Assuming I wanted to remedy that
 situation, what texts would you recommend I study to learn the vocabulary and to
 at least recognize when a good (or bad) argument is being made?

I'm afraid you'd really need to go through a graduate econometrics
textbook to get the lingo down, and then hang around with practioners
for a while to get a feel for what they actually do.  I'm not too happy
with any graduate econometrics text, but I use Johnston and DiNardo.

-- 
  Prof. Bryan Caplan   [EMAIL PROTECTED] 
 
  http://www.gmu.edu/departments/economics/bcaplan 
 
  "[W]hen we attempt to prove by direct argument, what is really
   self-evident, the reasoning will always be inconclusive; for it
   will either take for granted the thing to be proved, or something
   not more evident; and so, instead of giving strength to the
   conclusion, will rather tempt those to doubt of it, who never
   did so before."  
-- Thomas Reid, _Essays on the Active Powers of the Human Mind_



Re: Murray/Hernstein

2000-10-29 Thread Bryan Caplan

Chris Auld wrote:
 
 Bryan Caplan wrote:
 
  Most of these wind up being dependent variables at some point in their
  book.
 
 I'm not sure why that helps their case at all -- it's as if they've
 produced a whole bunch of reduced form equations, treating age,
 and, probably inappropriately, SES and AFQT as the only exogenous
 regressors.  Take any one of the outcomes they consider and the
 relevant related discussion.  Cast it as a study on that particular
 outcome.  Would it be accepted by any economics journal?  I don't
 think so -- so why should I think that if we put a couple of dozen
 of these together, we arrive at something compelling?

I think so.  Anyone can get one marginally convincing result.  But
getting hundreds shows something.  There is even a formal test using
this intuition - the p-lambda test.   Eight results (out of 10)
significant the 20% level, for example, could almost never arise from
pure chance. 

  variable.  I think there are good practical reasons not to do this, and
  there is a wealth of research that uses simplified indices in place of
  the kitchen sink (e.g. "Democracy" indices, "Rule of Law" indices, "Bank
  Failure" indices, etc.)
 
 Sure, but here there was no reason not to simply put the four variables
 in each regression, rather then the ad hoc index.

There's as much or as little reason to use an index in TBC as anywhere. 
Should they have put in each test sub-scale separately, too?

 
  My memory is not too good here - I read a few pieces by Heckman on this,
  but nothing that I remember reaching results that were "dramatically"
  different.
 
 You don't take it as problematic for M/H's conclusions that Heckman
 found there was no way to seperate the effects of intelligence from
 education?  

I don't think it's possible for them to show there's "no way" to do it. 
They could certainly point out data limitations, and offer their
judgment that these are insuperable.  But I wouldn't call that judgment
a "finding."

 That he and coathors showed the returns to intelligence
 vary markedly across subpopulations?

If there's "no way" to do it for the whole population, how did they
manage? :-)  Seriously, I'd expect that you could re-do almost anyone's
results this way.  In each case, you would learn more, but unless the
whole-sample results drastically reversed I don't see why it's so
interesting.

  What you call an "endogeneity problem" doesn't fit the usual textbook
  description.  Normally correlation among independent variables isn't a
  problem, though it complicates the interpretation if changing one
  variable almost always changes the other.
 
 I am referring to correlation between the error term and regressors as
 "endogeneity," which is of course the usual textbook defintion.

OK.

 Correlation amongst the independent variables is actually a problem (it
 reduces precision of the estimates), it's just not a problem that causes
 inconsistency.  

OK.

   That said, I'm not sure education, at least, is particularly
   well measured in most datasets, as we generally ignore quality measures.
 
  Well-measured compared to what?
 
 Lots of things (to take some obvious examples: gender, age, nationality,
 region of residence).  

OK, if that's your benchmark of quality.  It doesn't leave much.

   And recall M/H "solve" the colinearity problem they have between education
   and IQ by *dropping* education.  Some solution!
 
  Though it may appall you, many textbooks recommend it.
 
 Many textbooks recommend dropping a regressor, then interpreting the
 coefficients on the remaining regressors _as if_ the dropped regressor
 was still in the equation?  That's clearly not true.  What are these
 "many textbooks," out of curiosity (I can't recall ever reading anyone
 recommending that one "solve" colinearity problems by just tossing
 out regressors)?  And surely they note that such an exclusion affects how
 one should interpret the coefficients on the remaining regressors, no?

I'm sure they put it more circumspectly, but that's what it amounts to.
If you did a regression with both "what you said your education is" and
"what your brother said your education is" and the SEs on both exploded,
what would any econometrics prof tell you?  That "there's no way to
decide" which matters, and we have to be agnostic?  That's the purist
answer, but I think most practioners would tell you to drop the second
measure and call the results from the new specification the "return to
education."

   In spite of all
  of his reservations about TBC, Bill Dickens felt pretty comfortable with
  their omission.
 
 Perhaps Bill can explain why?

Alas, we are bereft of his wisdom here because of his other commitments.
-- 
  Prof. Bryan Caplan   [EMAIL PROTECTED] 
 
  http://www.gmu.edu/departments/economics/bcaplan 
 
  "[W]hen we attempt to prove by direct argument, what is really
   self-evident, the reasoning will always be inconclusive; for it
   will either take for 

Re: Murray/Hernstein

2000-10-27 Thread Bryan Caplan

On the adequacy of M/H's SES measure, I know list member Bill Dickens
has done a lot of work on this.  He revised their measure to include
more information.  Better if Bill summarizes, but on the whole I'd say
he concluded that M/H's SES moderately understates the importance of
SES, but intelligence still matters a great deal.

On the other hand, as behavioral geneticists (and Chris?) will point
out, since SES partially reflects genotype, this measure *understates*
the real importance of intelligence. 
-- 
Prof. Bryan Caplan   [EMAIL PROTECTED]
http://www.gmu.edu/departments/economics/bcaplan

  "We may be dissatisfied with television for two quite different 
   reasons: because our set does not work, or because we dislike 
   the program we are receiving.  Similarly, we may be dissatisfied 
   with ourselves for two quite different reasons: because our body 
   does not work (bodily illness), or because we dislike our 
   conduct (mental illness)."
   --Thomas Szasz, *The Untamed Tongue*



Re: Murray/Hernstein

2000-10-27 Thread Chris Auld




On Fri, 27 Oct 2000, Bryan Caplan wrote:

 more information.  Better if Bill summarizes, but on the whole I'd say
 he concluded that M/H's SES moderately understates the importance of
 SES, but intelligence still matters a great deal.

Don't get me wrong: I'm not claiming that intelligence doesn't matter.
What I'm claiming is that disentangling the marginal effect of
intelligence from other factors that influence various outcomes is
not "pretty easy."  In particular, the approach taken by M/H is so
statistically inept that it's impossible to discern from their work
whether intelligence matters directly, and if it does, by how much. 


Chris Auld  (403)220-4098
Economics, University of Calgarymailto:[EMAIL PROTECTED]
Calgary, Alberta, CanadaURL:http://jerry.ss.ucalgary.ca/




Re: Murray/Hernstein

2000-10-26 Thread Chris Auld



 Chris, could you summarize the alleged deficiencies of the Bell Curve?
 -fabio

Many others have critiqued their methods, their interpretation of
the psychometric literature, and their analysis of their own
original results.  You can find lengthy criticism in:

Kincheloe et al (1996) Measured Lies: the Bell Curve Examined.

Devlin et al (1997) Intelligence and Success: Is it All in the Genes?
Scientists Respond to the Bell Curve.

Here's my take on some aspects: almost all of the original results are
based on logit models that look like

y* = constant + b(SES) + c(AFQT) + d(age) + extreme value noise.

There are rarely other covariates, where there they aren't exactly
exhaustive.  SES is simply a weighted sum of father's and mother's income,
family income (of the respondent's parents), and an index of
parents' occupational status. The weights are more or less ad hoc, so
including SES is the same as including all these covariates, but then
adding three ad hoc linear restrictions.  

Now, the outcomes y* include: unemployment, levels of educational
attainment, poverty, marital status, illegitimate births, welfare
dependency, low birth weight children, criminal activities, and so on. 
Consider any one of these outcomes, say, "the subject was in the top
decile on an index of self-reported crime."  Suppose anyone on this list
took NLS-Y data, constructed an indicator for that condition, and typed: 

 logit criminal age mothersed fathersed faminc occind afqt

into an econometric package (remembering income and occupation refer to
the family where the respondent grew up, not to the respondent).  We then
arbitrarily add three restrictions to get

 logit criminal age ses afqt.

We then interpret the coefficient on AFQT as the causal effect of higher
cognitive ability on propensity to be a criminal, all else equal, write up
our result, and send it off to a journal. 

Of course, it barely stops moving across the editor's desk before being
popped back in the mail, rejected.  Where to start documenting the
problems with this interpretation of the regression above?  The
respondent's own income, gender, occupation, marital status, health, and
so on have been excluded.  Since all of these outcomes are related to
the distibution of intelligence, the coefficient on AFQT reflects all
these effects, not the marginal impact of intelligence.  It is highly
doubtful SES is controlling for background adequately, and we haven't 
even controlled for education!  Education!  It is true that the authors
stratify by coarse educational groupings, but that's nowhere near good
enough.  And remember that, as my little Monte Carlo showed, even a little
endogeneity causes big problems in this context, even if they'd gone to
the trouble of properly trying to hold all else equal. 

So, take a whole bunch of more or less uninterpretable logit regressions,
make some lousy conclusions from them, and write a book: that's the bulk
of the Bell Curve. the other part concerns how race factors into all this,
and I'm not even going to go there. 

Individually, none of Herrnstein and Murray's results would pass muster
as an undergraduate term paper in economics, much less a study in a 
refereed journal.  And if you sum junk, you just get aggregate junk. 



Chris Auld  (403)220-4098
Economics, University of Calgarymailto:[EMAIL PROTECTED]
Calgary, Alberta, CanadaURL:http://jerry.ss.ucalgary.ca/