Re:[tips] Biserial r

Mike Palij Thu, 22 Apr 2010 10:30:49 -0700

On Thu, 22 Apr 2010 08:34:42 -0700, Rick Froman wrote:
>I agree with Jim that such results are not "ludicrous" (my poor choice of 
>words) and that they should be treated as estimates that went outside of the 
>range of what is possible (for example, a z-score associated with a raw score 
>so high or low that such a raw test score wouldn't be possible because you 
>can't get less than 0 items correct or you would have to get more items 
>correct 
>than were on the test) . If such results show, for example, a p greater than 1 
>or an F less than 0, I think those results should not just be reproduced 
>without comment. I was mainly concerned with cases where a person would think 
>that the negative sign was actually providing useful information and was not 
>commented on.


I'm not sure what this last comment is based on.  The incorrect regression
results in pre-2003 Excel were probably known to statisticians and serious
data analysts which was why the usual advice was not to use Excel for
any serious data analysis (advice that is still being given today).  The 
programmers of Excel clearly didn't understand the problem and the
naive user of Excel with no background in statistics probably would not
realize the magnitude of the error.  Outside of these folks, who do you
think would actually report a negative F-value?

>Also, the use of the term r-squared to refer to a negative number 
>just seems entirely confusing and reporting such a thing uncommented or 
>unrevised would indicate to me that the person didn't realize Jim's point, 
>that 
>the number is an estimate that is not demonstrating a real effect. I wasn't 
>actually advocating ignoring the misleading numbers but just not reporting 
>them 
>without realizing they indicate something. 

Again, I don't know why you would think that someone would do something
like this.  I brought up the example of negative R-square which can occur in 
multilevel or hierarchial level modeling (HLM) as well as in structural equation
modeling (SEM) and in regression analysis where the equation is specified to 
have a zero intercept but in actuality the intercept is not zero.  Only the most
naive or unknowledgeable person would think that a negative R square
is a reportable result in contrast to a red flag that something is wrong.

Earlier I referred to Kreft & de Leeuw's Multilevel analysis text where
they point out that when one includes a random factor or random coefficient
in a HLM analysis, the concept of "total variance" is somewhat problematic
which is why R square can somethimes be negative in value (see their p118).

This can also occur in SEM as one is cautioned in the "SAS/STAT 9.1
User's Guide" in the use of their Calis procedure -- see page 681 in this 
book on books.google.com:
http://tinyurl.com/zduao5  

Actual researchers do come up with negative R square values in these
analyses and will ask folks on the HLM or SEM mailing list about what
this means and what to do about it.  I don't think anyone would ever
report a negative R square without pointing out that the model they were
fitting or analyzing was a poor or inappropriate model (only the most
naive would report this as a valid result without explanation).  A comparable
situation is when one obtain a negative Conbach's alpha which is the topic
of this thread on the SPSSX mailing list; see:
http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0708&L=spssx-l&P=19997 

There is also the problem of Heywood cases in factor analyses which have
communalities, usually estimated by R square (though if memory serves, it
has been proven that R square is just the lower bound for communality or
common variance), greater than 1.00.  The problem of R square greater
than 1.00 is further complicated by the implication that there has to be 
negative error or unique variance to balance this out.  Again, SAS provides
a warning about this condition:
http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/statug_factor_sect022.htm
 

In all of these cases, pathological values for the statistics can occur and
they provide information about problems with the data and/or method of
analysis.  Only the most naive would report these as legitimate results outside
of, say, showing that a particular theory implied a specific mathematical
model but that model fails to adequately account for the data.

>Also, I think you would have to be 
>concerned with rounding or calculation error if you ever got a negative F 
>value. It was my point that such a result should be pointed out, examined and 
>adjusted instead of just being reported unaltered.

Again, I have no idea why you think someone would do so outside of the
most naive or someone showing that a particular model fails.

Oh, not a stick a thumb in your eye or anything, but you might want to take
a look on this Wikipedia entry on negative probabilities. ;-)
http://en.wikipedia.org/wiki/Negative_probability 

and here's an example by physicist Richard Feynman on negative probabilities:
http://tinyurl.com/zdrwk9  

One is reminded of Hamlet:
"There are more things in the world than are dreamt of in your philosophy."

-Mike Palij
New York University
[email protected]


---
You are currently subscribed to tips as: [email protected].
To unsubscribe click here: 
http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5&n=T&l=tips&o=2166
or send a blank email to 
leave-2166-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu

Re:[tips] Biserial r

Reply via email to