----- Original Message -----
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, June 19, 2000 4:50 PM
Subject: Adjusting Variable Distributions
> I am reading a book called Data Preparation for Data Mining by Dorian
> Pyle. I am having problem understanding Section 7.2.3 called
> Adjusting Distributions:
>
> "The easiest way to adjust a distribution density is simply to displace
> the high density points into the low density areas until all points
> are at the mean density for the variable. Such a process ends up
> in a rectangular distribution. This simple approach can only be
> completely
> successful if none of the instance values is duplicted.........In effect
> every point is
> displaced in a particular direction and distance. Any point in
> the variable's range could be used as a reference. The zero point
> is a convenient as any other. Using this as a reference every other
> point can be specified as moving away from or toward the reference
> point."
>
>
>
> Can anyone elaborate of Dorian's explanantion??????
I imagine this to refer to replacing data values by quantiles [or
equivalently ranks]. Whether it is the right thing to do or not I do not
know... the clarity (or lack of same) of the author's explanation does not
fill me with optimism.
-Robert Dawson
===========================================================================
This list is open to everyone. Occasionally, less thoughtful
people send inappropriate messages. Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.
For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================