I see your point Robert, and I hope you didn't think I was curt in my
response (which I may have been).  Your message was quite informative.  I
did meet an on-site statistician who has pledged to help me.  I'm
particularly interested in the square root transformation.  I'm guessing it
will compress the data.  I'll look into it.

I could have provided a related example to the problem I'm trying to solve.
It could best be represented by maintenance intervals on an automobile.
Much of the
data is random (depending on what you drive).  At the same time, some of the
data is non-random in that you must change the oil periodically and that the
older the car is, the more likely it will break down and for a longer period
of time.

You have been quite helpful and I appreciate your time and interest in my
problem.  I will investigate your leads.  You all can use me as an example
of ignorant arrogance.

-dnult

Robert Dawson wrote in message
<071a01bf5c3e$a131dab0$[EMAIL PROTECTED]>...
>Dave Nulton wrote:
>
>> Quite frankly Robert the details are proprietary.  I suppose I could have
>> been more descriptive, but I don't see what the shape of my distribution
>> have to do with what it represents
>
>    To take the second point first, the origin of a dataset often contains
>valuable information relating to the plausibility of various models. For
>instance, it is a truism that "it takes money to make money". If I buy 100
>shares of Wombat.Com and you buy 1000 shares, and the price goes up by $5
>per share, I make $500 and you make $5000. Because of this inherently
>multiplicative structure, it is *very* common for financial data to respond
>well to a logarithmic transformation.
>
>    On the other hand, "count" data may - depending on what's being counted
>and how - follow a "Poisson" model. In such a model, the events being
>counted hapen independently and at random in a "window" of fixed size -
>calls per day to a help line, flaws per 1000 meters in recording tape,
>snowflakes landing on your tongue per minute...  Such data, if the numbers
>are small, may require specialized regression techniques; with more data, a
>square root transformation often helps.
>
>    If the data set is small or has any unusual features, it may be
>difficult to tell which transformation is appropriate just by looking at
the
>data.  The "story" of the data is important.
>
>    There are many other examples. For instance, even with a simple 2x2
>table in which the frequencies of two outcomes are compared under two
>situations, you need to know whether the trials are independent (in which
>case a two-sample z test would typically be used) or paired across
>treatments, in which case McNemar's test would be more appropriate.
>
>    For such reasons, it is often impossible to give reliable statistical
>advice based on numbers _in_vacuo_. I cannot imagine members of many other
>professions attempting to do the equivalent - indeed, I would hazard a
guess
>that in many cases professional associations would take a dim view of
giving
>a professional opinion to a client/patient/whatever who insisted on
>withholding relevant information.
>
>    I would suggest that if this dataset is important enough to warrant
this
>level of secrecy, you find a statistician who is willing to sign a NDA, and
>that you pay the going rate for the consultation. (Don't ask me, I'm
neither
>a professional statistician nor interested.) Trying to get advice, free or
>not, from people whom you do not trust enough to give even a basic
>explanation seems to me like a waste of your time and ours.
>
>    -Robert Dawson
>

Reply via email to