[EMAIL PROTECTED] (Bob Roberts) wrote in message news:<[EMAIL PROTECTED]>... > Following is the data: > > [As] Log10[As] > 2.500 0.3979 > 0.784 -0.1057 > 0.015 -1.8182 > 3.540 0.5490 > 0.005 -2.3010 > 0.005 -2.3010 > 0.016 -1.7959 > 0.397 -0.4012 > 0.017 -1.7696 > 0.392 -0.4067 > 0.636 -0.1965 > 0.062 -1.2076 > > Max. 3.540 0.5490 > Median 0.227 -0.8072 > Min. 0.005 -2.3010 > > Avg. 0.697 -0.946 > s.d. 1.139 1.036 > Skew. 1.984 0.028 > Kurt. 3.179 -1.629 > > The two 0.005 values were reported as <0.010 mg/L, which is the > detection limit of the analytical procedure. > > Histogram of [As] > ** > ******** > . > ** > > Histogram of Log10[As] > ** > *** > *** > **** > > The question to be answered is: > What fraction of the 12,800 bundles of pallets/crates exceed the > regulatory standard of 5.0 mg/L for TCLP arsenic? (If TCLP As is > > 5.0 mg/L, the wood is hazardous waste and cannot be placed in a > municipal solid waste landfill.) > > If the regulations were to be read strictly, each pallet/crate bundle > with As > 5.0 mg/L would have to be handled as hazardous waste, not > sent to the MSW. As a practical matter, no one could afford to test > every bundle. If we can expect the large majority to be < 5.0 mg/L, > they will be allowed to send all 12,800 bundles to the landfill. > > I'm inclined to allow the material to be sent to the MSW landfill. > What do you think? Are there any statistical analyses you would > suggest performing on this very small amount of data?
You have some kind of mixture distribution (which you can't begin to identify with the data at hand). Consequently it would require some pretty heroic assumptions to say much of anything. Another way to look at it is that you can try a variety of assumptions, but the result is going to depend pretty strongly on the assumptions. You might be able to make some progress by assuming (say) that the mixture is continuous and unimodal (though I don't know that there's any evidence that even this assumption is reasonable), and then attempt to get bounds on the proportion above 5. Without assuming a distribution, about all you can say about the proportion is that it's probably not much bigger than 1/12 - if you can assume independence then the chance of observing your sample if the proportion above 5 was say .3 is pretty small. Looking at your histogram, and this one (use a monospace font): <.01 |## .01-.099 |#### .10-.99 |#### 1.0+ |## seems to imply that something near a lognormal may not be too bad, but you also should take the parameter uncertainty into account when trying to get bounds on your estimates, and lognormal is not a particularly unusual assumption when dealing with tiny proportions like this. I'd try a variety of plausible distributions though, to try to get a handle on how sensitive your estimates of the proportions are to the assumptions. Glen . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
