Re: AI-GEOSTATS: Log versus nscore transform

Yetta Jager Thu, 10 Aug 2006 11:08:43 -0700

Regardless of how well a lognormal model represents the distribution of the (one realization) of data,
there are still significant issues in interpreting back-transformed kriging predictions and their back-transformed
variances. For example, because the back-transformed mean is a function of both the transformed mean and kriging variance, higher estimates result where kriging variances are higher (ie areas with lower density of sampling data). Does this make sense? Or just choose to model the median instead of the mean.

I was advised to consider identifying different sub-populations with possible different means and variances as separate strata, which can be standardized by their individual variances and the residuals kriged (or simulated) together. We have one example (on my website) of where we developed a method to do this. Also, consider using covariates to reduce the variation in residuals first.

Yetta

At 11:37 AM 8/10/2006, you wrote:

Mike,

   I can't speak to EPA UCLs, and I'm too far removed from the literature at this point to make a cogent argument... but I do remember my work characterizing the hydraulic properties of artificial soils and there was no doubt that the soil water retention curves (tension vs water content) were log normal. I also remember Wilford Gardner (UW-Madison) commenting on how often that function form appeared in soil water physics.

   While digging through an old folder I found a classic reference ...

Spatial Variability of Field-Measured Soil-Water Properties
DR Nielsen, JW Biggar and KT Erh
Hillgardia Vol 42, Number 7, pp 215-260, Nov 1973

Maribeth

At 08:50 AM 8/10/2006 -0400, Michael Grant wrote:

My apologies. The email below accidentally only went to Gregoire only. It turns out that I haven't quite reconnected to the list correctly.. So...

------- Original Message -----
From: Michael Grant
To: Gregoire Dubois
Sent: Wednesday, August 09, 2006 8:48 AM
Subject: Re: AI-GEOSTATS: Log versus nscore transform

Hi Gregoire,

Please forgive the rambling philosophical response but I find your question interestingly provocative.

Is a preference of lognormality mathematical elegance or is it tradition? I remember an era of virtually automatic assumption of lognormality for two key classes of variables in our business (nuclear/environmental): contaminant concentrations and hydraulic conductivity. That practice lingers.

    By the early and mid 1990's many human and ecological risk assessors assumed lognormality of contaminant concentrations in environmental media as an article of faith. 'The data are skewed and hence lognormal.' In the US, I suspect that this state of affairs reflected in part the issuance of a single document--the USEPA's approachable supplemental guidance on calculating UCL for human health risk assessment (May 1992). While the EPA clearly evolved beyond that point, e.g., the agency's work on bootstrapping UCLs, numerical/computational savvy of many but not all 'street' assessors probably lagged.

    This lag was due in part to a mix of professional focus (toxicology versus numbers), availability of tools, and convenience. Also the commercial environmental business has significantly matured as a class of business and we all know that it is crowded. Competitive pressures are significant, and thorough data analysis--an expensive endeavor--is often a loser. The convenience and economy of sanctioned lognormality (no-one reads the fine print) beckons. For me going beyond nominal practice(?) almost always as been on my time. However, that is the nature of things and as long as we learn...:O) I think that the wider development, elucidation, and/or implementation of computationally intensive techniques, e.g., bootstrap, Monte Carlo, is changing at a fundamental level how we formulate our approaches to many problems, vis-a-vis simulation. (Consider the transparency in the formulation of resampling methods relative to the 'obscurity' of traditional parametric statistics.)

    Now regarding hydraulic conductivity. Again lognormality is a long-standing tradition of nominal practice. Certainly the last 25 years have witnessed a real evolution of concepts and understanding with respect to hydraulic conductivity. And that evolution certainly continues. But again, a mature, over-crowded environmental business dictates nominal practice. Not everyone is a numbers-oriented (hydro)geologist, and many who compile/interprets conductivity data have other duties/interest. The convenience of long-standing tradition--all theory aside--is powerful when faced with a need of a 'quick' characterization. BTW is there a hydraulic conductivity analogy to the 92 EPA supplemental guidance for concentrations UCLs? Sort of. I suspect that early co-kriging of water levels (H) and K (T) has had a cementing impact on perception of K as lognormal.

    Is this pessimistic? Well, not really. There are both academic and business opportunities here, and some individuals will recognize those opportunities. 'Justification' is the sort of issue that lead to progress both in the advance of theory and the application of theory (technology). Also I do not mean for any of my remarks to be judgemental or disparaging as to how others approach their work. I am just trying to communicate what I perceive as (commercial and government sector) participant in the environmental business for over 25 years.

    In closing, some related scrap thoughts: We operate (or should operate) in the context decision or decisions being made and sometimes 'nominal' practice may suffice--although that has to be reasonably demonstrated. I never have understood why decision analysis has not had a better reception over the years. Also how are things going to play out as some attempt to weave equifinality more into our consciousness? Finally, all work has a finite shelf-life.

Best regards,

Mike

----- Original Message -----

From: Gregoire Dubois
To: [email protected]
Sent: Wednesday, August 09, 2006 4:15 AM
Subject: AI-GEOSTATS: Log versus nscore transform

Dear list,
I am puzzled about the use of logarithmic and nscore transforms in geostatistics.

Given the apparent advantages in using nscore transforms over the logarithmic transform (nscore has no problem when dealing with 0 values and is "managing" the tails of the distribution very (more?) efficiently), why would one still want to use log-normal kriging? Because of the mathematical elegance of using a model only?
Moreover, one can frequently not be "sure" about the lognormality of the analysed dataset, so why would one still take the risk of using log-normal kriging?
Thank you in advance for any feedback on this issue.

Best regards,

Gregoire

__________________________________________
Gregoire Dubois (Ph.D.)

European Commission (EC)
Joint Research Centre Directorate (DG JRC)
Institute for Environment and Sustainability (IES)

TP 441, Via Fermi 1
21020 Ispra (VA)
ITALY

Tel. +39 (0)332 78 6360
Fax. +39 (0)332 78 5466
Email: [EMAIL PROTECTED]

WWW: http://www.ai-geostats.org
WWW: http://rem.jrc.cec.eu.int

"The views expressed are purely those of the writer and may not in any circumstances be regarded as stating an official position of the European Commission."

------------------------------------------------------
Yetta Jager
Environmental Sciences Division
Oak Ridge National Laboratory
P.O. Box 2008, MS 6036
Oak Ridge, TN 37831-6036 USA

For packages, please replace "P.O. Box 2008" with "Bethel Valley Road".

OFFICE: 865/574-8143 FAX: 865/576-3989
Work email: [EMAIL PROTECTED] Home email: [EMAIL PROTECTED]
My webpage: http://www.esd.ornl.gov/~zij/
Fish and Wildlife Modelling: http://www.esd.ornl.gov/research/ecol_management/fish_wildlife_modeling

Re: AI-GEOSTATS: Log versus nscore transform

Reply via email to