Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-14 Thread Ian Tickle
Ed, no the fact that you don't, can't or won't estimate the precision
doesn't change anything (only as you say it becomes a poorly designed
experiment).  A measurement has a standard deviation regardless of whether
you possess an estimate of its value or not.  The exact true value of the
standard deviation can never be known, just as the true value of any
physical quantity can never be known, even after measuring it umpteen
times!  The measurements are only estimates of the true value, sampled from
the error distribution of the true value.

The experimental estimate of the standard deviation is called the 'standard
uncertainty' (indeed I remember when it was called the 'estimated standard
deviation' or e.s.d.), again sampled from the error distribution of the
SD.  Sometimes I see in the literature the term 'estimated standard
uncertainty' but this is a term that does not appear in any literature on
statistics (it seems to be peculiar to protein crystallography
literature!).  Also it would then be the 'estimated estimated standard
deviation' which is one more level of estimation that you need (an estimate
of an estimate is still an estimate - it just has a bigger uncertainty than
the previous estimate!).

See http://physics.nist.gov/cgi-bin/cuu/Info/Constants/definitions.html for
the terminology approved by NIST.

Cheers

-- Ian


On 13 March 2013 20:36, Ed Pozharski epozh...@umaryland.edu wrote:

 Ian,

 On Wed, 2013-03-13 at 19:46 +, Ian Tickle wrote:
  So I don't see there's a question of wilfully choosing to ignore. or
  not sampling certain factors: if the experiment is properly calibrated
  to get the SD estimate you can't ignore it.
 

 So perhaps I can explain better by using the same example of protein
 concentration measurement.  It is certainly true that only taking one
 dilution is poor design. (Although in crystallization practice it may
 not matter given that it is not imperative to have a protein exactly at
 10 mg/ml, 9.7 will do).  If I don't bother including pipetting precision
 in my error estimate either by direct experiment or by using
 manufacturer's declaration I am willfully ignoring this source of error.
 That would be wrong.

 But what if I only have one measurement worth of sample?  And pipetting
 precision cannot be calibrated (I know it can be so this is hypothetical
 - say pipettor was stolen and company that made it is out of business,
 their offices burned down by raging mob).  Is the pipetting error now
 systematic because experimental situation (not design) prevents it from
 being sampled or estimated?

 I actually like the immutable error type better for my own purposes, but
 I am trying to see whether some argument might stand that allows some
 error that can be sampled to be called inaccuracy nonetheless.

 Cheers and thanks,

 Ed.


 --
 I don't know why the sacrifice thing didn't work.
 Science behind it seemed so solid.
 Julian, King of Lemurs




Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-13 Thread Ian Tickle
Ed, sorry for delay.  I was not trying to make any significant distinction
between controllable and potentially controllable: from a statistical
POV they are the same thing.  The distinction is purely one of
practicality, i.e. within the current experimental parameters is it
possible to eliminate the systematic error, for example is there a
calibration step where you determine the systematic error by use of a
standard of known concentration.  The error is still controllable
regardless of whether you actually take the trouble to control it!  Note
that the experimental setup has not changed, you are merely using the same
apparatus in a different way but any random errors associated with the
measurements will still be present.

Of course if you change the experimental setup (note that this potentially
includes the experimenter!) then all bets are off!  It's very important to
describe the experimental setup precisely before you attempt to
characterise the errors associated with a particular setup.

BTW I agree completely with Kay's analysis of the problem: as he said you
are sampling (once!) a statistical error component.  This is what I was
trying to say, he just said it in a much more concise way!  This random
(uncontrollable) error then gets propagated through the sequence of steps
in the experiment along with all the other uncontrollable errors.

Cheers

-- Ian


On 11 March 2013 19:04, Ed Pozharski epozh...@umaryland.edu wrote:

 Ian,

 thanks for the quick suggestion.

 On Mon, 2013-03-11 at 18:34 +, Ian Tickle wrote:
  Personally I tend to avoid the systematic vs random error distinction
  and think instead in terms of controllable and uncontrollable errors:
  systematic errors are potentially under your control (given a
  particular experimental setup), whereas random errors aren't.
 
 Should you make a distinction then between controllable (cycling cuvette
 in and out of the holder) and potentially controllable errors
 (dilution)?  And the latter may then become controllable with a
 different experimental setup?

 Cheers,

 Ed.

 --
 I don't know why the sacrifice thing didn't work.
 Science behind it seemed so solid.
 Julian, King of Lemurs




Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-13 Thread Ed Pozharski
Pete,

 Actually, I was trying to say the opposite - that the decision to 
 include something in the model (or not) could change the nature of the 
 error.  

Duly noted

 Pete
 
 PS - IIUC := ?
 

IIUC - If I Understand Correctly

-- 
Bullseye!  Excellent shot, Maurice.
  Julian, King of Lemurs.


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-13 Thread Ed Pozharski
Kay,

  the latter is _not_ a systematic error; rather, you are sampling (once!) a 
 statistical error component. 

OK.  Other words, what is potentially removable error is always
statistical error, whether it is sampled or not.

So is it fair to say that if there are some factors that I either do not
know about, willfully choose to ignore or just cannot sample, then I am
underestimating precision of the experiment?  

Cheers,

Ed.


-- 
After much deep and profound brain things inside my head, 
I have decided to thank you for bringing peace to our home.
Julian, King of Lemurs


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-13 Thread Ed Pozharski
Ian,

thanks - I think I had it backwards after reading your first post and
thought of controllable errors being those that can be brought under
conrtol by sampling, whereas uncontrollable would be those that cannot
be sampled and therefore their amplitude is unknown.

Yet you also seem to agree that characterization is dependent on
specifics of experimental setup, leaving the door open for the
possibility that noise-vs-bias choice may be driven by experimental
circumstance.  

And in practice, wouldn't it be more consistent to stick with the
definition that statistical error/noise/precision is defined by what is
really sampled?  Because if some factor is not sampled, I have zero
knowledge of the corresponding error magnitude.  I agree with Tim that
not sampling what can be easily sampled is a poorly designed experiment,
but it can also be characterized (which is probably a nicer term) as an
experiment with large systematic error (due to poor design).

Cheers,

Ed.

On Wed, 2013-03-13 at 12:33 +, Ian Tickle wrote:
 Ed, sorry for delay.  I was not trying to make any significant
 distinction between controllable and potentially controllable:
 from a statistical POV they are the same thing.  The distinction is
 purely one of practicality, i.e. within the current experimental
 parameters is it possible to eliminate the systematic error, for
 example is there a calibration step where you determine the systematic
 error by use of a standard of known concentration.  The error is still
 controllable regardless of whether you actually take the trouble to
 control it!  Note that the experimental setup has not changed, you are
 merely using the same apparatus in a different way but any random
 errors associated with the measurements will still be present.
 
 
 Of course if you change the experimental setup (note that this
 potentially includes the experimenter!) then all bets are off!  It's
 very important to describe the experimental setup precisely before you
 attempt to characterise the errors associated with a particular setup.
 
 
 BTW I agree completely with Kay's analysis of the problem: as he said
 you are sampling (once!) a statistical error component.  This is
 what I was trying to say, he just said it in a much more concise way!
 This random (uncontrollable) error then gets propagated through the
 sequence of steps in the experiment along with all the other
 uncontrollable errors.
 
 Cheers
 
 
 -- Ian
 
 
 
 On 11 March 2013 19:04, Ed Pozharski epozh...@umaryland.edu wrote:
 Ian,
 
 thanks for the quick suggestion.
 
 On Mon, 2013-03-11 at 18:34 +, Ian Tickle wrote:
  Personally I tend to avoid the systematic vs random error
 distinction
  and think instead in terms of controllable and
 uncontrollable errors:
  systematic errors are potentially under your control (given
 a
  particular experimental setup), whereas random errors
 aren't.
 
 
 Should you make a distinction then between controllable
 (cycling cuvette
 in and out of the holder) and potentially controllable errors
 (dilution)?  And the latter may then become controllable with
 a
 different experimental setup?
 
 Cheers,
 
 Ed.
 
 --
 I don't know why the sacrifice thing didn't work.
 Science behind it seemed so solid.
 Julian, King of Lemurs
 
 
 

-- 
Edwin Pozharski, PhD, Assistant Professor
University of Maryland, Baltimore
--
When the Way is forgotten duty and justice appear;
Then knowledge and wisdom are born along with hypocrisy.
When harmonious relationships dissolve then respect and devotion arise;
When a nation falls to chaos then loyalty and patriotism are born.
--   / Lao Tse /


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-13 Thread Ian Tickle
The precision must be obtained either from multiple measurements which must
be representative of the measurements you propose to make, or if the
measurement consists of a count (say of photons) then from counting
statistics, or a combination of the two.  This must be done by either by
prior calibration (by say the manufacturer or by you) of the experimental
setup, or in the course of making the measurements themselves.  Either way
there will be an experimental estimate of the standard deviation of the
quantity you are trying to measure, against which you can compare
individual or averaged measurements for significance using P values,
confidence intervals etc.

Now of course there may be variances that are not being explored by the
current setup, but if the setup is redefined it must be recalibrated so the
new estimates of the SDs are applicable to the new setup.  To answer the
question from your email just in, if experimental setup is changed in any
significant way the experimental precision is likely to change and it is
likely to require recalibration.

So I don't see there's a question of wilfully choosing to ignore. or not
sampling certain factors: if the experiment is properly calibrated to get
the SD estimate you can't ignore it.

-- Ian


On 13 March 2013 18:59, Ed Pozharski epozh...@umaryland.edu wrote:

 Kay,

   the latter is _not_ a systematic error; rather, you are sampling
 (once!) a statistical error component.

 OK.  Other words, what is potentially removable error is always
 statistical error, whether it is sampled or not.

 So is it fair to say that if there are some factors that I either do not
 know about, willfully choose to ignore or just cannot sample, then I am
 underestimating precision of the experiment?

 Cheers,

 Ed.


 --
 After much deep and profound brain things inside my head,
 I have decided to thank you for bringing peace to our home.
 Julian, King of Lemurs



Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-13 Thread Ed Pozharski
 OK.  Other words, what is potentially removable error is always
 statistical error, whether it is sampled or not.

Clarification - what I meant is potentially removable by proper sampling
and reducing standard error to zero with infinite number of
measurements.  Not removable by better calibration or experimental
setup.

-- 
I don't know why the sacrifice thing didn't work.  
Science behind it seemed so solid.
Julian, King of Lemurs


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-13 Thread Ed Pozharski
Ian,

On Wed, 2013-03-13 at 19:46 +, Ian Tickle wrote:
 So I don't see there's a question of wilfully choosing to ignore. or
 not sampling certain factors: if the experiment is properly calibrated
 to get the SD estimate you can't ignore it.
 

So perhaps I can explain better by using the same example of protein
concentration measurement.  It is certainly true that only taking one
dilution is poor design. (Although in crystallization practice it may
not matter given that it is not imperative to have a protein exactly at
10 mg/ml, 9.7 will do).  If I don't bother including pipetting precision
in my error estimate either by direct experiment or by using
manufacturer's declaration I am willfully ignoring this source of error.
That would be wrong.

But what if I only have one measurement worth of sample?  And pipetting
precision cannot be calibrated (I know it can be so this is hypothetical
- say pipettor was stolen and company that made it is out of business,
their offices burned down by raging mob).  Is the pipetting error now
systematic because experimental situation (not design) prevents it from
being sampled or estimated?

I actually like the immutable error type better for my own purposes, but
I am trying to see whether some argument might stand that allows some
error that can be sampled to be called inaccuracy nonetheless.  

Cheers and thanks,

Ed.


-- 
I don't know why the sacrifice thing didn't work.  
Science behind it seemed so solid.
Julian, King of Lemurs


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-13 Thread Alexander Aleshin

On Mar 13, 2013, at 1:36 PM, Ed Pozharski wrote:

But what if I only have one measurement worth of sample?

Is it proper to use statistical analysis for a single measurement? I thought 
statistics, by definition, means multiple measurements.

Alex



Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-13 Thread mjvdwoerd
I think that in statistics you can build a model that describes (and predicts) 
the uncertainty. So if you have done similar (!) replicate experiments, from 
which you can build the model, you can apply it to a single observation and 
provide a reasonably good guess for the value that you were measuring and its 
variance. Of course that guess would not be as good as the average value and 
variance from true replicates.

With protein crystals (or solutions for that matter), the sample is often too 
precious to redo the experiment and it is worth thinking about doing replicate 
experiments with a cheap one, build the model, and then apply it to single 
expensive observations. That would be statistically justified (provided that 
the model is valid for all sets of experiments). I have not built such models, 
but we know that pipetting isn't really as good as we believe. If you randomly 
dial to a particular value on your pipetteman (say 5 uL), you will get a 
certain pattern of errors (which is really not a good word for it), while if 
you consistently dial either from a low (1uL) or a high (10uL) value towards 
the value you want, you will get another pattern. Those two patterns are not 
representative of each other, I don't think, and you would need to understand 
how to do experiments consistently to stay within your error-model (bad word). 

Among many other things, statisticians try to come up with models that explain 
the uncertainty so that you know what to think, even if your set of observation 
is too small to say for sure, with n=1 being the ultimate too small. (Maybe not 
ultimate, n=0 is really too small.)

Mark

 

 

 

-Original Message-
From: Alexander Aleshin aales...@sanfordburnham.org
To: CCP4BB CCP4BB@JISCMAIL.AC.UK
Sent: Wed, Mar 13, 2013 3:05 pm
Subject: Re: [ccp4bb] statistical or systematic? bias or noise?




On Mar 13, 2013, at 1:36 PM, Ed Pozharski wrote:


But what if I only have one measurement worth of sample?  


Is it proper to use statistical analysis for a single measurement? I thought 
statistics, by definition, means multiple measurements.


Alex


 


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-13 Thread Tom Peat
Slightly off the topic, but still potentially relevant in terms of realistic 
experimental error: when dealing with the small volumes typically used in 
crystallization (say 1 uL + 1 uL drops), and using a 10 uL pipette, the errors 
are fairly high (more like 30% than 5-10%), leading to a lot of 
non-reproducibility in the experiment- even when setting up the same exact 
solution many times.  Going to robotics helps with the reproducibility in 
liquid transfer, but doesn't necessarily help with the reproducibility of 
crystallization (an example of this can be found in: 
http://journals.iucr.org/d/issues/2007/07/00/bw5202/ ).

Cheers,  tom

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
mjvdwo...@netscape.net
Sent: Thursday, 14 March 2013 12:07 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] statistical or systematic? bias or noise?

I think that in statistics you can build a model that describes (and predicts) 
the uncertainty. So if you have done similar (!) replicate experiments, from 
which you can build the model, you can apply it to a single observation and 
provide a reasonably good guess for the value that you were measuring and its 
variance. Of course that guess would not be as good as the average value and 
variance from true replicates.

With protein crystals (or solutions for that matter), the sample is often too 
precious to redo the experiment and it is worth thinking about doing replicate 
experiments with a cheap one, build the model, and then apply it to single 
expensive observations. That would be statistically justified (provided that 
the model is valid for all sets of experiments). I have not built such models, 
but we know that pipetting isn't really as good as we believe. If you randomly 
dial to a particular value on your pipetteman (say 5 uL), you will get a 
certain pattern of errors (which is really not a good word for it), while if 
you consistently dial either from a low (1uL) or a high (10uL) value towards 
the value you want, you will get another pattern. Those two patterns are not 
representative of each other, I don't think, and you would need to understand 
how to do experiments consistently to stay within your error-model (bad word).

Among many other things, statisticians try to come up with models that explain 
the uncertainty so that you know what to think, even if your set of observation 
is too small to say for sure, with n=1 being the ultimate too small. (Maybe not 
ultimate, n=0 is really too small.)

Mark



-Original Message-
From: Alexander Aleshin 
aales...@sanfordburnham.orgmailto:aales...@sanfordburnham.org
To: CCP4BB CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Sent: Wed, Mar 13, 2013 3:05 pm
Subject: Re: [ccp4bb] statistical or systematic? bias or noise?

On Mar 13, 2013, at 1:36 PM, Ed Pozharski wrote:


But what if I only have one measurement worth of sample?

Is it proper to use statistical analysis for a single measurement? I thought 
statistics, by definition, means multiple measurements.

Alex



Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-13 Thread Alexander Aleshin
I googled on the subject and found that a discipline that deals with this type 
of problems (measurements) is called the Decision theory. It uses statistics to 
estimate probability of  certain events (results of measurements). So, 
everything depends on a decision that someone needs to make. A single 
observation may be justifiable for some decisions and not for others. The 
purpose should be kept in mind while discussing these types of problems.

As a matter of fact, measuring protein concentration just once is not a truly 
single observation, because the experimenter knows something about the sample, 
and s/he makes a decision based on a consistency of new observation with 
previous ones (the so-called model in your example).

Alex


On Mar 13, 2013, at 6:06 PM, 
mjvdwo...@netscape.netmailto:mjvdwo...@netscape.net
 mjvdwo...@netscape.netmailto:mjvdwo...@netscape.net wrote:

I think that in statistics you can build a model that describes (and predicts) 
the uncertainty. So if you have done similar (!) replicate experiments, from 
which you can build the model, you can apply it to a single observation and 
provide a reasonably good guess for the value that you were measuring and its 
variance. Of course that guess would not be as good as the average value and 
variance from true replicates.

With protein crystals (or solutions for that matter), the sample is often too 
precious to redo the experiment and it is worth thinking about doing replicate 
experiments with a cheap one, build the model, and then apply it to single 
expensive observations. That would be statistically justified (provided that 
the model is valid for all sets of experiments). I have not built such models, 
but we know that pipetting isn't really as good as we believe. If you randomly 
dial to a particular value on your pipetteman (say 5 uL), you will get a 
certain pattern of errors (which is really not a good word for it), while if 
you consistently dial either from a low (1uL) or a high (10uL) value towards 
the value you want, you will get another pattern. Those two patterns are not 
representative of each other, I don't think, and you would need to understand 
how to do experiments consistently to stay within your error-model (bad word).

Among many other things, statisticians try to come up with models that explain 
the uncertainty so that you know what to think, even if your set of observation 
is too small to say for sure, with n=1 being the ultimate too small. (Maybe not 
ultimate, n=0 is really too small.)

Mark



-Original Message-
From: Alexander Aleshin 
aales...@sanfordburnham.orgmailto:aales...@sanfordburnham.org
To: CCP4BB CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Sent: Wed, Mar 13, 2013 3:05 pm
Subject: Re: [ccp4bb] statistical or systematic? bias or noise?


On Mar 13, 2013, at 1:36 PM, Ed Pozharski wrote:

But what if I only have one measurement worth of sample?

Is it proper to use statistical analysis for a single measurement? I thought 
statistics, by definition, means multiple measurements.

Alex




Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-12 Thread Kay Diederichs
On Mon, 11 Mar 2013 11:46:03 -0400, Ed Pozharski epozh...@umaryland.edu wrote:
...
Notice that I
only prepared one sample, so if on that particular instance I picked up
4.8ul and not 5.0ul, this will translate into systematically
underestimating protein concentration, even though it could have equally
likely been 5.2ul.

Within the framework of such a short Materials and Methods description, the 
latter is _not_ a systematic error; rather, you are sampling (once!) a 
statistical error component. If possible, you should repeat the experiment and 
find out the magnitude of your pipetting error. If impossible, you can only 
estimate it (making reasonable assumptions based on past experiments).

Of course, if - in repeated experiments - your pipetting more often gives 
(e.g.) lower volumes than 5.0 ul, then some kind of systematic error must be 
the reason.

Systematic error in most cases has the property that it (on average) changes 
the result towards one side, whereas statistical error should not change the 
mean value.

It is one of the goals of an experiment to identify all sources of systematic 
error, and to either model or eliminate them. If you are able to identify and 
model the systematic error, then you can convert noise to signal.

best,

Kay


[ccp4bb] statistical or systematic? bias or noise?

2013-03-11 Thread Ed Pozharski
Salve,

I would like to solicit opinions on a certain question about the
relationship between statistical and systematic error. Please read and
consider the following in its entirety before commenting.

Statistical error (experiment precision) is determined by the degree to
which experimental measurement is reproducible. It is derived from
variance of the data when an experiment is repeated multiple times under
otherwise identical conditions. Statistical error is by its very nature
irremovable and originates from various sources of random noise, which
can be reduced but not entirely eliminated.

Systematic error (experiment accuracy) reflects degree to which precise
average deviates from a true value. Theoretically, corrections can be
introduced to the experimental method that eliminate various sources of
bias. Systematic error refers to some disconnect between the quantities
one tries to determine and what is actually measured.

The issue is whether the classification of various sources of error into
the two types depends on procedure. Let me explain using an example.

To determine the concentration of a protein stock, I derive extinction
coefficient from its sequence, dilute it 20x to and take OD measurement.
The OD value is then divided by extinction coefficient and inflated 20
times to calculate concentration.

So what is the statistical error of this when I am at the
spectrophotometer? I can cycle sample cuvette in and out of the holder
to correct for reproducibility of its position and instrument noise.
This gives me the estimated statistical error of the OD measurement.
Scaled by extinction coefficient and dilution factor, this number
corresponds to the statistical error (precision) of the protein
concentration.

There are two sources of the systematic error originating from the two
factors used to convert OD to concentration. First is irremovable
inaccuracy of the extinction coefficient. 

Second: dilution factor. Here main contribution to the systematic error
is pipetting. Importantly, this includes both systematic (pipettor
calibration) and statistical (pipetting precision) error. Notice that I
only prepared one sample, so if on that particular instance I picked up
4.8ul and not 5.0ul, this will translate into systematically
underestimating protein concentration, even though it could have equally
likely been 5.2ul.

So if pipetting error could have contributed ~4% into the overall
systematic error while the spectrophotometer measures with 0.1%
precision, it makes sense to consider how this systematic error can be
eliminated. The experiment can be modified to include multiple samples
prepared for OD determination from the same protein stock.

An interesting thing happens when I do that. What used to be a
systematic error of pipetting now becomes statistical error, because my
experiment now includes reproducing dilution of the stock. In a
nutshell,

Whether a particular source of error contributes to accuracy or
precision of an experiment depends on how experiment is conducted. 

And one more thing. No need to waste precious protein on evaluating
error of pipetting. I can determine that from a separate calibration
experiment using lysozyme solution of comparable concentration/surface
tension. Technically, a single measurement has accuracy of said 4%
(padded by whatever is error in extinction coefficient). But one can
also project that with actual dilution repeats, the precision would be
this same 4% (assuming that this is a dominant source of error).

So, is there anything wrong with this? Naturally, the question really is
not about extinction coefficients, but rather about semantics of what is
accuracy and what is precision and whether certain source of
experimental error is rigidly assigned to one of the two categories.
There is, of course, the wikipedia article on accuracy vs precision, and
section 3.1 from Ian's paper (ActaD 68:454) can be used as a point of
reference.

Cheers,

Ed.

-- 
Edwin Pozharski, PhD, Assistant Professor
University of Maryland, Baltimore
--
When the great Tao is abandoned, Ideas of humanitarianism and 
   righteousness appear.
When intellectualism arises It is accompanied by great hypocrisy.
When there is strife within a family Ideas of brotherly love appear.
When nation is plunged into chaos Politicians become patriotic.
--   / Lao Tse /


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-11 Thread Tim Gruene
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Ed,


 only prepared one sample, so if on that particular instance I
 picked up 4.8ul and not 5.0ul, this will translate into
 systematically

I don't share your opinion about a single measurement translating into
a systematic error. I would call it a poorly designed experiment in
case you were actually iterested in how accurately you determined the
protein concentration.

Best,
Tim


On 03/11/2013 04:46 PM, Ed Pozharski wrote:
 Salve,
 
 I would like to solicit opinions on a certain question about the 
 relationship between statistical and systematic error. Please read
 and consider the following in its entirety before commenting.
 
 Statistical error (experiment precision) is determined by the
 degree to which experimental measurement is reproducible. It is
 derived from variance of the data when an experiment is repeated
 multiple times under otherwise identical conditions. Statistical
 error is by its very nature irremovable and originates from various
 sources of random noise, which can be reduced but not entirely
 eliminated.
 
 Systematic error (experiment accuracy) reflects degree to which
 precise average deviates from a true value. Theoretically,
 corrections can be introduced to the experimental method that
 eliminate various sources of bias. Systematic error refers to some
 disconnect between the quantities one tries to determine and what
 is actually measured.
 
 The issue is whether the classification of various sources of error
 into the two types depends on procedure. Let me explain using an
 example.
 
 To determine the concentration of a protein stock, I derive
 extinction coefficient from its sequence, dilute it 20x to and take
 OD measurement. The OD value is then divided by extinction
 coefficient and inflated 20 times to calculate concentration.
 
 So what is the statistical error of this when I am at the 
 spectrophotometer? I can cycle sample cuvette in and out of the
 holder to correct for reproducibility of its position and
 instrument noise. This gives me the estimated statistical error of
 the OD measurement. Scaled by extinction coefficient and dilution
 factor, this number corresponds to the statistical error
 (precision) of the protein concentration.
 
 There are two sources of the systematic error originating from the
 two factors used to convert OD to concentration. First is
 irremovable inaccuracy of the extinction coefficient.
 
 Second: dilution factor. Here main contribution to the systematic
 error is pipetting. Importantly, this includes both systematic
 (pipettor calibration) and statistical (pipetting precision) error.
 Notice that I only prepared one sample, so if on that particular
 instance I picked up 4.8ul and not 5.0ul, this will translate into
 systematically underestimating protein concentration, even though
 it could have equally likely been 5.2ul.
 
 So if pipetting error could have contributed ~4% into the overall 
 systematic error while the spectrophotometer measures with 0.1% 
 precision, it makes sense to consider how this systematic error can
 be eliminated. The experiment can be modified to include multiple
 samples prepared for OD determination from the same protein stock.
 
 An interesting thing happens when I do that. What used to be a 
 systematic error of pipetting now becomes statistical error,
 because my experiment now includes reproducing dilution of the
 stock. In a nutshell,
 
 Whether a particular source of error contributes to accuracy or 
 precision of an experiment depends on how experiment is conducted.
 
 
 And one more thing. No need to waste precious protein on
 evaluating error of pipetting. I can determine that from a separate
 calibration experiment using lysozyme solution of comparable
 concentration/surface tension. Technically, a single measurement
 has accuracy of said 4% (padded by whatever is error in extinction
 coefficient). But one can also project that with actual dilution
 repeats, the precision would be this same 4% (assuming that this is
 a dominant source of error).
 
 So, is there anything wrong with this? Naturally, the question
 really is not about extinction coefficients, but rather about
 semantics of what is accuracy and what is precision and whether
 certain source of experimental error is rigidly assigned to one of
 the two categories. There is, of course, the wikipedia article on
 accuracy vs precision, and section 3.1 from Ian's paper (ActaD
 68:454) can be used as a point of reference.
 
 Cheers,
 
 Ed.
 

- -- 
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFRPhm5UxlJ7aRr7hoRAuoSAJwN9zAJj2qbZBNMlF0cJ0goszaqWQCg2hFp
9u+slrVyYEYbCf2D2/SOVTg=
=UACi
-END PGP SIGNATURE-


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-11 Thread Ian Tickle
On 11 March 2013 15:46, Ed Pozharski epozh...@umaryland.edu wrote:

 Notice that I
 only prepared one sample, so if on that particular instance I picked up
 4.8ul and not 5.0ul, this will translate into systematically
 underestimating protein concentration, even though it could have equally
 likely been 5.2ul


Ed, surely the point is that you don't know that you only picked up 4.8ul -
as you say what you actually picked up for all you know could equally well
have been 5.2ul (I'm assuming that you don't conduct a separate more
accurate experiment to measure what was actually picked up by each
pipetting).

Statistics is about expectation as distinct from actuality, and the
expected error is 0.2ul (or whatever: you would have to repeat the
pipetting several times to estimate the standard deviation), regardless of
what the actual error is.  This expected error then feeds into the expected
error of the measured concentration which results from performing the
experiment in its entirety, using the usual rules of error propagation.
Again the actual error in the concentration from a single experiment is
unrelated to its expected error, except insofar that you would normally
expect it to fall within (say) a +- 3 sigma envelope.

Personally I tend to avoid the systematic vs random error distinction and
think instead in terms of controllable and uncontrollable errors:
systematic errors are potentially under your control (given a particular
experimental setup), whereas random errors aren't.

Cheers

-- Ian


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-11 Thread Pete Meyer

Hi Ed,

Ed Pozharski wrote:


An interesting thing happens when I do that. What used to be a
systematic error of pipetting now becomes statistical error, because my
experiment now includes reproducing dilution of the stock. In a
nutshell,

Whether a particular source of error contributes to accuracy or
precision of an experiment depends on how experiment is conducted. 


My take on it is slightly different - the difference seems to be more on 
how the source of error is modeled (although that may dictate changes to 
the experiment) rather than essentially depending on how the experiment 
was conducted.


Or (possibly) more clearly, systematic error is a result of the model of 
the experiment incorrectly reflecting the actual experiment; measurement 
error is due to living in a non-deterministic universe.


Of course, there could be better ways of looking at it that I'm missing.

Pete


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-11 Thread Ed Pozharski
Tim,

On Mon, 2013-03-11 at 18:51 +0100, Tim Gruene wrote:
 I don't share your opinion about a single measurement translating into
 a systematic error. I would call it a poorly designed experiment in
 case you were actually iterested in how accurately you determined the
 protein concentration.
 
OK.  As I said, this is not about protein concentration, but let's say I
only have about 6ul of protein sample, so that I can only have *one*
dilution.  Would pipetting uncertainty then be considered systematic
error or statistical error?

I am afraid this is a matter of unsettled definitions.  By the way, it
wasn't an opinion, more of an option in interpretation.  I can say that
whatever is not sampled in a particular experimental setup is systematic
error.  Or I can say that (as you seem to suggest, and I like this
option better) that whenever there is a theoretical possibility of
sampling something, it is statistical error even though the particular
setup does not allow accounting for it.

Ed.

-- 
Hurry up before we all come back to our senses!
   Julian, King of Lemurs


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-11 Thread Ed Pozharski
Ian,

thanks for the quick suggestion.

On Mon, 2013-03-11 at 18:34 +, Ian Tickle wrote:
 Personally I tend to avoid the systematic vs random error distinction
 and think instead in terms of controllable and uncontrollable errors:
 systematic errors are potentially under your control (given a
 particular experimental setup), whereas random errors aren't.
 
Should you make a distinction then between controllable (cycling cuvette
in and out of the holder) and potentially controllable errors
(dilution)?  And the latter may then become controllable with a
different experimental setup?

Cheers,

Ed.

-- 
I don't know why the sacrifice thing didn't work.  
Science behind it seemed so solid.
Julian, King of Lemurs


Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-11 Thread Ed Pozharski
Pete,

On Mon, 2013-03-11 at 13:42 -0500, Pete Meyer wrote:
 My take on it is slightly different - the difference seems to be more
 on 
 how the source of error is modeled (although that may dictate changes
 to 
 the experiment) rather than essentially depending on how the
 experiment 
 was conducted.
 
 Or (possibly) more clearly, systematic error is a result of the model
 of 
 the experiment incorrectly reflecting the actual experiment;
 measurement 
 error is due to living in a non-deterministic universe.

I see your point. 

I want to clarify that reproducing an experiment as far back as possible
is best.  Of course it's possible to design an experiment better and
account for pipetting errors.  The question is not whether it has to be
done (certainly yes) but whether pipetting error should be considered as
inaccuracy or imprecision when the experiment is not repeated.

One can say it's inaccuracy when it is not estimated and imprecision
when it is.  Or one can accept Ian's suggestion and notice that there is
no fundamental difference between things you can control and things you
can potentially control.

IIUC, you are saying that nature of the error should be independent of
my decision to model it or not.  Other words, if I can potentially
sample some additional random variable in my experiment, it contributes
to precision whether I do it or not.  When it's not sampled, the
precision is simply underestimated.  Does that make more sense?

Cheers,

Ed.


-- 
After much deep and profound brain things inside my head, 
I have decided to thank you for bringing peace to our home.
Julian, King of Lemurs


Re: [ccp4bb] [Err] Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-11 Thread Ed Pozharski
By the way, am I the only one who gets this thing with every post?  If
anyone can ask Jin Kwang (liebe...@korea.ac.kr) to either clean up his
mailbox or unsubscribe, that would be truly appreciated.  Delete button
is easy and fun to use, but this has been going on for quite some time.



On Tue, 2013-03-12 at 04:16 +0900, spam_mas...@korea.ac.kr wrote:
 ransmit Report:
 
 liebe...@korea.ac.kr ߼; 5 õ Ͽ4ϴ.
 ( / : 554 Transaction failed. 402 Local User Inbox Full
 (liebe...@korea.ac.kr) 4,61440,370609(163.152.6.98))
 
  / 
 User unknown   :; ڰ x =
 Socket connect fail: 
 DATA write fail: ޼ ۽ 
 DATA reponse fail  : κ ޼ 
 

-- 
Bullseye!  Excellent shot, Maurice.
  Julian, King of Lemurs.


Re: [ccp4bb] [Err] Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-11 Thread Andrey Lebedev

I've just have the same thing.
I'll write to Jin Kwang and remove him from the bb-list if he will not respond 
by tomorrow evening

Andrey

On 11 Mar 2013, at 19:27, Ed Pozharski wrote:

 By the way, am I the only one who gets this thing with every post?  If
 anyone can ask Jin Kwang (liebe...@korea.ac.kr) to either clean up his
 mailbox or unsubscribe, that would be truly appreciated.  Delete button
 is easy and fun to use, but this has been going on for quite some time.
 
 
 
 On Tue, 2013-03-12 at 04:16 +0900, spam_mas...@korea.ac.kr wrote:
 ransmit Report:
 
 liebe...@korea.ac.kr ߼; 5 õ Ͽ4ϴ.
 ( / : 554 Transaction failed. 402 Local User Inbox Full
 (liebe...@korea.ac.kr) 4,61440,370609(163.152.6.98))
 
  / 
 User unknown   :; ڰ x =
 Socket connect fail: 
 DATA write fail: ޼ ۽ 
 DATA reponse fail  : κ ޼ 
 
 
 -- 
 Bullseye!  Excellent shot, Maurice.
  Julian, King of Lemurs.



Re: [ccp4bb] [Err] Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-11 Thread Andrey Lebedev

It should stop. I'll see after sending this message.

On 11 Mar 2013, at 19:27, Ed Pozharski wrote:

 By the way, am I the only one who gets this thing with every post?  If
 anyone can ask Jin Kwang (liebe...@korea.ac.kr) to either clean up his
 mailbox or unsubscribe, that would be truly appreciated.  Delete button
 is easy and fun to use, but this has been going on for quite some time.
 
 
 
 On Tue, 2013-03-12 at 04:16 +0900, spam_mas...@korea.ac.kr wrote:
 ransmit Report:
 
 liebe...@korea.ac.kr ߼; 5 õ Ͽ4ϴ.
 ( / : 554 Transaction failed. 402 Local User Inbox Full
 (liebe...@korea.ac.kr) 4,61440,370609(163.152.6.98))
 
  / 
 User unknown   :; ڰ x =
 Socket connect fail: 
 DATA write fail: ޼ ۽ 
 DATA reponse fail  : κ ޼ 
 
 
 -- 
 Bullseye!  Excellent shot, Maurice.
  Julian, King of Lemurs.



Re: [ccp4bb] statistical or systematic? bias or noise?

2013-03-11 Thread Pete Meyer

Ed,

Ed Pozharski wrote:
  IIUC, you are saying that nature of the error should be independent of

my decision to model it or not.  Other words, if I can potentially
sample some additional random variable in my experiment, it contributes
to precision whether I do it or not.  When it's not sampled, the
precision is simply underestimated.  Does that make more sense?


Actually, I was trying to say the opposite - that the decision to 
include something in the model (or not) could change the nature of the 
error.  Too bad that what I was thinking doesn't apply to the situation 
you described - my intuition was assuming that there was some time of 
optimization/refinement/fitting going on.  By analogy to profile 
fitting, modeling a spot as a circle or ellipsoid will have an effect on 
the standard deviation attributed to that spot.  But that wasn't the 
situation you were describing.


Pete

PS - IIUC := ?