[spctools-discuss] Re: how is xtandem fval calcualted

Brendan Tue, 06 Oct 2009 05:44:37 -0700

Hi Bill,
I think you are going to have to build it yourself, and do some printf
debugging to get yourself through this one.  I looked over what you
wrote, and the only issue I can see is that next_score appears to be
221 and not 210, but that only makes the final value further from what
is reported.


At this point I'd start having the code print out values for me to
check my assumptions.  I am afraid I can't help you with that, though.

By the way, I can't take any credit for the k-score discriminant
function.  It is the same function created for the original Keller
OMICS paper where the score was introduced, and mimics Comet
discriminant score.  I created the discriminant score for X! Tandem
native scoring.

The expect_wt_ and len_wt_ values are initialize to zero in the
TandemDiscrimFunction base class.

--Brendan

On Oct 5, 12:19 pm, bill <[email protected]> wrote:
> Thanks again for the feedback! We compared plain old expect values to
> those with the Brendan's discriminant scoring and our samples
> performed better with Brendan's scoring. So, now I'm back to figuring
> out how fval is calculated.
>
> I'm trying to maually work through calculating the fval for an xtandem
> (k-score) result.
>
> My stripped down pep.xml entry looks like this:
>
> <spectrum_query spectrum="010319_f16.00923.00923.1"
>   <search_result>
>     <search_hit hit_rank="1" peptide="AISDAMFANPK" >
>       <search_score name="hyperscore" value="253"/>
>       <search_score name="nextscore" value="221"/>
>       <search_score name="expect" value="0.79"/>
>       <analysis_result analysis="peptideprophet">
>         <peptideprophet_result probability="0.1732"
> all_ntt_prob="(0.0000,0.0000,0.1732)">
>           <search_score_summary>
>             <parameter name="fval" value="1.0108"/>
>
> I'm plugging the values into the getDiscriminantScore method in
> TandemDiscrimFunction:
>
> double TandemDiscrimFunction::getDiscriminantScore(SearchResult*
> result)
> {
>   TandemResult* tresult = (TandemResult*)(result);
>   double tot = const_;
>   double disc = score_wt_ * log((double)tresult->hyper_) + expect_wt_
> * (0-log((double)tresult->expect_)) + delta_wt_ * (1.0 - (tresult->next_ / 
> tresult->hyper_));
>
>   if (len_wt_)
>       disc /= len_wt_ * sqrt((double)strlen(tresult->peptide_));
>   tot += disc;
>   if (use_expect_) {
>     tot = 3 * tot  - 8;
>   }
>   return tot;
>
> }
>
> I'm initailizing through TandemKscoreDF.cxx with TandemKscoreDF(1,
> false)
>
> static double consts[] = {-13.287, -28.708, -31.083, -31.083,
> -31.083};
> static double score_wts[] = {2.256, 4.91, 4.983, 4.983, 4.983};
> static double delta_wts[] = {14.346, 10.882, 18.091, 18.091, 18.091};
>
> if (!use_expect)
> {
>         const_ = consts[charge];
>         score_wt_ = score_wts[charge];
>         delta_wt_ = delta_wts[charge];
>
> }
>
> So my manual calculation looks like this:
> tot = -28.708
> disc = 4.91 * ln(253) + 0 * (0 - ln(0.79)) + 10.882 * (1.0 -
> (210/253)) =
>        4.91 * 5.533 + 0 * 0.236 + 10.882 * 0.169 =
>        27.169 + 0 + 1.8495 =
>        29.185
>
> tot = -28.708 + 29.185 = 0.311
>
> But, I'm trying to reproduce the fval of 1.0108. I'm not an
> experienced C++ programmer. Am I missing where expect_wt_ and len_wt_
> are initaillized? Is ther another step to the calculation?
>
> Thanks,
> Bill
>
> On Sep 17, 1:05 pm, David Shteynberg <[email protected]>
> wrote:
>
> > The EXPECTSCORE option was something that was added after f-val at
> > Alexey request.  I believe the reason for the inclusion of this option
> > is in the usage statement:
>
> > E [only use Expect Score as the Discriminant(applies only to X!Tandem data,
> >      helpful for data with homologous top hits e.g. phospho or glyco)]
>
> > -David
>
> > On Thu, Sep 17, 2009 at 4:54 AM, Brendan <[email protected]> wrote:
>
> > > Hi Jimmy and Bill,
> > > I am sure using EXPECTSCORE in place of the fval has received less
> > > testing for the veracity of its probabilities, something Alexei was
> > > adamant about when I developed the fval for the X! Tandem native
> > > score.  You have to run searches with decoys and look at q-q plots for
> > > this.  Also, Alexei was initially dubious whether PeptideProphet would
> > > work at all on X! Tandem native, because the expect scoring
> > > distribution has a significant left skew, and is far from normal,
> > > which I believe I was able to mitigate somewhat with the variables I
> > > added in the fval.  And, finally, my own ROC plots showed this fval
> > > doing a better job at discriminating between true- and false-positive
> > > hits.
>
> > > So, go cautiously into throwing that switch, and make your own
> > > estimations of benefit v. cost.
>
> > > --Brendan
>
> > > On Sep 16, 9:36 am, Jimmy Eng <[email protected]> wrote:
> > >> Bill,
>
> > >> If you're interested in Tandem's E-value , there is an option that
> > >> David added in a long time ago that allows the Tandem's E-value to be
> > >> used in place of the discriminant function.  The xinteract option to
> > >> invoke this is "-OE".  Looks like this adds "EXPECTSCORE" to the
> > >> PeptideProphetParser command line.
>
> > >> On Wed, Sep 16, 2009 at 5:52 AM, Brendan <[email protected]> wrote:
>
> > >> > Hi Bill,
> > >> > I am the one who wrote it several years back.  I derived the fval
> > >> > somewhat by voodoo, using analysis tools I created in CPAS.  It tested
> > >> > quite well, and if anything produces slightly conservative
> > >> > probabilities, where I felt the k-score fval was slightly optimistic
> > >> > in its estimates (but only slightly).  Note that the k-score fval was
> > >> > derived for the scoring function originally published by Keller, et
> > >> > al, before the score was incorporated into X! Tandem, and therefore
> > >> > makes no use of X! Tandem's expect value.
>
> > >> > These fval calculations can be found in <tpproot>/src/Validation/
> > >> > DiscriminateFunction/Tandem (for native scoring) and Comet (for k-
> > >> > score).
>
> > >> > I remember that a peptide length correction (I think sqrt(length)
> > >> > actually) was important in the final native fval.  Because of how
> > >> > aggressively X! Tandem weights the presence of matching ions, a larger
> > >> > peptide is more likely to produce a wider spread between its best and
> > >> > second best score, the key factor in the expectation value.
>
> > >> > Hope that helps.
>
> > >> > --Brendan
>
> > >> > On Sep 15, 2:30 pm, bill <[email protected]> wrote:
> > >> >> If this is not easily available can you direct me to the class/script/
> > >> >> method that creates the xtandem fval?
> > >> >> Thanks,
> > >> >> Bill
>
> > >> >> On Sep 14, 4:16 pm, Bill Nelson <[email protected]> wrote:
>
> > >> >> > Can you please explain how ProteinProphet calculates the fval from 
> > >> >> > the
> > >> >> > xtandem output?
> > >> >> > It doesn't seem to be tracking the xtandem E-value, what else is
> > >> >> > included?
> > >> >> > Thanks,
> > >> >> > Bill
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

[spctools-discuss] Re: how is xtandem fval calcualted

Reply via email to