[spctools-discuss] Re: how is xtandem fval calcualted

Brendan Wed, 07 Oct 2009 06:37:10 -0700

Hi Bill,
Depending on how much you hate it, you could also just use some simple
printf statements to better understand how the values are being
calculated.  I am a huge fan of debuggers, but this doesn't seem, yet,
like a complicated enough problem to truly require one.  And using
print statements is surely your quickest least intrusive route to
getting more information, since it can be done immediately, if you can
already build successfully.


--Brendan

On Oct 6, 10:37 am, Natalie Tasman <[email protected]>
wrote:
> Hi Bill,
>
> While the currently support Visual Studio version for the TPP is the  
> (not-free) 2005 Professional version, you might want to first give the  
> (free) 2008 Express version a try.  You would not be able to build the  
> SPC raw file converters (readw, etc), and you will probably turn up a  
> few places in the code that you'd need to change to build under 2008  
> Express.
>
> Hope this helps,
>
> Natalie
>
> On Oct 6, 2009, at 6:13 AM, bill wrote:
>
>
>
> > Thanks for your help. My next step was to run it in a debugger but I
> > don't have a visual studio license. I hate to do it but I guess I'll
> > have to give Microsoft more money.I'll post the results if I figure it
> > out.
> > Bill
>
> > On Oct 6, 8:44 am, Brendan <[email protected]> wrote:
> >> Hi Bill,
> >> I think you are going to have to build it yourself, and do some  
> >> printf
> >> debugging to get yourself through this one.  I looked over what you
> >> wrote, and the only issue I can see is that next_score appears to be
> >> 221 and not 210, but that only makes the final value further from  
> >> what
> >> is reported.
>
> >> At this point I'd start having the code print out values for me to
> >> check my assumptions.  I am afraid I can't help you with that,  
> >> though.
>
> >> By the way, I can't take any credit for the k-score discriminant
> >> function.  It is the same function created for the original Keller
> >> OMICS paper where the score was introduced, and mimics Comet
> >> discriminant score.  I created the discriminant score for X! Tandem
> >> native scoring.
>
> >> The expect_wt_ and len_wt_ values are initialize to zero in the
> >> TandemDiscrimFunction base class.
>
> >> --Brendan
>
> >> On Oct 5, 12:19 pm, bill <[email protected]> wrote:
>
> >>> Thanks again for the feedback! We compared plain old expect values  
> >>> to
> >>> those with the Brendan's discriminant scoring and our samples
> >>> performed better with Brendan's scoring. So, now I'm back to  
> >>> figuring
> >>> out how fval is calculated.
>
> >>> I'm trying to maually work through calculating the fval for an  
> >>> xtandem
> >>> (k-score) result.
>
> >>> My stripped down pep.xml entry looks like this:
>
> >>> <spectrum_query spectrum="010319_f16.00923.00923.1"
> >>>   <search_result>
> >>>     <search_hit hit_rank="1" peptide="AISDAMFANPK" >
> >>>       <search_score name="hyperscore" value="253"/>
> >>>       <search_score name="nextscore" value="221"/>
> >>>       <search_score name="expect" value="0.79"/>
> >>>       <analysis_result analysis="peptideprophet">
> >>>         <peptideprophet_result probability="0.1732"
> >>> all_ntt_prob="(0.0000,0.0000,0.1732)">
> >>>           <search_score_summary>
> >>>             <parameter name="fval" value="1.0108"/>
>
> >>> I'm plugging the values into the getDiscriminantScore method in
> >>> TandemDiscrimFunction:
>
> >>> double TandemDiscrimFunction::getDiscriminantScore(SearchResult*
> >>> result)
> >>> {
> >>>   TandemResult* tresult = (TandemResult*)(result);
> >>>   double tot = const_;
> >>>   double disc = score_wt_ * log((double)tresult->hyper_) +  
> >>> expect_wt_
> >>> * (0-log((double)tresult->expect_)) + delta_wt_ * (1.0 - (tresult-
> >>> >next_ / tresult->hyper_));
>
> >>>   if (len_wt_)
> >>>       disc /= len_wt_ * sqrt((double)strlen(tresult->peptide_));
> >>>   tot += disc;
> >>>   if (use_expect_) {
> >>>     tot = 3 * tot  - 8;
> >>>   }
> >>>   return tot;
>
> >>> }
>
> >>> I'm initailizing through TandemKscoreDF.cxx with TandemKscoreDF(1,
> >>> false)
>
> >>> static double consts[] = {-13.287, -28.708, -31.083, -31.083,
> >>> -31.083};
> >>> static double score_wts[] = {2.256, 4.91, 4.983, 4.983, 4.983};
> >>> static double delta_wts[] = {14.346, 10.882, 18.091, 18.091,  
> >>> 18.091};
>
> >>> if (!use_expect)
> >>> {
> >>>         const_ = consts[charge];
> >>>         score_wt_ = score_wts[charge];
> >>>         delta_wt_ = delta_wts[charge];
>
> >>> }
>
> >>> So my manual calculation looks like this:
> >>> tot = -28.708
> >>> disc = 4.91 * ln(253) + 0 * (0 - ln(0.79)) + 10.882 * (1.0 -
> >>> (210/253)) =
> >>>        4.91 * 5.533 + 0 * 0.236 + 10.882 * 0.169 =
> >>>        27.169 + 0 + 1.8495 =
> >>>        29.185
>
> >>> tot = -28.708 + 29.185 = 0.311
>
> >>> But, I'm trying to reproduce the fval of 1.0108. I'm not an
> >>> experienced C++ programmer. Am I missing where expect_wt_ and  
> >>> len_wt_
> >>> are initaillized? Is ther another step to the calculation?
>
> >>> Thanks,
> >>> Bill
>
> >>> On Sep 17, 1:05 pm, David Shteynberg  
> >>> <[email protected]>
> >>> wrote:
>
> >>>> The EXPECTSCORE option was something that was added after f-val at
> >>>> Alexey request.  I believe the reason for the inclusion of this  
> >>>> option
> >>>> is in the usage statement:
>
> >>>> E [only use Expect Score as the Discriminant(applies only to X!
> >>>> Tandem data,
> >>>>      helpful for data with homologous top hits e.g. phospho or  
> >>>> glyco)]
>
> >>>> -David
>
> >>>> On Thu, Sep 17, 2009 at 4:54 AM, Brendan <[email protected]>  
> >>>> wrote:
>
> >>>>> Hi Jimmy and Bill,
> >>>>> I am sure using EXPECTSCORE in place of the fval has received less
> >>>>> testing for the veracity of its probabilities, something Alexei  
> >>>>> was
> >>>>> adamant about when I developed the fval for the X! Tandem native
> >>>>> score.  You have to run searches with decoys and look at q-q  
> >>>>> plots for
> >>>>> this.  Also, Alexei was initially dubious whether PeptideProphet  
> >>>>> would
> >>>>> work at all on X! Tandem native, because the expect scoring
> >>>>> distribution has a significant left skew, and is far from normal,
> >>>>> which I believe I was able to mitigate somewhat with the  
> >>>>> variables I
> >>>>> added in the fval.  And, finally, my own ROC plots showed this  
> >>>>> fval
> >>>>> doing a better job at discriminating between true- and false-
> >>>>> positive
> >>>>> hits.
>
> >>>>> So, go cautiously into throwing that switch, and make your own
> >>>>> estimations of benefit v. cost.
>
> >>>>> --Brendan
>
> >>>>> On Sep 16, 9:36 am, Jimmy Eng <[email protected]> wrote:
> >>>>>> Bill,
>
> >>>>>> If you're interested in Tandem's E-value , there is an option  
> >>>>>> that
> >>>>>> David added in a long time ago that allows the Tandem's E-value  
> >>>>>> to be
> >>>>>> used in place of the discriminant function.  The xinteract  
> >>>>>> option to
> >>>>>> invoke this is "-OE".  Looks like this adds "EXPECTSCORE" to the
> >>>>>> PeptideProphetParser command line.
>
> >>>>>> On Wed, Sep 16, 2009 at 5:52 AM, Brendan  
> >>>>>> <[email protected]> wrote:
>
> >>>>>>> Hi Bill,
> >>>>>>> I am the one who wrote it several years back.  I derived the  
> >>>>>>> fval
> >>>>>>> somewhat by voodoo, using analysis tools I created in CPAS.  
> >>>>>>> It tested
> >>>>>>> quite well, and if anything produces slightly conservative
> >>>>>>> probabilities, where I felt the k-score fval was slightly  
> >>>>>>> optimistic
> >>>>>>> in its estimates (but only slightly).  Note that the k-score  
> >>>>>>> fval was
> >>>>>>> derived for the scoring function originally published by  
> >>>>>>> Keller, et
> >>>>>>> al, before the score was incorporated into X! Tandem, and  
> >>>>>>> therefore
> >>>>>>> makes no use of X! Tandem's expect value.
>
> >>>>>>> These fval calculations can be found in <tpproot>/src/
> >>>>>>> Validation/
> >>>>>>> DiscriminateFunction/Tandem (for native scoring) and Comet  
> >>>>>>> (for k-
> >>>>>>> score).
>
> >>>>>>> I remember that a peptide length correction (I think  
> >>>>>>> sqrt(length)
> >>>>>>> actually) was important in the final native fval.  Because of  
> >>>>>>> how
> >>>>>>> aggressively X! Tandem weights the presence of matching ions,  
> >>>>>>> a larger
> >>>>>>> peptide is more likely to produce a wider spread between its  
> >>>>>>> best and
> >>>>>>> second best score, the key factor in the expectation value.
>
> >>>>>>> Hope that helps.
>
> >>>>>>> --Brendan
>
> >>>>>>> On Sep 15, 2:30 pm, bill <[email protected]> wrote:
> >>>>>>>> If this is not easily available can you direct me to the  
> >>>>>>>> class/script/
> >>>>>>>> method that creates the xtandem fval?
> >>>>>>>> Thanks,
> >>>>>>>> Bill
>
> >>>>>>>> On Sep 14, 4:16 pm, Bill Nelson <[email protected]> wrote:
>
> >>>>>>>>> Can you please explain how ProteinProphet calculates the  
> >>>>>>>>> fval from the
> >>>>>>>>> xtandem output?
> >>>>>>>>> It doesn't seem to be tracking the xtandem E-value, what  
> >>>>>>>>> else is
> >>>>>>>>> included?
> >>>>>>>>> Thanks,
> >>>>>>>>> Bill
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

[spctools-discuss] Re: how is xtandem fval calcualted

Reply via email to