Hi Natalie, Thanks for the info but my question is still not answered completely.
My question was "*HOW*" TPP does what it does. We all know it extracts more info from results but I wanted to know HOW does it calculate the statistically valid p-value (chance of match being incorrect given the null hypothesis) if the search engine did i already? If u know the database size, u can always calculate the p-value from e-value and vice-versa. So, how does TPP fit in? Lets focus on another very potent problem not much discussed in search engines - In higher mass ranges there are very few candidates to accurately calculate a p-value or e-value (or any statistical measure for that matter). Unless u have a null model (a supposedly bell curve for random hits), u cannot calculate a statistical confidence. And u dont have enough candidates (lets assume just 5 candidates) to draw a distribution for Tandem to calculate the e-value correctly. Curve fitting will have drastic effect here. How does TPP correct for it, if it does so? Regards, Amit Kumar Yadav Senior Research Fellow (SRF-CSIR) IGIB, New Delhi (India) http://masswiz.igib.res.in On Thu, Mar 11, 2010 at 11:55 AM, Natalie Tasman < [email protected]> wrote: > Good questions and worth explaining for those new to the field. X!Tandem > is a program which to assign peptide sequences ("ID"s) to ms/ms spectra. We > call this type of program a "search engine" (for "peptide ID search engine" > or similar). Other programs in this class are OMSSA, Sequest, Mascot, and > others. Each of these programs can be run on its own, and outputs a score > for each "assigned" peptide. This score reflects the search engine's > estimation of how likely or confident that assignment is (I use those terms > not necessarily as true stats meanings here.) > > So now you have input of peptide IDs and "scores". The TPP tools (Pep and > Prot Prophet) do two major things of interest. One, they take the > variously-derrived scores, compute additional information not necessarily > accounted by the search engine (asking such questions as "how reasonable is > this sequence, given its terminii, length, hydrophobicity, and so on), and > combine these numbers to arrive at a statistically valid p value. Two, the > TPP tools do this for many supported search engines, which allows the > possibility of comparing the peptide assignments to other results, i.e. in a > publication context. (For a more complex approach to the last point, look > at the TPP's InterProphet tool.) > > Regarding FDR, I will leave that to someone else to answer in detail. > > Hope this helps, > Natalie > > On 3/10/10 8:46 PM, Amit Kumar Yadav wrote: > >> Dear All, >> >> I was just browsing the gpm site and reading about tandem. It says >> that peptideprophet and proteinprophet need not be used with X!Tandem. >> Can someone explain (in simple terms please) how TPP uses tandem >> result data to assign probabilities? >> >> Another naive question is about calculation of FDR from X!Tandem >> results? How should it be done? >> >> For reference- >> Quoting from http://www.thegpm.org/TANDEM/index.html "Unlike >> some... ... Therefore, separate assembly and statistical analysis >> software, e.g. PeptideProphet and ProteinProphet, do not need to be >> used." >> >> >> > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<spctools-discuss%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/spctools-discuss?hl=en. > > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.
