Hi David, You are right, searching all of Uniprot + decoys does take a while, about 6 hours for a relatively small data set. Unfortunately, in my case, there were only a few sequences in there from the species being studied. Apparently only reviewed sequences are in the "full" uniprot database, or at least in the version I downloaded.
Anyway, I guessed correctly on the "contaminating" species in the sample. I concatenated that database with the database for the species under study and viola the pep.xml files run nicely through the TPP with many hundreds matching to proteins from the offending species database. In the end, only two proteins matched to the database in the species under study and 650 matched to the offending species. So no wonder TPP couldn't give me any validated results. I did this using the latest version of TPP (4.7) as you suggested. I like the beta version of the viewer for the prot.xml files. Though I haven't found a link from that view to see the ms/ms spectra. Also the iProphet results are now fully viewable (on linux). Before only the first line would be displayed. Thanks a bunch for your help! Cheers! Brian On Sat, Mar 1, 2014 at 12:01 PM, David Shteynberg < [email protected]> wrote: > Hi Brian, > > I have recently been confirming that incomplete databases could pose a big > problem to properly modeling the data. One solution is to search against > all of uniprot plus decoys. The good part about this is you don't have to > include contaminants as they should already be represented ;-). The bad > part is the searches will take longer. > > -David > > > > > > > On Sat, Mar 1, 2014 at 6:44 AM, Brian Hampton <[email protected]>wrote: > >> Thanks David, I'll try your suggestions. Also, I'm thinking there might >> be proteins from another species in the sample that are not represented in >> the database I'm searching. >> >> Thanks, >> Brian >> >> On Feb 28, 2014, at 6:18 PM, David Shteynberg < >> [email protected]> wrote: >> >> Hi Brian, >> >> This message is quite telling: 10751 Decoys, and 11603 Non-Decoys >> >> Assuming 50% of the wrong ones are Decoys there are less than 900 correct >> matches in this dataset. >> >> >> That said, I don't think using the default parameteric option is the best >> choice here due to the extended tail of the negative distribution and the >> relatively small number of correct IDs. I would suggest you try the >> semi-parametric (non-parametric option), also run with minimum probability >> 0 and perhaps set a CLEVEL of 1 with advanced option -c1 or -c2 (higher is >> more conservative) to help keep the positive distribution of the model on >> the shoulder. >> >> Finally, I would recommend you try the latest version 4.7.0, which is due >> out today. >> >> Cheers, >> -David >> >> >> On Fri, Feb 28, 2014 at 3:09 PM, Brian Hampton <[email protected]>wrote: >> >>> I am trying to squeak additional mass accuracy out of an LTQ by >>> collecting MS1 data with Enhanced Scan and in Profile mode so the data will >>> have isotopic resolution. This had the expected result of shifting the >>> frequency of peptides observed at a particular error down from ~0.75 (when >>> collected using normal scan rate and centroid mode) to center the curve >>> over an error of 0 (when collected using enhanced scan rate and profile >>> mode). Nice. >>> >>> I processed the raw files with msconvert and used the --filter >>> "peakPicking true 1-1" argument and run it through TPP v 4.6.2 with Tandem >>> as the search engine. This has worked fine on samples until today when >>> this latest data set (which is a pull down experiment), xinteract fails to >>> produce a pep.xml file and results in a warning that says: >>> >>> WARNING: Mixture model quality test failed for charge (1+). >>> WARNING: Mixture model quality test failed for charge (2+). >>> WARNING: Mixture model quality test failed for charge (3+). >>> >>> The tandem.pep.xml file contains many high scoring peptides. And there >>> are good +1,+2 & +3 spectral matches. >>> >>> I am wondering if there is an argument to msconvert that I am missing. >>> Or maybe my approach to collecting the data is incompatible with TPP? Or >>> could this be a problem with K-score vs native tandem scoring? I haven't >>> tried another search using native Tandem scoring yet. >>> >>> Below is the output from TPP. >>> >>> Thanks in advance for any help. >>> >>> Cheers, >>> Brian >>> >>> >>> >>> >>> >>> EXECUTING: cd /usr/local/tpp/data/projects/lindsey/xiap-ide12; >>> /usr/local/tpp/bin/xinteract -Ninteract-XIAP-IDE12.pep.xml -p0.05 -l5 -Op >>> -dDECOY 140226-XIAP-IDE12-5pct.tandem.pep.xml >>> >>> /usr/local/tpp/bin/xinteract (TPP v4.6 OCCUPY rev 2, Build 201302151642 >>> (linux)) >>> >>> running: "/usr/local/tpp/bin/InteractParser >>> 'interact-XIAP-IDE12.pep.xml' '140226-XIAP-IDE12-5pct.tandem.pep.xml' -L'5'" >>> file 1: 140226-XIAP-IDE12-5pct.tandem.pep.xml >>> SUCCESS: CORRECTED data file >>> /usr/local/tpp/data/projects/lindsey/xiap-ide12/140226-XIAP-IDE12-5pct.mzML >>> in msms_run_summary tag ... >>> processed altogether 22365 results >>> INFO: Results written to file: >>> /usr/local/tpp/data/projects/lindsey/xiap-ide12/interact-XIAP-IDE12.pep.xml >>> command completed in 6 sec >>> >>> running: "/usr/local/tpp/bin/DatabaseParser >>> 'interact-XIAP-IDE12.pep.xml'" >>> command completed in 1 sec >>> >>> running: "/usr/local/tpp/bin/RefreshParser 'interact-XIAP-IDE12.pep.xml' >>> '/usr/local/tpp/data/dbase/ixodes-plus-crap_DECOY.fasta'" >>> - Building Commentz-Walter keyword tree... - Searching the tree... >>> - Linking duplicate entries... - Printing results... >>> >>> command completed in 6 sec >>> >>> running: "/usr/local/tpp/bin/PeptideProphetParser >>> 'interact-XIAP-IDE12.pep.xml' MINPROB=0.05 DECOY=DECOY" >>> Using Decoy Label "DECOY". >>> (X! Tandem (k-score)) >>> init with X! Tandem (k-score) trypsin >>> MS Instrument info: Manufacturer: Thermo Scientific, Model: UNKNOWN, >>> Ionization: nanoelectrospray, Analyzer: radial ejection linear ion trap, >>> Detector: electron multiplier >>> >>> PeptideProphet (TPP v4.6 OCCUPY rev 2, Build 201302151642 (linux)) >>> AKeller@ISB >>> read in 2445 1+, 9959 2+, 9950 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra. >>> Initialising statistical models ... >>> Found 10751 Decoys, and 11603 Non-Decoys >>> Iterations: .........10.........20........ >>> WARNING: Mixture model quality test failed for charge (1+). >>> WARNING: Mixture model quality test failed for charge (2+). >>> WARNING: Mixture model quality test failed for charge (3+). >>> model complete after 29 iterations >>> command completed in 4 sec >>> >>> running: "/usr/local/tpp/bin/ProphetModels.pl -i >>> interact-XIAP-IDE12.pep.xml -d DECOY" >>> Analyzing interact-XIAP-IDE12.pep.xml ... >>> Parsing search results >>> "/usr/local/tpp/data/projects/lindsey/xiap-ide12/140226-XIAP-IDE12-5pct (X! >>> Tandem (k-score))"... >>> => Total of 0 hits. >>> => Total of 0 decoy hits. >>> => Total of 0 excluded hits. >>> Warning: empty y range [0:0], adjusting to [0:1] >>> Warning: empty y range [0:0], adjusting to [0:1] >>> >>> plot "interact-XIAP-IDE12.pep_FVAL.tsv" using 1:17 title "Observed" with >>> line lc -1 , "interact-XIAP-IDE12.pep_FVAL.tsv" using 1:18 title "Model >>> Pos" with line lc 3 , "interact-XIAP-IDE12.pep_FVAL.tsv" using 1:19 title >>> "Model Neg" with line lc 1 >>> >>> ^ >>> "interact-XIAP-IDE12.pep_FVAL.gp", line 23: warning: Skipping data file >>> with no valid points >>> Warning: empty y range [0:0], adjusting to [0:1] >>> >>> plot "interact-XIAP-IDE12.pep_FVAL.tsv" using 1:20 title "Observed" with >>> line lc -1 , "interact-XIAP-IDE12.pep_FVAL.tsv" using 1:21 title "Model >>> Pos" with line lc 3 , "interact-XIAP-IDE12.pep_FVAL.tsv" using 1:22 title >>> "Model Neg" with line lc 1 >>> >>> ^ >>> "interact-XIAP-IDE12.pep_FVAL.gp", line 25: warning: Skipping data file >>> with no valid points >>> Warning: empty y range [0:0], adjusting to [0:1] >>> >>> plot "interact-XIAP-IDE12.pep_PPPROB.tsv" using 2:1 title >>> "PeptideProphet" with line lt 1 lc 3 , x notitle with line lt 0 lc -1 >>> >>> ^ >>> "interact-XIAP-IDE12.pep_PPPROB.gp", line 17: warning: Skipping data >>> file with no valid points >>> >>> plot "interact-XIAP-IDE12.pep_IPPROB.tsv" using 2:1 title "iProphet" >>> with line lt 1 lc 3 , "interact-XIAP-IDE12.pep_PPPROB.tsv" using 2:1 title >>> "PeptideProphet" with line lt 1 lc 1 , x notitle with line lt 0 lc -1 >>> >>> ^ >>> "interact-XIAP-IDE12.pep_IPPROB.gp", line 17: warning: Skipping data >>> file with no valid points >>> >>> plot "interact-XIAP-IDE12.pep_IPPROB.tsv" using 2:1 title "iProphet" >>> with line lt 1 lc 3 , "interact-XIAP-IDE12.pep_PPPROB.tsv" using 2:1 title >>> "PeptideProphet" with line lt 1 lc 1 , x notitle with line lt 0 lc -1 >>> >>> >>> ^ >>> "interact-XIAP-IDE12.pep_IPPROB.gp", line 17: warning: Skipping data >>> file with no valid points >>> command completed in 0 sec >>> >>> running: "/usr/local/tpp/cgi-bin/PepXMLViewer.cgi -I >>> /usr/local/tpp/data/projects/lindsey/xiap-ide12/interact-XIAP-IDE12.pep.xml" >>> Segmentation fault (core dumped) >>> >>> command "/usr/local/tpp/cgi-bin/PepXMLViewer.cgi -I >>> /usr/local/tpp/data/projects/lindsey/xiap-ide12/interact-XIAP-IDE12.pep.xml" >>> exited with non-zero exit code: 35584 >>> QUIT - the job is incomplete >>> >>> Command FAILED >>> RETURN CODE:35584 >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "spctools-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/spctools-discuss. >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "spctools-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at http://groups.google.com/group/spctools-discuss. >> For more options, visit https://groups.google.com/groups/opt_out. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "spctools-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at http://groups.google.com/group/spctools-discuss. >> For more options, visit https://groups.google.com/groups/opt_out. >> > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/spctools-discuss. > For more options, visit https://groups.google.com/groups/opt_out. > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/spctools-discuss. For more options, visit https://groups.google.com/groups/opt_out.
