I just committed the change. Please update from trunk, recompile PeptideProphetParser and try it again.
Thanks, -David On Wed, Nov 10, 2010 at 2:12 PM, ira cooke <[email protected]> wrote: > Hi David, > Thanks very much for fixing this. Would you recommend that we check > out the latest source and just recompile PeptideProphet? I guess this > would be safer than upgrading our whole TPP install to the latest > code? > > We'll contact Phenyx support about the pepXML dump and hopefully they > can fix that also. For now we can just clean the files. > > Ira > > > > On Nov 11, 7:51 am, David Shteynberg <[email protected]> > wrote: >> Hello again, >> >> I realized you sent me the db and each protein listed only once, so >> the problem is with the Phenyx pepXML dumper writing these incorrect >> tags. Temporarily, I have removed these entries using the sed >> command: sed -i 's/^<alternative_protein.*$//g' >> >> Then I processed the file with PeptideProphetParser and options >> MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy . Your PeptideProphet >> processed file is posted here: >> >> http://groups.google.com/group/spctools-discuss/web/interact.pep.xml4Ira >> >> For the time being, please run RefreshParser as a separate step to map >> the IDed peptides to the proteins in the database. >> >> My code changes will be committed later today. >> >> -David >> >> On Wed, Nov 10, 2010 at 12:35 PM, David Shteynberg >> >> >> >> >> >> >> >> <[email protected]> wrote: >> > Hi Ira, >> >> > Thanks for providing your Phenyx search results. I was able to >> > resolve a bug in PeptideProphet, which was preventing it from reading >> > in any Phenyx search results. At this point I still cannot process >> > these files with PeptideProphet because of the alternative_protein >> > entries. Somehow, the Phenyx pepXML exporter is writing each protein >> > twice (once listing it as an alternative protein), as follows: >> >> > <search_hit hit_rank="1" peptide="QVFKQFENYVMQFNFPEEEYIDNLHK" >> > peptide_prev_aa="R" peptide_next_aa="M" protein="decoy_12505" >> > num_tot_proteins="1" num_matched_ions="3" tot_num_ions="153" >> > calc_neutral_pep_mass="3335.559715" massdiff= >> > "0.1563615" num_tol_term="2" num_missed_cleavages="1" is_rejected="0"> >> > <modification_info modified_peptide="QVFKQFENYVMQFNFPEEEYIDNLHK"> >> > </modification_info> >> > <search_score name="zscore" value="3.98486"/> >> > <search_score name="zvalue" value="3.375993e-05"/> >> > <search_score name="origScore" value="-30.25199"/> >> > <alternative_protein protein="sp_human%decoy_12505"/> >> > </search_hit> >> >> > Do you really have each protein listed twice in the database or is >> > this pointing to a bug in the Phenyx pepXML dump? As far as >> > PeptideProphet is concerned the second entry <alternative_protein >> > protein="sp_human%decoy_12505"/> is not a DECOY since it doesn't begin >> > with "decoy", and PeptideProphet doesn't consider any hit a decoy >> > when atleast one of the matched proteins in the DB (that contains that >> > peptide sequence) is not a DECOY. >> >> > I think if this second issue can be resolved I will be able to compile >> > a binary PeptideProphetParser that will not choke on this data. >> >> > Cheers, >> > -David >> >> > On Tue, Nov 9, 2010 at 6:31 PM, ira cooke <[email protected]> wrote: >> >> Hi David, >> >> I've uploaded the database and original file .. the file is called >> >> iracooke_phenyx_pepXML.tar.gz >> >> >> The files I've uploaded are all unmodified output from the search >> >> (performed on a different machine to where the TPP is installed). I'd >> >> previously tried changing some of the paths in the files to try and >> >> fix them, but had no success. >> >> >> Also, the output of the commands I'm running is; >> >> >> /usr/local/tpp-4-4-0/bin/xinteract -Ninteract.pep.xml -p0 -eS -l7 - >> >> D'sphuman_20101013_DECOY.fasta' -OdP -ddecoy 475_pepxml.xml >> >> >> /usr/local/tpp-4-4-0/bin/xinteract (TPP v4.4 VUVUZELA rev 0, Build >> >> 201010010955 (linux)) >> >> >> running: "/usr/local/tpp-4-4-0/bin/InteractParser 'interact.pep.xml' >> >> '475_pepxml.xml' -D'sphuman_20101013_DECOY.fasta' -L'7' - >> >> E'stricttrypsin'" >> >> file 1: 475_pepxml.xml >> >> processed altogether 116 results >> >> >> results written to file /var/www/ISB/data/Projects/Test/ >> >> interact.pep.shtml >> >> >> command completed in 1 sec >> >> >> running: "/usr/local/tpp-4-4-0/bin/PeptideProphetParser >> >> 'interact.pep.xml' MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy" >> >> Using Decoy Label "decoy". >> >> Decoy Probabilities will be reported. >> >> Using non-parametric distributions >> >> (PHENYX) (minprob 0) >> >> WARNING!! The discriminant function for Phenyx is not yet complete. >> >> It is presented here to help facilitate trial and discussion. >> >> Reliance on this code for publishable scientific results is not >> >> recommended. >> >> init with PHENYX stricttrypsin >> >> MS Instrument info: Manufacturer: ThermoFinnigan, Model: default, >> >> Ionization: FIXME, Analyzer: FIXME, Detector: FIXME >> >> >> PeptideProphet (TPP v4.4 VUVUZELA rev 0, Build 201010010955 (linux)) >> >> akel...@isb >> >> read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra. >> >> read in no data >> >> >> command "/usr/local/tpp-4-4-0/bin/PeptideProphetParser >> >> 'interact.pep.xml' MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy" exited >> >> with non-zero exit code: 256 >> >> QUIT - the job is incomplete >> >> >> Thanks for your help >> >> Ira >> >> >> On Nov 10, 3:08 am, David Shteynberg <[email protected]> >> >> wrote: >> >>> It would be helpful to see all of the output from the latest command. >> >>> You might have to unravel the pipeline and run the steps separately if >> >>> the Phenyx pepXML is missing some information. Would it be possible >> >>> for you post you Phenyx pepXML file and the database so I can try it >> >>> in a debugger? >> >> >>> Thanks, >> >>> -David >> >> >>> On Mon, Nov 8, 2010 at 6:19 PM, ira cooke <[email protected]> >> >>> wrote: >> >>> > Thanks for your quick response. >> >>> > I've modified my command to >> >> >>> > /usr/local/tpp-4-4-0/bin/xinteract -Ninteract.pep.xml -p0 -eT -l7 -D/ >> >>> > var/www/ISB/data/Databases/OnMascot/SPHuman/ >> >>> > sphuman_20101013_DECOY.fasta -OdP -ddecoy 206_pepxml.xml >> >> >>> > (also tried without the -Od option). >> >> >>> > Unfortunately I still get the same error. >> >>> > I've checked my 206_pepxml.xml file .. and the decoys are named as >> >>> > follows >> >> >>> > protein="decoy_9817" >> >> >>> > Is it possible that the error is related to not having the raw >> >>> > spectra? Or should I not worry about that? >> >> >>> > Thanks for your help. >> >> >>> > On Nov 9, 11:06 am, David Shteynberg <[email protected]> >> >>> > wrote: >> >>> >> Phenyx search results can only be processed with the semi-parametric >> >>> >> modeling based on decoys. Judging from name of your database, it does >> >>> >> have decoys in there. Now you must tell PeptideProphet to use the >> >>> >> semi-parametric model with xinteract option -OP and the decoy tag that >> >>> >> all your decoy proteins begin with using xinteract flag e.g. -dDECOY >> >>> >> if all your decoys proteins begin with DECOY. You can also use option >> >>> >> -Od to have PeptideProphet assign non-zero probabilities to the decoy >> >>> >> hits. >> >> >>> >> -David >> >> >>> >> On Mon, Nov 8, 2010 at 2:36 PM, ira cooke <[email protected]> >> >>> >> wrote: >> >>> >> > Hi, >> >> >>> >> > I've been struggling to run PeptideProphet on phenyx generated >> >>> >> > pepXML >> >>> >> > files. >> >> >>> >> > The error I get from PeptideProphet is "read in no data". Full >> >>> >> > output >> >>> >> > from the tool is as follows; >> >> >>> >> > ------------------ <BEGIN COMMANDLINE OUTPUT>----------------- >> >>> >> > /usr/local/tpp-4-4-0/bin/xinteract -Ninteract.pep.xml -p0 -eT -l7 >> >>> >> > -D/ >> >>> >> > var/www/ISB/data/Databases/OnMascot/SPHuman/ >> >>> >> > sphuman_20101013_DECOY.fasta 206_pepxml.xml >> >> >>> >> > /usr/local/tpp-4-4-0/bin/xinteract (TPP v4.4 VUVUZELA rev 0, Build >> >>> >> > 201010010955 (linux)) >> >> >>> >> > running: "/usr/local/tpp-4-4-0/bin/InteractParser 'interact.pep.xml' >> >>> >> > '206_pepxml.xml' -D'/var/www/ISB/data/Databases/OnMascot/SPHuman/ >> >>> >> > sphuman_20101013_DECOY.fasta' -L'7' -E'trypsin'" >> >>> >> > file 1: 206_pepxml.xml >> >>> >> > processed altogether 182 results >> >> >>> >> > results written to file /var/www/ISB/data/Projects/TRegs/SP/Phenyx/ >> >>> >> > interact.pep.shtml >> >> >>> >> > command completed in 3 sec >> >> >>> >> > running: "/usr/local/tpp-4-4-0/bin/PeptideProphetParser >> >>> >> > 'interact.pep.xml' MINPROB=0" >> >>> >> > (PHENYX) (minprob 0) >> >>> >> > WARNING!! The discriminant function for Phenyx is not yet complete. >> >>> >> > It is presented here to help facilitate trial and discussion. >> >>> >> > Reliance on this code for publishable scientific results is not >> >>> >> > recommended. >> >>> >> > init with PHENYX trypsin >> >>> >> > MS Instrument info: Manufacturer: ThermoFinnigan, Model: default, >> >>> >> > Ionization: FIXME, Analyzer: FIXME, Detector: FIXME >> >> >>> >> > PeptideProphet (TPP v4.4 VUVUZELA rev 0, Build 201010010955 >> >>> >> > (linux)) >> >>> >> > akel...@isb >> >>> >> > read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra. >> >>> >> > read in no data >> >> >>> >> > command "/usr/local/tpp-4-4-0/bin/PeptideProphetParser >> >>> >> > 'interact.pep.xml' MINPROB=0" exited with non-zero exit code: 256 >> >>> >> > QUIT - the job is incomplete >> >> >>> >> > ------------------ <END COMMANDLINE OUTPUT>----------------- >> >> >>> >> > Note that without the -eT option I get a crash (segmentation fault). >> >>> >> > The enzyme specified inside the phenyx pepXML is >> >> >>> >> > <sample_enzyme name="Trypsin_(KR_noP)"> >> >>> >> > </sample_enzyme> >> >> >>> >> > (to be honest I'm not sure if this means trypsin or stricttrypsin >> >>> >> > ... >> >>> >> > but that's probably another issue as I get the error with -eS option >> >>> >> > as well). >> >> >>> >> > Could this error be caused by a lack of raw data files? I ran the >> >>> >> > phenyx searches on another computer and the file contains paths to >> >>> >> > raw >> >>> >> > data on that computer. I did try fixing the paths (and copying the >> >>> >> > raw data to a TPP accessible location) ... but that didn't work >> >>> >> > either. >> >> >>> >> > I guess my question is what does "read in no data" mean? Does data >> >>> >> > refer to the original spectra, or does it refer to something in the >> >>> >> > output of InteractParser (ie interact.pep.xml). >> >> >>> >> > Any help at all on this issue would be much appreciated >> >> >>> >> > Thanks >> >> >>> >> > -- >> >>> >> > You received this message because you are subscribed to the Google >> >>> >> > Groups "spctools-discuss" group. >> >>> >> > To post to this group, send email to >> >>> >> > [email protected]. >> >>> >> > To unsubscribe from this group, send email to >> >>> >> > [email protected]. >> >>> >> > For more options, visit this group >> >> ... >> >> read more » > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/spctools-discuss?hl=en. > > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.
