Hi David,
Thanks very much for fixing this.  Would you recommend that we check
out the latest source and just recompile PeptideProphet?  I guess this
would be safer than upgrading our whole TPP install to the latest
code?

We'll contact Phenyx support about the pepXML dump and hopefully they
can fix that also.  For now we can just clean the files.

Ira



On Nov 11, 7:51 am, David Shteynberg <[email protected]>
wrote:
> Hello again,
>
> I realized you sent me the db and each protein listed only once, so
> the problem is with the Phenyx pepXML dumper writing these incorrect
> tags.  Temporarily, I have removed these entries using the sed
> command: sed -i 's/^<alternative_protein.*$//g'
>
> Then I processed the file with PeptideProphetParser and options
> MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy .  Your PeptideProphet
> processed file is posted here:
>
> http://groups.google.com/group/spctools-discuss/web/interact.pep.xml4Ira
>
> For the time being, please run RefreshParser as a separate step to map
> the IDed peptides to the proteins in the database.
>
> My code changes will be committed later today.
>
> -David
>
> On Wed, Nov 10, 2010 at 12:35 PM, David Shteynberg
>
>
>
>
>
>
>
> <[email protected]> wrote:
> > Hi Ira,
>
> > Thanks for providing your Phenyx search results.  I was able to
> > resolve a bug in PeptideProphet, which was preventing it from reading
> > in any Phenyx search results.  At this point I still cannot process
> > these files with PeptideProphet because of the alternative_protein
> > entries.  Somehow, the Phenyx pepXML exporter is writing each protein
> > twice (once listing it as an alternative protein), as follows:
>
> > <search_hit hit_rank="1" peptide="QVFKQFENYVMQFNFPEEEYIDNLHK"
> > peptide_prev_aa="R" peptide_next_aa="M" protein="decoy_12505"
> > num_tot_proteins="1" num_matched_ions="3" tot_num_ions="153"
> > calc_neutral_pep_mass="3335.559715" massdiff=
> > "0.1563615" num_tol_term="2" num_missed_cleavages="1" is_rejected="0">
> > <modification_info modified_peptide="QVFKQFENYVMQFNFPEEEYIDNLHK">
> > </modification_info>
> > <search_score name="zscore" value="3.98486"/>
> > <search_score name="zvalue" value="3.375993e-05"/>
> > <search_score name="origScore" value="-30.25199"/>
> > <alternative_protein protein="sp_human%decoy_12505"/>
> > </search_hit>
>
> > Do you really have each protein listed twice in the database or is
> > this pointing to a bug in the Phenyx pepXML dump?  As far as
> > PeptideProphet is concerned the second entry <alternative_protein
> > protein="sp_human%decoy_12505"/> is not a DECOY since it doesn't begin
> > with "decoy",  and PeptideProphet doesn't consider any hit a decoy
> > when atleast one of the matched proteins in the DB (that contains that
> > peptide sequence) is not a DECOY.
>
> > I think if this second issue can be resolved I will be able to compile
> > a binary PeptideProphetParser that will not choke on this data.
>
> > Cheers,
> > -David
>
> > On Tue, Nov 9, 2010 at 6:31 PM, ira cooke <[email protected]> wrote:
> >> Hi David,
> >> I've uploaded the database and original file .. the file is called
> >> iracooke_phenyx_pepXML.tar.gz
>
> >> The files I've uploaded are all unmodified output from the search
> >> (performed on a different machine to where the TPP is installed). I'd
> >> previously tried changing some of the paths in the files to try and
> >> fix them, but had no success.
>
> >> Also, the output of the commands I'm running is;
>
> >> /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eS -l7 -
> >> D'sphuman_20101013_DECOY.fasta' -OdP -ddecoy 475_pepxml.xml
>
> >> /usr/local/tpp-4-4-0/bin/xinteract (TPP v4.4 VUVUZELA rev 0, Build
> >> 201010010955 (linux))
>
> >> running: "/usr/local/tpp-4-4-0/bin/InteractParser 'interact.pep.xml'
> >> '475_pepxml.xml' -D'sphuman_20101013_DECOY.fasta' -L'7' -
> >> E'stricttrypsin'"
> >>  file 1: 475_pepxml.xml
> >>  processed altogether 116 results
>
> >>  results written to file /var/www/ISB/data/Projects/Test/
> >> interact.pep.shtml
>
> >> command completed in 1 sec
>
> >> running: "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
> >> 'interact.pep.xml' MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy"
> >> Using Decoy Label "decoy".
> >> Decoy Probabilities will be reported.
> >> Using non-parametric distributions
> >>  (PHENYX) (minprob 0)
> >> WARNING!! The discriminant function for Phenyx is not yet complete.
> >> It is presented here to help facilitate trial and discussion.
> >> Reliance on this code for publishable scientific results is not
> >> recommended.
> >> init with PHENYX stricttrypsin
> >> MS Instrument info: Manufacturer: ThermoFinnigan, Model: default,
> >> Ionization: FIXME, Analyzer: FIXME, Detector: FIXME
>
> >>  PeptideProphet  (TPP v4.4 VUVUZELA rev 0, Build 201010010955 (linux))
> >> akel...@isb
> >>  read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
> >>  read in no data
>
> >> command "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
> >> 'interact.pep.xml' MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy" exited
> >> with non-zero exit code: 256
> >> QUIT - the job is incomplete
>
> >> Thanks for your help
> >> Ira
>
> >> On Nov 10, 3:08 am, David Shteynberg <[email protected]>
> >> wrote:
> >>> It would be helpful to see all of the output from the latest command.
> >>> You might have to unravel the pipeline and run the steps separately if
> >>> the Phenyx pepXML is missing some information.  Would it be possible
> >>> for you post you Phenyx pepXML file and the database so I can try it
> >>> in a debugger?
>
> >>> Thanks,
> >>> -David
>
> >>> On Mon, Nov 8, 2010 at 6:19 PM, ira cooke <[email protected]> wrote:
> >>> > Thanks for your quick response.
> >>> > I've modified my command to
>
> >>> > /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eT -l7 -D/
> >>> > var/www/ISB/data/Databases/OnMascot/SPHuman/
> >>> > sphuman_20101013_DECOY.fasta -OdP -ddecoy 206_pepxml.xml
>
> >>> > (also tried without the -Od option).
>
> >>> > Unfortunately I still get the same error.
> >>> > I've checked my 206_pepxml.xml file .. and the decoys are named as
> >>> > follows
>
> >>> > protein="decoy_9817"
>
> >>> > Is it possible that the error is related to not having the raw
> >>> > spectra? Or should I not worry about that?
>
> >>> > Thanks for your help.
>
> >>> > On Nov 9, 11:06 am, David Shteynberg <[email protected]>
> >>> > wrote:
> >>> >> Phenyx search results can only be processed with the semi-parametric
> >>> >> modeling based on decoys.  Judging from name of your database, it does
> >>> >> have decoys in there.  Now you must tell PeptideProphet to use the
> >>> >> semi-parametric model with xinteract option -OP and the decoy tag that
> >>> >> all your decoy proteins begin with using xinteract flag e.g. -dDECOY
> >>> >> if all your decoys proteins begin with DECOY.  You can also use option
> >>> >> -Od to have PeptideProphet assign non-zero probabilities to the decoy
> >>> >> hits.
>
> >>> >> -David
>
> >>> >> On Mon, Nov 8, 2010 at 2:36 PM, ira cooke <[email protected]> 
> >>> >> wrote:
> >>> >> > Hi,
>
> >>> >> > I've been struggling to run PeptideProphet on phenyx generated pepXML
> >>> >> > files.
>
> >>> >> > The error I get from PeptideProphet is "read in no data".  Full 
> >>> >> > output
> >>> >> > from the tool is as follows;
>
> >>> >> > ------------------ <BEGIN COMMANDLINE OUTPUT>-----------------
> >>> >> > /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eT -l7 
> >>> >> > -D/
> >>> >> > var/www/ISB/data/Databases/OnMascot/SPHuman/
> >>> >> > sphuman_20101013_DECOY.fasta 206_pepxml.xml
>
> >>> >> > /usr/local/tpp-4-4-0/bin/xinteract (TPP v4.4 VUVUZELA rev 0, Build
> >>> >> > 201010010955 (linux))
>
> >>> >> > running: "/usr/local/tpp-4-4-0/bin/InteractParser 'interact.pep.xml'
> >>> >> > '206_pepxml.xml' -D'/var/www/ISB/data/Databases/OnMascot/SPHuman/
> >>> >> > sphuman_20101013_DECOY.fasta' -L'7' -E'trypsin'"
> >>> >> >  file 1: 206_pepxml.xml
> >>> >> >  processed altogether 182 results
>
> >>> >> >  results written to file /var/www/ISB/data/Projects/TRegs/SP/Phenyx/
> >>> >> > interact.pep.shtml
>
> >>> >> > command completed in 3 sec
>
> >>> >> > running: "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
> >>> >> > 'interact.pep.xml' MINPROB=0"
> >>> >> >  (PHENYX) (minprob 0)
> >>> >> > WARNING!! The discriminant function for Phenyx is not yet complete.
> >>> >> > It is presented here to help facilitate trial and discussion.
> >>> >> > Reliance on this code for publishable scientific results is not
> >>> >> > recommended.
> >>> >> > init with PHENYX trypsin
> >>> >> > MS Instrument info: Manufacturer: ThermoFinnigan, Model: default,
> >>> >> > Ionization: FIXME, Analyzer: FIXME, Detector: FIXME
>
> >>> >> >  PeptideProphet  (TPP v4.4 VUVUZELA rev 0, Build 201010010955 
> >>> >> > (linux))
> >>> >> > akel...@isb
> >>> >> >  read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
> >>> >> >  read in no data
>
> >>> >> > command "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
> >>> >> > 'interact.pep.xml' MINPROB=0" exited with non-zero exit code: 256
> >>> >> > QUIT - the job is incomplete
>
> >>> >> > ------------------ <END COMMANDLINE OUTPUT>-----------------
>
> >>> >> > Note that without the -eT option I get a crash (segmentation fault).
> >>> >> > The enzyme specified inside the phenyx pepXML is
>
> >>> >> > <sample_enzyme name="Trypsin_(KR_noP)">
> >>> >> > </sample_enzyme>
>
> >>> >> > (to be honest I'm not sure if this means trypsin or stricttrypsin ...
> >>> >> > but that's probably another issue as I get the error with -eS option
> >>> >> > as well).
>
> >>> >> > Could this error be caused by a lack of raw data files?  I ran the
> >>> >> > phenyx searches on another computer and the file contains paths to 
> >>> >> > raw
> >>> >> > data on that computer.  I did try fixing the paths (and copying the
> >>> >> > raw data to a TPP accessible location) ... but that didn't work
> >>> >> > either.
>
> >>> >> > I guess my question is what does "read in no data" mean?  Does data
> >>> >> > refer to the original spectra, or does it refer to something in the
> >>> >> > output of InteractParser (ie interact.pep.xml).
>
> >>> >> > Any help at all on this issue would be much appreciated
>
> >>> >> > Thanks
>
> >>> >> > --
> >>> >> > You received this message because you are subscribed to the Google 
> >>> >> > Groups "spctools-discuss" group.
> >>> >> > To post to this group, send email to 
> >>> >> > [email protected].
> >>> >> > To unsubscribe from this group, send email to 
> >>> >> > [email protected].
> >>> >> > For more options, visit this group
>
> ...
>
> read more »

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to