Hi Ira,

Thanks for providing your Phenyx search results.  I was able to
resolve a bug in PeptideProphet, which was preventing it from reading
in any Phenyx search results.  At this point I still cannot process
these files with PeptideProphet because of the alternative_protein
entries.  Somehow, the Phenyx pepXML exporter is writing each protein
twice (once listing it as an alternative protein), as follows:

<search_hit hit_rank="1" peptide="QVFKQFENYVMQFNFPEEEYIDNLHK"
peptide_prev_aa="R" peptide_next_aa="M" protein="decoy_12505"
num_tot_proteins="1" num_matched_ions="3" tot_num_ions="153"
calc_neutral_pep_mass="3335.559715" massdiff=
"0.1563615" num_tol_term="2" num_missed_cleavages="1" is_rejected="0">
<modification_info modified_peptide="QVFKQFENYVMQFNFPEEEYIDNLHK">
</modification_info>
<search_score name="zscore" value="3.98486"/>
<search_score name="zvalue" value="3.375993e-05"/>
<search_score name="origScore" value="-30.25199"/>
<alternative_protein protein="sp_human%decoy_12505"/>
</search_hit>

Do you really have each protein listed twice in the database or is
this pointing to a bug in the Phenyx pepXML dump?  As far as
PeptideProphet is concerned the second entry <alternative_protein
protein="sp_human%decoy_12505"/> is not a DECOY since it doesn't begin
with "decoy",  and PeptideProphet doesn't consider any hit a decoy
when atleast one of the matched proteins in the DB (that contains that
peptide sequence) is not a DECOY.

I think if this second issue can be resolved I will be able to compile
a binary PeptideProphetParser that will not choke on this data.

Cheers,
-David





On Tue, Nov 9, 2010 at 6:31 PM, ira cooke <iraco...@googlemail.com> wrote:
> Hi David,
> I've uploaded the database and original file .. the file is called
> iracooke_phenyx_pepXML.tar.gz
>
> The files I've uploaded are all unmodified output from the search
> (performed on a different machine to where the TPP is installed). I'd
> previously tried changing some of the paths in the files to try and
> fix them, but had no success.
>
> Also, the output of the commands I'm running is;
>
> /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eS -l7 -
> D'sphuman_20101013_DECOY.fasta' -OdP -ddecoy 475_pepxml.xml
>
> /usr/local/tpp-4-4-0/bin/xinteract (TPP v4.4 VUVUZELA rev 0, Build
> 201010010955 (linux))
>
> running: "/usr/local/tpp-4-4-0/bin/InteractParser 'interact.pep.xml'
> '475_pepxml.xml' -D'sphuman_20101013_DECOY.fasta' -L'7' -
> E'stricttrypsin'"
>  file 1: 475_pepxml.xml
>  processed altogether 116 results
>
>
>  results written to file /var/www/ISB/data/Projects/Test/
> interact.pep.shtml
>
>
>
> command completed in 1 sec
>
> running: "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
> 'interact.pep.xml' MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy"
> Using Decoy Label "decoy".
> Decoy Probabilities will be reported.
> Using non-parametric distributions
>  (PHENYX) (minprob 0)
> WARNING!! The discriminant function for Phenyx is not yet complete.
> It is presented here to help facilitate trial and discussion.
> Reliance on this code for publishable scientific results is not
> recommended.
> init with PHENYX stricttrypsin
> MS Instrument info: Manufacturer: ThermoFinnigan, Model: default,
> Ionization: FIXME, Analyzer: FIXME, Detector: FIXME
>
>  PeptideProphet  (TPP v4.4 VUVUZELA rev 0, Build 201010010955 (linux))
> akel...@isb
>  read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
>  read in no data
>
> command "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
> 'interact.pep.xml' MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy" exited
> with non-zero exit code: 256
> QUIT - the job is incomplete
>
>
> Thanks for your help
> Ira
>
>
>
> On Nov 10, 3:08 am, David Shteynberg <dshteynb...@systemsbiology.org>
> wrote:
>> It would be helpful to see all of the output from the latest command.
>> You might have to unravel the pipeline and run the steps separately if
>> the Phenyx pepXML is missing some information.  Would it be possible
>> for you post you Phenyx pepXML file and the database so I can try it
>> in a debugger?
>>
>> Thanks,
>> -David
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Nov 8, 2010 at 6:19 PM, ira cooke <iraco...@googlemail.com> wrote:
>> > Thanks for your quick response.
>> > I've modified my command to
>>
>> > /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eT -l7 -D/
>> > var/www/ISB/data/Databases/OnMascot/SPHuman/
>> > sphuman_20101013_DECOY.fasta -OdP -ddecoy 206_pepxml.xml
>>
>> > (also tried without the -Od option).
>>
>> > Unfortunately I still get the same error.
>> > I've checked my 206_pepxml.xml file .. and the decoys are named as
>> > follows
>>
>> > protein="decoy_9817"
>>
>> > Is it possible that the error is related to not having the raw
>> > spectra? Or should I not worry about that?
>>
>> > Thanks for your help.
>>
>> > On Nov 9, 11:06 am, David Shteynberg <dshteynb...@systemsbiology.org>
>> > wrote:
>> >> Phenyx search results can only be processed with the semi-parametric
>> >> modeling based on decoys.  Judging from name of your database, it does
>> >> have decoys in there.  Now you must tell PeptideProphet to use the
>> >> semi-parametric model with xinteract option -OP and the decoy tag that
>> >> all your decoy proteins begin with using xinteract flag e.g. -dDECOY
>> >> if all your decoys proteins begin with DECOY.  You can also use option
>> >> -Od to have PeptideProphet assign non-zero probabilities to the decoy
>> >> hits.
>>
>> >> -David
>>
>> >> On Mon, Nov 8, 2010 at 2:36 PM, ira cooke <iraco...@googlemail.com> wrote:
>> >> > Hi,
>>
>> >> > I've been struggling to run PeptideProphet on phenyx generated pepXML
>> >> > files.
>>
>> >> > The error I get from PeptideProphet is "read in no data".  Full output
>> >> > from the tool is as follows;
>>
>> >> > ------------------ <BEGIN COMMANDLINE OUTPUT>-----------------
>> >> > /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eT -l7 -D/
>> >> > var/www/ISB/data/Databases/OnMascot/SPHuman/
>> >> > sphuman_20101013_DECOY.fasta 206_pepxml.xml
>>
>> >> > /usr/local/tpp-4-4-0/bin/xinteract (TPP v4.4 VUVUZELA rev 0, Build
>> >> > 201010010955 (linux))
>>
>> >> > running: "/usr/local/tpp-4-4-0/bin/InteractParser 'interact.pep.xml'
>> >> > '206_pepxml.xml' -D'/var/www/ISB/data/Databases/OnMascot/SPHuman/
>> >> > sphuman_20101013_DECOY.fasta' -L'7' -E'trypsin'"
>> >> >  file 1: 206_pepxml.xml
>> >> >  processed altogether 182 results
>>
>> >> >  results written to file /var/www/ISB/data/Projects/TRegs/SP/Phenyx/
>> >> > interact.pep.shtml
>>
>> >> > command completed in 3 sec
>>
>> >> > running: "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
>> >> > 'interact.pep.xml' MINPROB=0"
>> >> >  (PHENYX) (minprob 0)
>> >> > WARNING!! The discriminant function for Phenyx is not yet complete.
>> >> > It is presented here to help facilitate trial and discussion.
>> >> > Reliance on this code for publishable scientific results is not
>> >> > recommended.
>> >> > init with PHENYX trypsin
>> >> > MS Instrument info: Manufacturer: ThermoFinnigan, Model: default,
>> >> > Ionization: FIXME, Analyzer: FIXME, Detector: FIXME
>>
>> >> >  PeptideProphet  (TPP v4.4 VUVUZELA rev 0, Build 201010010955 (linux))
>> >> > akel...@isb
>> >> >  read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
>> >> >  read in no data
>>
>> >> > command "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
>> >> > 'interact.pep.xml' MINPROB=0" exited with non-zero exit code: 256
>> >> > QUIT - the job is incomplete
>>
>> >> > ------------------ <END COMMANDLINE OUTPUT>-----------------
>>
>> >> > Note that without the -eT option I get a crash (segmentation fault).
>> >> > The enzyme specified inside the phenyx pepXML is
>>
>> >> > <sample_enzyme name="Trypsin_(KR_noP)">
>> >> > </sample_enzyme>
>>
>> >> > (to be honest I'm not sure if this means trypsin or stricttrypsin ...
>> >> > but that's probably another issue as I get the error with -eS option
>> >> > as well).
>>
>> >> > Could this error be caused by a lack of raw data files?  I ran the
>> >> > phenyx searches on another computer and the file contains paths to raw
>> >> > data on that computer.  I did try fixing the paths (and copying the
>> >> > raw data to a TPP accessible location) ... but that didn't work
>> >> > either.
>>
>> >> > I guess my question is what does "read in no data" mean?  Does data
>> >> > refer to the original spectra, or does it refer to something in the
>> >> > output of InteractParser (ie interact.pep.xml).
>>
>> >> > Any help at all on this issue would be much appreciated
>>
>> >> > Thanks
>>
>> >> > --
>> >> > You received this message because you are subscribed to the Google 
>> >> > Groups "spctools-discuss" group.
>> >> > To post to this group, send email to spctools-disc...@googlegroups.com.
>> >> > To unsubscribe from this group, send email to 
>> >> > spctools-discuss+unsubscr...@googlegroups.com.
>> >> > For more options, visit this group 
>> >> > athttp://groups.google.com/group/spctools-discuss?hl=en.
>>
>> > --
>> > You received this message because you are subscribed to the Google Groups 
>> > "spctools-discuss" group.
>> > To post to this group, send email to spctools-disc...@googlegroups.com.
>> > To unsubscribe from this group, send email to 
>> > spctools-discuss+unsubscr...@googlegroups.com.
>> > For more options, visit this group 
>> > athttp://groups.google.com/group/spctools-discuss?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "spctools-discuss" group.
> To post to this group, send email to spctools-disc...@googlegroups.com.
> To unsubscribe from this group, send email to 
> spctools-discuss+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/spctools-discuss?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to spctools-disc...@googlegroups.com.
To unsubscribe from this group, send email to 
spctools-discuss+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to