Hello again,

I realized you sent me the db and each protein listed only once, so
the problem is with the Phenyx pepXML dumper writing these incorrect
tags.  Temporarily, I have removed these entries using the sed
command: sed -i 's/^<alternative_protein.*$//g'

Then I processed the file with PeptideProphetParser and options
MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy .  Your PeptideProphet
processed file is posted here:

http://groups.google.com/group/spctools-discuss/web/interact.pep.xml4Ira

For the time being, please run RefreshParser as a separate step to map
the IDed peptides to the proteins in the database.


My code changes will be committed later today.

-David




On Wed, Nov 10, 2010 at 12:35 PM, David Shteynberg
<[email protected]> wrote:
> Hi Ira,
>
> Thanks for providing your Phenyx search results.  I was able to
> resolve a bug in PeptideProphet, which was preventing it from reading
> in any Phenyx search results.  At this point I still cannot process
> these files with PeptideProphet because of the alternative_protein
> entries.  Somehow, the Phenyx pepXML exporter is writing each protein
> twice (once listing it as an alternative protein), as follows:
>
> <search_hit hit_rank="1" peptide="QVFKQFENYVMQFNFPEEEYIDNLHK"
> peptide_prev_aa="R" peptide_next_aa="M" protein="decoy_12505"
> num_tot_proteins="1" num_matched_ions="3" tot_num_ions="153"
> calc_neutral_pep_mass="3335.559715" massdiff=
> "0.1563615" num_tol_term="2" num_missed_cleavages="1" is_rejected="0">
> <modification_info modified_peptide="QVFKQFENYVMQFNFPEEEYIDNLHK">
> </modification_info>
> <search_score name="zscore" value="3.98486"/>
> <search_score name="zvalue" value="3.375993e-05"/>
> <search_score name="origScore" value="-30.25199"/>
> <alternative_protein protein="sp_human%decoy_12505"/>
> </search_hit>
>
> Do you really have each protein listed twice in the database or is
> this pointing to a bug in the Phenyx pepXML dump?  As far as
> PeptideProphet is concerned the second entry <alternative_protein
> protein="sp_human%decoy_12505"/> is not a DECOY since it doesn't begin
> with "decoy",  and PeptideProphet doesn't consider any hit a decoy
> when atleast one of the matched proteins in the DB (that contains that
> peptide sequence) is not a DECOY.
>
> I think if this second issue can be resolved I will be able to compile
> a binary PeptideProphetParser that will not choke on this data.
>
> Cheers,
> -David
>
>
>
>
>
> On Tue, Nov 9, 2010 at 6:31 PM, ira cooke <[email protected]> wrote:
>> Hi David,
>> I've uploaded the database and original file .. the file is called
>> iracooke_phenyx_pepXML.tar.gz
>>
>> The files I've uploaded are all unmodified output from the search
>> (performed on a different machine to where the TPP is installed). I'd
>> previously tried changing some of the paths in the files to try and
>> fix them, but had no success.
>>
>> Also, the output of the commands I'm running is;
>>
>> /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eS -l7 -
>> D'sphuman_20101013_DECOY.fasta' -OdP -ddecoy 475_pepxml.xml
>>
>> /usr/local/tpp-4-4-0/bin/xinteract (TPP v4.4 VUVUZELA rev 0, Build
>> 201010010955 (linux))
>>
>> running: "/usr/local/tpp-4-4-0/bin/InteractParser 'interact.pep.xml'
>> '475_pepxml.xml' -D'sphuman_20101013_DECOY.fasta' -L'7' -
>> E'stricttrypsin'"
>>  file 1: 475_pepxml.xml
>>  processed altogether 116 results
>>
>>
>>  results written to file /var/www/ISB/data/Projects/Test/
>> interact.pep.shtml
>>
>>
>>
>> command completed in 1 sec
>>
>> running: "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
>> 'interact.pep.xml' MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy"
>> Using Decoy Label "decoy".
>> Decoy Probabilities will be reported.
>> Using non-parametric distributions
>>  (PHENYX) (minprob 0)
>> WARNING!! The discriminant function for Phenyx is not yet complete.
>> It is presented here to help facilitate trial and discussion.
>> Reliance on this code for publishable scientific results is not
>> recommended.
>> init with PHENYX stricttrypsin
>> MS Instrument info: Manufacturer: ThermoFinnigan, Model: default,
>> Ionization: FIXME, Analyzer: FIXME, Detector: FIXME
>>
>>  PeptideProphet  (TPP v4.4 VUVUZELA rev 0, Build 201010010955 (linux))
>> akel...@isb
>>  read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
>>  read in no data
>>
>> command "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
>> 'interact.pep.xml' MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy" exited
>> with non-zero exit code: 256
>> QUIT - the job is incomplete
>>
>>
>> Thanks for your help
>> Ira
>>
>>
>>
>> On Nov 10, 3:08 am, David Shteynberg <[email protected]>
>> wrote:
>>> It would be helpful to see all of the output from the latest command.
>>> You might have to unravel the pipeline and run the steps separately if
>>> the Phenyx pepXML is missing some information.  Would it be possible
>>> for you post you Phenyx pepXML file and the database so I can try it
>>> in a debugger?
>>>
>>> Thanks,
>>> -David
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Nov 8, 2010 at 6:19 PM, ira cooke <[email protected]> wrote:
>>> > Thanks for your quick response.
>>> > I've modified my command to
>>>
>>> > /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eT -l7 -D/
>>> > var/www/ISB/data/Databases/OnMascot/SPHuman/
>>> > sphuman_20101013_DECOY.fasta -OdP -ddecoy 206_pepxml.xml
>>>
>>> > (also tried without the -Od option).
>>>
>>> > Unfortunately I still get the same error.
>>> > I've checked my 206_pepxml.xml file .. and the decoys are named as
>>> > follows
>>>
>>> > protein="decoy_9817"
>>>
>>> > Is it possible that the error is related to not having the raw
>>> > spectra? Or should I not worry about that?
>>>
>>> > Thanks for your help.
>>>
>>> > On Nov 9, 11:06 am, David Shteynberg <[email protected]>
>>> > wrote:
>>> >> Phenyx search results can only be processed with the semi-parametric
>>> >> modeling based on decoys.  Judging from name of your database, it does
>>> >> have decoys in there.  Now you must tell PeptideProphet to use the
>>> >> semi-parametric model with xinteract option -OP and the decoy tag that
>>> >> all your decoy proteins begin with using xinteract flag e.g. -dDECOY
>>> >> if all your decoys proteins begin with DECOY.  You can also use option
>>> >> -Od to have PeptideProphet assign non-zero probabilities to the decoy
>>> >> hits.
>>>
>>> >> -David
>>>
>>> >> On Mon, Nov 8, 2010 at 2:36 PM, ira cooke <[email protected]> 
>>> >> wrote:
>>> >> > Hi,
>>>
>>> >> > I've been struggling to run PeptideProphet on phenyx generated pepXML
>>> >> > files.
>>>
>>> >> > The error I get from PeptideProphet is "read in no data".  Full output
>>> >> > from the tool is as follows;
>>>
>>> >> > ------------------ <BEGIN COMMANDLINE OUTPUT>-----------------
>>> >> > /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eT -l7 -D/
>>> >> > var/www/ISB/data/Databases/OnMascot/SPHuman/
>>> >> > sphuman_20101013_DECOY.fasta 206_pepxml.xml
>>>
>>> >> > /usr/local/tpp-4-4-0/bin/xinteract (TPP v4.4 VUVUZELA rev 0, Build
>>> >> > 201010010955 (linux))
>>>
>>> >> > running: "/usr/local/tpp-4-4-0/bin/InteractParser 'interact.pep.xml'
>>> >> > '206_pepxml.xml' -D'/var/www/ISB/data/Databases/OnMascot/SPHuman/
>>> >> > sphuman_20101013_DECOY.fasta' -L'7' -E'trypsin'"
>>> >> >  file 1: 206_pepxml.xml
>>> >> >  processed altogether 182 results
>>>
>>> >> >  results written to file /var/www/ISB/data/Projects/TRegs/SP/Phenyx/
>>> >> > interact.pep.shtml
>>>
>>> >> > command completed in 3 sec
>>>
>>> >> > running: "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
>>> >> > 'interact.pep.xml' MINPROB=0"
>>> >> >  (PHENYX) (minprob 0)
>>> >> > WARNING!! The discriminant function for Phenyx is not yet complete.
>>> >> > It is presented here to help facilitate trial and discussion.
>>> >> > Reliance on this code for publishable scientific results is not
>>> >> > recommended.
>>> >> > init with PHENYX trypsin
>>> >> > MS Instrument info: Manufacturer: ThermoFinnigan, Model: default,
>>> >> > Ionization: FIXME, Analyzer: FIXME, Detector: FIXME
>>>
>>> >> >  PeptideProphet  (TPP v4.4 VUVUZELA rev 0, Build 201010010955 (linux))
>>> >> > akel...@isb
>>> >> >  read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
>>> >> >  read in no data
>>>
>>> >> > command "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
>>> >> > 'interact.pep.xml' MINPROB=0" exited with non-zero exit code: 256
>>> >> > QUIT - the job is incomplete
>>>
>>> >> > ------------------ <END COMMANDLINE OUTPUT>-----------------
>>>
>>> >> > Note that without the -eT option I get a crash (segmentation fault).
>>> >> > The enzyme specified inside the phenyx pepXML is
>>>
>>> >> > <sample_enzyme name="Trypsin_(KR_noP)">
>>> >> > </sample_enzyme>
>>>
>>> >> > (to be honest I'm not sure if this means trypsin or stricttrypsin ...
>>> >> > but that's probably another issue as I get the error with -eS option
>>> >> > as well).
>>>
>>> >> > Could this error be caused by a lack of raw data files?  I ran the
>>> >> > phenyx searches on another computer and the file contains paths to raw
>>> >> > data on that computer.  I did try fixing the paths (and copying the
>>> >> > raw data to a TPP accessible location) ... but that didn't work
>>> >> > either.
>>>
>>> >> > I guess my question is what does "read in no data" mean?  Does data
>>> >> > refer to the original spectra, or does it refer to something in the
>>> >> > output of InteractParser (ie interact.pep.xml).
>>>
>>> >> > Any help at all on this issue would be much appreciated
>>>
>>> >> > Thanks
>>>
>>> >> > --
>>> >> > You received this message because you are subscribed to the Google 
>>> >> > Groups "spctools-discuss" group.
>>> >> > To post to this group, send email to [email protected].
>>> >> > To unsubscribe from this group, send email to 
>>> >> > [email protected].
>>> >> > For more options, visit this group 
>>> >> > athttp://groups.google.com/group/spctools-discuss?hl=en.
>>>
>>> > --
>>> > You received this message because you are subscribed to the Google Groups 
>>> > "spctools-discuss" group.
>>> > To post to this group, send email to [email protected].
>>> > To unsubscribe from this group, send email to 
>>> > [email protected].
>>> > For more options, visit this group 
>>> > athttp://groups.google.com/group/spctools-discuss?hl=en.
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "spctools-discuss" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to 
>> [email protected].
>> For more options, visit this group at 
>> http://groups.google.com/group/spctools-discuss?hl=en.
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to