I just committed the change.  Please update from trunk, recompile
PeptideProphetParser and try it again.

Thanks,
-David

On Wed, Nov 10, 2010 at 2:12 PM, ira cooke <[email protected]> wrote:
> Hi David,
> Thanks very much for fixing this.  Would you recommend that we check
> out the latest source and just recompile PeptideProphet?  I guess this
> would be safer than upgrading our whole TPP install to the latest
> code?
>
> We'll contact Phenyx support about the pepXML dump and hopefully they
> can fix that also.  For now we can just clean the files.
>
> Ira
>
>
>
> On Nov 11, 7:51 am, David Shteynberg <[email protected]>
> wrote:
>> Hello again,
>>
>> I realized you sent me the db and each protein listed only once, so
>> the problem is with the Phenyx pepXML dumper writing these incorrect
>> tags.  Temporarily, I have removed these entries using the sed
>> command: sed -i 's/^<alternative_protein.*$//g'
>>
>> Then I processed the file with PeptideProphetParser and options
>> MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy .  Your PeptideProphet
>> processed file is posted here:
>>
>> http://groups.google.com/group/spctools-discuss/web/interact.pep.xml4Ira
>>
>> For the time being, please run RefreshParser as a separate step to map
>> the IDed peptides to the proteins in the database.
>>
>> My code changes will be committed later today.
>>
>> -David
>>
>> On Wed, Nov 10, 2010 at 12:35 PM, David Shteynberg
>>
>>
>>
>>
>>
>>
>>
>> <[email protected]> wrote:
>> > Hi Ira,
>>
>> > Thanks for providing your Phenyx search results.  I was able to
>> > resolve a bug in PeptideProphet, which was preventing it from reading
>> > in any Phenyx search results.  At this point I still cannot process
>> > these files with PeptideProphet because of the alternative_protein
>> > entries.  Somehow, the Phenyx pepXML exporter is writing each protein
>> > twice (once listing it as an alternative protein), as follows:
>>
>> > <search_hit hit_rank="1" peptide="QVFKQFENYVMQFNFPEEEYIDNLHK"
>> > peptide_prev_aa="R" peptide_next_aa="M" protein="decoy_12505"
>> > num_tot_proteins="1" num_matched_ions="3" tot_num_ions="153"
>> > calc_neutral_pep_mass="3335.559715" massdiff=
>> > "0.1563615" num_tol_term="2" num_missed_cleavages="1" is_rejected="0">
>> > <modification_info modified_peptide="QVFKQFENYVMQFNFPEEEYIDNLHK">
>> > </modification_info>
>> > <search_score name="zscore" value="3.98486"/>
>> > <search_score name="zvalue" value="3.375993e-05"/>
>> > <search_score name="origScore" value="-30.25199"/>
>> > <alternative_protein protein="sp_human%decoy_12505"/>
>> > </search_hit>
>>
>> > Do you really have each protein listed twice in the database or is
>> > this pointing to a bug in the Phenyx pepXML dump?  As far as
>> > PeptideProphet is concerned the second entry <alternative_protein
>> > protein="sp_human%decoy_12505"/> is not a DECOY since it doesn't begin
>> > with "decoy",  and PeptideProphet doesn't consider any hit a decoy
>> > when atleast one of the matched proteins in the DB (that contains that
>> > peptide sequence) is not a DECOY.
>>
>> > I think if this second issue can be resolved I will be able to compile
>> > a binary PeptideProphetParser that will not choke on this data.
>>
>> > Cheers,
>> > -David
>>
>> > On Tue, Nov 9, 2010 at 6:31 PM, ira cooke <[email protected]> wrote:
>> >> Hi David,
>> >> I've uploaded the database and original file .. the file is called
>> >> iracooke_phenyx_pepXML.tar.gz
>>
>> >> The files I've uploaded are all unmodified output from the search
>> >> (performed on a different machine to where the TPP is installed). I'd
>> >> previously tried changing some of the paths in the files to try and
>> >> fix them, but had no success.
>>
>> >> Also, the output of the commands I'm running is;
>>
>> >> /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eS -l7 -
>> >> D'sphuman_20101013_DECOY.fasta' -OdP -ddecoy 475_pepxml.xml
>>
>> >> /usr/local/tpp-4-4-0/bin/xinteract (TPP v4.4 VUVUZELA rev 0, Build
>> >> 201010010955 (linux))
>>
>> >> running: "/usr/local/tpp-4-4-0/bin/InteractParser 'interact.pep.xml'
>> >> '475_pepxml.xml' -D'sphuman_20101013_DECOY.fasta' -L'7' -
>> >> E'stricttrypsin'"
>> >>  file 1: 475_pepxml.xml
>> >>  processed altogether 116 results
>>
>> >>  results written to file /var/www/ISB/data/Projects/Test/
>> >> interact.pep.shtml
>>
>> >> command completed in 1 sec
>>
>> >> running: "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
>> >> 'interact.pep.xml' MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy"
>> >> Using Decoy Label "decoy".
>> >> Decoy Probabilities will be reported.
>> >> Using non-parametric distributions
>> >>  (PHENYX) (minprob 0)
>> >> WARNING!! The discriminant function for Phenyx is not yet complete.
>> >> It is presented here to help facilitate trial and discussion.
>> >> Reliance on this code for publishable scientific results is not
>> >> recommended.
>> >> init with PHENYX stricttrypsin
>> >> MS Instrument info: Manufacturer: ThermoFinnigan, Model: default,
>> >> Ionization: FIXME, Analyzer: FIXME, Detector: FIXME
>>
>> >>  PeptideProphet  (TPP v4.4 VUVUZELA rev 0, Build 201010010955 (linux))
>> >> akel...@isb
>> >>  read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
>> >>  read in no data
>>
>> >> command "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
>> >> 'interact.pep.xml' MINPROB=0 DECOYPROBS NONPARAM DECOY=decoy" exited
>> >> with non-zero exit code: 256
>> >> QUIT - the job is incomplete
>>
>> >> Thanks for your help
>> >> Ira
>>
>> >> On Nov 10, 3:08 am, David Shteynberg <[email protected]>
>> >> wrote:
>> >>> It would be helpful to see all of the output from the latest command.
>> >>> You might have to unravel the pipeline and run the steps separately if
>> >>> the Phenyx pepXML is missing some information.  Would it be possible
>> >>> for you post you Phenyx pepXML file and the database so I can try it
>> >>> in a debugger?
>>
>> >>> Thanks,
>> >>> -David
>>
>> >>> On Mon, Nov 8, 2010 at 6:19 PM, ira cooke <[email protected]> 
>> >>> wrote:
>> >>> > Thanks for your quick response.
>> >>> > I've modified my command to
>>
>> >>> > /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eT -l7 -D/
>> >>> > var/www/ISB/data/Databases/OnMascot/SPHuman/
>> >>> > sphuman_20101013_DECOY.fasta -OdP -ddecoy 206_pepxml.xml
>>
>> >>> > (also tried without the -Od option).
>>
>> >>> > Unfortunately I still get the same error.
>> >>> > I've checked my 206_pepxml.xml file .. and the decoys are named as
>> >>> > follows
>>
>> >>> > protein="decoy_9817"
>>
>> >>> > Is it possible that the error is related to not having the raw
>> >>> > spectra? Or should I not worry about that?
>>
>> >>> > Thanks for your help.
>>
>> >>> > On Nov 9, 11:06 am, David Shteynberg <[email protected]>
>> >>> > wrote:
>> >>> >> Phenyx search results can only be processed with the semi-parametric
>> >>> >> modeling based on decoys.  Judging from name of your database, it does
>> >>> >> have decoys in there.  Now you must tell PeptideProphet to use the
>> >>> >> semi-parametric model with xinteract option -OP and the decoy tag that
>> >>> >> all your decoy proteins begin with using xinteract flag e.g. -dDECOY
>> >>> >> if all your decoys proteins begin with DECOY.  You can also use option
>> >>> >> -Od to have PeptideProphet assign non-zero probabilities to the decoy
>> >>> >> hits.
>>
>> >>> >> -David
>>
>> >>> >> On Mon, Nov 8, 2010 at 2:36 PM, ira cooke <[email protected]> 
>> >>> >> wrote:
>> >>> >> > Hi,
>>
>> >>> >> > I've been struggling to run PeptideProphet on phenyx generated 
>> >>> >> > pepXML
>> >>> >> > files.
>>
>> >>> >> > The error I get from PeptideProphet is "read in no data".  Full 
>> >>> >> > output
>> >>> >> > from the tool is as follows;
>>
>> >>> >> > ------------------ <BEGIN COMMANDLINE OUTPUT>-----------------
>> >>> >> > /usr/local/tpp-4-4-0/bin/xinteract  -Ninteract.pep.xml -p0 -eT -l7 
>> >>> >> > -D/
>> >>> >> > var/www/ISB/data/Databases/OnMascot/SPHuman/
>> >>> >> > sphuman_20101013_DECOY.fasta 206_pepxml.xml
>>
>> >>> >> > /usr/local/tpp-4-4-0/bin/xinteract (TPP v4.4 VUVUZELA rev 0, Build
>> >>> >> > 201010010955 (linux))
>>
>> >>> >> > running: "/usr/local/tpp-4-4-0/bin/InteractParser 'interact.pep.xml'
>> >>> >> > '206_pepxml.xml' -D'/var/www/ISB/data/Databases/OnMascot/SPHuman/
>> >>> >> > sphuman_20101013_DECOY.fasta' -L'7' -E'trypsin'"
>> >>> >> >  file 1: 206_pepxml.xml
>> >>> >> >  processed altogether 182 results
>>
>> >>> >> >  results written to file /var/www/ISB/data/Projects/TRegs/SP/Phenyx/
>> >>> >> > interact.pep.shtml
>>
>> >>> >> > command completed in 3 sec
>>
>> >>> >> > running: "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
>> >>> >> > 'interact.pep.xml' MINPROB=0"
>> >>> >> >  (PHENYX) (minprob 0)
>> >>> >> > WARNING!! The discriminant function for Phenyx is not yet complete.
>> >>> >> > It is presented here to help facilitate trial and discussion.
>> >>> >> > Reliance on this code for publishable scientific results is not
>> >>> >> > recommended.
>> >>> >> > init with PHENYX trypsin
>> >>> >> > MS Instrument info: Manufacturer: ThermoFinnigan, Model: default,
>> >>> >> > Ionization: FIXME, Analyzer: FIXME, Detector: FIXME
>>
>> >>> >> >  PeptideProphet  (TPP v4.4 VUVUZELA rev 0, Build 201010010955 
>> >>> >> > (linux))
>> >>> >> > akel...@isb
>> >>> >> >  read in 0 1+, 0 2+, 0 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
>> >>> >> >  read in no data
>>
>> >>> >> > command "/usr/local/tpp-4-4-0/bin/PeptideProphetParser
>> >>> >> > 'interact.pep.xml' MINPROB=0" exited with non-zero exit code: 256
>> >>> >> > QUIT - the job is incomplete
>>
>> >>> >> > ------------------ <END COMMANDLINE OUTPUT>-----------------
>>
>> >>> >> > Note that without the -eT option I get a crash (segmentation fault).
>> >>> >> > The enzyme specified inside the phenyx pepXML is
>>
>> >>> >> > <sample_enzyme name="Trypsin_(KR_noP)">
>> >>> >> > </sample_enzyme>
>>
>> >>> >> > (to be honest I'm not sure if this means trypsin or stricttrypsin 
>> >>> >> > ...
>> >>> >> > but that's probably another issue as I get the error with -eS option
>> >>> >> > as well).
>>
>> >>> >> > Could this error be caused by a lack of raw data files?  I ran the
>> >>> >> > phenyx searches on another computer and the file contains paths to 
>> >>> >> > raw
>> >>> >> > data on that computer.  I did try fixing the paths (and copying the
>> >>> >> > raw data to a TPP accessible location) ... but that didn't work
>> >>> >> > either.
>>
>> >>> >> > I guess my question is what does "read in no data" mean?  Does data
>> >>> >> > refer to the original spectra, or does it refer to something in the
>> >>> >> > output of InteractParser (ie interact.pep.xml).
>>
>> >>> >> > Any help at all on this issue would be much appreciated
>>
>> >>> >> > Thanks
>>
>> >>> >> > --
>> >>> >> > You received this message because you are subscribed to the Google 
>> >>> >> > Groups "spctools-discuss" group.
>> >>> >> > To post to this group, send email to 
>> >>> >> > [email protected].
>> >>> >> > To unsubscribe from this group, send email to 
>> >>> >> > [email protected].
>> >>> >> > For more options, visit this group
>>
>> ...
>>
>> read more »
>
> --
> You received this message because you are subscribed to the Google Groups 
> "spctools-discuss" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/spctools-discuss?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to