The "DECOY=" option in the PeptideProphetParser command specifies the text string that denotes your decoy entries. This string is the text in the protein description line directly following the > character. So your decoy entries in case A should be denoted by the text string "sp|decoy" or command line argument "DECOY=sp|decoy". But you told PeptideProphetParser to just look for "decoy" using the option "DECOY=decoy" which is why it failed to find any decoys in case A. Hope this makes sense.
However, you're also going to face problems trying to pass the "|" character in the decoy string as the shell will treat it as a pipe command. You'd have to escape it and hope that works. Or better yet get rid of the "|" character altogether in your decoy string. On Sun, Aug 1, 2010 at 10:49 PM, Jagan Kommineni <[email protected]> wrote: > Dear All, > > I have created decoy database for the latest uniprot_sprot_April2010 > database by modifying the original Matrix Science decoy.pl script to get > customized header format. > > ----------------------------------------------------------------------------------------------------------------- > Case A: Here are few decoy entries in the database created with decoy.pl: > ----------------------------------------------------------------------------------------------------------------- >>sp|decoy_Q197F8|002R_IIV3 Reverse sequence, was Uncharacterized protein >> 002R OS=Invertebrate iridescent virus 3 GN=IIV3-002R PE=4 SV=1 > CDDESDSDDDESDSDYEFDEDESNYNSDLSCFYRDISQIKKPKPKFGLSKMLNTITLKGK > ELIRAAVSPEEETETEWDSDDSDPAWTPDDEDSSDEEGSSFSEDDFHENEPDSDSEYGTG > DRNLQSIRLLFKYMEVQAPTKLPNNEVWDTDRKHLITSYMYRRHFKNDEVRYGHWEDEQF > LVWVYIPRPVNEYRYSNTCYFTEINDTCWQVFPKNVGYTEFIYLYKLTSQMSKIGDMVPG > FWRPHMDLNKITLHTLSPAWFDTFGCADEVMLHTLFELDDLGLDGHTNYNMDDIRRIVLR > TVNPTTELIAKIEEARTDTFHIEQIKEPFAYERQFDEFTIASVRVVNDKYWRTWLWPHKC > WSIQEWSLYQMIDLKIELPLLELSEVPGSYDESPESTQLEPYREALAQERNMKWPVISGP > QENWIPDFLLFQAVDQINSFDRVPRNSGGQASVTNSAM >>sp|decoy_Q197F7|003L_IIV3 Reverse sequence, was Uncharacterized protein >> 003L OS=Invertebrate iridescent virus 3 GN=IIV3-003L PE=4 SV=1 > IGYTLPELRCTGYNNKRTKSNTILLRYTDPATNSTTSPRDSAVCECRQPSKASGFDFCLT > GIRRPPNIHPPSDVMGLSTPPTCAALSPPTCTTLSPTTTLSRANLSTDFWAGGLANPHVP > YYNPYHPAGSMKCVIERELQPSGYWSQPCPNIAQYM > > -------------------------------------------------------------------------------------------------------------------------- > The original entries corresponding to the above entries are as follows ..... > -------------------------------------------------------------------------------------------------------------------------- >>sp|Q197F8|002R_IIV3 Uncharacterized protein 002R OS=Invertebrate iridescent >> virus 3 GN=IIV3-002R PE=4 SV=1 > MASNTVSAQGGSNRPVRDFSNIQDVAQFLLFDPIWNEQPGSIVPWKMNREQALAERYPEL > QTSEPSEDYSGPVESLELLPLEIKLDIMQYLSWEQISWCKHPWLWTRWYKDNVVRVSAIT > FEDFQREYAFPEKIQEIHFTDTRAEEIKAILETTPNVTRLVIRRIDDMNYNTHGDLGLDD > LEFLTHLMVEDACGFTDFWAPSLTHLTIKNLDMHPRWFGPVMDGIKSMQSTLKYLYIFET > YGVNKPFVQWCTDNIETFYCTNSYRYENVPRPIYVWVLFQEDEWHGYRVEDNKFHRRYMY > STILHKRDTDWVENNPLKTPAQVEMYKFLLRISQLNRDGTGYESDSDPENEHFDDESFSS > GEEDSSDEDDPTWAPDSDDSDWETETEEEPSVAARILEKGKLTITNLMKSLGFKPKPKKI > QSIDRYFCSLDSNYNSEDEDFEYDSDSEDDDSDSEDDC >>sp|Q197F7|003L_IIV3 Uncharacterized protein 003L OS=Invertebrate iridescent >> virus 3 GN=IIV3-003L PE=4 SV=1 > MYQAINPCPQSWYGSPQLEREIVCKMSGAPHYPNYYPVHPNALGGAWFDTSLNARSLTTT > PSLTTCTPPSLAACTPPTSLGMVDSPPHINPPRRIGTLCFDFGSAKSPQRCECVASDRPS > TTSNTAPDTYRLLITNSKTRKNNYGTCRLEPLTYGI > -------------------------------------------------------------------------------------------- > Case B: Here is the format decoy entries when I use decoyFasta of TPP > ----------------------------------------------------------------------------------------------- >>decoy_1 > ............................... > .............................. >>decoy_2 > ............................ > ............................ > ------------------------------------------------------------------------------------------------- > > I have used same parameters and input files for running OMSSA search and > PeptideProphet but I notice the segmentation fault in the case A however > PeptideProphet runs OK for the Case B. > -------------------------------------------------------------------------------------------- > Here is the STDOUT display for Case A( when I use decoy.pl) > ------------------------------------------------------------------- > [r...@apcf-hn3 jagan-J442]# /mnt/sanfs/APCF/APCF_WEB/tpp/bin/InteractParser > 'jagan-J442.pepprophet.xml' 'jagan-J442.pep.xml' -L'7' -E'trypsin' -C -P > file 1: jagan-J442.pep.xml > processed altogether 123 results > > > results written to file > /mnt/sanfs/APCF/results/omssa/2010-08-02/jagan-J442/jagan-J442.pepprophet.shtml > > [r...@apcf-hn3 jagan-J442]# > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/PeptideProphetParser > 'jagan-J442.pepprophet.xml' DECOY=decoy MINPROB=0 NONPARAM > Using Decoy Label "decoy". > Using non-parametric distributions > (OMSSA) (minprob 0) > WARNING!! The discriminant function for OMSSA is not yet complete. It is > presented here to help facilitate trial and discussion. Reliance on this > code for publishable scientific results is not recommended. > init with OMSSA trypsin > MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization: > UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN > > PeptideProphet (TPP v4.4 JETSTREAM (unstable development prerelease) rev > 0, Build 201007011135 (linux)) akel...@isb > read in 0 1+, 78 2+, 45 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra. > Initialising statistical models ... > Found 0 Decoys, and 123 Non-Decoys > WARNING: No decoys with label decoy were found in this dataset. reverting to > fully unsupervised method. > Iterations: .........10.........20 > Segmentation fault > [r...@apcf-hn3 jagan-J442]# > ----------------------------------------------------------------------------------------------------------- > In Case B (decoyFasta of TPP has been used), here is the STDOUT .... > ----------------------------------------------------------------------------------------------------------- > [r...@apcf-hn3 jagan-J443]# /mnt/sanfs/APCF/APCF_WEB/tpp/bin/InteractParser > 'jagan-J443.pepprophet.xml' 'jagan-J443.pep.xml' -L'7' -E'trypsin' -C -P > file 1: jagan-J443.pep.xml > processed altogether 123 results > > > results written to file > /mnt/sanfs/APCF/results/omssa/2010-08-02/jagan-J443/jagan-J443.pepprophet.shtml > > > > [r...@apcf-hn3 jagan-J443]# > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/PeptideProphetParser > 'jagan-J443.pepprophet.xml' DECOY=decoy MINPROB=0 NONPARAM > Using Decoy Label "decoy". > Using non-parametric distributions > (OMSSA) (minprob 0) > WARNING!! The discriminant function for OMSSA is not yet complete. It is > presented here to help facilitate trial and discussion. Reliance on this > code for publishable scientific results is not recommended. > init with OMSSA trypsin > MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization: > UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN > > PeptideProphet (TPP v4.4 JETSTREAM (unstable development prerelease) rev > 0, Build 201007011135 (linux)) akel...@isb > read in 0 1+, 78 2+, 45 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra. > Initialising statistical models ... > Found 2 Decoys, and 121 Non-Decoys > Iterations: .........10.........20..... > WARNING: Mixture model quality test failed for charge (1+). > WARNING: Mixture model quality test failed for charge (2+). > WARNING: Mixture model quality test failed for charge (4+). > WARNING: Mixture model quality test failed for charge (5+). > WARNING: Mixture model quality test failed for charge (6+). > WARNING: Mixture model quality test failed for charge (7+). > model complete after 26 iterations > [r...@apcf-hn3 jagan-J443]# > ------------------------------------------------------------------------------------------------------------------------------ > > Which is the best way to make the Case A sematics to work with TPP pipeline > .... > > Here is the difference in the contents of the pepXML files from the Case A > to Case B > ------------------------------------------------------------------------------------------------------------------------------ > > < date="2010-08-02T11:15:41" > summary_xml="/home/APCF/omssa/results/2b78d709e0fc1276e3bdff7faa1c95a8/jagan-J443.pep.xml"> > < <msms_run_summary > base_name="/home/APCF/omssa/results/2b78d709e0fc1276e3bdff7faa1c95a8/jagan-j443_62928" > raw_data_type="raw" raw_data=".mzXML"> > --- >> date="2010-08-02T10:52:40" >> summary_xml="/home/APCF/omssa/results/091b4664500d7e67d0eba75ef9170064/jagan-J442.pep.xml"> >> <msms_run_summary >> base_name="/home/APCF/omssa/results/091b4664500d7e67d0eba75ef9170064/jagan-j442_62921" >> raw_data_type="raw" raw_data=".mzXML"> > 11c11 > < <search_summary > base_name="/home/APCF/omssa/results/2b78d709e0fc1276e3bdff7faa1c95a8/jagan-j443_62928" > search_engine="OMSSA" precursor_mass_type="monoisotopic" > fragment_mass_type="monoisotopi > c" out_data_type="n/a" out_data="n/a" search_id="1"> > --- >> <search_summary >> base_name="/home/APCF/omssa/results/091b4664500d7e67d0eba75ef9170064/jagan-j442_62921" >> search_engine="OMSSA" precursor_mass_type="monoisotopic" >> fragment_mass_type="monoisotopi > c" out_data_type="n/a" out_data="n/a" search_id="1"> > 17c17 > < <search_hit hit_rank="1" peptide="KENNNNNNNK" peptide_prev_aa="K" > peptide_next_aa="N" protein="285922" num_tot_proteins="1" > num_matched_ions="16" tot_num_ions="18" calc_neutral_pep_mass=" > 1201.545" massdiff="-0.862000000000007" is_rejected="0" > protein_descr="sp|Q54UC0|PRKDC_DICDI DNA-dependent protein kinase catalytic > subunit OS=Dictyostelium discoideum GN=dnapkcs PE=3 SV=2"> > --- >> <search_hit hit_rank="1" peptide="KENNNNNNNK" peptide_prev_aa="K" >> peptide_next_aa="N" protein="Q54UC0" num_tot_proteins="1" >> num_matched_ions="16" tot_num_ions="18" calc_neutral_pep_mass=" > 1201.545" massdiff="-0.862000000000007" is_rejected="0" > protein_descr="DNA-dependent protein kinase catalytic subunit > OS=Dictyostelium discoideum GN=dnapkcs PE=3 SV=2"> > 25c25 > < <search_hit hit_rank="1" peptide="WQGHEGDIDK" peptide_prev_aa="K" > peptide_next_aa="G" protein="132542" num_tot_proteins="1" > num_matched_ions="13" tot_num_ions="18" calc_neutral_pep_mass=" > 1183.526" massdiff="-0.000999999999909" is_rejected="0" > protein_descr="sp|O95395|GCNT3_HUMAN > Beta-1,3-galactosyl-O-glycosyl-glycoprotein > beta-1,6-N-acetylglucosaminyltransferase 3 OS=Homo sapiens G > N=GCNT3 PE=2 SV=1"> > --- >> <search_hit hit_rank="1" peptide="WQGHEGDIDK" peptide_prev_aa="K" >> peptide_next_aa="G" protein="O95395" num_tot_proteins="1" >> num_matched_ions="13" tot_num_ions="18" calc_neutral_pep_mass=" > 1183.526" massdiff="-0.000999999999909" is_rejected="0" > protein_descr="Beta-1,3-galactosyl-O-glycosyl-glycoprotein > beta-1,6-N-acetylglucosaminyltransferase 3 OS=Homo sapiens GN=GCNT3 PE=2 > SV=1"> > 33,34c33,34 > < <search_hit hit_rank="1" peptide="SKAEAESLYQSK" > peptide_prev_aa="K" peptide_next_aa="Y" protein="183292" > num_tot_proteins="2" num_matched_ions="19" tot_num_ions="22" > calc_neutral_pep_mass > ="1339.663" massdiff="-0.001999999999942" is_rejected="0" > protein_descr="sp|P04264|K2C1_HUMAN Keratin, type II cytoskeletal 1 OS=Homo > sapiens GN=KRT1 PE=1 SV=6"> > < <alternative_protein protein="183294" > protein_descr="sp|A5A6M6|K2C1_PANTR Keratin, type II cytoskeletal 1 OS=Pan > troglodytes GN=KRT1 PE=2 SV=1"/> > --- >> <search_hit hit_rank="1" peptide="SKAEAESLYQSK" >> peptide_prev_aa="K" peptide_next_aa="Y" protein="P04264" >> num_tot_proteins="2" num_matched_ions="19" tot_num_ions="22" >> calc_neutral_pep_mass > ="1339.663" massdiff="-0.001999999999942" is_rejected="0" > protein_descr="Keratin, type II cytoskeletal 1 OS=Homo sapiens GN=KRT1 PE=1 > SV=6"> >> <alternative_protein protein="A5A6M6" protein_descr="Keratin, >> type II cytoskeletal 1 OS=Pan troglodytes GN=KRT1 PE=2 SV=1"/> > 42c42 > < <search_hit hit_rank="1" peptide="NQNESVSEIGGK" > peptide_prev_aa="R" peptide_next_aa="I" protein="394680" > num_tot_proteins="1" num_matched_ions="18" tot_num_ions="22" > calc_neutral_pep_mass > ="1260.595" massdiff="-0.001999999999925" is_rejected="0" > protein_descr="sp|Q68CR1|SE1L3_HUMAN Protein sel-1 homolog 3 OS=Homo sapiens > GN=SEL1L3 PE=1 SV=2"> > --- >> <search_hit hit_rank="1" peptide="NQNESVSEIGGK" >> peptide_prev_aa="R" peptide_next_aa="I" protein="Q68CR1" >> num_tot_proteins="1" num_matched_ions="18" tot_num_ions="22" >> calc_neutral_pep_mass > ="1260.595" massdiff="-0.001999999999925" is_rejected="0" > protein_descr="Protein sel-1 homolog 3 OS=Homo sapiens GN=SEL1L3 PE=1 SV=2"> > 50c50 > < <search_hit hit_rank="1" peptide="LVGATATSSPPPK" > peptide_prev_aa="R" peptide_next_aa="A" protein="452919" > num_tot_proteins="1" num_matched_ions="15" tot_num_ions="24" > calc_neutral_pep_mas > s="1224.672" massdiff="-0.002999999999904" is_rejected="0" > protein_descr="sp|Q96QD9|UIF_HUMAN UAP56-interacting factor OS=Homo sapiens > GN=FYTTD1 PE=1 SV=3"> > --- >> <search_hit hit_rank="1" peptide="LVGATATSSPPPK" >> peptide_prev_aa="R" peptide_next_aa="A" protein="Q96QD9" >> num_tot_proteins="1" num_matched_ions="15" tot_num_ions="24" >> calc_neutral_pep_mas > s="1224.672" massdiff="-0.002999999999904" is_rejected="0" > protein_descr="UAP56-interacting factor OS=Homo sapiens GN=FYTTD1 PE=1 > SV=3"> > 58c58 > < <search_hit hit_rank="1" peptide="LHQDTFNQLHK" peptide_prev_aa="K" > peptide_next_aa="V" protein="136270" num_tot_proteins="1" > num_matched_ions="20" tot_num_ions="20" calc_neutral_pep_mass= > "1379.696" massdiff="-0.003000000000016" is_rejected="0" > protein_descr="sp|Q8NCI6|GLBL3_HUMAN Beta-galactosidase-1-like protein 3 > OS=Homo sapiens GN=GLB1L3 PE=2 SV=3"> > --- >> <search_hit hit_rank="1" peptide="LHQDTFNQLHK" peptide_prev_aa="K" >> peptide_next_aa="V" protein="Q8NCI6" num_tot_proteins="1" >> num_matched_ions="20" tot_num_ions="20" calc_neutral_pep_mass= > "1379.696" massdiff="-0.003000000000016" is_rejected="0" > protein_descr="Beta-galactosidase-1-like protein 3 OS=Homo sapiens GN=GLB1L3 > PE=2 SV=3"> > 66c66 > < <search_hit hit_rank="1" peptide="SVGLGTESTGR" peptide_prev_aa="R" > peptide_next_aa="G" protein="136270" num_tot_proteins="1" > num_matched_ions="17" tot_num_ions="20" calc_neutral_pep_mass= > "1062.53" massdiff="0.000999999999949" is_rejected="0" > protein_descr="sp|Q8NCI6|GLBL3_HUMAN Beta-galactosidase-1-like protein 3 > OS=Homo sapiens GN=GLB1L3 PE=2 SV=3"> > --- >> <search_hit hit_rank="1" peptide="SVGLGTESTGR" peptide_prev_aa="R" >> peptide_next_aa="G" protein="Q8NCI6" num_tot_proteins="1" >> num_matched_ions="17" tot_num_ions="20" calc_neutral_pep_mass= > "1062.53" massdiff="0.000999999999949" is_rejected="0" > protein_descr="Beta-galactosidase-1-like protein 3 OS=Homo sapiens GN=GLB1L3 > PE=2 SV=3"> > 74c74 > < <search_hit hit_rank="1" peptide="NQNESVSEIGGK" > peptide_prev_aa="R" peptide_next_aa="I" protein="394680" > num_tot_proteins="1" num_matched_ions="12" tot_num_ions="22" > calc_neutral_pep_mass > ="1260.595" massdiff="-0.006000000000058" is_rejected="0" > protein_descr="sp|Q68CR1|SE1L3_HUMAN Protein sel-1 homolog 3 OS=Homo sapiens > GN=SEL1L3 PE=1 SV=2"> > --- >> <search_hit hit_rank="1" peptide="NQNESVSEIGGK" >> peptide_prev_aa="R" peptide_next_aa="I" protein="Q68CR1" >> num_tot_proteins="1" num_matched_ions="12" tot_num_ions="22" >> calc_neutral_pep_mass > ="1260.595" massdiff="-0.006000000000058" is_rejected="0" > protein_descr="Protein sel-1 homolog 3 OS=Homo sapiens GN=SEL1L3 PE=1 SV=2"> > > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > > > with regards, > > Dr. Jagan Kommineni > Ludwig Institute for Cancer research > Pakville VIC 3145 > Australia. > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/spctools-discuss?hl=en. > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.
