Hello everyone:
I'm the novice of TPP, and I encountered some difficulties when I use
x!tandem database search.
Because our lab was not famaliar with TPP, I first used the mass data of
BSA (which I can 100% confirm what protein I should find in this data) to
do the proteomic search.The mass of our lab is waters G2, and I convert its
raw data into mzXML with proteowizard MSconvert. Then I loaded the mzXML
file, parameters and NCBI Bos Taurus fasta file to do the database search,
followed by PeptideProphet and ProteinProphet. However, the result of
ProteinProphet showed only 25.4% coverage which is unreasonably low. I
wonder whether I made some mistake when I write my parameters file. So I
show my parameters below, please read this and help me find whether I
should modify.
Moreover, I want to ask why the default mass tolerace is -2 and 4 Da, which
are incredibly large scale for mass. And why an assymetric tolerance is
preferable. I also wonder whether k score or native X!tandem scoring
(hypersocre) is preferable. Thanks a lot for your time to read this and
please give me some advises.
<?xml version="1.0" encoding="UTF-8"?>
<bioml>
<note> DEFAULT PARAMETERS. The value of "isb_default_input_kscore.xml" is
recommended. Change to "isb_default_input_native.xml" for native X!Tandem
scoring.</note>
<note type="input" label="list path, default
parameters">C:\Inetpub\wwwroot\ISB\data\parameters\isb_default_input_native.xml</note>
<note> FILE LOCATIONS. Replace them with your input (.mzXML) file and
output file -- these are REQUIRED. Optionally a log file and a sequence
output file of all protein sequences identified in the first-pass can be
specified. Use of FULL path (not relative) paths is recommended. </note>
<note type="input" label="spectrum,
path">C:\Inetpub\wwwroot\ISB\data\demo2009\tandem</note>
<note type="input" label="output,
path">C:\Inetpub\wwwroot\ISB\data\demo2009\tandem</note>
<note type="input" label="output, log path"></note>
<note type="input" label="output, sequence path"></note>
<note type="input" label="spectrum, threads">2</note>
<note> TAXONOMY FILE. This is a file containing references to the sequence
databases. Point it to your own taxonomy.xml if needed.</note>
<note type="input" label="list path, taxonomy
information">C:\Inetpub\wwwroot\ISB\data\parameters\taxonomy.xml</note>
<note> PROTEIN SEQUENCE DATABASE. This refers to identifiers in the
taxomony.xml, not the .fasta files themselves! Make sure the database you
want is present as an entry in the taxonomy.xml referenced above. This is
REQUIRED. </note>
<note type="input" label="protein, taxon">yeast_orfs_all_REV01_short</note>
<note type="input" label="protein, cleavage site">[RK]|{P}</note>
<note> PRECURSOR MASS TOLERANCES. In the example below, a -2.0 Da to 4.0 Da
(monoisotopic mass) window is searched for peptide candidates. Since this
is monoisotopic mass, so for non-accurate-mass instruments, for which the
precursor is often taken nearer to the isotopically averaged mass, an
asymmetric tolerance (-2.0 Da to 4.0 Da) is preferable. This somewhat
imitates a (-3.0 Da to 3.0 Da) window for averaged mass (but not
exactly)</note>
<note type="input" label="spectrum, parent monoisotopic mass error
minus">0.1</note>
<note type="input" label="spectrum, parent monoisotopic mass error
plus">0.1</note>
<note type="input" label="spectrum, parent monoisotopic mass error
units">Daltons</note>
<note>The value for this parameter may be 'Daltons' or 'ppm': all other
values are ignored</note>
<note type="input" label="spectrum, parent monoisotopic mass isotope
error">no</note>
<note>This allows peptide candidates in windows around -1 Da and -2 Da from
the acquired mass to be considered. Only applicable when the minus/plus
window above is set to less than 0.5 Da. Good for accurate-mass instruments
for which the reported precursor mass is not corrected to the monoisotopic
mass. </note>
<note> MODIFICATIONS. In the example below, there is a static
(carbamidomethyl) modification on C, and variable modifications on M
(oxidation). Multiple modifications can be separated by commas, as in
"80.0@S,80.0@T". Peptide terminal modifications can be specified with the
symbol '[' for N-terminus and ']' for C-terminus, such as 42.0@[ . </note>
<note type="input" label="residue, modification mass"></note>
<note type="input" label="residue, potential modification
mass">57.02146@C,0.984016@N,0.984016@Q,15.99492@M,43.0581@K,43.0581@]</note>
<note type="input" label="residue, potential modification motif"></note>
<note> You can specify a variable modification only when present in a
motif. For instance, 0.998@N!{P}[ST] is a deamidation modification on N
only if it is present in an N[any but P][S or T] motif (N-glycosite).
</note>
<note type="input" label="protein, N-terminal residue modification
mass"></note>
<note type="input" label="protein, C-terminal residue modification
mass"></note>
<note> These are *static* modifications on the PROTEINS' N or C-termini.
</note>
<note> SEMI-TRYPTICS AND MISSED CLEAVAGES. In the example below,
semitryptic peptides are allowed, and up to 2 missed cleavages are allowed.
</note>
<note type="input" label="protein, cleavage semi">no</note>
<note type="input" label="scoring, maximum missed cleavage sites">1</note>
<note> REFINEMENT. Do not use unless you know what you are doing. Set
"refine" to "yes" and specify what you want to search in the refinement.
For non-confusing results, repeat the same modifications you set above for
the first-pass here.</note>
<note type="input" label="refine">no</note>
<note type="input" label="refine, maximum valid expectation
value">0.1</note>
<note type="input" label="refine, potential modification
mass">15.994915@M,8.014199@K,10.008269@R</note>
<note type="input" label="refine, potential modification motif"></note>
<note type="input" label="refine, cleavage semi">yes</note>
<note type="input" label="refine, unanticipated cleavage">no</note>
<note type="input" label="refine, potential N-terminus
modifications"></note>
<note type="input" label="refine, potential C-terminus
modifications"></note>
<note type="input" label="refine, point mutations">no</note>
<note type="input" label="refine, use potential modifications for full
refinement">no</note>
</bioml>
--
You received this message because you are subscribed to the Google Groups
"spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/spctools-discuss.
For more options, visit https://groups.google.com/d/optout.