[spctools-discuss] basic problems about X!tandem database search

陳胤中 Tue, 23 Aug 2016 23:05:26 -0700

Hello everyone:
I'm the novice of TPP, and I encountered some difficulties when I use 
x!tandem database search.
Because our lab was not famaliar with TPP, I first used the mass data of 
BSA (which I can 100% confirm what protein I should find in this data) to 
do the proteomic search.The mass of our lab is waters G2, and I convert its 
raw data into mzXML with proteowizard MSconvert. Then I loaded the mzXML 
file, parameters and NCBI Bos Taurus fasta file to do the database search, 
followed by PeptideProphet and ProteinProphet. However, the result of 
ProteinProphet showed only 25.4% coverage which is unreasonably low. I 
wonder whether I made some mistake when I write my parameters file. So I 
show my parameters below, please read this and help me find whether I 
should modify. 
Moreover, I want to ask why the default mass tolerace is -2 and 4 Da, which 
are incredibly large scale for mass. And why an assymetric tolerance is 
preferable. I also wonder whether k score or native X!tandem scoring 
(hypersocre) is preferable. Thanks a lot for your time to read this and 
please give me some advises.


<?xml version="1.0" encoding="UTF-8"?>

<bioml>
 
<note> DEFAULT PARAMETERS. The value of "isb_default_input_kscore.xml" is 
recommended. Change to "isb_default_input_native.xml" for native X!Tandem 
scoring.</note> 
<note type="input" label="list path, default 
parameters">C:\Inetpub\wwwroot\ISB\data\parameters\isb_default_input_native.xml</note>

<note> FILE LOCATIONS. Replace them with your input (.mzXML) file and 
output file -- these are REQUIRED. Optionally a log file and a sequence 
output file of all protein sequences identified in the first-pass can be 
specified. Use of FULL path (not relative) paths is recommended. </note>
<note type="input" label="spectrum, 
path">C:\Inetpub\wwwroot\ISB\data\demo2009\tandem</note>
<note type="input" label="output, 
path">C:\Inetpub\wwwroot\ISB\data\demo2009\tandem</note>
<note type="input" label="output, log path"></note>
<note type="input" label="output, sequence path"></note>
    <note type="input" label="spectrum, threads">2</note>

<note> TAXONOMY FILE. This is a file containing references to the sequence 
databases. Point it to your own taxonomy.xml if needed.</note>
<note type="input" label="list path, taxonomy 
information">C:\Inetpub\wwwroot\ISB\data\parameters\taxonomy.xml</note>

<note> PROTEIN SEQUENCE DATABASE. This refers to identifiers in the 
taxomony.xml, not the .fasta files themselves! Make sure the database you 
want is present as an entry in the taxonomy.xml referenced above. This is 
REQUIRED. </note>
<note type="input" label="protein, taxon">yeast_orfs_all_REV01_short</note>

<note type="input" label="protein, cleavage site">[RK]|{P}</note>

 
<note> PRECURSOR MASS TOLERANCES. In the example below, a -2.0 Da to 4.0 Da 
(monoisotopic mass) window is searched for peptide candidates. Since this 
is monoisotopic mass, so for non-accurate-mass instruments, for which the 
precursor is often taken nearer to the isotopically averaged mass, an 
asymmetric tolerance (-2.0 Da to 4.0 Da) is preferable. This somewhat 
imitates a (-3.0 Da to 3.0 Da) window for averaged mass (but not 
exactly)</note>
  <note type="input" label="spectrum, parent monoisotopic mass error 
minus">0.1</note>
  <note type="input" label="spectrum, parent monoisotopic mass error 
plus">0.1</note>
<note type="input" label="spectrum, parent monoisotopic mass error 
units">Daltons</note>
<note>The value for this parameter may be 'Daltons' or 'ppm': all other 
values are ignored</note>
<note type="input" label="spectrum, parent monoisotopic mass isotope 
error">no</note>
<note>This allows peptide candidates in windows around -1 Da and -2 Da from 
the acquired mass to be considered. Only applicable when the minus/plus 
window above is set to less than 0.5 Da. Good for accurate-mass instruments 
for which the reported precursor mass is not corrected to the monoisotopic 
mass. </note>


<note> MODIFICATIONS. In the example below, there is a static 
(carbamidomethyl) modification on C, and variable modifications on M 
(oxidation). Multiple modifications can be separated by commas, as in 
"80.0@S,80.0@T". Peptide terminal modifications can be specified with the 
symbol '[' for N-terminus and ']' for C-terminus, such as 42.0@[ .  </note>
  <note type="input" label="residue, modification mass"></note>
      <note type="input" label="residue, potential modification 
mass">57.02146@C,0.984016@N,0.984016@Q,15.99492@M,43.0581@K,43.0581@]</note>
<note type="input" label="residue, potential modification motif"></note>
<note> You can specify a variable modification only when present in a 
motif. For instance, 0.998@N!{P}[ST] is a deamidation modification on N 
only if it is present in an N[any but P][S or T] motif (N-glycosite). 
</note>
<note type="input" label="protein, N-terminal residue modification 
mass"></note>
  <note type="input" label="protein, C-terminal residue modification 
mass"></note>
<note> These are *static* modifications on the PROTEINS' N or C-termini. 
</note>

<note> SEMI-TRYPTICS AND MISSED CLEAVAGES. In the example below, 
semitryptic peptides are allowed, and up to 2 missed cleavages are allowed. 
</note>
  <note type="input" label="protein, cleavage semi">no</note>
<note type="input" label="scoring, maximum missed cleavage sites">1</note>

<note> REFINEMENT. Do not use unless you know what you are doing. Set 
"refine" to "yes" and specify what you want to search in the refinement. 
For non-confusing results, repeat the same modifications you set above for 
the first-pass here.</note>
<note type="input" label="refine">no</note>
<note type="input" label="refine, maximum valid expectation 
value">0.1</note>
      <note type="input" label="refine, potential modification 
mass">15.994915@M,8.014199@K,10.008269@R</note>
<note type="input" label="refine, potential modification motif"></note>
  <note type="input" label="refine, cleavage semi">yes</note>
<note type="input" label="refine, unanticipated cleavage">no</note>
<note type="input" label="refine, potential N-terminus 
modifications"></note>
<note type="input" label="refine, potential C-terminus 
modifications"></note>
<note type="input" label="refine, point mutations">no</note>
<note type="input" label="refine, use potential modifications for full 
refinement">no</note>

  


</bioml>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/spctools-discuss.
For more options, visit https://groups.google.com/d/optout.

[spctools-discuss] basic problems about X!tandem database search

Reply via email to