Thank you both for the advice. That helps a lot!

The subsetdb.exe. is just what I'm after as I'm trying to extract
human only sequences from swissprot/uniprot. I've been trying to use
sed in bash to extract the identifiers to then create a new database,
but have had no luck so far. I'll run subsetdb and then append a decoy
library for Tandem searching.

Thanks again!

On Oct 22, 10:31 am, Jimmy Eng <[email protected]> wrote:
> If there's some set of unique identifier in the original database that
> denotes all the proteins you want in the subset database, you can use
> the subsetdb program.  It's distributed as part of the TPP (typically
> binary exists at c:\inetpub\tpp-bin\subsetdb.exe) but there's no web
> interface to it that I'm aware of.
>
> As an example, to create a drosophila subset of the uniprot database,
> you do something like:
>
>    subsetdb.exe -MOS=Drosophila^melanogaster -ofly.fasta uniprot_sprot.fasta
>
> This creates an output file "fly.fasta" that contains all entries with
> the text "OS=Drosophila melanogaster" in the protein description line.
>  The carat (^) character replaces a space.  You can have multiple -M
> match text string options, no match -N strings, etc.  Typing the
> executable w/o input arguments will show the usage statement.
>
> On Thu, Oct 21, 2010 at 4:41 PM, Kristian <[email protected]> 
> wrote:
> > Okay.  To do that, all you can really do is make a smaller data base.
> > There's no function in the TPP that will allow you do select a subset
> > of your database.  However, it's really easy to edit your database.
> > Open your database in a text editor (i.e. wordpad) and you'll see the
> > format the entries have.  Use this format to create a new database
> > that only contains the entries you are interested in.  Note that
> > searching against a small database will compromise your statistics
> > (partly because if you're only only searching against a small number
> > of possible matches, X!Tandem will probably find something that
> > matches it, even if poortly; and partly because Peptide Prophet's
> > error model works best if there is a large number of incorrect hits as
> > well as correct hits. ).  For the best results, add decoys to your
> > database.  You can add decoys using the tool in the TPP, or you can
> > simply embed your proteins of interest in a database for another
> > organism whose proteins should not give you any positive hits.
>
> > On Oct 21, 3:24 pm, James Broadbent <[email protected]> wrote:
> >> Thanks Kristian. I think my concept of databases and specifying
> >> taxonomy is a little underdeveloped. I think what I really want is a
> >> smaller, specific database.
>
> >> On Oct 22, 2:59 am, Kristian <[email protected]> wrote:
>
> >> > Do you mean search a specific database?  The taxonomy file specifies
> >> > the location of a database.
> >> > The GUI automatically generates a taxonomy file based on the database
> >> > and location you specify.
> >> > If you're going to run things in command line, there are other things
> >> > you can do.
>
> >> > What are you trying to do?
>
> >> > To specify the taxonomy, modify the line
> >> > <note type="input" label="list path, taxonomy information">C:\Inetpub
> >> > \wwwroot\ISB\data\parameters\taxonomy.xml</note>
> >> > in your tandem.params file.
>
> >> > The line I have above is, I believe, the default location.
>
> >> > On Oct 20, 8:20 pm, James Broadbent <[email protected]> wrote:
>
> >> > > Hi Everyone!
>
> >> > > Can anyone tell me how to search a specific taxonomy by specifying it
> >> > > in the tandem.params file when running searches in the TPP GUI?
>
> >> > > Thanks,
>
> >> > > James
>
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "spctools-discuss" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to 
> > [email protected].
> > For more options, visit this group 
> > athttp://groups.google.com/group/spctools-discuss?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to