subsetdb worked perfectly! Thanks again!
On Fri, Oct 22, 2010 at 12:36 PM, James Broadbent <[email protected] > wrote: > Thank you both for the advice. That helps a lot! > > The subsetdb.exe. is just what I'm after as I'm trying to extract human > only sequences from swissprot/uniprot. I've been trying to use sed in bash > to extract the identifiers to then create a new database, but have had no > luck so far. I'll run subsetdb and then append a decoy library for Tandem > searching. > > Thanks again! > > > On Fri, Oct 22, 2010 at 10:31 AM, Jimmy Eng <[email protected]> wrote: > >> If there's some set of unique identifier in the original database that >> denotes all the proteins you want in the subset database, you can use >> the subsetdb program. It's distributed as part of the TPP (typically >> binary exists at c:\inetpub\tpp-bin\subsetdb.exe) but there's no web >> interface to it that I'm aware of. >> >> As an example, to create a drosophila subset of the uniprot database, >> you do something like: >> >> subsetdb.exe -MOS=Drosophila^melanogaster -ofly.fasta >> uniprot_sprot.fasta >> >> This creates an output file "fly.fasta" that contains all entries with >> the text "OS=Drosophila melanogaster" in the protein description line. >> The carat (^) character replaces a space. You can have multiple -M >> match text string options, no match -N strings, etc. Typing the >> executable w/o input arguments will show the usage statement. >> >> >> On Thu, Oct 21, 2010 at 4:41 PM, Kristian <[email protected]> >> wrote: >> > Okay. To do that, all you can really do is make a smaller data base. >> > There's no function in the TPP that will allow you do select a subset >> > of your database. However, it's really easy to edit your database. >> > Open your database in a text editor (i.e. wordpad) and you'll see the >> > format the entries have. Use this format to create a new database >> > that only contains the entries you are interested in. Note that >> > searching against a small database will compromise your statistics >> > (partly because if you're only only searching against a small number >> > of possible matches, X!Tandem will probably find something that >> > matches it, even if poortly; and partly because Peptide Prophet's >> > error model works best if there is a large number of incorrect hits as >> > well as correct hits. ). For the best results, add decoys to your >> > database. You can add decoys using the tool in the TPP, or you can >> > simply embed your proteins of interest in a database for another >> > organism whose proteins should not give you any positive hits. >> > >> > On Oct 21, 3:24 pm, James Broadbent <[email protected]> wrote: >> >> Thanks Kristian. I think my concept of databases and specifying >> >> taxonomy is a little underdeveloped. I think what I really want is a >> >> smaller, specific database. >> >> >> >> On Oct 22, 2:59 am, Kristian <[email protected]> wrote: >> >> >> >> > Do you mean search a specific database? The taxonomy file specifies >> >> > the location of a database. >> >> > The GUI automatically generates a taxonomy file based on the database >> >> > and location you specify. >> >> > If you're going to run things in command line, there are other things >> >> > you can do. >> >> >> >> > What are you trying to do? >> >> >> >> > To specify the taxonomy, modify the line >> >> > <note type="input" label="list path, taxonomy information">C:\Inetpub >> >> > \wwwroot\ISB\data\parameters\taxonomy.xml</note> >> >> > in your tandem.params file. >> >> >> >> > The line I have above is, I believe, the default location. >> >> >> >> > On Oct 20, 8:20 pm, James Broadbent <[email protected]> >> wrote: >> >> >> >> > > Hi Everyone! >> >> >> >> > > Can anyone tell me how to search a specific taxonomy by specifying >> it >> >> > > in the tandem.params file when running searches in the TPP GUI? >> >> >> >> > > Thanks, >> >> >> >> > > James >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups "spctools-discuss" group. >> > To post to this group, send email to [email protected]. >> > To unsubscribe from this group, send email to >> [email protected]<spctools-discuss%[email protected]> >> . >> > For more options, visit this group at >> http://groups.google.com/group/spctools-discuss?hl=en. >> > >> > >> >> -- >> You received this message because you are subscribed to the Google Groups >> "spctools-discuss" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]<spctools-discuss%[email protected]> >> . >> For more options, visit this group at >> http://groups.google.com/group/spctools-discuss?hl=en. >> >> > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.
