Hi Bernd, Bernd Web wrote: > Hi, > > Sometimes I use an EMBOSS command directly on a FastA file. > I wonder if it is possible to select the ID used in the output, esp > for FastA records with an NCBI defline. > >> gi|248166|g|AA21972.1| description... > > in the output of an EMBOSS command becomes: > AA21972.1| > > It would be very easy if the ID could be chosen to be the GI number. > Now the ID used depends on the GI record (sp, pdb, pir) show different > IDs in EMBOSS output.
Did you mistype the defline? There is a defined set of database names that can appear in NCBI deflines. If the "|g|" is really "gb" then the ID will be AA21972 which is what I would expect. If the database name is invalid (or a new one unknown to EMBOSS) then we could try to use the GI number. but the "EMBOSS way" would be to use the accession number from the sequence version. Unfortunately at present it is using the last part of sequence version "1" as the ID in your example. I will fix it for the next release. You can use -sid on the command line to give an ID to a sequence that does not have one,but not to replace an existing ID. That seems strange. It may change for the next release so that you can always use -sid to define the ID. Hope that helps Peter _______________________________________________ EMBOSS mailing list [email protected] http://lists.open-bio.org/mailman/listinfo/emboss
