Dear All, Many thanks for your reply.
Best regards. sincerely zheng, yun On 12/8/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > Dear Yun Zheng, > > > Are there any tools for find unique sequences from a large database? > Many > > thanks. > > > > I need to find unique DNA sequences from a large database. A short piece > > is > > given as follows. > > > > > All these seem to be the same sequence, since BLASTN gives very small > > e-values for their alignments. > > Remember than BLASTN is a local alignment tool. The small e-values > indicate that some part of your 001 query sequence is similar to some part > of a sequence in the database. > > You need to check what is matching in the alignments reported by BLASTN. > One useful test is whether the whole length of your query is matching to > any of the sequences in the database, also for DNA whether it is matching > in one or both directions (as sequences can have biologically significant > inverted repeats). > > There are tools (not in EMBOSS) available for building non-redundant > databases - excluding sequences which are subsequences of others in the > database, or selecting one of a set of sequences that match closely over > their whole length. But you do have to decide what you mean by redundancy > and make sure that the methods you apply are appropriate. > > Hope that helps, > > Peter Rice > > _______________________________________________ EMBOSS mailing list [email protected] http://lists.open-bio.org/mailman/listinfo/emboss
