Re: [EMBOSS] how to find unique DNA sequences from a large database

yun zheng Fri, 08 Dec 2006 20:09:39 -0800

Dear All,

Many thanks for your reply.


Best regards.

sincerely
zheng, yun


On 12/8/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>
> Dear Yun Zheng,
>
> > Are there any tools for find unique sequences from a large database?
> Many
> > thanks.
> >
> > I need to find unique DNA sequences from a large database. A short piece
> > is
> > given as follows.
> >
>
> > All these seem to be the same sequence, since BLASTN gives very small
> > e-values for their alignments.
>
> Remember than BLASTN is a local alignment tool. The small e-values
> indicate that some part of your 001 query sequence is similar to some part
> of a sequence in the database.
>
> You need to check what is matching in the alignments reported by BLASTN.
> One useful test is whether the whole length of your query is matching to
> any of the sequences in the database, also for DNA whether it is matching
> in one or both directions (as sequences can have biologically significant
> inverted repeats).
>
> There are tools (not in EMBOSS) available for building non-redundant
> databases - excluding sequences which are subsequences of others in the
> database, or selecting one of a set of sequences that match closely over
> their whole length. But you do have to decide what you mean by redundancy
> and make sure that the methods you apply are appropriate.
>
> Hope that helps,
>
> Peter Rice
>
>
_______________________________________________
EMBOSS mailing list
[email protected]
http://lists.open-bio.org/mailman/listinfo/emboss

Re: [EMBOSS] how to find unique DNA sequences from a large database

Reply via email to