Many thanks for all the replies. Unless present at the beginning (or end?)
of a sequence, "X" seems to be treated as a pseudo-residue which is used to
calculate homology (i.e., if 2 aligned sequences have matching disordered
loops denoted by X, this region is treated as identical). I've tried a
bunch of non-alphanumerics which result in a crash or are stripped out. I'd
like a wildcard which keeps residues in register but flags this region for
exclusion in homology analysis.

Other comments about using the sequence information of a PDB header are
great, but as mentioned, some PDB files don't have this information, and
also, if you have the structure of say the C-terminal domain of a protein
the N-terminal residues will not be present in the header.

Many thanks in advance for any further tips/advice!


On Sat, Sep 21, 2013 at 8:42 AM, Mo Wong <[email protected]> wrote:

> Hi,
>
> I'm trying to do sequence alignments that are generated using PDB files as
> the sequence source so are often missing residues with the sequence. Is
> there any way to run BLAST (or related server - not a local program) that
> accepts wildcards so I can keep the numbering in the resulting alignment in
> register with the PDB? I've Googled round, and I'm surprised that I can't
> find this addressed anywhere.
>
> Thanks!
>

Reply via email to