Hi David,
it seems the index is OK, just the database query code can not handle
the ":" which has special meanings in USAs. So as workaround you can
replace the ":" by a "*".
entret -stdout -auto 'imgthla-key:A*02*364'
will return the entry HLA08011.
But be aware that by this you actually generate a wildcard query, so
the * matches any single character at that position.
Unfortunately that is not going to work for this case since the HLA
alleles use a somewhat nested nomenclature, for example:
a*01:01:02
a*01:02
a*02:01:02
a*02:101:02
However a little experimentation indicates that EMBOSS supports the
single character wild-card '?', so something like:
$ entret -stdout -auto 'imgthla-key:A?01?02'
appears to do what I want in most cases.
That said, it would be better to have a way to escape the special
characters (i.e. '*', ':' and '?') in the search term when an exact
match is required (as in this case).
Thanks,
Hamish
Kind regards, David.
-Ursprüngliche Nachricht- Von:
emboss-boun...@lists.open-bio.org
[mailto:emboss-boun...@lists.open-bio.org] Im Auftrag von Hamish
McWilliam Gesendet: 23 August 2013 11:25 An:
emboss@lists.open-bio.org Betreff: [EMBOSS] Escaping query terms in a
USA
Hi folks,
In the IMGT/HLA database (http://www.ebi.ac.uk/ipd/imgt/hla/) the
keywords field in the EMBL-Bank format flat-file contains allele
names like:
A*02:364
While I can build an index containing the keywords, it does not
appear to be possible to search the index with the allele names. For
example:
$ entret -stdout -auto 'imgthla-key:Allele'
works as expected, but:
$ entret -stdout -auto 'imgthla-key:A*02:364'
just gives errors:
Error: Failed to open filename 'imgthla-key' Error: Unable to read
sequence 'imgthla-key:A*02:364' Died: entret terminated: Bad value
for '-sequence' with -auto defined
I am guessing that the problem is the '*' and ':' characters in the
term... so is there some way to escape these or are the terms in the
index mangles in some way?
All the best,
Hamish
--
Mr Hamish McWilliam,
Web Production,
European Bioinformatics Institute (EMBL-EBI),
European Molecular Biology Laboratory,
Wellcome Trust Genome Campus,
Hinxton, Cambridge, CB10 1SD
United Kingdom
URL: http://www.ebi.ac.uk/
___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss