#138: Introduce timeout mechanism for long queries -----------------------+---------------------------------------------------- Reporter: lmarian | Owner: lmarian Type: task | Status: new Priority: major | Milestone: v1.0 Component: WebSearch | Version: Keywords: | -----------------------+---------------------------------------------------- Wildcards are currently allowed for words longer than N letters. This is too simplistic, because phys* can have lots of variants, while xy* may have less. So the wildcard should be allowed for the term xy, but not for the term cern.
We should therefore use COUNT() to see how many matching terms there may be, and allow wildcard if there are less than a reasonable limit number, or remove wildcard if there are more. Example: mysql> select count(*) from idxWORD01F where term like 'cern%'; Note that this limiting technique is not perfect for any kind of query, e.g. this one would be very slow to check: mysql> select count(*) from idxWORD01F where term like '%cern%'; due to full table scan. Similarly span queries of the kind: mysql> select count(*) from idxWORD01F where term between 'a' and 'y'; For these queries, we'd better use explicit LIMIT statement: mysql> select term from idxWORD01F where term between 'a' and 'y' limit 1001; If the resulting list contains 1001 terms indeed, then we know we have hit the limit and we should remove the wildcards from the term and warn the user that it was removed because there were too many words. (P.S. Timeouting would have to kill query on MySQL side too.) -- Ticket URL: <http://cdswaredev.cern.ch/invenio/ticket/138> Invenio <http://invenio-software.org>