Thanks Peter, and I think the client actually attaches specific meanings to particular prefixes (e.g., "J13* = 'batch'), which we may even factor into a separate field, so we can search on two fields, one exact on a category and the other as you suggest. At the very least, when indexed, this would avoid a complete table walk. In practice, I also used KinoSearch, and I have moved some of these searches into KinoSearch, using a phrase search on character tokens (!). Since these names are generally eight characters or less, this is not a huge overhead (compared to the other KinoSearch fields) and seems to have good performance, for filtering at least. Initially I did this because we were using this as a filter on other KinoSearch fields, and it was easier to handle everything in Kino rather than get Kino hits lists and then filter against an SQL query. So yes, there is definitely opportunity to optimize, and we are doing so wherever we can.

On the SQL side, we do use some similar techniques, but we are hit by several issues at low levels in Perl/IIS. In particular, DBI calls to retrieve a row are blocking. alarm seems to be able to cancel a database request, at least for some drivers, so we are not too worried about this.

The issue is that big searches may return thousands of hits which then need to be rendered. Because IIS loves buffering and won't tell us when it cancelled, IIS just seems to enjoy taking all the web server resources, possibly for minutes at a time. If these resources were being used to display information in a browser, we would be doing what the client expects. To do them anyway, when a user may have refined their search (as is typical) and re-executed it, is just an enormous waste of resources.

Most of this is because IIS is a total pain to get working with FastCGI/Catalyst. I have been trying for days, and generally get

Error Details:

    * The FastCGI process exited unexpectedly
    * Error Number: -2147467259 (0x80004005).
    * Error Description: Unspecified error

more or less whatever I try. I am sure most of these are permissions issues, but without any real feedback from IIS, they are more or less impossible to find.

I must admit I am kind of hoping against hope that FastCGI will allow us to pick up more from IIS than PerlEx does, and given the close support for PHP with Microsoft and Zend, it looks like a huge amount more resource is thrown at FastCGI rather than PerlEx. I love Catalyst (as a long-term Perl developer, who also used the Java Spring framework, it gives me just what I need, quickly and well. And if i could get IIS/FastCGI working I would happily write a nice howto on it, but I am nowhere close, and nor can I find that anyone else is, apart from the current documentation, which as far as I am aware is simply:
It is possible to run Catalyst under IIS with FastCGI, but we do not yet have detailed instructions.
All the best
Stuart


Peter Corlett wrote:
On Wed, Oct 29, 2008 at 10:11:35AM -0400, Stuart Watt wrote:
[...]
To give context, the queries that are an issue are SQL queries against a
database that contains millions of components, where users may construct
wildcard queries of the form "*A*", with additional filtering constraints.

If that was a reasonably popular query, I would certainly consider building
a table which contains substrings of that field, so that it becomes a much
more optimisable "LIKE 'A%'" query.


_______________________________________________
List: [email protected]
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/[email protected]/
Dev site: http://dev.catalyst.perl.org/

--
This message was scanned by ESVA and is believed to be clean.
Click here to report this message as spam. http://antispam.infobal.com/cgi-bin/learn-msg.cgi?id=ABB412806C.0DEDB



_______________________________________________
List: [email protected]
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/[email protected]/
Dev site: http://dev.catalyst.perl.org/

Reply via email to