Am 12.06.2014 20:50, schrieb Jan Ulrich Hasecke: > Hi, > > I have a project with 600+ html files. The built-in search of Sphinx > cannot handle these amount of files in a reasonable time. > > I never had a project with such an amount of files, so I don't know > whether 600+ files are way too much for Sphinx to handle or whether > there is a glitch in my configuration. > > Has anyone used Sphinx with more than 600 files? Only rarely. Dog slow, especially when using includes/suffix/postfix includes and many small files, as those are parsed over and over again by docutils. No kind of precompiled headers or something like that. Even our old totally broken docbook/xml toolchain was faster.
I had split our project into smaller chunks (around 300 files each), but we don't really use the built in search too much, as we have some SOLR/Lucene installed anyway. If you patch out the search index creation, things get faster by about 30% (for older Sphinx versions). Additionally you should use the absolute minimal set of rst substitutions and no includes in rst_suffix/prefix, as those get parsed over and over again, which can easily double the time needed if you have lots of small files (like we had from an auto-generated docbook -> rst conversion of some huge books (around 1500 pages in PDF). But all that doesn't help if you need the built in search (it is no wonder it is so slow, after all, it tries to store the inverted word index in memory, and usual wisdom is, that the index can be as large or larger than the original document). Try using an external search engine. Michael -- Michael Schlenker Software Architect CONTACT Software GmbH Tel.: +49 (421) 20153-80 Wiener Straße 1-3 Fax: +49 (421) 20153-41 28359 Bremen http://www.contact.de/ E-Mail: [email protected] Sitz der Gesellschaft: Bremen Geschäftsführer: Karl Heinz Zachries, Ralf Holtgrefe Eingetragen im Handelsregister des Amtsgerichts Bremen unter HRB 13215 -- You received this message because you are subscribed to the Google Groups "sphinx-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/sphinx-users. For more options, visit https://groups.google.com/d/optout.
