Am 12.06.2014 20:50, schrieb Jan Ulrich Hasecke:
> Hi,
> 
> I have a project with 600+ html files. The built-in search of Sphinx
> cannot handle these amount of files in a reasonable time.
> 
> I never had a project with such an amount of files, so I don't know
> whether 600+ files are way too much for Sphinx to handle or whether
> there is a glitch in my configuration.
> 
> Has anyone used Sphinx with more than 600 files?
Only rarely. Dog slow, especially when using includes/suffix/postfix
includes and many small files, as those are parsed over and over again
by docutils. No kind of precompiled headers or something like that.
Even our old totally broken docbook/xml toolchain was faster.

I had split our project into smaller chunks (around 300 files each), but
we don't really use the built in search too much, as we have some
SOLR/Lucene installed anyway.

If you patch out the search index creation, things get faster by about
30% (for older Sphinx versions).

Additionally you should use the absolute minimal set of rst
substitutions and no includes in rst_suffix/prefix, as those get parsed
over and over again, which can easily double the time needed if you have
lots of small files (like we had from an auto-generated docbook -> rst
conversion of some huge books (around 1500 pages in PDF).

But all that doesn't help if you need the built in search (it is no
wonder it is so slow, after all, it tries to store the inverted word
index in memory, and usual wisdom is, that the index can be as large or
larger than the original document). Try using an external search engine.

Michael

-- 
Michael Schlenker
Software Architect

CONTACT Software GmbH           Tel.:   +49 (421) 20153-80
Wiener Straße 1-3               Fax:    +49 (421) 20153-41
28359 Bremen
http://www.contact.de/          E-Mail: [email protected]

Sitz der Gesellschaft: Bremen
Geschäftsführer: Karl Heinz Zachries, Ralf Holtgrefe
Eingetragen im Handelsregister des Amtsgerichts Bremen unter HRB 13215

-- 
You received this message because you are subscribed to the Google Groups 
"sphinx-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sphinx-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to