Anders Aagaard wrote:
> 
> Why not hold back updates, but force flush to disk if a search is called?

it would be too slow I guess (assuming 1000's of files cached up and 
could cause a bus timeout for the search).

We would only hold back updates for new files - existing files would be 
indexed straight away as we have a differential indexer so only the 
changed differences are indexed not the whole content (you are unlikely 
to have a large number of files that have changed at the same time as well).

It is tempting on the first run index to hold back any updating of QDBM 
until its finished - this should 100% eliminate all performance hits, 
disk thrashing and fragmentation but you would get back no search 
results until the first run completes.

subsequent indexing of new files can then be spooled into sqlite and 
flushed periodically into QDBM to minimise the above.

I dont think its a must have to have newly indexed content searchable 
right away.

> 
>> As we are memory conservative, I am planning to do something similiar 
>> but using sqlite (instead of precious memory) to cache new files and 
>> then bulk upload. We could easily cache the data for many thousands of 
>> files before uploading them.
> 
> If I remember correctly sqlite3 has some built in cache stuff, you might 
> wanna tweak the standard values a bit.

they are mostly used for sorting and stuff - we will be relying on the 
OS dirty write cache instead (in most cases, writing to sqlite the 
values would be instantaneous on any competent platform and those with 
extra RAM will benefit from the contents effectively being in memory via 
the disk cache anyhow)

-- 
Mr Jamie McCracken
http://jamiemcc.livejournal.com/

_______________________________________________
tracker-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/tracker-list

Reply via email to