On Fri, 2007-06-29 at 18:18 +1000, Simon Cullen wrote:
> I have been looking for a desktop search that will index *all* of my
> PDFs---some of which are thousands of pages long.  Google only turns
> up one hit when I search for MaxTextToIndex -- the variable in the
> tracker.cfg file which sets the "maximum size of text in bytes to
> index from a file's text contents". I have tried setting this to
> various things---but I cannot seem to get tracker to go further than
> (say) 20 pages into a document... 
> 
> Is there a way I can just disable the limit---so as that it will go on
> until the end of any file it finds?  I only use tracker to index
> pdfs---so it doesn't have to deal with any of the other junk on my
> system. I don't mind if the index is large---but if there is a better
> system I should consider for this, I'd love to hear about it.  
> 
> I've recently been forced into trying Google Desktop, which is fine,
> but suffers from an even stricter limit in this regard (about 10
> pages?). 
> 
> Any handy hits will be VERY much appreciated,


we only index the first 10,000 unique words in any one doc and/or only
the first 1mb of text

I believe the maximum text size limit can be adjusted in the config file
but the word limit is hardcoded (which needs to be changed to use a
config var). Note these settings greatly affect memory usage of trackerd
when indexing.

I will look at adding support for these in the tracker-preferences UI in
the near future

jamie


_______________________________________________
tracker-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/tracker-list

Reply via email to