It'd be nice if this was changed so that if a PDF has no title then the
first xx words become the new title.
(but it seems that the Google title process is more advanced that this)
Jérôme Charron wrote:
When searching with nutch the title of pdf documents is a url to the
file like:
http://www.ists.dartmouth.edu/library/wse0901.pdf
In Nutch, the title of PDF file is displayed if a title is available,
otherwise the URL
of the document is displayed.
Regards
Jérôme
--
http://motrech.free.fr/
http://www.frutch.org/
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general