Very interesting :) On Tue, Nov 24, 2009 at 8:04 PM, Sébastien Launay <[email protected]> wrote: > Hi Paco, > > If you are not afraid to get their hands dirty you can use Luke [1] > and analyze the indexes found in repository/workspaces/*/index. > You might want to search the field named '_:FULLTEXT' (told you it > will get dirty ;)). > > [1] http://code.google.com/p/luke/ > > 2009/11/24 Paco Avila <[email protected]>: >> Thanks, this is the expected answer :( >> >> Anyway, there is any way to detect a failed text extraction ? I know, >> I can see the log but the failure it not associated to a file or path. >> >> Some times when I upload a document (word, pdf, etc.) to my DMS build >> on Jackrabbit, it is not indexed. Office documents seems to be >> specially problematic due to its propietary format. And the problem is >> that I don't know which document had problems it their text >> extraction, specially if use extractorPoolSize > 1. >> >> Perhaps this question should be send to the development list? I thinks >> this can be a very useful improvement to Jackrabbit. > > -- > Sébastien Launay >
-- Paco Avila OpenKM http://www.openkm.com http://www.guia-ubuntu.org
