Hi Florin

You can create a custom config

<properties>
>     <parsers>
>         <!-- Load all available parsers -->
>         <parser class="org.apache.tika.parser.DefaultParser"/>
>         <!-- Override parsing of all types supported by CustomParser -->
>         <parser class="com.digitalpebble.tika.pdf.TETPDFParser"/>
>     </parsers>
> </properties>


then call

conf = new TikaConfig(customConf);


your custom parser will then be used for pdf documents,  assuming that your
parser has something like :

private static final Set<MediaType> SUPPORTED_TYPES = Collections
> .singleton(MediaType.application("pdf"));
> public Set<MediaType> getSupportedTypes(ParseContext context) {
> return SUPPORTED_TYPES;
> }


HTH

Julien

On 13 July 2011 14:11, Florin P <[email protected]> wrote:

>  Hello!
>    We would like to replace the existing PDFParser with
>  our custom one. Moreover we would like that our
>  CustomPDFParser to be used for all pdf documents that we are
>  parsing.  How we can achieve this by using Java API? We
>  are using Apache Tika 0.9.
>
>  Thank you,
>
>  Florin
>
>
>
>


-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to