I just added the ability to wrap a content handler via tika-config.xml and
it will be out in 2.4.1 shortly.  Let me document it on our wiki.  I've
started a stub here:
https://cwiki.apache.org/confluence/display/TIKA/ModifyingContentWithHandlersAndMetadataFilters

On Fri, Jun 3, 2022 at 1:41 PM Cihad Guzel <[email protected]> wrote:

> Hi Nick,
>
> Thanks for your information.
>
> If i use embedded tika, i think that i can set the custom content handler
> using the api.
>
> On the other hand If i use tika server, how can i set the custom content
> handler to the tika server? Is there a way to the it from the config file?
>
> Regards,
> Cihad Guzel
>
>
> 3 Haz 2022 Cum 19:09 tarihinde Nick Burch <[email protected]> şunu
> yazdı:
>
>> On Fri, 3 Jun 2022, Cihad Guzel wrote:
>> > I want to pass the content's words through some filters while parsing
>> in
>> > Tika. How can I add custom filtering?
>> >
>> > Does the content handler work for this? Is there a document about this?
>>
>> A custom content handler is a pretty good way to do that. Tika just uses
>> regular Java XML content handlers, so you don't need a Tika-specific
>> tutorial on writing one
>>
>> Depending on what you're wanting to do, you can use Tika's
>> TeeContentHandler to send the events to both your custom handler and a
>> normal one. ContentHandlerDecorator can also be used to override just
>> some
>> bits
>>
>> Nick
>>
>>

Reply via email to