I just added the ability to wrap a content handler via tika-config.xml and it will be out in 2.4.1 shortly. Let me document it on our wiki. I've started a stub here: https://cwiki.apache.org/confluence/display/TIKA/ModifyingContentWithHandlersAndMetadataFilters
On Fri, Jun 3, 2022 at 1:41 PM Cihad Guzel <[email protected]> wrote: > Hi Nick, > > Thanks for your information. > > If i use embedded tika, i think that i can set the custom content handler > using the api. > > On the other hand If i use tika server, how can i set the custom content > handler to the tika server? Is there a way to the it from the config file? > > Regards, > Cihad Guzel > > > 3 Haz 2022 Cum 19:09 tarihinde Nick Burch <[email protected]> şunu > yazdı: > >> On Fri, 3 Jun 2022, Cihad Guzel wrote: >> > I want to pass the content's words through some filters while parsing >> in >> > Tika. How can I add custom filtering? >> > >> > Does the content handler work for this? Is there a document about this? >> >> A custom content handler is a pretty good way to do that. Tika just uses >> regular Java XML content handlers, so you don't need a Tika-specific >> tutorial on writing one >> >> Depending on what you're wanting to do, you can use Tika's >> TeeContentHandler to send the events to both your custom handler and a >> normal one. ContentHandlerDecorator can also be used to override just >> some >> bits >> >> Nick >> >>
