Hi Tim, This document looks pretty good. Maybe an example can be added for TeeContentHandler as well.
Regards, Cihad Guzel Tim Allison <[email protected]>, 3 Haz 2022 Cum, 22:24 tarihinde şunu yazdı: > First draft of that page is up. Let me know if you have any questions. > > On Fri, Jun 3, 2022 at 2:03 PM Tim Allison <[email protected]> wrote: > >> I just added the ability to wrap a content handler via tika-config.xml >> and it will be out in 2.4.1 shortly. Let me document it on our wiki. I've >> started a stub here: >> https://cwiki.apache.org/confluence/display/TIKA/ModifyingContentWithHandlersAndMetadataFilters >> >> On Fri, Jun 3, 2022 at 1:41 PM Cihad Guzel <[email protected]> wrote: >> >>> Hi Nick, >>> >>> Thanks for your information. >>> >>> If i use embedded tika, i think that i can set the custom content >>> handler using the api. >>> >>> On the other hand If i use tika server, how can i set the custom content >>> handler to the tika server? Is there a way to the it from the config file? >>> >>> Regards, >>> Cihad Guzel >>> >>> >>> 3 Haz 2022 Cum 19:09 tarihinde Nick Burch <[email protected]> şunu >>> yazdı: >>> >>>> On Fri, 3 Jun 2022, Cihad Guzel wrote: >>>> > I want to pass the content's words through some filters while parsing >>>> in >>>> > Tika. How can I add custom filtering? >>>> > >>>> > Does the content handler work for this? Is there a document about >>>> this? >>>> >>>> A custom content handler is a pretty good way to do that. Tika just >>>> uses >>>> regular Java XML content handlers, so you don't need a Tika-specific >>>> tutorial on writing one >>>> >>>> Depending on what you're wanting to do, you can use Tika's >>>> TeeContentHandler to send the events to both your custom handler and a >>>> normal one. ContentHandlerDecorator can also be used to override just >>>> some >>>> bits >>>> >>>> Nick >>>> >>>>
