Forgot to mention -- if there are features that you think our default
parser should include, please let us know (we do have a bunch of
configurations available in the AutoDetectParser [0]). I realize, though,
that there are use-case specific items that might not be a great fit in our
DefaultParser. :D

      Best,

            Tim

[0]
https://cwiki.apache.org/confluence/display/TIKA/ModifyingContentWithHandlersAndMetadataFilters#ModifyingContentWithHandlersAndMetadataFilters-4.AutoDetectParserConfig

On Sun, Feb 4, 2024 at 2:27 PM Tim Allison <[email protected]> wrote:

> W00t! Let us know when you have more questions!
>
> On Fri, Feb 2, 2024 at 10:27 AM Slava G <[email protected]> wrote:
>
>> Thanks a lot !!
>>
>> On Thu, Feb 1, 2024 at 6:36 PM Tim Allison <[email protected]> wrote:
>>
>>> https://tika.apache.org/3.0.0-BETA/parser_guide.html
>>>
>>> You should be able to add your parser in a services file, and the way
>>> the class loading sorting works, non-tika parsers should have a higher
>>> priority automatically. If that doesn't work, we can update the
>>> documentation to show what that would look like via the tika-config.xml.
>>> The best example I can quickly find is this:
>>> https://github.com/apache/tika/blob/main/tika-core/src/test/resources/org/apache/tika/config/mock-exclude.xml
>>>
>>> Add the tika parser that handles the file format to the exclude clause,
>>> and put your new parser below the DefaultParser.
>>>
>>> On Wed, Jan 31, 2024 at 12:36 PM Slava G <[email protected]> wrote:
>>>
>>>> Hi,
>>>> If I want to create a new default parser that does some changes to the
>>>> existing default parser (will inherit from the default parser), what is the
>>>> proper way to configure tika server to use my parser as a default ?
>>>>
>>>> Thanks
>>>>
>>>

Reply via email to