[ 
https://issues.apache.org/jira/browse/TIKA-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated TIKA-4545:
------------------------------
    Description: 
Follow on for TIKA-4544.

Steps:
 * Add annotations to components (parsers, etc.) and unit tests to confirm they 
work (finished this today)
 * Modify components (parsers etc), at least a few of them so that they are 
actually configurable. We don't have to modify all, just the most important 
ones PDFParser, tesseract, MSOffice, and others???
 * Move to tika-config.json in tika-pipes client/server, tika-async-cli, 
tika-app and tika-server one by one

  was:
Follow on for TIKA-4544.

Steps:
 * Add annotations to components (parsers, etc.) and unit tests to confirm they 
work
 * Move to tika-config.json in tika-async-cli, tika-app and tika-server one by 
one


> Fully integrate new json based deserializer in 4.x
> --------------------------------------------------
>
>                 Key: TIKA-4545
>                 URL: https://issues.apache.org/jira/browse/TIKA-4545
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> Follow on for TIKA-4544.
> Steps:
>  * Add annotations to components (parsers, etc.) and unit tests to confirm 
> they work (finished this today)
>  * Modify components (parsers etc), at least a few of them so that they are 
> actually configurable. We don't have to modify all, just the most important 
> ones PDFParser, tesseract, MSOffice, and others???
>  * Move to tika-config.json in tika-pipes client/server, tika-async-cli, 
> tika-app and tika-server one by one



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to