Hi,
I'm looking for information if it is possible to configure FileSystemFetcher
for tika-pipes to only process certain files, e.g. based on extension, match on
file name/path or similar pattern.
This way it would be possible to point to a specific root folder and only
process matching files
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841242#comment-17841242
]
Tim Allison edited comment on TIKA-4243 at 4/26/24 1:32 PM:
I really, really
[
https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841220#comment-17841220
]
Tim Allison commented on TIKA-4245:
---
This is an ongoing area for improvement in Tika.
The algorithm is
That's not possible yet. Please open an issue on our JIRA...you may need to
request an account(?).
On Fri, Apr 26, 2024 at 6:01 AM Emil Zegers
wrote:
> Hi,
>
> I'm looking for information if it is possible to configure
> FileSystemFetcher for tika-pipes to only process certain files, e.g. based
[
https://issues.apache.org/jira/browse/TIKA-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Emil Zegers updated TIKA-4246:
--
Description:
Would be useful to have the possibility to configure FileSystemFetcher for
tika-pipes to
[
https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841221#comment-17841221
]
Tim Allison commented on TIKA-4245:
---
Oops, sorry. I didn't realize you sent your tika-config.xml. Y, one
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841252#comment-17841252
]
Tim Allison commented on TIKA-4243:
---
https://json-schema.org/learn/getting-started-step-by-step
Yes,
[
https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841209#comment-17841209
]
Xiaohong Yang commented on TIKA-4245:
-
[~tilman] Can you detect the right charset (utf-8) and fix the
Emil Zegers created TIKA-4246:
-
Summary: tika-pipes FileSystemFetcher configuration option for
file name/path pattern selection
Key: TIKA-4246
URL: https://issues.apache.org/jira/browse/TIKA-4246
[
https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841221#comment-17841221
]
Tim Allison edited comment on TIKA-4245 at 4/26/24 1:23 PM:
Oops, sorry. I
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841243#comment-17841243
]
Tim Allison commented on TIKA-4243:
---
Oh, sorry. Does this break anything? Can we add this as a new
[
https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841242#comment-17841242
]
Tim Allison commented on TIKA-4243:
---
I really, really want to clean up our configuration, and moving to
Worst case scenario, or if you're building older releases:
mvn clean install -Dossindex.skip
On Mon, Apr 22, 2024 at 10:35 AM Nicholas DiPiazza <
nicholas.dipia...@gmail.com> wrote:
> thanks I'll pull latest
> appreciate your help.
>
> On Mon, Apr 22, 2024 at 9:30 AM Tilman Hausherr
> wrote:
13 matches
Mail list logo