Hi,

I'm looking for information if it is possible to configure FileSystemFetcher 
for tika-pipes to only process certain files, e.g. based on extension, match on 
file name/path or similar pattern.

This way it would be possible to point to a specific root folder and only 
process matching files like certain extensions, names (e.g. for GIS files like 
shapefiles there is same name with multiple extensions) etc.

Something like:

<properties>
  <fetchers>
    <fetcher class="org.apache.tika.pipes.fetcher.fs.FileSystemFetcher">
      <params>
        <name>fsf</name>
        <basePath>/my/base/path1</basePath>
        <pattern>myshapefilename.*</pattern>
      </params>
    </fetcher>
  </fetchers>
</properties>

Or:

        <pattern>*.doc*,*.pdf</pattern>

Regards,

Emil


Reply via email to