Thanks for this thread! It saves my time :) On Wed, 21 Jul 2021 at 21:43, Tim Allison <[email protected]> wrote: > > Hi David, > W00t! You should definitely also look into the async/pipes option > for FSCrawler once I get the documentation in order. I'm in the > process of putting together the minimal config files for > fileshare->fileshare, and then I'll put together an example of > fileshare->OpenSearch, which, um, should work for a bit at least with > Elasticsearch. If it doesn't work with Elasticsearch, it should be > fairly easy to write your own emitter. > The benefit of the pipes package is that all of the parsing is done > in isolated jvms so that catastrophic problems aren't catastrophic for > the indexing process or the indexer. :D The other benefit is that we > have fetchers for fileshare, S3 and http so that you can easily add > new data sources. > The new pipes module takes a bit of explanation (in lieu of tbd > documentation), but not much. I'm always happy to chat. > > Cheers, > > Tim > > > On Wed, Jul 21, 2021 at 10:16 AM David Pilato <[email protected]> wrote: > > > > Ha. Found it... > > > > <dependency> > > <groupId>org.apache.tika</groupId> > > <artifactId>tika-parsers-standard-package</artifactId> > > </dependency> > > <dependency> > > <groupId>org.apache.tika</groupId> > > <artifactId>tika-parser-scientific-module</artifactId> > > </dependency> > > <dependency> > > <groupId>org.apache.tika</groupId> > > <artifactId>tika-parser-sqlite3-module</artifactId> > > </dependency> > > > > > > > > I guess we just need to update the documentation? > > > > David > > Le 21 juil. 2021 à 16:10 +0200, David Pilato <[email protected]>, a écrit : > > > > Hey team > > > > > > I'm trying to upgrade my project to 2.0.0. > > I'm confused. The doc says to include: > > > > <dependency> > > <groupId>org.apache.tika</groupId> > > <artifactId>tika-parsers</artifactId> > > <version>2.0.0</version> > > </dependency> > > > > > > But the release note says to include modules like: > > > > <dependency> > > <groupId>org.apache.tika</groupId> > > <artifactId>tika-parsers-standard</artifactId> > > <version>2.0.0</version> > > </dependency> > > <dependency> > > <groupId>org.apache.tika</groupId> > > <artifactId>tika-parsers-extended</artifactId> > > <version>2.0.0</version> > > </dependency> > > <dependency> > > <groupId>org.apache.tika</groupId> > > <artifactId>tika-parser-scientific-module</artifactId> > > <version>2.0.0</version> > > </dependency> > > <dependency> > > <groupId>org.apache.tika</groupId> > > <artifactId>tika-parser-sqlite3-module</artifactId> > > <version>2.0.0</version> > > </dependency> > > > > > > > > But AFAICS all those modules are marked as pom not as jar. So maven is > > failing when I'm trying to use them. > > > > What am I missing here? > > > > > > David
-- Best regards, Maxim
