[
https://issues.apache.org/jira/browse/OODT-630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tyler Palsulich updated OODT-630:
---------------------------------
Summary: Upgrade OODT components from using Tika 0.8 to Tika 1.6 (was:
Upgrade OODT components from using Tika 0.8 to Tika 1.3)
> Upgrade OODT components from using Tika 0.8 to Tika 1.6
> -------------------------------------------------------
>
> Key: OODT-630
> URL: https://issues.apache.org/jira/browse/OODT-630
> Project: OODT
> Issue Type: Improvement
> Components: file manager, metadata container, product server
> Affects Versions: 0.6
> Reporter: Rishi Verma
> Assignee: Rishi Verma
> Fix For: 0.8
>
>
> Currently, OODT makes use of Tika v0.8 (tika-core) for mime-detection
> purposes. This version is quite out-of-date, and is incompatible with the use
> of a tika-core or tika-app v1.3 JAR.
> Tika v1.3 contains numerous upgrades since 0.8 (see [1]), some of which
> include improved metadata generation for common files. These improved
> features are extremely useful for metadata gathering.
> If a project using OODT needs features provided with the v1.3 tika-core or
> tika-app JAR (e.g. custom met extractor), currently they cannot use this
> version when interacting with OODT server-side components like filemgr,
> crawler etc. since it is incompatible with OODT's use of v0.8.
> One of the incompatibilities is the deprecation of the 'getMimeType' method
> within org.apache.tika.mime.MimeTypes.getMimeType(URL). This has been
> supplemented with Tika.detect(URL.getPath()) &
> MimeTypes.getRegisteredMimeType(String)
> See example exception thrown below. when crawler 0.6-SNAPSHOT was invoked
> while a 'tika-app-1.3.jar' was placed in the crawler's lib directory:
> ---
> Jun 18, 2013 3:40:07 PM org.apache.oodt.cas.crawl.ProductCrawler ingest
> INFO: ProductCrawler: Ready to ingest product: [/data/staging/IMG_2590.jpg]:
> ProductType: [GenericFile]
> Jun 18, 2013 3:40:07 PM org.apache.oodt.cas.filemgr.ingest.StdIngester
> setFileManager
> INFO: StdIngester: connected to file manager: [http://localhost:9000]
> Jun 18, 2013 3:40:07 PM
> org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferer
> setFileManagerUrl
> INFO: In Place Data Transfer to: [http://localhost:9000] enabled
> Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.tika.mime.MimeTypes.getMimeType(Ljava/net/URL;)Lorg/apache/tika/mime/MimeType;
> at org.apache.oodt.cas.filemgr.structs.Reference.<init>(Reference.java:115)
> at
> org.apache.oodt.cas.filemgr.versioning.VersioningUtils.addRefsFromUris(VersioningUtils.java:251)
> at org.apache.oodt.cas.filemgr.ingest.StdIngester.ingest(StdIngester.java:189)
> at org.apache.oodt.cas.crawl.ProductCrawler.ingest(ProductCrawler.java:304)
> at
> org.apache.oodt.cas.crawl.ProductCrawler.handleFile(ProductCrawler.java:188)
> at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:108)
> at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:75)
> at
> org.apache.oodt.cas.crawl.daemon.CrawlDaemon.startCrawling(CrawlDaemon.java:82)
> at
> org.apache.oodt.cas.crawl.cli.action.CrawlerLauncherCliAction.execute(CrawlerLauncherCliAction.java:55)
> at org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331)
> at org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:187)
> at org.apache.oodt.cas.crawl.CrawlerLauncher.main(CrawlerLauncher.java:36)
> ---
> This JIRA issue is seeks to document efforts to upgrade OODT's use of tika
> from 0.8 to 1.3.
> ---
> [1] http://www.apache.org/dist/tika/CHANGES-1.3.txt
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)