Hi Jukka, Am 11.11.2009 um 15:42 schrieb Jukka Zitting:
Here's a complete code sample I'd use: File file = new File("document.pdf"); Metadata metadata = new Metadata(); metadata.set(Metadata.RESOURCE_NAME_KEY, file.getName()); InputStream stream = new FileInputStream(file); try { String content = new Tika().parseToString(stream, metadata); System.out.println("Title: " + metadata.get(Metadata.TITLE)); System.out.println("Content: " + content); } finally {stream.close(); // note that you need to explicitly close the stream}
Thanks for the code sample. When I try to run it, it throws the following Exception:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/ pdfbox/pdmodel/PDDocument
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:54)at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java: 121) at org.apache.tika.parser.AutoDetectParser.parse (AutoDetectParser.java:108)
at org.apache.tika.Tika.parseToString(Tika.java:260) at prog.main(prog.java:32)Caused by: java.lang.ClassNotFoundException: org.apache.pdfbox.pdmodel.PDDocument
at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:319) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:330) at java.lang.ClassLoader.loadClass(ClassLoader.java:254) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:399) ... 5 moreI tried to install Tika again with Maven, the result is an Exception too (after the Tests for "Building Apache Tika application"):
[INFO] [bundle:bundle {execution: default-bundle}][INFO] ------------------------------------------------------------------------
[ERROR] FATAL ERROR[INFO] ------------------------------------------------------------------------
[INFO] Java heap space[INFO] ------------------------------------------------------------------------
[INFO] Trace java.lang.OutOfMemoryError: Java heap space at java.util.zip.ZipEntry.initFields(Native Method) at java.util.zip.ZipEntry.<init>(ZipEntry.java:96) at java.util.zip.ZipFile$2.nextElement(ZipFile.java:329) at java.util.zip.ZipFile$2.nextElement(ZipFile.java:299) at aQute.lib.osgi.ZipResource.build(ZipResource.java:41) at aQute.lib.osgi.ZipResource.build(ZipResource.java:32) at aQute.lib.osgi.Jar.<init>(Jar.java:36) at aQute.lib.osgi.Jar.<init>(Jar.java:55) at aQute.lib.osgi.Analyzer.getJarFromName(Analyzer.java:864) at aQute.lib.osgi.Builder.extractFromJar(Builder.java:767) at aQute.lib.osgi.Builder.doIncludeResource(Builder.java:682) at aQute.lib.osgi.Builder.doIncludeResources(Builder.java:668) at aQute.lib.osgi.Builder.build(Builder.java:73)at org.apache.felix.bundleplugin.BundlePlugin.buildOSGiBundle (BundlePlugin.java:391) at org.apache.felix.bundleplugin.BundlePlugin.execute (BundlePlugin.java:282) at org.apache.felix.bundleplugin.BundlePlugin.execute (BundlePlugin.java:236) at org.apache.felix.bundleplugin.BundlePlugin.execute (BundlePlugin.java:227) at org.apache.maven.plugin.DefaultPluginManager.executeMojo (DefaultPluginManager.java:490) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals (DefaultLifecycleExecutor.java:694) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalWithLifecycle (DefaultLifecycleExecutor.java:556) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal (DefaultLifecycleExecutor.java:535) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures (DefaultLifecycleExecutor.java:387) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments (DefaultLifecycleExecutor.java:348) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute (DefaultLifecycleExecutor.java:180)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:328) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138) at org.apache.maven.cli.MavenCli.main(MavenCli.java:362)at org.apache.maven.cli.compat.CompatibleMain.main (CompatibleMain.java:60)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)[INFO] ------------------------------------------------------------------------
[INFO] Total time: 49 seconds [INFO] Finished at: Wed Nov 11 16:03:32 CET 2009 [INFO] Final Memory: 31M/81M[INFO] ------------------------------------------------------------------------
I'm working on a Macbook with MacOS X 10.6.1 und Maven Version 2.2.1. Any idea what's wrong? Regards, Daniel
smime.p7s
Description: S/MIME cryptographic signature