I've been using Tika 0.9 for about a week and absolutely love it. One
issue I noticed was that Tika ignored column division when parsing text
from a PDF which was hampering my project. After a quick search I
noticed that there was a commit to fix this. Anyhow, I got the source
downloaded and ran 'mvn -e clean install' and got the following.
Any assistance would be greatly appreciated. There is more than what I
pasted below but it all essentially is complaining about a missing xml
file in a very odd location. Who knows, I may be missing more! I
simply checked out the SVN to my machine.
[INFO] [bundle:bundle]
[INFO] [install:install]
[INFO] Installing
/home/<removed>/tika/tika_svn/tika-core/target/tika-core-1.0-SNAPSHOT.ja
r to
/home/administrator/.m2/repository/org/apache/tika/tika-core/1.0-SNAPSHO
T/tika-core-1.0-SNAPSHOT.jar
[INFO] [bundle:install]
[INFO] Installing
org/apache/tika/tika-core/1.0-SNAPSHOT/tika-core-1.0-SNAPSHOT.jar
[INFO] Writing OBR metadata
[INFO]
------------------------------------------------------------------------
[INFO] Building Apache Tika parsers
[INFO] task-segment: [clean, install]
[INFO]
------------------------------------------------------------------------
[INFO] [clean:clean]
[INFO]
------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO]
------------------------------------------------------------------------
[INFO] Failed to resolve artifact.
Unable to read local copy of metadata: Cannot read metadata from
'/home/<removed>/.m2/repository/org/apache/tika/tika-core/1.0-SNAPSHOT/m
aven-metadata-apache.snapshots.xml': expected START_TAG or END_TAG not
TEXT (position: TEXT seen ...<extension>jar</... @14:25)
org.apache.tika:tika-core:jar:1.0-SNAPSHOT
Path to dependency:
1) org.apache.tika:tika-parsers:bundle:1.0-SNAPSHOT
2) org.apache.tika:tika-core:jar:1.0-SNAPSHOT