Hi,

I've tried exactly the same code in two scenarios :

Tika tika = new Tika();
Metadata metadata = new Metadata();

Reader reader = tika.parse(new File("..."));
FileWriter fw = new FileWriter(new File("..."));

int data = reader.read();
StringBuilder sb = new StringBuilder();
while (data != -1){
 char dataChar = (char) data;
sb.append(dataChar);
fw.write(dataChar);
 data = reader.read();
}

When I put this code in a simple Java project with tika-app-1.4.jar as a
dependency, it
generates UTF-8 output (correct).
When I put this code inside a bundle with *tika-bundle* and *tika-core* as
dependencies and deploy it
inside karaf, it generates ANSI output (blah).
Both projects are managed with maven and Eclipse 4.2.

Do I have to additionaly set something or should I embed tika-app inside my
bundle (using
maven-bundle-plugin)?

I'm using Tika 1.4, Java 1.6.45, Win 7 x64 and karaf 2.3.3.


-- 
Bratislav Stojanovic, M.Sc.

Reply via email to