On Tue, 17 Apr 2012, Taylor, Wade wrote:
Since I couldn't get that to work I went back to basics and tried a
simple XML string:
new Tika().detect(new ByteArrayInputStream("<?xml version=\"1.0\"
encoding=\"UTF-8\"?><root><child>text</child></root>".getBytes())));
but this gets detected as "text/plain" too and I can't figure out why it's
not coming back as "application/xml".
I've just tried this with a very simple test class:
import org.apache.tika.*;
import java.io.*;
public class Test {
public static void main(String[] a) throws Exception {
System.out.println(
new Tika().detect(new ByteArrayInputStream(
"<?xml version=\"1.0\"
encoding=\"UTF-8\"?><root><child>text</child></root>".getBytes()))
);
}
}
When I run it, it works fine:
java -classpath tika-core-1.2-SNAPSHOT.jar:. Test
application/xml
Looks to me like you've managed to miss some key parts of Tika out when
you added it to your application. I'm not sure which bits you missed, and
how it hasn't blown up complaining, but it does seem to me that it's your
environment that's stuffed...
Nick