On 7/17/22 10:24 AM, Tilman Hausherr wrote:
That is in pdfbox, not in tika.

There's also a PDFParser.parse() in tika, which then calls PDDocument.load(). 
However I don't know if this will use the InputStream call, or the one with 
File. If it uses the one with the file, then check the length and content of 
the file (tika does sometimes store streams into a temporary file).

i see the same results -- i.e., nada -- with explicit stop in PDFParser.parse

Re the failed build: remove the segment with ossindex-maven-plugin from the 
parent pom.xml . That plugin (or rather, the company behind it) has gone crazy, 
we've partly disabled it in the current trunk.

no idea what specifically to do there.

trying building 'main' with those partial disables, rather than '2.4.1', that 
also fails,

INFO  [pool-6-thread-1] 10:59:03,890 org.apache.tika.pipes.PipesClient 
pipesClientId=2 parse success: myId in 58 ms
ERROR [main] 10:59:03,907 org.apache.tika.pipes.PipesServer oom: myId
java.lang.OutOfMemoryError: oom message
        at 
jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:67)
 ~[?:?]
        at 
java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) ~[?:?]
        at java.lang.reflect.Constructor.newInstance(Constructor.java:483) 
~[?:?]
        at org.apache.tika.parser.mock.MockParser.throwIt(MockParser.java:428) 
~[test-classes/:?]
        at org.apache.tika.parser.mock.MockParser.throwIt(MockParser.java:374) 
~[test-classes/:?]
        at 
org.apache.tika.parser.mock.MockParser.executeAction(MockParser.java:155) 
~[test-classes/:?]
        at org.apache.tika.parser.mock.MockParser.parse(MockParser.java:134) 
~[test-classes/:?]
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) 
~[classes/:?]
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) 
~[classes/:?]
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:167) 
~[classes/:?]
        at 
org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:163)
 ~[classes/:?]
        at 
org.apache.tika.pipes.PipesServer.parseRecursive(PipesServer.java:540) 
~[classes/:?]
        at org.apache.tika.pipes.PipesServer.parse(PipesServer.java:473) 
~[classes/:?]
        at org.apache.tika.pipes.PipesServer.parseIt(PipesServer.java:420) 
~[classes/:?]
        at 
org.apache.tika.pipes.PipesServer.actuallyParse(PipesServer.java:340) 
~[classes/:?]
        at org.apache.tika.pipes.PipesServer.parseOne(PipesServer.java:311) 
~[classes/:?]
        at 
org.apache.tika.pipes.PipesServer.processRequests(PipesServer.java:232) 
~[classes/:?]
        at org.apache.tika.pipes.PipesServer.main(PipesServer.java:168) 
~[classes/:?]

my 1st priority is a stable dovecot search env, so i've removed tika from use & 
its config.

for now, i'll have to pass this^ on to an admin here that works regularly in a 
full java env, and won't have to keep guessing at how to debug the app.

Reply via email to