On 7/19/22 10:46 PM, Tilman Hausherr wrote:
You can also put a breakpoint in PDFBox, then go to
org.apache.pdfbox.pdfparser.PDFParser.parse()
and when it does breakpoint-stop there (it definitively passes that point!), 
then look into your /tmp directory for the file that is mentioned in the tika 
debug output and copy it somewhere else.

the bkpt guessing and various builds i've attempted (other thread) haven't been 
fruitful

at this point, it'd be helpful to be specific about the correct breakpoint

what do you explicitly intend for "set a breakpoint" and "go to"?

atm, i DL

        D="/srv/tika"
        F="tika-server-standard-2.4.2-20220720.025305-98.jar"
        cd ${D}
        rm -rf TMP
        mkdir -p TMP/mod
        cd TMP
        rm -f ${F}*
        wget 
https://repository.apache.org/content/groups/snapshots/org/apache/tika/tika-server-standard/2.4.2-SNAPSHOT/${F}
        cd mod

extract & turn debug loggin ON

        jar -xfv ../${F}
        perl -pi -e 's|Root level="info"|Root level="debug"|g' log4j2.xml

repack

        jar -cvmf META-INF/MANIFEST.MF ../mod.jar *

get main class

        cd ${D}
        unzip -p TMP/mod.jar META-INF/MANIFEST.MF | grep Main-Class
                Main-Class: org.apache.tika.server.core.TikaServerCli

launch under jdb

        sudo -u tika /usr/bin/jdb \
         -classpath /srv/tika/TMP/mod.jar \
         org.apache.tika.server.core.TikaServerCli \
         -c /etc/tika/tika-server-config-custom.xml

now, what specific breakpoint(s) to set here

        > stop in ???

so that on

        > run

and stop on/after email receipt + failed scan, I will find that trapped file in 
/tmp ?




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to