Within pdfbox, set a breakpoint here:

If execution stops at that point, then the file still exists.
Tilman

Am 20.07.2022 um 12:32 schrieb PGNet Dev:
On 7/19/22 10:46 PM, Tilman Hausherr wrote:
You can also put a breakpoint in PDFBox, then go to
org.apache.pdfbox.pdfparser.PDFParser.parse()
and when it does breakpoint-stop there (it definitively passes that point!), then look into your /tmp directory for the file that is mentioned in the tika debug output and copy it somewhere else.

the bkpt guessing and various builds i've attempted (other thread) haven't been fruitful

at this point, it'd be helpful to be specific about the correct breakpoint

what do you explicitly intend for "set a breakpoint" and "go to"?

atm, i DL

    D="/srv/tika"
    F="tika-server-standard-2.4.2-20220720.025305-98.jar"
    cd ${D}
    rm -rf TMP
    mkdir -p TMP/mod
    cd TMP
    rm -f ${F}*
    wget https://repository.apache.org/content/groups/snapshots/org/apache/tika/tika-server-standard/2.4.2-SNAPSHOT/${F}
    cd mod

extract & turn debug loggin ON

    jar -xfv ../${F}
    perl -pi -e 's|Root level="info"|Root level="debug"|g' log4j2.xml

repack

    jar -cvmf META-INF/MANIFEST.MF ../mod.jar *

get main class

    cd ${D}
    unzip -p TMP/mod.jar META-INF/MANIFEST.MF | grep Main-Class
        Main-Class: org.apache.tika.server.core.TikaServerCli

launch under jdb

    sudo -u tika /usr/bin/jdb \
     -classpath /srv/tika/TMP/mod.jar \
     org.apache.tika.server.core.TikaServerCli \
     -c /etc/tika/tika-server-config-custom.xml

now, what specific breakpoint(s) to set here

    > stop in ???

so that on

    > run

and stop on/after email receipt + failed scan, I will find that trapped file in /tmp ?




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to