Am 02.02.2017 um 20:26 schrieb Pulkit Kapur:
Thanks. Thats what i would expect to read. Also thanks for pointing to the latest version. I pointed to the pdfbox-app-2.0.4.jar and the fontbox-2.0.4.jar files.Since i want to read over 1000 pdf documents programmatically in matlab, i am not using the command line, but using the java library in matlab. Not sure why i am still *not *getting the text using getText() {code} pdfdoc = org.pdfbox.pdmodel.PDDocument; pdfdoc.close; reader = org.pdfbox.util.PDFTextStripper; % list all the pdf files in the current folder % listing = dir('**/*.pdf'); listing = dir('*.pdf'); pdfdoc = pdfdoc.load(fullfile(listing(i).folder,listing(i).name)); pdfdoc.isEncrypted; %% text, with planty of padding pdfstr = reader.getText(pdfdoc); %#ok pdfdoc.close {\code}
Are you getting nothing at all? Or just not all? Make sure you cleaned your class path. Tilman --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

