Re: How can I reduce the size of scanned PDF files?

2023-02-27 Thread Lachezar Dobrev
Typical scanners would scan to PDF by creating a page with one image spanning the whole page. Some scanners that have built-in OCR may put transparent text over the scanned image for select-and-copy purposes. Typically though the compression of the large image can stand to improvement. Easiest

Merging multiple PDFs in single page

2019-07-30 Thread Lachezar Dobrev
Hello, I am evaluating the possibility to allow users to choose multiple PDFs that contain blocks of content to be coalesced into a single PDF. I know that pages can be coalesced into a single PDF with all the pages, but in this case PDFs would contain less then a page, and I would like to

Re: QR Image Detection Down 2.0.13 -> 2.0.14

2019-04-05 Thread Lachezar Dobrev
Maybe this is a bit off, but my experience with PDFBox and looking for bar-codes led me to completely avoid rendering the PDF. Instead I iterate the resources in the pages, and look for images which I process one by one by calling PDImage.getImage(). This seems to avoid a few problems. Also that

Re: Draw arc with given data(start point, start angle, sweep angle, major axis, minor axis)

2019-03-11 Thread Lachezar Dobrev
I believe to OP means that the arc is part of an Ellipse, hence minor and major axis. На пн, 11.03.2019 г. в 9:28 ч. Tilman Hausherr написа: > > Hi, > > I entered "sweep angle arc" into google images so now I see what you > mean with "sweep angle". > > When you write "major axis" and "minor

Re: Extract embedded SVG image from PDF file

2019-03-07 Thread Lachezar Dobrev
Now that it is visible, that there is no SVG in the PDF, why not simply render the PDF to PNG and crop the graph part? Unless there is some weird requirement to use SVG rather than PNG… If there is a hard requirement for SVG one could use the PDFRenderer.renderPageToGraphics[1] in

Re: Detecting if PDF contains only/mostly images.

2017-11-06 Thread Lachezar Dobrev
ary and the command > line tools are for convenience :-) > > Tilman > > > Am 31.10.2017 um 11:18 schrieb Lachezar Dobrev: >> >>Ahh... You mean use the tool as a *ahm* tool? >>I'm so used to seeing these as parts of the command-line tools that >>

Re: Detecting if PDF contains only/mostly images.

2017-10-31 Thread Lachezar Dobrev
space is written to it, and use the writeText(PDDocument,Writer) to quickly cancel processing when non-white space is found. 2017-10-30 19:54 GMT+02:00 Tilman Hausherr <thaush...@t-online.de>: > Am 30.10.2017 um 16:52 schrieb Lachezar Dobrev: >> >>I have been looking at it

Re: Detecting if PDF contains only/mostly images.

2017-10-30 Thread Lachezar Dobrev
tell me that there is additional content on the page other than the single image? 2017-10-30 15:53 GMT+02:00 Tilman Hausherr <thaush...@t-online.de>: > Am 30.10.2017 um 14:04 schrieb Lachezar Dobrev: >> >>I have to process PDF files, that (supposedly) contain one big ima

Detecting if PDF contains only/mostly images.

2017-10-30 Thread Lachezar Dobrev
I have to process PDF files, that (supposedly) contain one big image per page, which is a result from a Document-Scanner. I'd like to avoid performing PDF-To-Image in these cases, and use the underlying image instead. I am not well-versed in all things PDF and have no idea how to detect if a

Re: converting hex to PDColor

2017-03-13 Thread Lachezar Dobrev
Hmm... 1. java.awt.Color.decode(colorStr); 2. You're using integer division "rgb.getRed()/255" will yield 0 or 1, which is then cast to float. Use "getRed()/255f" to get a float result. Your integer division code will only yield a red colour with #FF8000, which I suspect gets

Re: Rendering of a Devanagari text

2017-01-20 Thread Lachezar Dobrev
Apologies for being blunt, but seeing that you're mixing string literals and UNICODE escape sequences, I have to ask: are you *sure* you're using the same character set when editing the .java file and when compiling it? I've had discrepancies when editing the java file in one encoding (say

Re: Bolding, enhancing fonts when converting to image.

2016-07-18 Thread Lachezar Dobrev
t; > -- John > > > Tilman > > > > > >> Am 28.06.2016 um 14:06 schrieb Lachezar Dobrev: > >> I am not expecting anything to do with the library code changing. > >> I was hoping there is some bulk-font-change technique that I can do > with > >&g

Re: Bolding, enhancing fonts when converting to image.

2016-06-28 Thread Lachezar Dobrev
scaling-down that image using lanczos (just Google for a Java > implementation). Then threshold that image to binary - an adaptive > threshold might work well here. > > -- John > > > On 27 Jun 2016, at 09:10, Lachezar Dobrev <l.dob...@gmail.com> wrote: > > > >

Re: Bolding, enhancing fonts when converting to image.

2016-06-28 Thread Lachezar Dobrev
rendering the pages. 2016-06-27 19:39 GMT+03:00 Tilman Hausherr <thaush...@t-online.de>: > Am 27.06.2016 um 18:10 schrieb Lachezar Dobrev: > >>Hey all, >> >>I need to print PDFs to images to be forwarded to a low-resolution >> printer (200 dpi). >>

Bolding, enhancing fonts when converting to image.

2016-06-27 Thread Lachezar Dobrev
Hey all, I need to print PDFs to images to be forwarded to a low-resolution printer (200 dpi). Printing works (Yay!), and the ability to specify colour space and resolution helps immensely. When reading the PDFs I get these error messages: VI 27, 2016 5:53:39 PM

Re: Replacing images in PDFs.

2015-09-02 Thread Lachezar Dobrev
In-line 2015-09-01 19:31 GMT+03:00 Tilman Hausherr <thaush...@t-online.de>: > Am 01.09.2015 um 11:26 schrieb Lachezar Dobrev: > >>Hello all. >>I'm tasked with providing a service to generate PDFs from template PDFs >> by replacing text place holders and

Replacing images in PDFs.

2015-09-01 Thread Lachezar Dobrev
Hello all. I'm tasked with providing a service to generate PDFs from template PDFs by replacing text place holders and image place holders with data from a database. ​ For replacing text we decided to use Form Field​s to keep minimal effect on the page layout, and to avoid problems with texts

Re: Bouncycastle Provider Suddenly Stopped Working wehn I went to PDFBox 2.0

2015-07-02 Thread Lachezar Dobrev
You might want to check if you're not being plagued by transitive dependencies woes. Use mvn dependency:tree to check the dependencies in your project. You may be surprised. 2015-07-02 17:09 GMT+03:00 Evan Williams evan.willi...@zapprx.com: I tried updating to bouncycastle 1.52 (I was using

Re: Wether pdfbox can extract the picture from pdf?

2015-05-29 Thread Lachezar Dobrev
Yes, it is possible. PDDocument pdf = PDDocument.load(input_stream); List pages = pdf.getDocumentCatalog().getAllPages(); for (int pageNo = 0; pageNo pdf.getNumberOfPages(); pageNo ++) { PDPage page = (PDPage) pages.get(pageNo); PDResources rsc = page.getResources(); if

Re: org.apache.pdfbox.rendering.PDFRenderer logs

2015-03-16 Thread Lachezar Dobrev
In that case I suspect you need to check JULI [1]. It's not short, and at places it is cryptic, but is the definitive source. A quick-and-dirty cork-in: 1. create a logging.properties file (name is arbitrary) 2. put in that file the following two lines handlers= .level=OFF 3.

Re: WARNING: Using fallback font

2015-03-16 Thread Lachezar Dobrev
I'd rather 'alias' the font in either: - /etc/fonts/fonts.conf - ~/.config/fontconfig/fonts.conf 2015-03-15 13:08 GMT+02:00 Tilman Hausherr thaush...@t-online.de: Install the Times New Roman TT font on your system. Tilman Am 15.03.2015 um 03:16 schrieb Andrew Munn: I am running

Re: Private key and mark

2014-02-19 Thread Lachezar Dobrev
Well... Typically when using a hardware device one does not use the key material directly, but rather uses a PKCS#11 library to interface the device and call on it to perform any cryptographic operations it needs to: signatures, verification, encryption, decryption, generation etc. For this

Re: [SURVEY] PDFBox Uses Cases

2014-01-06 Thread Lachezar Dobrev
Main use: extract images from PDFs created by scanner/fax. 2014/1/6 Maruan Sahyoun sahy...@fileaffairs.de: Dear PDFBox users, we’d love to hear from you how you are using PDFBox in your PDF applications. Do you use it for rendering, merging, creation … - what is the main application?

Re: Problem loading large pdf files

2013-10-30 Thread Lachezar Dobrev
This smells of Exception overwriting. BaseParser.java:610 is actually a clean-up procedure, and if it crashes it's quite possible that the original error is lost. I have a gut feeling that there is an OOME somewhere above, that gets wiped out by a crashed clean-up procedure. That said: did you

Re: Images in pdf not the same RGB

2013-09-10 Thread Lachezar Dobrev
Possibly the images get compressed as JPEG or similar. I've had problems with other PDF generators too, occasionally when adding a Bar Code to the PDF it gets severely distorted. As it turns out the encoder compressed the image using JPEG, which results in severe errors in the image.

Unable to read PDF with embedded Black-And-White TIFF.

2013-07-29 Thread Lachezar Dobrev
Hello colleagues. Since a month or two I've started using PDF Box to read PDF files received from a scanner. Recently some of the users started receiving this error: java.lang.RuntimeException: EOL encountered in black run. at