Re: [poppler] Big image splash support

2020-10-04 Thread Martin (gzlist)
On Thu, 1 Oct 2020 at 23:48, Albert Astals Cid  wrote:
>
> If it's guarded, it's not wrong.
>
> It's like saying that code this code
>
> if (!a)
>   return;
>
> a->doSomething();
>
> is technically wrong because if a is null it will crash and is only guarded 
> by the if.
>
> Yes, that's what guards do, they make subsequent code free to assume stuff.

The issue is the guard is distant from all the rest of the code
depending on it. At the moment there are lots of methods with implicit
preconditions, some obvious (height is greater than zero), others less
so (width times components_of_mode plus row_pad times height is less
than some_large_number). It's just easier to work with code that is
trivially correct due to the types and operations being used, and the
compiler does more verification for you.

> Make the patches correct (to by possibly wrong perception).

Rather than make the prerequisite patches pointless, I've put up the
full set in a PR that you can build and try. You can also try removing
the type casts you don't see the reason for, and see where the program
then segfaults.

https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/645

Thanks,

Martin
___
poppler mailing list
poppler@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler


Re: [poppler] Big image splash support

2020-09-27 Thread Martin (gzlist)
I didn't directly respond to this point, sorry William.

On Thu, 20 Aug 2020 at 13:57, William Bader  wrote:
>
> If the big image patches are committed, is it worth adding a command line 
> option to enable or disable big images or to set the max image size so 
> applications that should never see big images don't have to worry about DOS?

So, DoS is an issue regardless of the current allocator restrictions,
because of the features of the PDF format. It would be reasonable to
have some documentation about using the various sandboxing features of
platforms when handing untrusted PDFs, which in practice most uses of
the library will already do (previewing a PDF should not crash your
shell).

The cases I'm most worried about come from the current code around
buffer offsets, which need to be fixed before the allocator change
could land. With all those cases adapted to use the correct (64 bit)
operations, the binary Just Works for larger sizes, and the code
already does (something) in response to malloc failure.

Martin
___
poppler mailing list
poppler@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler


Re: [poppler] Big image splash support

2020-09-27 Thread Martin (gzlist)
Thank you for coming back to this Albert, and I hope you had a good conference.

On Sat, 26 Sep 2020 at 21:18, Albert Astals Cid  wrote:
>
> Or there is nothing to fix because we know the code works, which is the case 
> here.

To clarify, the current code is technically wrong in many places, but
guarded by the hard limit in the allocator, as described in my initial
message.

> Doing changes for the sake of changing things is sometimes something we do,

The changes are not "for the sake of [it]", they are to support the
goal "big images" as named in the title of this thread.

> but we need to feel it, on those patches i wasn't feelint it.

I don't know how to move forward in response to this.

I'm unclear if your "feel it" has specific objections. You've not
stated that you would not accept any patches to the current allocator
limit, but perhaps that's the case?

Martin
___
poppler mailing list
poppler@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler


Re: [poppler] Big image splash support

2020-09-19 Thread Martin (gzlist)
On Tue, 1 Sep 2020 at 11:34, Martin (gzlist)  wrote:
>
> Albert has mentioned in review that he's not clear on why the casts
> are there, and is not convinced about using size_t for height and
> width values. So, I'll have another go at explaining this.

Have you had a chance to read over this last post Albert? I believe
I've answered all your questions on the two preliminary PRs, so would
appreciate knowing if you have any further concerns.

Martin
___
poppler mailing list
poppler@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler


Re: [poppler] Big image splash support

2020-09-01 Thread Martin (gzlist)
On Thu, 20 Aug 2020 at 12:43, Martin (gzlist)  wrote:
>
> There is an argument that all widths, heights, and coordinates should
> be size_t instead of int, but to limit the size and pain of the
> change, keeping the current interfaces using int seems reasonable.
...
> For the splash code, what I've implemented and propose landing is
> casting to size_t on index into a buffer, generally where
> multiplication happens.

I've posted a couple of initial patches to gitlab, with most of the
rest of the changes depending on the one that adds new buffer
accessors to SplashBitmap. One extra change is that to support the
negative rowSize convention where bottom-up data layout stores a
pointer to the start of the last pixel row rather than to the first,
the buffer offset type uses ptrdiff_t instead of size_t.

Albert has mentioned in review that he's not clear on why the casts
are there, and is not convinced about using size_t for height and
width values. So, I'll have another go at explaining this.

int w = 1 << 16;
int h = 1 << 16;
memset(some_ptr, 0, w * h);

What happens? The third parameter to [memset] is size_t but the
[multiplication] is happening as int. On a LP64 machine, this is
signed [integer overflow] and undefined behaviour.

This can be fixed by assigning into a wider type:

- int w = 1 << 16;
+ size_t w = (int)(1 << 16);

Or by casting to the result type in the operation:

- memset(some_ptr, 0, w * h);
+ memset(some_ptr, 0, (size_t)w * h);

Assigning or casting both values would be a little clearer, but the
promotion to the wider type happens regardless.

Because this is fiddly to get right, and verbose when spelled out with
c++ static_cast<>() syntax, I've [introduced helpers] for getting at
the buffer data to minimise the changes required. The general rule is
sizes should be promoted to size_t and memory offsets should be
promoted to ptrdiff_t.

One final note, I don't expect any performance impact, but
microbenchmarks aren't going to be very informative as the compiler
optimisations have a big effect here.

Martin


[memset] memset manpage
https://man7.org/linux/man-pages/man3/memset.3.html
[multiplication] How to Write Multiplies Correctly in C Code
https://www.ti.com/lit/an/spra683/spra683.pdf
[integer overflow] Understanding Integer Overflow in C/C++
https://wdtz.org/files/tosem15.pdf
[introduced helpers] splash: New buffer accessors for SplashBitmap
https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/623
___
poppler mailing list
poppler@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler


Re: [poppler] Big image splash support

2020-08-27 Thread Martin (gzlist)
On Thu, 20 Aug 2020 at 16:53, Albert Astals Cid  wrote:
>
> What's your reason for needing such huge images?

Main point is they are not *that* huge? The [largest digital photo in
1999] was bigger.

Essentially, it's not unreasonable to have a working set larger than
2GiB. With some sample inputs my final output then is ~10MiB as png.
It's fine for the allocator to say you can't have N bytes because the
machine can't provide it, rather than because it doesn't fit in int32?

On Thu, 20 Aug 2020 at 22:28, Albert Astals Cid  wrote:
>
> You will always hit number limits at some point, my suggestion is to change 
> pdftoppm to just render tiles and then stitch it together if they don't fit 
> the current size limits.

In practice I am also tiling above the level of pdftoppm, but in
development wanted the flexibility of picking the tiling parameters to
find the best combination. It's likely to be smaller, say 8192x8192,
in deployment.

Martin


[largest digital photo in 1999]: 1.7 gigapixels, or a bit under 5GiB
of pixel data
https://en.wikipedia.org/wiki/List_of_largest_photographs#Portrait_of_a_Coral_Reef_(1999)
___
poppler mailing list
poppler@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler


[poppler] Big image splash support

2020-08-20 Thread Martin (gzlist)
I have a patchset to support writing images with more than 2GiB pixel
data using the splash backend, which raises a few questions worth
running past this list.

Rendering a pdf (using for instance the pdftoppm utility) allocates a
bitmap to draw into and assemble the output. The bitmap dimensions and
coordinates are of type int, and the product is also treated as an
int, both in the gmem allocation, and in drawing logic. On an LP64
system, this means the max image size is under 27500x27500 and quite a
bit less than could fit in memory.

Large outputs result is this error:

$ pdftoppm -scale-to 32768 -png ~/sample.ai > /tmp/out.png
Bogus memory allocation size
$ file /tmp/out.png
/tmp/out.png: PNG image data, 1 x 1, 8-bit/color RGB, non-interlaced

The logged line is from the allocator level check, and after there is
a slightly surprising fallback to 1 pixel square output. From looking
at other [allocator errors] this probably exists so  *some* output is
produced for documents with only some bogus content?

There is an argument that all widths, heights, and coordinates should
be size_t instead of int, but to limit the size and pain of the
change, keeping the current interfaces using int seems reasonable.

So two distinct changes to make:
* Allocations larger than 2GiB must be allowed
* Buffer offsets must use a type larger than int
Also, optionally:
* Handle failed allocations a little better

For allocation, what I have currently is (A) change use of int in
gmem.c to long so all allocations can be >2GiB on 64-bit machines.
Alternatively I could (B) find just the uses of the allocator that may
need large areas and make them use a new gmem function.

Do you prefer (A) or (B) here? Given existing [pdf deflate bombs]
there's probably not much DoS protection provided by the 2GiB limit?

For the splash code, what I've implemented and propose landing is
casting to size_t on index into a buffer, generally where
multiplication happens. The tradeoff is it's harder to be sure of
correctness - my method of finding where to make changes was a
combination of grep, and gdb with real world inputs.

Is this acceptable? It's possible this method will introduce memory
safety issues specifically for large output sizes.

Note the [cairo max size] is explicitly (1<<15)-1 for surface
dimensions, and not addressed by my changes.

Martin


[allocator errors]: Some problems currently blocked at gmalloc level?
https://gitlab.freedesktop.org/poppler/poppler/-/issues/532
https://gitlab.freedesktop.org/poppler/poppler/-/issues/825

[pdf deflate bombs]: PDF Deflate bombs may cause crashes or resource exhaustion
https://gitlab.freedesktop.org/poppler/poppler/-/issues/878

[cairo max size]: Limits for the cairo backed
https://gitlab.freedesktop.org/cairo/cairo/-/blob/1.16/src/cairo-image-surface.c#L59
There's been some work in pixman, but it's incomplete?
https://gitlab.freedesktop.org/pixman/pixman/-/issues/6
___
poppler mailing list
poppler@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler


Re: [poppler] [PATCH] Add -jpegopt optimize option support to utils

2018-07-20 Thread Martin (gzlist)
On 20 July 2018 at 22:43, Albert Astals Cid  wrote:
>
> Pushed, I removed the changes to the man pages that were not related to this 
> specific change, we can discuss them in a different thread if you think makes 
> sense (I kind of agree your definition of progressive was probably better, 
> but I didn't want to have "unrelated" changes in the commit)

Thanks! Yes, fair enough, probably better to do a pass over the
manpage wording as a separate change.

Martin
___
poppler mailing list
poppler@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler


Re: [poppler] [PATCH] Add -jpegopt optimize option support to utils

2018-07-04 Thread Martin (gzlist)
On 24 June 2018 at 23:12, Martin (gzlist)  wrote:
>
> I've changed the print ones in pdftocairo, see amended patch also with
> other wording tweaks attached.

Is there anything else I can do to help you review this patch?

Am happy to do another pass over the documentation as a separate
change if that would help.

Martin
___
poppler mailing list
poppler@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler


Re: [poppler] Utils - Fix UTF-16 file name on Windows environment

2018-06-26 Thread Martin (gzlist)
On 25/06/2018, Thibaut Brard  wrote:
>
> I just have a look to GooFile. As I understand, Martin suggests to use it
> instead of gfile.cc, am I right?

No, using openFile is fine in general, it just happens to be wrong in
the VMS case whereas a GooFile codepath includes the extra param.

For now, just changing the calls that you've been able to test locally
with non-ascii filenames sounds like a good first step for the patch.

Martin
___
poppler mailing list
poppler@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler


Re: [poppler] [PATCH] Add -jpegopt optimize option support to utils

2018-06-24 Thread Martin (gzlist)
On 24/06/2018, Albert Astals Cid  wrote:
>
> Not an English native speaker myself, but are you sure of the
> Selects vs Select change?

Struggling to find a documentation style guide that actually states
it, but think in general imperative is preferred over simple present.
See for instance `man grep` -w and -x (and in fact the whole file).
The pdftoppm man page is not consistent, both tenses are used, compare
-x/-y -opw/-upw or -png -jpeg.

> So if Select is the correct one should we change that one too?

Probably. I don't want to rewrite the whole manpage as part of this
change, it seems unrelated.

I've changed the print ones in pdftocairo, see amended patch also with
other wording tweaks attached.

Martin
Add -jpegopt optimize option support to utils

New option 'optimize=y' for utils that take a -jpegopt param,
pdftocairo and pdftoppm. This corresponds to the cjpeg -optimize
flag, and slightly reduces the size of the output jpeg but uses
additional cpu and memory.

New jpegOptimize boolean in splash/SplashBitmap WriteImgParams.

New setOptimize method on goo/JpegWriter taking a boolean.

Update manpages for new option.

Signed-off-by: Martin Packman 
---
 goo/JpegWriter.cc  | 11 +++
 goo/JpegWriter.h   |  1 +
 splash/SplashBitmap.cc |  1 +
 splash/SplashBitmap.h  |  1 +
 utils/pdftocairo.1 | 17 -
 utils/pdftocairo.cc| 10 ++
 utils/pdftoppm.1   | 13 ++---
 utils/pdftoppm.cc  | 10 ++
 8 files changed, 56 insertions(+), 8 deletions(-)

diff --git a/goo/JpegWriter.cc b/goo/JpegWriter.cc
index 37c15c2a..9ac9b63b 100644
--- a/goo/JpegWriter.cc
+++ b/goo/JpegWriter.cc
@@ -25,6 +25,7 @@ extern "C" {
 
 struct JpegWriterPrivate {
   bool progressive;
+  bool optimize;
   int quality;
   JpegWriter::Format format;
   struct jpeg_compress_struct cinfo;
@@ -46,6 +47,7 @@ JpegWriter::JpegWriter(int q, bool p, Format formatA)
 {
   priv = new JpegWriterPrivate;
   priv->progressive = p;
+  priv->optimize = false;
   priv->quality = q;
   priv->format = formatA;
 }
@@ -54,6 +56,7 @@ JpegWriter::JpegWriter(Format formatA)
 {
   priv = new JpegWriterPrivate;
   priv->progressive = false;
+  priv->optimize = false;
   priv->quality = -1;
   priv->format = formatA;
 }
@@ -75,6 +78,11 @@ void JpegWriter::setProgressive(bool progressive)
   priv->progressive = progressive;
 }
 
+void JpegWriter::setOptimize(bool optimize)
+{
+  priv->optimize = optimize;
+}
+
 bool JpegWriter::init(FILE *f, int width, int height, int hDPI, int vDPI)
 {
   // Setup error handler
@@ -137,6 +145,9 @@ bool JpegWriter::init(FILE *f, int width, int height, int hDPI, int vDPI)
 jpeg_simple_progression(>cinfo);
   }
 
+  // Set whether to compute optimal Huffman coding tables
+  priv->cinfo.optimize_coding = priv->optimize;
+
   // Get ready for data
   jpeg_start_compress(>cinfo, TRUE);
 
diff --git a/goo/JpegWriter.h b/goo/JpegWriter.h
index 9a68bcc9..bb87b949 100644
--- a/goo/JpegWriter.h
+++ b/goo/JpegWriter.h
@@ -41,6 +41,7 @@ public:
 
   void setQuality(int quality);
   void setProgressive(bool progressive);
+  void setOptimize(bool optimize);
   bool init(FILE *f, int width, int height, int hDPI, int vDPI) override;
 
   bool writePointers(unsigned char **rowPointers, int rowCount) override;
diff --git a/splash/SplashBitmap.cc b/splash/SplashBitmap.cc
index 2bec9f93..6f5e4cf8 100644
--- a/splash/SplashBitmap.cc
+++ b/splash/SplashBitmap.cc
@@ -356,6 +356,7 @@ void SplashBitmap::setJpegParams(ImgWriter *writer, WriteImgParams* params)
 #ifdef ENABLE_LIBJPEG
   if (params) {
 static_cast(writer)->setProgressive(params->jpegProgressive);
+static_cast(writer)->setOptimize(params->jpegOptimize);
 if (params->jpegQuality >= 0)
   static_cast(writer)->setQuality(params->jpegQuality);
   }
diff --git a/splash/SplashBitmap.h b/splash/SplashBitmap.h
index 092bd4cf..9756164f 100644
--- a/splash/SplashBitmap.h
+++ b/splash/SplashBitmap.h
@@ -81,6 +81,7 @@ public:
 int jpegQuality = -1;
 GBool jpegProgressive = gFalse;
 GooString tiffCompression;
+GBool jpegOptimize = gFalse;
   };
 
   SplashError writeImgFile(SplashImageFileFormat format, char *fileName, int hDPI, int vDPI, WriteImgParams* params = nullptr);
diff --git a/utils/pdftocairo.1 b/utils/pdftocairo.1
index 3a60b73e..6be3c410 100644
--- a/utils/pdftocairo.1
+++ b/utils/pdftocairo.1
@@ -305,21 +305,28 @@ When JPEG output is specified, the \-jpegopt option can be used to control the J
 It takes a string of the form "=[,=]". Currently the available options are:
 .TP
 .BI quality
-Selects the JPEG quality value. The value must be an integer between 0 and 100.
+Select the JPEG quality. The value must be an integer between 0 and 100.
 .TP
 .BI progressive
-Select progressive JPEG output. The possible values are "y", "n",
-indicating progressive (yes) or non-progressive (no), respectively.
+Select whether to write the JPEG as multiple scans of increasing quality. The
+value must be "y" or 

Re: [poppler] [PATCH] Add -jpegopt optimize option support to utils

2018-06-13 Thread Martin (gzlist)
On 13/06/2018, Albert Astals Cid  wrote:
> You probably need to update the manpages of both utils to include that?

Done, see amended patch attached. Also updated the wording of the
other options a little, with reference to the jpeg-turbo documentation
on the options.

I note the copyright on AUTHOR below seems a little outdated, but left
that alone.

Martin
diff --git a/goo/JpegWriter.cc b/goo/JpegWriter.cc
index 37c15c2a..9ac9b63b 100644
--- a/goo/JpegWriter.cc
+++ b/goo/JpegWriter.cc
@@ -25,6 +25,7 @@ extern "C" {
 
 struct JpegWriterPrivate {
   bool progressive;
+  bool optimize;
   int quality;
   JpegWriter::Format format;
   struct jpeg_compress_struct cinfo;
@@ -46,6 +47,7 @@ JpegWriter::JpegWriter(int q, bool p, Format formatA)
 {
   priv = new JpegWriterPrivate;
   priv->progressive = p;
+  priv->optimize = false;
   priv->quality = q;
   priv->format = formatA;
 }
@@ -54,6 +56,7 @@ JpegWriter::JpegWriter(Format formatA)
 {
   priv = new JpegWriterPrivate;
   priv->progressive = false;
+  priv->optimize = false;
   priv->quality = -1;
   priv->format = formatA;
 }
@@ -75,6 +78,11 @@ void JpegWriter::setProgressive(bool progressive)
   priv->progressive = progressive;
 }
 
+void JpegWriter::setOptimize(bool optimize)
+{
+  priv->optimize = optimize;
+}
+
 bool JpegWriter::init(FILE *f, int width, int height, int hDPI, int vDPI)
 {
   // Setup error handler
@@ -137,6 +145,9 @@ bool JpegWriter::init(FILE *f, int width, int height, int hDPI, int vDPI)
 jpeg_simple_progression(>cinfo);
   }
 
+  // Set whether to compute optimal Huffman coding tables
+  priv->cinfo.optimize_coding = priv->optimize;
+
   // Get ready for data
   jpeg_start_compress(>cinfo, TRUE);
 
diff --git a/goo/JpegWriter.h b/goo/JpegWriter.h
index 9a68bcc9..bb87b949 100644
--- a/goo/JpegWriter.h
+++ b/goo/JpegWriter.h
@@ -41,6 +41,7 @@ public:
 
   void setQuality(int quality);
   void setProgressive(bool progressive);
+  void setOptimize(bool optimize);
   bool init(FILE *f, int width, int height, int hDPI, int vDPI) override;
 
   bool writePointers(unsigned char **rowPointers, int rowCount) override;
diff --git a/splash/SplashBitmap.cc b/splash/SplashBitmap.cc
index 2bec9f93..6f5e4cf8 100644
--- a/splash/SplashBitmap.cc
+++ b/splash/SplashBitmap.cc
@@ -356,6 +356,7 @@ void SplashBitmap::setJpegParams(ImgWriter *writer, WriteImgParams* params)
 #ifdef ENABLE_LIBJPEG
   if (params) {
 static_cast(writer)->setProgressive(params->jpegProgressive);
+static_cast(writer)->setOptimize(params->jpegOptimize);
 if (params->jpegQuality >= 0)
   static_cast(writer)->setQuality(params->jpegQuality);
   }
diff --git a/splash/SplashBitmap.h b/splash/SplashBitmap.h
index 092bd4cf..9756164f 100644
--- a/splash/SplashBitmap.h
+++ b/splash/SplashBitmap.h
@@ -81,6 +81,7 @@ public:
 int jpegQuality = -1;
 GBool jpegProgressive = gFalse;
 GooString tiffCompression;
+GBool jpegOptimize = gFalse;
   };
 
   SplashError writeImgFile(SplashImageFileFormat format, char *fileName, int hDPI, int vDPI, WriteImgParams* params = nullptr);
diff --git a/utils/pdftocairo.1 b/utils/pdftocairo.1
index 3a60b73e..4c32f920 100644
--- a/utils/pdftocairo.1
+++ b/utils/pdftocairo.1
@@ -305,11 +305,18 @@ When JPEG output is specified, the \-jpegopt option can be used to control the J
 It takes a string of the form "=[,=]". Currently the available options are:
 .TP
 .BI quality
-Selects the JPEG quality value. The value must be an integer between 0 and 100.
+Select the JPEG quality. The value must be an integer between 0 and 100.
 .TP
 .BI progressive
-Select progressive JPEG output. The possible values are "y", "n",
-indicating progressive (yes) or non-progressive (no), respectively.
+Select whether to write the JPEG as multiple scans of increasing quality. The
+value must be "y" or "n", with "y" indicating a progressive image, otherwise a
+single-scan image is created.
+.TP
+.BI optimize
+Set whether to compute optimal Huffman coding tables for the JPEG output,
+which creates smaller files but requires and extra pass over the data. The
+value must be "y" or "n", with "y" performing optimization, otherwise the
+default Huffman tables are used.
 .SH WINDOWS PRINTER OPTIONS
 In Windows, you can use the \-print option to print directly to a system printer. Additionally, you can use the \-printopt 
 option to configure the printer. It takes a string of the form "=[,=]". Currently the available options are:
diff --git a/utils/pdftocairo.cc b/utils/pdftocairo.cc
index 05ceaa3b..e5e49a19 100644
--- a/utils/pdftocairo.cc
+++ b/utils/pdftocairo.cc
@@ -129,6 +129,7 @@ static GBool printHelp = gFalse;
 static GooString jpegOpt;
 static int jpegQuality = -1;
 static bool jpegProgressive = false;
+static bool jpegOptimize = false;
 
 static GooString printer;
 static GooString printOpt;
@@ -369,6 +370,14 @@ static GBool parseJpegOptions()
 	fprintf(stderr, "jpeg progressive option must be \"y\" or \"n\"\n");
 	return gFalse;
   }
+} else 

[poppler] [PATCH] Add -jpegopt optimize option support to utils

2018-06-09 Thread Martin (gzlist)
New option 'optimize=y' for utils that take a -jpegopt param,
pdftocairo and pdftoppm. This corresponds to the cjpeg -optimize
flag, and slightly reduces the size of the output jpeg but uses
additional cpu and memory.

New jpegOptimize boolean in splash/SplashBitmap WriteImgParams.

New setOptimize method on goo/JpegWriter taking a boolean.

Signed-off-by: Martin Packman 
---
 goo/JpegWriter.cc  | 11 +++
 goo/JpegWriter.h   |  1 +
 splash/SplashBitmap.cc |  1 +
 splash/SplashBitmap.h  |  1 +
 utils/pdftocairo.cc| 10 ++
 utils/pdftoppm.cc  | 10 ++
 6 files changed, 34 insertions(+)
diff --git a/goo/JpegWriter.cc b/goo/JpegWriter.cc
index 37c15c2a..9ac9b63b 100644
--- a/goo/JpegWriter.cc
+++ b/goo/JpegWriter.cc
@@ -25,6 +25,7 @@ extern "C" {
 
 struct JpegWriterPrivate {
   bool progressive;
+  bool optimize;
   int quality;
   JpegWriter::Format format;
   struct jpeg_compress_struct cinfo;
@@ -46,6 +47,7 @@ JpegWriter::JpegWriter(int q, bool p, Format formatA)
 {
   priv = new JpegWriterPrivate;
   priv->progressive = p;
+  priv->optimize = false;
   priv->quality = q;
   priv->format = formatA;
 }
@@ -54,6 +56,7 @@ JpegWriter::JpegWriter(Format formatA)
 {
   priv = new JpegWriterPrivate;
   priv->progressive = false;
+  priv->optimize = false;
   priv->quality = -1;
   priv->format = formatA;
 }
@@ -75,6 +78,11 @@ void JpegWriter::setProgressive(bool progressive)
   priv->progressive = progressive;
 }
 
+void JpegWriter::setOptimize(bool optimize)
+{
+  priv->optimize = optimize;
+}
+
 bool JpegWriter::init(FILE *f, int width, int height, int hDPI, int vDPI)
 {
   // Setup error handler
@@ -137,6 +145,9 @@ bool JpegWriter::init(FILE *f, int width, int height, int hDPI, int vDPI)
 jpeg_simple_progression(>cinfo);
   }
 
+  // Set whether to compute optimal Huffman coding tables
+  priv->cinfo.optimize_coding = priv->optimize;
+
   // Get ready for data
   jpeg_start_compress(>cinfo, TRUE);
 
diff --git a/goo/JpegWriter.h b/goo/JpegWriter.h
index 9a68bcc9..bb87b949 100644
--- a/goo/JpegWriter.h
+++ b/goo/JpegWriter.h
@@ -41,6 +41,7 @@ public:
 
   void setQuality(int quality);
   void setProgressive(bool progressive);
+  void setOptimize(bool optimize);
   bool init(FILE *f, int width, int height, int hDPI, int vDPI) override;
 
   bool writePointers(unsigned char **rowPointers, int rowCount) override;
diff --git a/splash/SplashBitmap.cc b/splash/SplashBitmap.cc
index 2bec9f93..6f5e4cf8 100644
--- a/splash/SplashBitmap.cc
+++ b/splash/SplashBitmap.cc
@@ -356,6 +356,7 @@ void SplashBitmap::setJpegParams(ImgWriter *writer, WriteImgParams* params)
 #ifdef ENABLE_LIBJPEG
   if (params) {
 static_cast(writer)->setProgressive(params->jpegProgressive);
+static_cast(writer)->setOptimize(params->jpegOptimize);
 if (params->jpegQuality >= 0)
   static_cast(writer)->setQuality(params->jpegQuality);
   }
diff --git a/splash/SplashBitmap.h b/splash/SplashBitmap.h
index 092bd4cf..9756164f 100644
--- a/splash/SplashBitmap.h
+++ b/splash/SplashBitmap.h
@@ -81,6 +81,7 @@ public:
 int jpegQuality = -1;
 GBool jpegProgressive = gFalse;
 GooString tiffCompression;
+GBool jpegOptimize = gFalse;
   };
 
   SplashError writeImgFile(SplashImageFileFormat format, char *fileName, int hDPI, int vDPI, WriteImgParams* params = nullptr);
diff --git a/utils/pdftocairo.cc b/utils/pdftocairo.cc
index 05ceaa3b..e5e49a19 100644
--- a/utils/pdftocairo.cc
+++ b/utils/pdftocairo.cc
@@ -129,6 +129,7 @@ static GBool printHelp = gFalse;
 static GooString jpegOpt;
 static int jpegQuality = -1;
 static bool jpegProgressive = false;
+static bool jpegOptimize = false;
 
 static GooString printer;
 static GooString printOpt;
@@ -369,6 +370,14 @@ static GBool parseJpegOptions()
 	fprintf(stderr, "jpeg progressive option must be \"y\" or \"n\"\n");
 	return gFalse;
   }
+} else if (opt.cmp("optimize") == 0 || opt.cmp("optimise") == 0) {
+  jpegOptimize = gFalse;
+  if (value.cmp("y") == 0) {
+	jpegOptimize = gTrue;
+  } else if (value.cmp("n") != 0) {
+	fprintf(stderr, "jpeg optimize option must be \"y\" or \"n\"\n");
+	return gFalse;
+  }
 } else {
   fprintf(stderr, "Unknown jpeg option \"%s\"\n", opt.getCString());
   return gFalse;
@@ -415,6 +424,7 @@ static void writePageImage(GooString *filename)
 else
   writer = new JpegWriter(JpegWriter::RGB);
 
+static_cast(writer)->setOptimize(jpegOptimize);
 static_cast(writer)->setProgressive(jpegProgressive);
 if (jpegQuality >= 0)
   static_cast(writer)->setQuality(jpegQuality);
diff --git a/utils/pdftoppm.cc b/utils/pdftoppm.cc
index 4d567bd4..8ea81d04 100644
--- a/utils/pdftoppm.cc
+++ b/utils/pdftoppm.cc
@@ -92,6 +92,7 @@ static GBool tiff = gFalse;
 static GooString jpegOpt;
 static int jpegQuality = -1;
 static bool jpegProgressive = false;
+static bool jpegOptimize = false;
 #ifdef SPLASH_CMYK
 static GBool overprint = gFalse;
 #endif
@@ -257,6 +258,14 @@ static