[
https://issues.apache.org/jira/browse/PDFBOX-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18081632#comment-18081632
]
Stefan Ziegler commented on PDFBOX-6192:
----------------------------------------
Thanks for adding this — it covers exactly the construction case the deprecated
setters left unsupported. A few points:
*1. The two null-checks are identical (copy-paste bug).* Both read {{{}if (base
== null && lookupData == null){}}}. As written, neither throws unless _both_
are null — and then only the first message fires. Since the method dereferences
_both_ arguments unconditionally afterwards ({{{}base.getCOSObject(){}}},
{{{}Arrays.copyOf(lookupData, …){}}}, {{{}new COSString(lookupData, …){}}}),
each must be non-null independently:
{code:java}
if (base == null)
throw new IllegalArgumentException("base must not be null");
if (lookupData == null)
throw new IllegalArgumentException("lookupData must not be null");{code}
*2. Yes — a length check is worthwhile, but only a _minimum_ check.* The
expected lookup size per PDF 32000-1 §8.6.6.3 is {{(hival + 1) ×
base.getNumberOfComponents()}} bytes.
* *Too short* is a genuine caller error: {{readColorTable()}} /
{{initRgbColorTable()}} will read past the end and fail deep inside with an
{{ArrayIndexOutOfBoundsException}} and an opaque stack trace. A clear upfront
message is much friendlier:
{code:java}
int expected = (hival + 1) * base.getNumberOfComponents();
if (lookupData.length < expected)
throw new IllegalArgumentException(
"lookupData too short: expected at least " + expected
+ " bytes ((hival+1) * components), got " + lookupData.length);{code}
* *Too long* should _not_ be rejected. The spec tolerates extra trailing
bytes, and real-world PDFs frequently carry slightly oversized tables — a
strict equality check would reject valid input. So check {{{}<{}}}, not
{{{}!={}}}.
*3. Validate {{hival}} too.* The spec requires {{{}0 ≤ hival ≤ 255{}}}; a
negative or >255 value is malformed and should be rejected before it reaches
{{{}readColorTable(){}}}.
*4. Minor — defensive-copy consistency.* {{pdIndexed.lookupData}} is set from
{{{}Arrays.copyOf(lookupData, …){}}}, but the {{COSString}} is built from the
_original_ {{{}lookupData{}}}. If the caller mutates the array afterwards, the
cached field and the COSString diverge. Pass the same (copied) array to both.
*Summary:* the copy-paste null-check is the must-fix; a minimum-length check
(with the spec formula in the message) and {{hival}} range check would turn
confusing downstream crashes into clear contract violations. An exact-equality
length check would be too strict.
> PDIndexed() no-arg constructor, setBaseColorSpace(), and setHighValue() all
> deprecated in 3.x with no usable replacement for incremental construction
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: PDFBOX-6192
> URL: https://issues.apache.org/jira/browse/PDFBOX-6192
> Project: PDFBox
> Issue Type: Wish
> Affects Versions: 3.0.7 PDFBox
> Reporter: Stefan Ziegler
> Assignee: Andreas Lehmkühler
> Priority: Major
>
> We are converting PostScript to PDF using PDFBox 3.x. PostScript's Indexed
> color space is defined as:
> {code:java}
> [/Indexed base hival lookup]
> {code}
> where base, hival, and lookup arrive as separate, independently parsed
> PostScript values. We need to construct a PDIndexed object incrementally:
> first set the base color space, then set the high value, and later
> (optionally) supply the lookup table.
> The current 3.x API offers exactly one approach for this:
> {code:java}
> PDIndexed indexed = new PDIndexed(); // @Deprecated
> indexed.setBaseColorSpace(base); // @Deprecated
> indexed.setHighValue(hival); // @Deprecated{code}
> All three of these are now marked {{@Deprecated}} with the note _"This will
> be removed in 4.0. If you need it, please contact us."_ — but {*}no
> replacement is suggested{*}.
> The only non-deprecated constructor is:
> {code:java}
> public PDIndexed(COSArray indexedArray) throws IOException{code}
> This constructor immediately calls {{readColorTable()}} and
> {{initRgbColorTable()}} in its body, which require a fully assembled
> {{COSArray}} including a valid lookup table at slot index 3. This makes it
> {*}impossible to use for incremental construction{*}: we cannot assemble the
> complete {{COSArray}} up front because the base color space, hival, and
> lookup data are parsed from separate PostScript operands at different points
> in the execution.
> Attempting to pass a partially built {{COSArray}} (e.g. with {{COSNull}} as
> the lookup entry, as the deprecated default constructor does internally) to
> the non-deprecated constructor would either throw an {{IOException}} during
> table initialization or produce a broken object.
>
> *Request:*
> Please provide one of the following:
> # A *builder or factory method* such as {{PDIndexed.create(PDColorSpace
> base, int hival, byte[] lookupData)}} that accepts the fully resolved
> components and handles internal construction — allowing callers to avoid
> piecemeal mutation without requiring a pre-built {{{}COSArray{}}}.
> # Or *retain the no-arg constructor and the three setters as non-deprecated*
> (or re-deprecate only after a builder is in place), since there is currently
> no way to construct a {{PDIndexed}} from dynamically parsed components
> without them.
> The deprecation of all mutation entry points simultaneously, without
> providing an incremental-construction alternative, leaves consumers of
> PostScript/PDF generation pipelines with no supported path forward.
> *PDFBox version:* 3.0.7 *Affected class:*
> {{org.apache.pdfbox.pdmodel.graphics.color.PDIndexed}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]