Maruan,

>>> From how I understand the rendering in PDF Form, Text, Image and Pattern 
>>> maintain their own matrix to map to user space which is then transformed by 
>>> the CTM to device space so handling them specifically is fine and inline 
>>> with the spec.
>> 
>> No, that’s not right, what I said was:
>> 
>>>> My problem is that tiling patterns are defined in their parent stream’s 
>>>> initial coordinate space, rather than the
>>>> coordinate space defined by the CTM.
>> 
>> So patterns should *not* be using the CTM, which is what I’m trying to 
>> achieve.
>> 
> 
> I think you misunderstood what I wrote - patterns have their own matrix - so 
> I think we are on the same page here. IMHO according to the spec CTM 
> transforms from user space to device space. So it’s pattern space -> user 
> space -> device space.

Nope, as I said, that’s what PDFBox currently does and it’s wrong. As you say 
the CTM transforms from user space to device space, but it’s not the only way 
to do so, and it is not used by patterns.

> Didn’t mean to only reference to the spec but to use the same terms as 
> described by the spec. Adding references to the spec is an add-on not a 
> replacement.

I don’t see what value this adds, given that the references will just go 
out-of-date when the next spec is released. We already use the same terminology 
as the PDF spec, so Ctrl+F can be used for quick look-ups that won’t go 
out-of-date.

>> This isn’t possible, as I said it "will necessarily be a breaking change”. 
>> This is because in 2.0 PDFStreamEngine needs to know the parent of each 
>> stream, but processStream and processSubStream do not provide this 
>> information. That’s why I’m discussing this on the mailing list.
> 
> I don’t understand why this is shouldn’t be possible. It’s more effort, 
> agreed, but beneficial.


What’s not to understand? PDFStreamEngine *needs* to know the parent of each 
stream, and the old methods don’t provide this, passing a null parent will not 
work because we need that information later in order to correctly process the 
stream. If we allowed a null parent to be passed, the result would be silently 
broken rendering - there’s no value in providing a backwards-compatible API if 
it can only produce broken results.

-- John

On 19 Mar 2014, at 10:31, Maruan Sahyoun <sahy...@fileaffairs.de> wrote:

> John,
> 
> Am 19.03.2014 um 18:15 schrieb John Hewson <j...@jahewson.com>:
> 
>> Maruan
>> 
>>> From how I understand the rendering in PDF Form, Text, Image and Pattern 
>>> maintain their own matrix to map to user space which is then transformed by 
>>> the CTM to device space so handling them specifically is fine and inline 
>>> with the spec.
>> 
>> No, that’s not right, what I said was:
>> 
>>>> My problem is that tiling patterns are defined in their parent stream’s 
>>>> initial coordinate space, rather than the
>>>> coordinate space defined by the CTM.
>> 
>> So patterns should *not* be using the CTM, which is what I’m trying to 
>> achieve.
>> 
> 
> I think you misunderstood what I wrote - patterns have their own matrix - so 
> I think we are on the same page here. IMHO according to the spec CTM 
> transforms from user space to device space. So it’s pattern space -> user 
> space -> device space.
> 
> 
>>> I’d suggest that we make sure that the different ‚spaces‘ are defined 
>>> properly within the code and refer to the PDF spec so that the code is 
>>> easier to read if this is not already the case. With so many changes it’s a 
>>> good opportunity to enhance the documentation within the source code. Some 
>>> of the old code enjoys very little documentation.
>> 
>> 
>> I disagree, in general I don’t think that references to the PDF spec are a 
>> good form of documentation (there are some exceptions). References to the 
>> spec are meaningless to the reader unless they take the time to look them up 
>> in a 700 page PDF document. I would argue that by just linking back to the 
>> spec, we have *failed* to document PDFBox, not succeeded.
>> 
>> References to the PDF spec have another major flaw: they go out-of-date. For 
>> example a Pattern Colour Space will always be called “Pattern Colour Space” 
>> in future versions of the PDF spec but it may not be described in paragraph 
>> 8.6.6.2 or on page 156. The existing code contains many references to the 
>> PDF 1.6 and 1.7 specs as well as the ISO PDF32000 spec, which means that I 
>> need three 700 page PDF files open at all times in order to look up PDFBox 
>> references. With the new version of the PDF spec due this year, this 
>> situation is going to get worse.
>> 
> 
> Didn’t mean to only reference to the spec but to use the same terms as 
> described by the spec. Adding references to the spec is an add-on not a 
> replacement.
> 
>> I agree that some of the existing code needs more documentation, and I often 
>> add documentation to old files which I’m working on. However, my approach is 
>> to just paste in a sentence or two from the PDF spec (fair use). That way 
>> the reader does not ever need to look at the PDF spec. Because we use the 
>> same terminology in PDFBox as in the spec, if someone really wants to look 
>> something up, it’s as simple as Ctrl+F, no reference needed, and it’s 
>> guaranteed not to go out-of-date.
>> 
>>> I wouldn’t remove processStream and processSubStream but deprecate them and 
>>> remove them in the next major release though as to keep the changes to a 
>>> minimum.
>> 
>> This isn’t possible, as I said it "will necessarily be a breaking change”. 
>> This is because in 2.0 PDFStreamEngine needs to know the parent of each 
>> stream, but processStream and processSubStream do not provide this 
>> information. That’s why I’m discussing this on the mailing list.
> 
> I don’t understand why this is shouldn’t be possible. It’s more effort, 
> agreed, but beneficial.
> 
>> 
>>> For the rendering what might have been missed is taking the UserUnit entry 
>>> in the page dictionary into account which might change the default user 
>>> space. This was introduced in PDF 1.6. A good opportunity to read that 
>>> entry and make sure that we handle it appropriately.
>> 
>> Yes, I have this as a “todo” in my working copy, however, if we put the 
>> UserUnit in the matrix then we should also put the page Rotation into the 
>> matrix, but that’a a significant change.
>> 
>> -- John

Reply via email to