Maruan With the exception of the documentation issue, these are not subjective matters, you can’t disagree with an objective truth. Either falsify my claims or concede that I am correct - we need to reach a technical resolution on this.
-- John On 19 Mar 2014, at 13:48, Maruan Sahyoun <[email protected]> wrote: > John > > Am 19.03.2014 um 19:10 schrieb John Hewson <[email protected]>: > >> Maruan, >> >>>>> From how I understand the rendering in PDF Form, Text, Image and Pattern >>>>> maintain their own matrix to map to user space which is then transformed >>>>> by the CTM to device space so handling them specifically is fine and >>>>> inline with the spec. >>>> >>>> No, that’s not right, what I said was: >>>> >>>>>> My problem is that tiling patterns are defined in their parent stream’s >>>>>> initial coordinate space, rather than the >>>>>> coordinate space defined by the CTM. >>>> >>>> So patterns should *not* be using the CTM, which is what I’m trying to >>>> achieve. >>>> >>> >>> I think you misunderstood what I wrote - patterns have their own matrix - >>> so I think we are on the same page here. IMHO according to the spec CTM >>> transforms from user space to device space. So it’s pattern space -> user >>> space -> device space. >> >> Nope, as I said, that’s what PDFBox currently does and it’s wrong. As you >> say the CTM transforms from user space to device space, but it’s not the >> only way to do so, and it is not used by patterns. > > As the processing is defined in the spec this is a good reference so no need > to discuss that further. Of course different people might come to different > conclusions by reading and interpreting the spec. > >> >>> Didn’t mean to only reference to the spec but to use the same terms as >>> described by the spec. Adding references to the spec is an add-on not a >>> replacement. >> >> I don’t see what value this adds, given that the references will just go >> out-of-date when the next spec is released. We already use the same >> terminology as the PDF spec, so Ctrl+F can be used for quick look-ups that >> won’t go out-of-date. > > You are not enforced to add the information. > >> >>>> This isn’t possible, as I said it "will necessarily be a breaking change”. >>>> This is because in 2.0 PDFStreamEngine needs to know the parent of each >>>> stream, but processStream and processSubStream do not provide this >>>> information. That’s why I’m discussing this on the mailing list. >>> >>> I don’t understand why this is shouldn’t be possible. It’s more effort, >>> agreed, but beneficial. >> >> >> What’s not to understand? PDFStreamEngine *needs* to know the parent of each >> stream, and the old methods don’t provide this, passing a null parent will >> not work because we need that information later in order to correctly >> process the stream. If we allowed a null parent to be passed, the result >> would be silently broken rendering - there’s no value in providing a >> backwards-compatible API if it can only produce broken results. > > Won’t get to the same conclusion here (as I think we won’t get on the other > topics above). > >> >> -- John >> >> On 19 Mar 2014, at 10:31, Maruan Sahyoun <[email protected]> wrote: >> >>> John, >>> >>> Am 19.03.2014 um 18:15 schrieb John Hewson <[email protected]>: >>> >>>> Maruan >>>> >>>>> From how I understand the rendering in PDF Form, Text, Image and Pattern >>>>> maintain their own matrix to map to user space which is then transformed >>>>> by the CTM to device space so handling them specifically is fine and >>>>> inline with the spec. >>>> >>>> No, that’s not right, what I said was: >>>> >>>>>> My problem is that tiling patterns are defined in their parent stream’s >>>>>> initial coordinate space, rather than the >>>>>> coordinate space defined by the CTM. >>>> >>>> So patterns should *not* be using the CTM, which is what I’m trying to >>>> achieve. >>>> >>> >>> I think you misunderstood what I wrote - patterns have their own matrix - >>> so I think we are on the same page here. IMHO according to the spec CTM >>> transforms from user space to device space. So it’s pattern space -> user >>> space -> device space. >>> >>> >>>>> I’d suggest that we make sure that the different ‚spaces‘ are defined >>>>> properly within the code and refer to the PDF spec so that the code is >>>>> easier to read if this is not already the case. With so many changes it’s >>>>> a good opportunity to enhance the documentation within the source code. >>>>> Some of the old code enjoys very little documentation. >>>> >>>> >>>> I disagree, in general I don’t think that references to the PDF spec are a >>>> good form of documentation (there are some exceptions). References to the >>>> spec are meaningless to the reader unless they take the time to look them >>>> up in a 700 page PDF document. I would argue that by just linking back to >>>> the spec, we have *failed* to document PDFBox, not succeeded. >>>> >>>> References to the PDF spec have another major flaw: they go out-of-date. >>>> For example a Pattern Colour Space will always be called “Pattern Colour >>>> Space” in future versions of the PDF spec but it may not be described in >>>> paragraph 8.6.6.2 or on page 156. The existing code contains many >>>> references to the PDF 1.6 and 1.7 specs as well as the ISO PDF32000 spec, >>>> which means that I need three 700 page PDF files open at all times in >>>> order to look up PDFBox references. With the new version of the PDF spec >>>> due this year, this situation is going to get worse. >>>> >>> >>> Didn’t mean to only reference to the spec but to use the same terms as >>> described by the spec. Adding references to the spec is an add-on not a >>> replacement. >>> >>>> I agree that some of the existing code needs more documentation, and I >>>> often add documentation to old files which I’m working on. However, my >>>> approach is to just paste in a sentence or two from the PDF spec (fair >>>> use). That way the reader does not ever need to look at the PDF spec. >>>> Because we use the same terminology in PDFBox as in the spec, if someone >>>> really wants to look something up, it’s as simple as Ctrl+F, no reference >>>> needed, and it’s guaranteed not to go out-of-date. >>>> >>>>> I wouldn’t remove processStream and processSubStream but deprecate them >>>>> and remove them in the next major release though as to keep the changes >>>>> to a minimum. >>>> >>>> This isn’t possible, as I said it "will necessarily be a breaking change”. >>>> This is because in 2.0 PDFStreamEngine needs to know the parent of each >>>> stream, but processStream and processSubStream do not provide this >>>> information. That’s why I’m discussing this on the mailing list. >>> >>> I don’t understand why this is shouldn’t be possible. It’s more effort, >>> agreed, but beneficial. >>> >>>> >>>>> For the rendering what might have been missed is taking the UserUnit >>>>> entry in the page dictionary into account which might change the default >>>>> user space. This was introduced in PDF 1.6. A good opportunity to read >>>>> that entry and make sure that we handle it appropriately. >>>> >>>> Yes, I have this as a “todo” in my working copy, however, if we put the >>>> UserUnit in the matrix then we should also put the page Rotation into the >>>> matrix, but that’a a significant change. >>>> >>>> -- John
