[okular] [Bug 395660] okular cannot preserve annotations in some pdf files.

2018-11-21 Thread Albert Astals Cid
https://bugs.kde.org/show_bug.cgi?id=395660

Albert Astals Cid  changed:

   What|Removed |Added

 Status|REPORTED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Albert Astals Cid  ---
I'm going to close assuming that
https://gitlab.freedesktop.org/poppler/poppler/issues/139 fixed it. 

Tobias please complain if it isn't correct.

-- 
You are receiving this mail because:
You are watching all bug changes.

[okular] [Bug 395660] okular cannot preserve annotations in some pdf files.

2018-06-27 Thread Tobias Deiminger
https://bugs.kde.org/show_bug.cgi?id=395660

--- Comment #8 from Tobias Deiminger  ---
(In reply to Tobias Deiminger from comment #7)
> guarantees. As this may take a long time, let's better add the visual
> warning as interim solution.
Probably it's not that bad, here's a poppler patch:
https://bugs.freedesktop.org/show_bug.cgi?id=107057
It's sufficient to fix the bug, if approach is valid.

Another related poppler issue would be to support XRef streams, and discovery
of objects inside object streams in XRef::constructXRef. I did some
experiments, partially working, but it's more difficult and I'm not sure if
it's worth the while.

-- 
You are receiving this mail because:
You are watching all bug changes.

[okular] [Bug 395660] okular cannot preserve annotations in some pdf files.

2018-06-27 Thread Nate Graham
https://bugs.kde.org/show_bug.cgi?id=395660

Nate Graham  changed:

   What|Removed |Added

 CC||n...@kde.org

-- 
You are receiving this mail because:
You are watching all bug changes.

[okular] [Bug 395660] okular cannot preserve annotations in some pdf files.

2018-06-25 Thread Tobias Deiminger
https://bugs.kde.org/show_bug.cgi?id=395660

--- Comment #7 from Tobias Deiminger  ---
(In reply to Albert Astals Cid from comment #6)
> I think the important question is, does Adobe Reader let you save stuff in
> that broken file?
Yes, Adobe Reader can save annotations in '1_PDFsam_Untitled 1.pdf'. Okular can
view the saved file afterwards. Details see below.

> If so we should try to do the same, and if we can't make
> it happen i guess we'd need some kind of visual warning (we have one in the
> command line when saving fails, but that's hardly enough)
Nothing is impossible:) I'd take it as learning story, with open end and no
guarantees. As this may take a long time, let's better add the visual
warning as interim solution. Or are there some experienced poppler guys out
there to join? 

Some details.

On full rewrite ("Save As..."), Adobe Reader created a new XRef stream for
objects 0..13. So there was an object 0 after save.

On incremental update ("Save"), Adobe Reader instead added a new XRef stream
with /Index[2 2 6 1 18 11] to the end of the file.
The original XRef stream with /Index [1 17] was preserved. In that case there
was still no object 0 after save.

The content of the full rewrite XRef looked as follows
$ dd if='1_PDFsam_Untitled 1.pdf' ibs=1 skip=12306 count=52 |
./unpredict_png.py | hexdump -e '4/1 " %02X" "\n"'
 00 00 00 00 # obj 0 free, next free object = 0, use gen 0 if reused
 01 1D FB 00
 01 20 D8 00
 01 2D 8A 00
 01 2E 59 00
 01 2F 3E 00
 02 00 01 00
 02 00 01 01
 02 00 01 02
 02 00 01 03
 02 00 03 00
 02 00 03 01
 02 00 03 02
 02 00 04 00

Adobe saves the stream with /DecodeParms<>
/Filter/FlateDecode.
So to analyze it, one has to decode and unpredict the PNG prediction first. I
used this quick and dirty python script:

Listing unpredict_png.py

#!/usr/bin/python3
import zlib
import sys
predicted = zlib.decompress(sys.stdin.buffer.read())
rows = [predicted[i+1:i+5] for i in range(0, len(predicted), 5)]
prev = bytearray(4)
for row in range(len(rows)):
for byte in range(len(rows[row])):
prev[byte] = (rows[row][byte] + prev[byte]) & 0xFF
sys.stdout.buffer.write(prev)

-- 
You are receiving this mail because:
You are watching all bug changes.

[okular] [Bug 395660] okular cannot preserve annotations in some pdf files.

2018-06-24 Thread Albert Astals Cid
https://bugs.kde.org/show_bug.cgi?id=395660

--- Comment #6 from Albert Astals Cid  ---
I think the important question is, does Adobe Reader let you save stuff in that
broken file? If so we should try to do the same, and if we can't make it happen
i guess we'd need some kind of visual warning (we have one in the command line
when saving fails, but that's hardly enough)

-- 
You are receiving this mail because:
You are watching all bug changes.

[okular] [Bug 395660] okular cannot preserve annotations in some pdf files.

2018-06-24 Thread Tobias Deiminger
https://bugs.kde.org/show_bug.cgi?id=395660

--- Comment #5 from Tobias Deiminger  ---
(In reply to Tobias Deiminger from comment #4)
> - First object number > 0 doesn't indicate a damaged file, but it's valid
> (am unsure about this)
After investigating a bit more, now I think not having an object 0 is invalid.
This would mean '1_PDFsam_Untitled 2.pdf' is invalid, and poppler is NOT to
blame (maybe poppler could provide a workaround, though).

Standard section 7.5.4 is explicit that an old fashioned XRef table needs a
special object 0:
"The first entry in the table (object number 0) shall always be free and shall
have a generation number of 65,535; it is shall be the head of the linked list
of free objects."

Now '1_PDFsam_Untitled 2.pdf' has no XRef table but an XRef stream, and it
seems a bit ambigous if the above statement about object 0 applies for XRef
streams too. This needs to be clarified before we can actually blame either
poppler or pdfsam. Maybe ask at adobe forum, or poppler list?

The XRef stream in '1_PDFsam_Untitled 2.pdf' looks like this (needed to decode
/Filter /FlateDecode first)

$ dd if=1_PDFsam_Untitled\ 2.pdf ibs=1 skip=5841 count=64 | python -c 'import
zlib;import sys;sys.stdout.write(zlib.decompress(sys.stdin.read()))' | hexdump
-e '5/1 " %02X" "\n"'
 01 13 73 00 00 # Object 1. Type 1 (used, not compressed), object offset =
0x1373, generation 0
 01 00 0F 00 00 # Object 2. Type 1 (used, not compressed), object offset = 0xf,
generation 0
 02 00 01 00 00 # Object 3. Type 2 (compressed), stored in object nr.1, index
in object stream 0
 02 00 01 00 01 # Object 4. Type 2 (compressed), stored in object nr.1, index
in object stream 1
 02 00 01 00 02 # Object 5. Type 2 (compressed), stored in object nr.1, index
in object stream 2
 02 00 01 00 03 # Object 6. Type 2 (compressed), stored in object nr.1, index
in object stream 3
 02 00 01 00 04 # Object 7. Type 2 (compressed), stored in object nr.1, index
in object stream 4
 02 00 01 00 05 # Object 8. Type 2 (compressed), stored in object nr.1, index
in object stream 5
 02 00 01 00 06 # Object 9. Type 2 (compressed), stored in object nr.1, index
in object stream 6
 01 00 6F 00 00 # Object 10. Type 1 (used, not compressed), object offset =
0x6f, generation 0
 02 00 01 00 07 # Object 11. Type 2 (compressed), stored in object nr.1, index
in object stream 7
 02 00 01 00 08 # Object 12. Type 2 (compressed), stored in object nr.1, index
in object stream 8
 02 00 01 00 09 # Object 13. Type 2 (compressed), stored in object nr.1, index
in object stream 9
 01 01 0D 00 00 # Object 14. Type 1 (used, not compressed), object offset =
0x10d, generation 0
 01 02 3E 00 00 # Object 15. Type 1 (used, not compressed), object offset =
0x23e, generation 0
 01 15 ED 00 00 # Object 16. Type 1 (used, not compressed), object offset =
0x15ed, generation 0
 01 16 01 00 00 # Object 17. Type 1 (used, not compressed), object offset =
0x1601, generation 0

You see, no special object 0 here. It would look something like this
 00 00 00 FF FF # Object 0. Type 0 (member of linked list of free objects),
generation nr. 65535

-- 
You are receiving this mail because:
You are watching all bug changes.

[okular] [Bug 395660] okular cannot preserve annotations in some pdf files.

2018-06-21 Thread Tobias Deiminger
https://bugs.kde.org/show_bug.cgi?id=395660

Tobias Deiminger  changed:

   What|Removed |Added

 CC||haxti...@posteo.de

--- Comment #4 from Tobias Deiminger  ---
(In reply to Albert Astals Cid from comment #3)
> It does fail on current version, would need someone to investigate why,
> probably a poppler issue
I reproduced the error with a standalone poppler application, to rule out
errors in Okular. Poppler immediatelly gave first hints about what's wrong:
"Error: Couldn't find trailer dictionary"
"Error: Invalid XRef entry"

Looking a bit deeper, it are two characteristics of 'Untitled 1.pdf' that make
poppler fail
- The document has an "XRef stream", instead of a "XRef table". XRef streams
are available since PDF 1.5 and legitimately have no "trailer" keyword.
- The first object in the XRef stream is 1 (see "17 0 obj <<... /Index [1 17]
...>>", instead of 0. 

The start-at-1 thing causes XRef::entries[0].type = xrefEntryNone (see
initialization in XRef::resize).

Then, upon document save, PDFDoc::saveIncrementalUpdate iterates over entires
ranging from 0 to (getNumObjects-1). Accessing entries[0] where type ==
xrefEntryNone causes poppler to think this is a damaged file and it tries to
reconstruct the xref table with XRef::constructXRef. Now XRef::constructXRef
wants a "trailer" keyword. But there is no "trailer" keyword in the file
(that's not an error because we've got a PDF 1.5 XRef stream). But
XRef::constructXRef can't work without, and bails out with error.

I believe there are two things to fix in poppler:
- XRef::constructXRef should support PDF 1.5 XRef streams without trailer
dictionary.
- First object number > 0 doesn't indicate a damaged file, but it's valid (am
unsure about this). No need to reconstruct XRef at all. Actually, everything
works fine if I trick poppler to start iteration at 1 in saveIncrementalUpdate.

There's no problem with the second document 'Untitled 2.pdf', because it uses
XRef table with trailer dictionary and has objects 0..22.

Albert, does this sound reasonable? This was my first play on XRef, so the
observation my be somewhat wrong. Anyway, we should open a bug at poppler.

-- 
You are receiving this mail because:
You are watching all bug changes.

[okular] [Bug 395660] okular cannot preserve annotations in some pdf files.

2018-06-21 Thread Albert Astals Cid
https://bugs.kde.org/show_bug.cgi?id=395660

Albert Astals Cid  changed:

   What|Removed |Added

 CC||aa...@kde.org

--- Comment #3 from Albert Astals Cid  ---
It does fail on current version, would need someone to investigate why,
probably a poppler issue

-- 
You are receiving this mail because:
You are watching all bug changes.

[okular] [Bug 395660] okular cannot preserve annotations in some pdf files.

2018-06-21 Thread Oliver Sander
https://bugs.kde.org/show_bug.cgi?id=395660

Oliver Sander  changed:

   What|Removed |Added

 CC||oliver.san...@tu-dresden.de

--- Comment #2 from Oliver Sander  ---
Hi Iuri,

okular 0.24.2 is quite old.  There are reasons to believe that new versions
handle these annotations better.  Could you please try that first?

Thanks,
Oliver

-- 
You are receiving this mail because:
You are watching all bug changes.

[okular] [Bug 395660] okular cannot preserve annotations in some pdf files.

2018-06-20 Thread iuri soter viana segtovich
https://bugs.kde.org/show_bug.cgi?id=395660

--- Comment #1 from iuri soter viana segtovich  ---
Created attachment 113468
  --> https://bugs.kde.org/attachment.cgi?id=113468=edit
a original pdf created in libreoffice and two pages splited using pdfsam

uploading a "good" and a "bad" pdf file.
okular can annotate in the original file (Untitled 2.pdf) and preserve
annotations upon "save as" or "export to archive".
okular can annotate on the files processed in pdfsam, but these are not
preserved upon "save as" or "export to archive".

-- 
You are receiving this mail because:
You are watching all bug changes.