[jira] [Commented] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-08 Thread Martin Hausner (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429831#comment-16429831
 ] 

Martin Hausner commented on PDFBOX-4186:


Danke!
Thank you for superfast improvement :)

> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.9, 3.0.0 PDFBox
>Reporter: Martin Hausner
>Assignee: Tilman Hausherr
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox-tool.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  see [^pdfbox-tool.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-08 Thread Emmeran Seehuber (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429822#comment-16429822
 ] 

Emmeran Seehuber commented on PDFBOX-4184:
--

Oh yes, you are right. And I totally overlooked that the getRGB() used always 
converts into sRGB ...

I already do colorspace tagging in 
[https://github.com/rototor/pdfbox-graphics2d/blob/master/src/main/java/de/rototor/pdfbox/graphics2d/PdfBoxGraphics2DLosslessImageEncoder.java]
 
{code:java}
/*
 * Do we have a color profile we need to embed?
 */
if (bi.getColorModel().getColorSpace() 
instanceof ICC_ColorSpace) {
ICC_Profile profile = ((ICC_ColorSpace) 
bi.getColorModel().getColorSpace()).getProfile();
/*
 * Only tag a profile if it is not the 
default sRGB profile.
 */
if (((ICC_ColorSpace) 
bi.getColorModel().getColorSpace()).getProfile() != ICC_Profile

.getInstance(ColorSpace.CS_sRGB)) {

SoftReference 
pdProfileRef = profileMap.get(new ProfileSoftReference(profile));

PDICCBased pdProfile = 
pdProfileRef == null ? null : pdProfileRef.get();
if (pdProfile == null) {
pdProfile = new 
PDICCBased(document);
OutputStream 
outputStream = pdProfile.getPDStream()

.createOutputStream(COSName.FLATE_DECODE);

outputStream.write(profile.getData());
outputStream.close();

pdProfile.getPDStream().getCOSObject().setInt(COSName.N, 
profile.getNumComponents());
profileMap.put(new 
ProfileSoftReference(profile), new SoftReference(pdProfile));
}

imageXObject.setColorSpace(pdProfile);
}
}
{code}
which is of course stupid if the color always get converted to sRGB Its not 
only stupid, but also wrong, because it causes color shifts ... argh

So at the moment PDFBox is not usably for any "real" prepress stuff, as the 
sRGB colorspace is way to small. (At the moment i still use iText 2.1 for my 
prepress stuff, but I want to get rid of it in the long term)

sRGB as used at the moment in the LosslessFactory is fine for web / display 
only PDFs. But for prepress not so much  Hmm, I should really try to find 
some time to implement a "ImageEncoderFactory" and implement all different 
encodings correctly (which are mostly 8-bit and 16-bit images, everything with 
less bit depth is likely fine with getRGB() as now - and of course not only 
encode RGB but also encode CMYK...).  (No, I wont use any code of iText; They 
have tons of special hacks to e.g. reuse already encoded PNG data etc which I 
think is not worth the effort and way to complex / to much code).

I have a factory with an API like this in mind: (everything with method 
chaining)
{code:java}
ImageEncoder myEncoder = ImageEncoderFactory.newBuilder(pdDocument)

// Lossy / JPEG quality 0.9
.jpeg(0.9)

// or lossless
.lossless()
// Lossless Compression the fast way with a not so great compression ratio like 
at the moment
.fastCompression()
// Lossless Compression the slow way with maximum possible compression ratio 
(using predictors etc.)
.slowCompression()
// Set conversion to sRGB 8-Bit. Default would be to always use the color space 
/ ICC Profile of the image.
.toSRGB()

// and finally 
.build();

PDImage pdImg = myEncoder.encode(img);
PDImage pdImg2 = myEncoder.encode(img2);
// ... reuse myEncoder as much as possible, but not multithreaded{code}
What do you think?

> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.9
>Reporter: Emmeran Seehuber
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox_support_16bit_image_write.patch,

[jira] [Commented] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-08 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429719#comment-16429719
 ] 

Tilman Hausherr commented on PDFBOX-4184:
-

I wonder if the patch code is correct - it takes the raster values directly 
without doing any conversions for ICC colorspaces.

> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.9
>Reporter: Emmeran Seehuber
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox_support_16bit_image_write.patch, 
> png16-arrow-bad-no-smask.pdf, png16-arrow-bad.pdf, 
> png16-arrow-good-no-mask.pdf, png16-arrow-good.pdf
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as 
> the images are currently not efficiently encoded. I.e. you could use PNG 
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with 
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is 
> something for a later patch. It would also need another API, as there is a 
> tradeoff speed vs compression ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-08 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429707#comment-16429707
 ] 

Tilman Hausherr commented on PDFBOX-4184:
-

I found the cause of the bug from the github issue, it is in 
{{createAlphaFromARGBImage}}, the line {{bos.write(pixel)}}. For 16 bit images 
it should be changed to {{bos.write(pixel / 256)}}. So the existing code should 
be changed to
{code}
else
{
bpc = 8;
int dataType = alphaRaster.getDataBuffer().getDataType();
if (dataType == DataBuffer.TYPE_USHORT)
{
for (int pixel : pixels)
{
bos.write(pixel / 256);
}
}
else
{
for (int pixel : pixels)
{
bos.write(pixel);
}
}
}
{code}
Sadly this doesn't explain why I can't produce a test that fails... I did make 
tries with alpha values and nothing weird happened.

> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.9
>Reporter: Emmeran Seehuber
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox_support_16bit_image_write.patch, 
> png16-arrow-bad-no-smask.pdf, png16-arrow-bad.pdf, 
> png16-arrow-good-no-mask.pdf, png16-arrow-good.pdf
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as 
> the images are currently not efficiently encoded. I.e. you could use PNG 
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with 
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is 
> something for a later patch. It would also need another API, as there is a 
> tradeoff speed vs compression ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-08 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429702#comment-16429702
 ] 

Tilman Hausherr commented on PDFBOX-4184:
-

The two last files (no smask) show that the bug is in the smask creation. The 
RGB images are identical visually (but different in bit size).

> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.9
>Reporter: Emmeran Seehuber
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox_support_16bit_image_write.patch, 
> png16-arrow-bad-no-smask.pdf, png16-arrow-bad.pdf, 
> png16-arrow-good-no-mask.pdf, png16-arrow-good.pdf
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as 
> the images are currently not efficiently encoded. I.e. you could use PNG 
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with 
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is 
> something for a later patch. It would also need another API, as there is a 
> tradeoff speed vs compression ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-08 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-4184:

Attachment: png16-arrow-good-no-mask.pdf
png16-arrow-bad-no-smask.pdf

> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.9
>Reporter: Emmeran Seehuber
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox_support_16bit_image_write.patch, 
> png16-arrow-bad-no-smask.pdf, png16-arrow-bad.pdf, 
> png16-arrow-good-no-mask.pdf, png16-arrow-good.pdf
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as 
> the images are currently not efficiently encoded. I.e. you could use PNG 
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with 
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is 
> something for a later patch. It would also need another API, as there is a 
> tradeoff speed vs compression ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-08 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-4184:

Attachment: png16-arrow-good.pdf
png16-arrow-bad.pdf

> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.9
>Reporter: Emmeran Seehuber
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox_support_16bit_image_write.patch, 
> png16-arrow-bad.pdf, png16-arrow-good.pdf
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as 
> the images are currently not efficiently encoded. I.e. you could use PNG 
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with 
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is 
> something for a later patch. It would also need another API, as there is a 
> tradeoff speed vs compression ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4182) Improve memory usage of PDFMergerUtility

2018-04-08 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429669#comment-16429669
 ] 

Maruan Sahyoun commented on PDFBOX-4182:


Thanks - I did some special merge implementation which works wo leaving the 
files open but is for a very specific set of PDFs (merging over 1 docs in 
one go) - so maybe we find a way to also deal with the issues which currently 
prevent us from doing it. OTOH if the resulting file is large it will still 
need lots of memory. We could take a look at memory mapped files for caching.

[~pasfilip] would it be possible to share a small set of your documents to get 
an idea which PDF elements they use? 

> Improve memory usage of PDFMergerUtility
> 
>
> Key: PDFBOX-4182
> URL: https://issues.apache.org/jira/browse/PDFBOX-4182
> Project: PDFBox
>  Issue Type: Improvement
>Affects Versions: 2.0.9
>Reporter: Pas Filip
>Priority: Major
> Attachments: PDFMergerUtilityUsingSupplier.java, Supplier.java, 
> Suppliers.java, 
> failed-merge-utility-4gb-heap-out-of-memory-after-1800-pdfs.png, 
> merge-pdf-stats.xlsx, oom-2gb-heap-after-refactoring-leak-suspect-1.png, 
> oom-2gb-heap-after-refactoring-leak-suspect-2.png, successful - 
> refactored-merge-utility-4gb-heap-2618-files-merged.png, successful 
> -merge-utility-6gb-heap-2618-files-merged.png, 
> successful-merge-utility-6gb-heap-2618-files-merged-setupTempFileOnly.png, 
> successful-merge-utility-8gb-heap-2618-files-merged.png, 
> successful-refactored-merge-utility-4gb-heap-2618-files-merged-setupTempFileOnly.png
>
>
> I have been running some tests trying to merge large amounts (2618) of small 
> pdf documents, between 100kb and 130kb, into a single large pdf (288.433kb)
> Memory consumption seems to be the main limitation.
> ScratchFileBuffer seems to consume the majority of the memory usage.
> (see screenshot from mat in attachment)
> (I would include the hprof in attachment so you can analyze yourselves but 
> it's rather large)
> Note that it seems impossible to generate a large pdf using a small memory 
> footprint.
> I personally thought that using MemorySettings with temporary file only would 
> allow me to generate arbitrarily large pdf files but it doesn't seem to help.
> I've run the mergeDocuments with  memory settings:
>  * MemoryUsageSetting.setupMixed(1024L * 1024L, 1024L * 1024L * 1024L * 1024L 
> * 1024L)
>  * MemoryUsageSetting.setupTempFileOnly()
> Refactored version completes with *4GB* heap:
> with temp file only completes 2618 documents in 1.760 min
> *VS*
> *8GB* heap:
> with temp file only completes 2618 documents in 2.0 min
> Heaps of 6gb or less result in OOM. (Didn't try different sizes between 6GB 
> and 8GB)
>  It looks like the loop in the mergeDocuments accumulates PDDocument objects 
> in a list which are closed after the merge is completed.
> Refactoring the code to close these as they are used, instead of accumulating 
> them and closing all at the end, improves memory usage considerably.(although 
> doesn't seem to be eliminated completed based on mat analysis.)
> Another change I've implemented is to only create the inputstream when the 
> file needs to be read and to close it alongside the PDDocument.
> (Some inputstreams contain buffers and depending on the size of the buffers 
> and or the stream type accumulating all the streams is a potential 
> memory-hog.)
> These changes seems to have a beneficial improvement in the sense that I can 
> process the same amount of pdfs with about half the memory.
>  I'd appreciate it if you could roll these changes into the main codebase.
> (I've respected java 6 compatibility.)
> I've included in attachment the java files of the new implementation:
>  * Suppliers
>  * Supplier
>  * PDFMergerUtilityUsingSupplier
> PDFMergerUtilityUsingSupplier can replace the previous version. No signature 
> changes only internal code changes. (just rename the class to 
> PDFMergerUtility if you decide to implemented the changes.)
>  In attachment you can also find some screenshots from visualvm showing the 
> memory usage of the original version and the refactored version as well as 
> some info produced by mat after analysing the heap.
> If you know of any other means, without running into memory issues, to merge 
> large sets of pdf files into a large single pdf I'd love to hear about it!
> I'd also suggest that there should be further improvements made in memory 
> usage in general as pdfbox seems to consumer a lot of memory in general.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4004) Elements in the structure tree are not removed or corrected when flattening

2018-04-08 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429656#comment-16429656
 ] 

Maruan Sahyoun commented on PDFBOX-4004:


[~tilman] I'll take a look after doing PDFBOX-3809

> Elements in the structure tree are not removed or corrected when flattening
> ---
>
> Key: PDFBOX-4004
> URL: https://issues.apache.org/jira/browse/PDFBOX-4004
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Priority: Major
>  Labels: StructureTree, flatten
> Attachments: GovFormPreFlattened.pdf
>
>
> When flattening, the elements in the structure tree are not removed nor 
> adjusted (to the form xobject). An example can be found at 
> {{Root/StructTreeRoot/ParentTree/Nums/\[31]/K/Obj}} in the file 
> GovFormPreFlattened.pdf . This links to something that does not really exist 
> anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Assigned] (PDFBOX-4004) Elements in the structure tree are not removed or corrected when flattening

2018-04-08 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun reassigned PDFBOX-4004:
--

Assignee: Maruan Sahyoun

> Elements in the structure tree are not removed or corrected when flattening
> ---
>
> Key: PDFBOX-4004
> URL: https://issues.apache.org/jira/browse/PDFBOX-4004
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Assignee: Maruan Sahyoun
>Priority: Major
>  Labels: StructureTree, flatten
> Attachments: GovFormPreFlattened.pdf
>
>
> When flattening, the elements in the structure tree are not removed nor 
> adjusted (to the form xobject). An example can be found at 
> {{Root/StructTreeRoot/ParentTree/Nums/\[31]/K/Obj}} in the file 
> GovFormPreFlattened.pdf . This links to something that does not really exist 
> anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4004) Elements in the structure tree are not removed or corrected when flattening

2018-04-08 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun updated PDFBOX-4004:
---
Component/s: AcroForm

> Elements in the structure tree are not removed or corrected when flattening
> ---
>
> Key: PDFBOX-4004
> URL: https://issues.apache.org/jira/browse/PDFBOX-4004
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Priority: Major
>  Labels: StructureTree, flatten
> Attachments: GovFormPreFlattened.pdf
>
>
> When flattening, the elements in the structure tree are not removed nor 
> adjusted (to the form xobject). An example can be found at 
> {{Root/StructTreeRoot/ParentTree/Nums/\[31]/K/Obj}} in the file 
> GovFormPreFlattened.pdf . This links to something that does not really exist 
> anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org