UTF16 encoded string to PDFDocEncoding

2017-07-10 Thread Andrea Vacondio
Hi, we came across this case where we are basically cloning outline items where the original outline title is a UTF16BE encoded text string containing the value 00A0 (non break space). We later use the string to assign the title in a new outline item and the A0 is recognised as a € sign. Here is a

Re: UTF16 encoded string to PDFDocEncoding

2017-07-11 Thread Andrea Vacondio
I'm talking about the node dictionary, try adding this: System.out.println(node.getTitle()); On Tue, Jul 11, 2017 at 12:20 PM, Andreas Lehmkühler wrote: > > > Andreas Lehmkühler hat am 11. Juli 2017 um 12:17 > geschrieben: > > > > > > > > > Andr

Form flattening with or without refresh

2019-05-13 Thread Andrea Vacondio
Hi, we are having some issue when we try to flatten the form in the file f1040_sample.pdf We get slightly different results depending on the value of refresh appearance, in particular when we set it to true we get a slightly wrong positioning of the text field value. Shouldn't we expect the same re

Re: Form flattening with or without refresh

2019-05-13 Thread Andrea Vacondio
Understood, thanks for clarifying. So in my algorithm, given I don't modify the form I could use refreshAppearance=true only when the NeedAppearance=true in the form, correct? Andrea Il giorno lun 13 mag 2019 alle ore 16:26 Maruan Sahyoun < sahy...@fileaffairs.de> ha scritto: > Hi, > > in theory

Re: Form flattening with or without refresh

2019-05-14 Thread Andrea Vacondio
It looks like they behave the same way. I added the generated files to the Dropbox folder. Andrea Il giorno lun 13 mag 2019 alle ore 22:38 Maruan Sahyoun < sahy...@fileaffairs.de> ha scritto: > Hi Andrea, > > would you mind testing with 2.0.14 and 2.0.13 just to see if that is a > regression? > >

JIRA performance

2015-02-04 Thread Andrea Vacondio
Hi, I'm trying to get more involved in the project and I'm lately often going through the issues in JIRA but it's awfully slow.. I mean "I want to throw my notebook out the window" slow. Is it just me? Anything that can be done? Thanks

Add jdk 8 to Travis

2015-02-05 Thread Andrea Vacondio
I'd like to suggest to add " - oraclejdk8" to the Travis CI yml config file. This will ensure that the build compiles and tests run fine against jdk 8.

Outline parent with P key

2015-02-08 Thread Andrea Vacondio
Hi, I'm trying to become more familiar with PDFBox code and I have a simple use case involving document outline. I decided to start looking at the fairly simple outline package, review it, clean the code a bit keeping an eye on Sonar and write unit tests for that. I was writing unit test for the PD

Outline and outline items count

2015-02-09 Thread Andrea Vacondio
Hi, I'm playing with the outline package and I' trying to figure out if my understanding of the spec tables 152 and 153, Count key for the outline dictionary and outline item dictionary is correct. Could someone confirm that the following units should pass? Thanks @Test public void outlinesCou

Re: Outline and outline items count

2015-02-10 Thread Andrea Vacondio
if it isn't a valid pdf/a. Ok, I'll test the current implementation and my implementation and see the result. Thanks Am 09.02.2015 um 18:52 schrieb Andrea Vacondio: > Hi, > I'm playing with the outline package and I' trying to figure out if my > understanding of the s

Mutable COSInteger

2015-02-13 Thread Andrea Vacondio
Hi, I was looking at the COSInteger and I have a couple of questions. I noticed that it caches instances between in -100 and 265. My first question is why those limits and not cache everything... perhaps with Soft or Weak values? I made a very rough test using a Map to cache all the COSInteger and

Transition dictionary implementation

2015-02-17 Thread Andrea Vacondio
I was thinking about implementing transition dictionary (12.4.4.1 of the PDF 32000-1:2008) which seems missing. I'll open an issue and attach the patch to it for review, is it ok? Anything I should know about it, except it's probably a very rarely used feature :)?

Xref parsing performance

2015-02-27 Thread Andrea Vacondio
Hi, few days ago I was profiling PDFBox when loading medium/large size documents and I think I found something. If you try loading the document http://www.adobe.com/devnet/acrobat/pdfs/pdf_reference_1-7.pdf you'll see it takes quite some time and that's mostly spent in the XrefTrailerResolver.getCo

Re: Xref parsing performance

2015-02-28 Thread Andrea Vacondio
ings in > PDFBOX-2576, and then the optimization you mention (or the other way > around). The parser is indeed a tricky part of the code (And SonarQube and > Software Diagnostics have also flagged it as too complex). I did some > refactorings a few weeks ago there (splitting methods), but s

Question about PDDocument.setVersion

2015-03-04 Thread Andrea Vacondio
Hi, about 2.0.0-SNAPSHOT I was setting version on an existing document and I noticed the version was set on the Catalog but not in the header so I took a look at the code and I think there's something odd there (or I'm missing something). It first makes sure we are not downgrading the version and t

Re: Question about PDDocument.setVersion

2015-03-04 Thread Andrea Vacondio
crement. > > BR > Maruan > > Am 04.03.2015 um 14:15 schrieb Andrea Vacondio >: > > > Hi, about 2.0.0-SNAPSHOT I was setting version on an existing document > and > > I noticed the version was set on the Catalog but not in the header so I > > took a look at

parseCOSStream defined twice

2015-04-02 Thread Andrea Vacondio
I see parseCOSStream is defined in BaseParser and COSParser is it on purpose? They kinda look like duplicates so I was wandering if it's a mistake and one can go.

Re: Getting the pagenumber of a PDPage

2015-04-22 Thread Andrea Vacondio
I think this might be related https://issues.apache.org/jira/browse/PDFBOX-2704 On Wed, Apr 22, 2015 at 10:15 AM, Johanneke Lamberink < johanneke.lamber...@onior.com> wrote: > Hi, > > I'm trying to get the pagenumber of a given PDPage. > What I have is: > document.getPages().indexOf(page) > > No

Should PDColor toCOSArray include the pattern name?

2015-08-04 Thread Andrea Vacondio
Hi, this is somehow related to https://issues.apache.org/jira/browse/PDFBOX-2830 the PDColor::toCOSArray method adds the pattern name (if present) to the end of the array but as far as I can tell from the spec it shouldn't. The C, BC and IC description says something like "An array of numbers that

Named destinations refactor/enhancement

2015-09-05 Thread Andrea Vacondio
Hi, I'd like to discuss few issues related to the named destinations and somehow related to PDFBOX-2793. 1. PDOutlineItem::findNamedDestinationPage I think this method belongs to the PDCatalog, public. In my use case I split a document and I want to process the annotations removing those link anno

Re: Named destinations refactor/enhancement

2015-09-08 Thread Andrea Vacondio
>> 2. PDDocumentNameDictionary::getDests if it doesn't find the Dests name >> tree it tries to find the catalog Dests and wraps it in a >> PDDestinationNameTreeNode. I think this is wrong since the catalog Dests >> is >> a dictionary, not a name tree. >> I also think this is a hidden behavior that

Text extraction and clip area

2016-12-01 Thread Andrea Vacondio
Hi, I had a couple of issues with text extraction and I tried to dig a bit into the code. As far as I can see the "current clipping area" is never used during text extraction, is this correct? My issue is with a form xobject where the bounding box clips out part of the text but that text is returne

Re: Text extraction and clip area

2016-12-01 Thread Andrea Vacondio
t; > About your problem, I've not understood clearly. > Do you want to process the contents inside a form? > > I can give a sample code used in my project. > It use PDFStreamEngine to get form objects in PDF. > I hope it can help you. > > > > > > -Original Mes

Cross-Reference stream

2010-11-09 Thread Andrea Vacondio
Hi, I'm new pdfbox and I'd like to know how can I write (if possible) a Cross-Reference stream (PDF reference chapter 3.4.7). I found the pdfbox can read cross reference stream but looking at the examples I could not find anything to start with, could you please point me on the right direction? To

Any plan for an Android version?

2013-03-26 Thread Andrea Vacondio
Hi, I'd like to know if there's any plan to package an Android version of PDFBox. There are already some libraries to write PDF but I couldn't find anything open source to load/change/save an existing document. Thanks