[jira] [Commented] (PDFBOX-2634) Multiple text operations on multiple pages cause NPE in TTFSubsetter

2015-01-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297187#comment-14297187
 ] 

ASF subversion and git services commented on PDFBOX-2634:
-

Commit 1655760 from [~jahewson] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1655760 ]

PDFBOX-2634: Revert accidental commit

 Multiple text operations on multiple pages cause NPE in TTFSubsetter
 

 Key: PDFBOX-2634
 URL: https://issues.apache.org/jira/browse/PDFBOX-2634
 Project: PDFBox
  Issue Type: Bug
  Components: FontBox
Affects Versions: 2.0.0
Reporter: Alex Nevidomsky
Assignee: John Hewson
 Fix For: 2.0.0


 Problem seems to be of the same nature as in PDFBOX-2605, in a slightly 
 different scenario.
 {code:title=NullPTest.java}
 import org.apache.pdfbox.pdmodel.PDDocument;
 import org.apache.pdfbox.pdmodel.PDPage;
 import org.apache.pdfbox.pdmodel.common.PDRectangle;
 import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;
 import org.apache.pdfbox.pdmodel.font.PDFont;
 import org.apache.pdfbox.pdmodel.font.PDType0Font;
 import org.junit.Test;
 public class NullPTest {
 @Test
 public void testMultipageUnicodePDF() throws Exception {
 PDDocument document = new PDDocument();
 PDFont titleFont = PDType0Font.load(document, 
 this.getClass().getResourceAsStream(/Arial Unicode.ttf));
 PDPage page = new PDPage(PDRectangle.A4);
 document.addPage(page);
 PDPageContentStream contentStream = new PDPageContentStream(document, 
 page);
 contentStream.beginText();
 contentStream.setFont(titleFont, 12);
 contentStream.newLineAtOffset(0, 100);
 contentStream.showText(Pěkný žluťoučký kůň úpěl ďábelské ódy);
 contentStream.endText();
 contentStream.close();
 
 page = new PDPage(PDRectangle.A4);
 document.addPage(page);
 contentStream = new PDPageContentStream(document, page);
 contentStream.beginText();
 contentStream.setFont(titleFont, 12);
 contentStream.newLineAtOffset(0, 200);
 contentStream.showText(Pěkný žluťoučký kůň úpěl ďábelské ódy);
 contentStream.endText();
 contentStream.close();
 document.close();
 }
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Font subsetting Unicode

2015-01-29 Thread John Hewson
Sure thing.

-- John

 On 29 Jan 2015, at 03:57, Maruan Sahyoun sahy...@fileaffairs.de wrote:
 
 Hi John,
 
 thx for putting that in - was a long standing issue. This also helps 
 resolving some issues around AcroForms. 
 
 BR
 Maruan



[jira] [Commented] (PDFBOX-2634) Multiple text operations on multiple pages cause NPE in TTFSubsetter

2015-01-29 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297188#comment-14297188
 ] 

John Hewson commented on PDFBOX-2634:
-

Well spotted - thanks!

 Multiple text operations on multiple pages cause NPE in TTFSubsetter
 

 Key: PDFBOX-2634
 URL: https://issues.apache.org/jira/browse/PDFBOX-2634
 Project: PDFBox
  Issue Type: Bug
  Components: FontBox
Affects Versions: 2.0.0
Reporter: Alex Nevidomsky
Assignee: John Hewson
 Fix For: 2.0.0


 Problem seems to be of the same nature as in PDFBOX-2605, in a slightly 
 different scenario.
 {code:title=NullPTest.java}
 import org.apache.pdfbox.pdmodel.PDDocument;
 import org.apache.pdfbox.pdmodel.PDPage;
 import org.apache.pdfbox.pdmodel.common.PDRectangle;
 import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;
 import org.apache.pdfbox.pdmodel.font.PDFont;
 import org.apache.pdfbox.pdmodel.font.PDType0Font;
 import org.junit.Test;
 public class NullPTest {
 @Test
 public void testMultipageUnicodePDF() throws Exception {
 PDDocument document = new PDDocument();
 PDFont titleFont = PDType0Font.load(document, 
 this.getClass().getResourceAsStream(/Arial Unicode.ttf));
 PDPage page = new PDPage(PDRectangle.A4);
 document.addPage(page);
 PDPageContentStream contentStream = new PDPageContentStream(document, 
 page);
 contentStream.beginText();
 contentStream.setFont(titleFont, 12);
 contentStream.newLineAtOffset(0, 100);
 contentStream.showText(Pěkný žluťoučký kůň úpěl ďábelské ódy);
 contentStream.endText();
 contentStream.close();
 
 page = new PDPage(PDRectangle.A4);
 document.addPage(page);
 contentStream = new PDPageContentStream(document, page);
 contentStream.beginText();
 contentStream.setFont(titleFont, 12);
 contentStream.newLineAtOffset(0, 200);
 contentStream.showText(Pěkný žluťoučký kůň úpěl ďábelské ódy);
 contentStream.endText();
 contentStream.close();
 document.close();
 }
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2634) Multiple text operations on multiple pages cause NPE in TTFSubsetter

2015-01-29 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296609#comment-14296609
 ] 

Tilman Hausherr commented on PDFBOX-2634:
-

[~jahewson] Me thinks you had an accidental commit in PDFPrinter.java.

 Multiple text operations on multiple pages cause NPE in TTFSubsetter
 

 Key: PDFBOX-2634
 URL: https://issues.apache.org/jira/browse/PDFBOX-2634
 Project: PDFBox
  Issue Type: Bug
  Components: FontBox
Affects Versions: 2.0.0
Reporter: Alex Nevidomsky
Assignee: John Hewson
 Fix For: 2.0.0


 Problem seems to be of the same nature as in PDFBOX-2605, in a slightly 
 different scenario.
 {code:title=NullPTest.java}
 import org.apache.pdfbox.pdmodel.PDDocument;
 import org.apache.pdfbox.pdmodel.PDPage;
 import org.apache.pdfbox.pdmodel.common.PDRectangle;
 import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;
 import org.apache.pdfbox.pdmodel.font.PDFont;
 import org.apache.pdfbox.pdmodel.font.PDType0Font;
 import org.junit.Test;
 public class NullPTest {
 @Test
 public void testMultipageUnicodePDF() throws Exception {
 PDDocument document = new PDDocument();
 PDFont titleFont = PDType0Font.load(document, 
 this.getClass().getResourceAsStream(/Arial Unicode.ttf));
 PDPage page = new PDPage(PDRectangle.A4);
 document.addPage(page);
 PDPageContentStream contentStream = new PDPageContentStream(document, 
 page);
 contentStream.beginText();
 contentStream.setFont(titleFont, 12);
 contentStream.newLineAtOffset(0, 100);
 contentStream.showText(Pěkný žluťoučký kůň úpěl ďábelské ódy);
 contentStream.endText();
 contentStream.close();
 
 page = new PDPage(PDRectangle.A4);
 document.addPage(page);
 contentStream = new PDPageContentStream(document, page);
 contentStream.beginText();
 contentStream.setFont(titleFont, 12);
 contentStream.newLineAtOffset(0, 200);
 contentStream.showText(Pěkný žluťoučký kůň úpěl ďábelské ódy);
 contentStream.endText();
 contentStream.close();
 document.close();
 }
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2638) PDF files content lost when multiples pdf files merged in to one file

2015-01-29 Thread MANISHA SHARMA (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

MANISHA SHARMA updated PDFBOX-2638:
---
Attachment: MergedDoc.pdf
Sample Pdf files.zip

Hereby, I am attaching a zip folder where you can find the pdf files that I 
used as sample files for merge in my code.

I also attached the output  merged Pdf  file   MergedDoc.pdf  where you can 
notice pdf pages whose content has been messed up (text got replaced by some 
boxes and images got lost).

 PDF files content lost when multiples pdf files merged in to one file
 -

 Key: PDFBOX-2638
 URL: https://issues.apache.org/jira/browse/PDFBOX-2638
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.8.2
 Environment: Oracle Linux 5 (Intel 64-bit, Developer) 
Reporter: MANISHA SHARMA
 Attachments: MergedDoc.pdf, Sample Pdf files.zip


 I am trying to merge six pdf files. In the merged document, I am seeing
 some boxes in place of text.
 Text got replaced by boxes and images got lost.
 The code used for merging is given below:
 public static void main (String args[])
   {
  String[] docletNamesAsPdf =
  { RP_OverviewPart1.pdf, RP_OverviewPart2.pdf,
 RP_OverviewPart3.pdf, RP_OverviewPart4.pdf, RP_OverviewPart5.pdf,
RP_OverviewPart6.pdf };
  PDDocument dest = PDDocument.load(docletNamesAsPdf[0]);
  PDDocument src = PDDocument.load(docletNamesAsPdf[1]);
  dest = mergePdfs(dest, src);
  for (int i = 2; i  docletNamesAsPdf.length; i++)
   {
  src = PDDocument.load(docletNamesAsPdf[i]);
  dest = pptToPdf.mergePdfs(dest, src);
   }
  try {
  dest.save(MergedDoc.pdf);
  } catch (COSVisitorException e) {
  ;
  }
  src.close();
  dest.close();
 }
 public PDDocument mergePdfs(PDDocument dest, PDDocument src) throws
 IOException {
  new PDFMergerUtility().appendDocument(dest, src);
  return dest;
  }
 Please let me know what is wrong with the code and how can we resolve
 this issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-2638) PDF files content lost when multiples pdf files merged in to one file

2015-01-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler closed PDFBOX-2638.
--
Resolution: Not a Problem
  Assignee: Andreas Lehmkühler

Please use our [mailinglists|http://pdfbox.apache.org/mailinglists.html] to ask 
questions. JIRA is used for tracking bugs and features only.

PS: 
- you should update to 1.8.8
- merging the pdfs using the command line tool works perfect
- you should use a new document for the merged pdf

 PDF files content lost when multiples pdf files merged in to one file
 -

 Key: PDFBOX-2638
 URL: https://issues.apache.org/jira/browse/PDFBOX-2638
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.8.2
 Environment: Oracle Linux 5 (Intel 64-bit, Developer) 
Reporter: MANISHA SHARMA
Assignee: Andreas Lehmkühler
 Attachments: MergedDoc.pdf, Sample Pdf files.zip


 I am trying to merge six pdf files. In the merged document, I am seeing
 some boxes in place of text.
 Text got replaced by boxes and images got lost.
 The code used for merging is given below:
 public static void main (String args[])
   {
  String[] docletNamesAsPdf =
  { RP_OverviewPart1.pdf, RP_OverviewPart2.pdf,
 RP_OverviewPart3.pdf, RP_OverviewPart4.pdf, RP_OverviewPart5.pdf,
RP_OverviewPart6.pdf };
  PDDocument dest = PDDocument.load(docletNamesAsPdf[0]);
  PDDocument src = PDDocument.load(docletNamesAsPdf[1]);
  dest = mergePdfs(dest, src);
  for (int i = 2; i  docletNamesAsPdf.length; i++)
   {
  src = PDDocument.load(docletNamesAsPdf[i]);
  dest = mergePdfs(dest, src);
   }
  try {
  dest.save(MergedDoc.pdf);
  } catch (COSVisitorException e) {
  ;
  }
  src.close();
  dest.close();
 }
 public PDDocument mergePdfs(PDDocument dest, PDDocument src) throws
 IOException {
  new PDFMergerUtility().appendDocument(dest, src);
  return dest;
  }
 Please let me know what is wrong with the code and how can we resolve
 this issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2631) Single radio-button group has no children

2015-01-29 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296744#comment-14296744
 ] 

Maruan Sahyoun commented on PDFBOX-2631:


[~giladd] Hi Gilad, can I close the issue or are there further questions?

 Single radio-button group has no children
 -

 Key: PDFBOX-2631
 URL: https://issues.apache.org/jira/browse/PDFBOX-2631
 Project: PDFBox
  Issue Type: Bug
  Components: AcroForm
Affects Versions: 1.8.8
 Environment: Windows 7, Eclipse, JRE 1.8.0_25
Reporter: Gilad Denneboom
Priority: Minor
 Attachments: test2.pdf


 (Continuation of https://issues.apache.org/jira/browse/PDFBOX-2617)
 A group of radio-buttons is an object of the PDRadioCollection class and each 
 child of that group is an PDCheckbox object.
 However, if the group only contains one widget the getKids method of the 
 PDRadioCollection object returns null.
 There should be at least one child for any such group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Font subsetting Unicode

2015-01-29 Thread Maruan Sahyoun
Hi John,

thx for putting that in - was a long standing issue. This also helps resolving 
some issues around AcroForms. 

BR
Maruan

[jira] [Assigned] (PDFBOX-2627) Add block composer to handle multiline text

2015-01-29 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun reassigned PDFBOX-2627:
--

Assignee: Maruan Sahyoun

 Add block composer to handle multiline text
 ---

 Key: PDFBOX-2627
 URL: https://issues.apache.org/jira/browse/PDFBOX-2627
 Project: PDFBox
  Issue Type: Sub-task
  Components: AcroForm
Affects Versions: 2.0.0
Reporter: Maruan Sahyoun
Assignee: Maruan Sahyoun
 Fix For: 2.0.0


 In order to generate the appearance for multiline text a *basic* plain text 
 block composer needs to be developed. 
 Features
 - box model
 - paragraph separation
 - line breaking
 - horizontal and vertical alignment
 - font setting, line height …
 Conceptually it should also include writing mode likely with the only initial 
 implementation being lr-tb.
 There should be no new dependencies on projects such as ICU which will limit 
 the capabilities but should be acceptable to the needs of form filling.
 To not create the *false* assumption that this is a generic composer it will 
 be sub packaged within the (AcroForms) forms package(s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Assigned] (PDFBOX-2631) Single radio-button group has no children

2015-01-29 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun reassigned PDFBOX-2631:
--

Assignee: Maruan Sahyoun

 Single radio-button group has no children
 -

 Key: PDFBOX-2631
 URL: https://issues.apache.org/jira/browse/PDFBOX-2631
 Project: PDFBox
  Issue Type: Bug
  Components: AcroForm
Affects Versions: 1.8.8
 Environment: Windows 7, Eclipse, JRE 1.8.0_25
Reporter: Gilad Denneboom
Assignee: Maruan Sahyoun
Priority: Minor
 Attachments: test2.pdf


 (Continuation of https://issues.apache.org/jira/browse/PDFBOX-2617)
 A group of radio-buttons is an object of the PDRadioCollection class and each 
 child of that group is an PDCheckbox object.
 However, if the group only contains one widget the getKids method of the 
 PDRadioCollection object returns null.
 There should be at least one child for any such group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-2639) Enhance the AcroForms related API

2015-01-29 Thread Maruan Sahyoun (JIRA)
Maruan Sahyoun created PDFBOX-2639:
--

 Summary: Enhance the AcroForms related API
 Key: PDFBOX-2639
 URL: https://issues.apache.org/jira/browse/PDFBOX-2639
 Project: PDFBox
  Issue Type: Improvement
  Components: AcroForm
Reporter: Maruan Sahyoun
Priority: Minor


This is a general issue to gather input for potential enhancements to use 
PDFBox for forms creation and filling. Sub tasks to that issue will track 
individual enhancements which might result from that input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2638) PDF files content lost when multiples pdf files merged in to one file

2015-01-29 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296740#comment-14296740
 ] 

Maruan Sahyoun commented on PDFBOX-2638:


the main issue is that the source documents need to be kept open until the 
final document has been stored, which is not the case with your approach. You 
might want to review the PDFMerger.java source. Using that your code will become

{code}
PDFMergerUtility merger = new PDFMergerUtility();
for(int i = 0; i  docletNamesAsPdf.length; i++)
{
String sourceFileName = docletNamesAsPdf[i];
merger.addSource(sourceFileName);
}

merger.setDestinationFileName(MergedDoc.pdf);
merger.mergeDocumentsNonSeq(null);
{code}

using it this way PDFMergerUtility, which you are using anyway, will take care 
of closing the sources after the job has been done.

Please note that usage questions shall be posted to the users mailing list 
[https://pdfbox.apache.org/mailinglists.html].  

 PDF files content lost when multiples pdf files merged in to one file
 -

 Key: PDFBOX-2638
 URL: https://issues.apache.org/jira/browse/PDFBOX-2638
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.8.2
 Environment: Oracle Linux 5 (Intel 64-bit, Developer) 
Reporter: MANISHA SHARMA
 Attachments: MergedDoc.pdf, Sample Pdf files.zip


 I am trying to merge six pdf files. In the merged document, I am seeing
 some boxes in place of text.
 Text got replaced by boxes and images got lost.
 The code used for merging is given below:
 public static void main (String args[])
   {
  String[] docletNamesAsPdf =
  { RP_OverviewPart1.pdf, RP_OverviewPart2.pdf,
 RP_OverviewPart3.pdf, RP_OverviewPart4.pdf, RP_OverviewPart5.pdf,
RP_OverviewPart6.pdf };
  PDDocument dest = PDDocument.load(docletNamesAsPdf[0]);
  PDDocument src = PDDocument.load(docletNamesAsPdf[1]);
  dest = mergePdfs(dest, src);
  for (int i = 2; i  docletNamesAsPdf.length; i++)
   {
  src = PDDocument.load(docletNamesAsPdf[i]);
  dest = mergePdfs(dest, src);
   }
  try {
  dest.save(MergedDoc.pdf);
  } catch (COSVisitorException e) {
  ;
  }
  src.close();
  dest.close();
 }
 public PDDocument mergePdfs(PDDocument dest, PDDocument src) throws
 IOException {
  new PDFMergerUtility().appendDocument(dest, src);
  return dest;
  }
 Please let me know what is wrong with the code and how can we resolve
 this issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2639) Enhance the AcroForms related API

2015-01-29 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun updated PDFBOX-2639:
---
Description: 
This is a general issue to gather input for potential enhancements to use 
PDFBox for forms creation and filling. Sub tasks to that issue will track 
individual enhancements which might result from that input.

Possible enhancements
- currently getWidget() only returns a single Widget but a field might have 
multiple
- adding a Widget to already existing ones could be simplified
- working with Widgets within RadioButtons might be enhanced to easier 
check/uncheck a Radiobutton option

  was:This is a general issue to gather input for potential enhancements to use 
PDFBox for forms creation and filling. Sub tasks to that issue will track 
individual enhancements which might result from that input.


 Enhance the AcroForms related API
 -

 Key: PDFBOX-2639
 URL: https://issues.apache.org/jira/browse/PDFBOX-2639
 Project: PDFBox
  Issue Type: Improvement
  Components: AcroForm
Reporter: Maruan Sahyoun
Priority: Minor

 This is a general issue to gather input for potential enhancements to use 
 PDFBox for forms creation and filling. Sub tasks to that issue will track 
 individual enhancements which might result from that input.
 Possible enhancements
 - currently getWidget() only returns a single Widget but a field might have 
 multiple
 - adding a Widget to already existing ones could be simplified
 - working with Widgets within RadioButtons might be enhanced to easier 
 check/uncheck a Radiobutton option



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2597) Provide easier access to AcroForm field tree

2015-01-29 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun updated PDFBOX-2597:
---
Fix Version/s: 2.1.0

 Provide easier access to AcroForm field tree
 

 Key: PDFBOX-2597
 URL: https://issues.apache.org/jira/browse/PDFBOX-2597
 Project: PDFBox
  Issue Type: Improvement
  Components: AcroForm
Reporter: Maruan Sahyoun
Priority: Minor
 Fix For: 2.1.0


 The current implementation of the AcroForm field retrieval methods don’t 
 provide an easy access to get to all fields as 
  - one needs to retrieve the documents root fields
  - check if these are non-terminal fields
  - retrieve their childs
  - move on until all terminal fields have been retrieved
 There should be a way to easier get access to all terminal fields  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2631) Single radio-button group has no children

2015-01-29 Thread Gilad Denneboom (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296760#comment-14296760
 ] 

Gilad Denneboom commented on PDFBOX-2631:
-

If you consider this to be a correct implementation, then yes, it can be closed.

 Single radio-button group has no children
 -

 Key: PDFBOX-2631
 URL: https://issues.apache.org/jira/browse/PDFBOX-2631
 Project: PDFBox
  Issue Type: Bug
  Components: AcroForm
Affects Versions: 1.8.8
 Environment: Windows 7, Eclipse, JRE 1.8.0_25
Reporter: Gilad Denneboom
Assignee: Maruan Sahyoun
Priority: Minor
 Attachments: test2.pdf


 (Continuation of https://issues.apache.org/jira/browse/PDFBOX-2617)
 A group of radio-buttons is an object of the PDRadioCollection class and each 
 child of that group is an PDCheckbox object.
 However, if the group only contains one widget the getKids method of the 
 PDRadioCollection object returns null.
 There should be at least one child for any such group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-2631) Single radio-button group has no children

2015-01-29 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun closed PDFBOX-2631.
--
Resolution: Not a Problem

getKids() works as designed and inline with the PDF specification as this is 
the PD model representation of the /Kids entry within the field dictionary

 Single radio-button group has no children
 -

 Key: PDFBOX-2631
 URL: https://issues.apache.org/jira/browse/PDFBOX-2631
 Project: PDFBox
  Issue Type: Bug
  Components: AcroForm
Affects Versions: 1.8.8
 Environment: Windows 7, Eclipse, JRE 1.8.0_25
Reporter: Gilad Denneboom
Assignee: Maruan Sahyoun
Priority: Minor
 Attachments: test2.pdf


 (Continuation of https://issues.apache.org/jira/browse/PDFBOX-2617)
 A group of radio-buttons is an object of the PDRadioCollection class and each 
 child of that group is an PDCheckbox object.
 However, if the group only contains one widget the getKids method of the 
 PDRadioCollection object returns null.
 There should be at least one child for any such group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2634) Multiple text operations on multiple pages cause NPE in TTFSubsetter

2015-01-29 Thread Alex Nevidomsky (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296824#comment-14296824
 ] 

Alex Nevidomsky commented on PDFBOX-2634:
-

Many thanks and my respect. I'll try it out shortly.

 Multiple text operations on multiple pages cause NPE in TTFSubsetter
 

 Key: PDFBOX-2634
 URL: https://issues.apache.org/jira/browse/PDFBOX-2634
 Project: PDFBox
  Issue Type: Bug
  Components: FontBox
Affects Versions: 2.0.0
Reporter: Alex Nevidomsky
Assignee: John Hewson
 Fix For: 2.0.0


 Problem seems to be of the same nature as in PDFBOX-2605, in a slightly 
 different scenario.
 {code:title=NullPTest.java}
 import org.apache.pdfbox.pdmodel.PDDocument;
 import org.apache.pdfbox.pdmodel.PDPage;
 import org.apache.pdfbox.pdmodel.common.PDRectangle;
 import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;
 import org.apache.pdfbox.pdmodel.font.PDFont;
 import org.apache.pdfbox.pdmodel.font.PDType0Font;
 import org.junit.Test;
 public class NullPTest {
 @Test
 public void testMultipageUnicodePDF() throws Exception {
 PDDocument document = new PDDocument();
 PDFont titleFont = PDType0Font.load(document, 
 this.getClass().getResourceAsStream(/Arial Unicode.ttf));
 PDPage page = new PDPage(PDRectangle.A4);
 document.addPage(page);
 PDPageContentStream contentStream = new PDPageContentStream(document, 
 page);
 contentStream.beginText();
 contentStream.setFont(titleFont, 12);
 contentStream.newLineAtOffset(0, 100);
 contentStream.showText(Pěkný žluťoučký kůň úpěl ďábelské ódy);
 contentStream.endText();
 contentStream.close();
 
 page = new PDPage(PDRectangle.A4);
 document.addPage(page);
 contentStream = new PDPageContentStream(document, page);
 contentStream.beginText();
 contentStream.setFont(titleFont, 12);
 contentStream.newLineAtOffset(0, 200);
 contentStream.showText(Pěkný žluťoučký kůň úpěl ďábelské ódy);
 contentStream.endText();
 contentStream.close();
 document.close();
 }
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2631) Single radio-button group has no children

2015-01-29 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296832#comment-14296832
 ] 

Maruan Sahyoun commented on PDFBOX-2631:


API wise this is correct as getKids() is supposed to return the /Kids 
dictionary key. What could be enhanced though is getting/setting the Widgets as 

- currently getWidget() only returns a single Widget but a field might have 
multiple
- adding a Widget to already existing ones could be simplified
- working with Widgets within RadioButtons might be enhanced to easier 
check/uncheck a Radiobutton option

I’ve created PDFBOX-2639 to gather input about enhancing the usage of PDFBox 
for forms creation and filling. Please feel free to add to that.



 Single radio-button group has no children
 -

 Key: PDFBOX-2631
 URL: https://issues.apache.org/jira/browse/PDFBOX-2631
 Project: PDFBox
  Issue Type: Bug
  Components: AcroForm
Affects Versions: 1.8.8
 Environment: Windows 7, Eclipse, JRE 1.8.0_25
Reporter: Gilad Denneboom
Assignee: Maruan Sahyoun
Priority: Minor
 Attachments: test2.pdf


 (Continuation of https://issues.apache.org/jira/browse/PDFBOX-2617)
 A group of radio-buttons is an object of the PDRadioCollection class and each 
 child of that group is an PDCheckbox object.
 However, if the group only contains one widget the getKids method of the 
 PDRadioCollection object returns null.
 There should be at least one child for any such group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-2642) NPE in PDCIDFontType0.getFontMatrix

2015-01-29 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-2642:
---

 Summary: NPE in PDCIDFontType0.getFontMatrix
 Key: PDFBOX-2642
 URL: https://issues.apache.org/jira/browse/PDFBOX-2642
 Project: PDFBox
  Issue Type: Bug
  Components: FontBox
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFBOX-2642-277053-p3.pdf

{code}
java.lang.NullPointerException
at 
org.apache.pdfbox.pdmodel.font.PDCIDFontType0.getFontMatrix(PDCIDFontType0.java:169)
at 
org.apache.pdfbox.pdmodel.font.PDCIDFontType0.init(PDCIDFontType0.java:153)
at 
org.apache.pdfbox.pdmodel.font.PDFontFactory.createDescendantFont(PDFontFactory.java:121)
at 
org.apache.pdfbox.pdmodel.font.PDType0Font.init(PDType0Font.java:95)
at 
org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:83)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2642) NPE in PDCIDFontType0.getFontMatrix

2015-01-29 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2642:

Attachment: PDFBOX-2642-277053-p3.pdf

 NPE in PDCIDFontType0.getFontMatrix
 ---

 Key: PDFBOX-2642
 URL: https://issues.apache.org/jira/browse/PDFBOX-2642
 Project: PDFBox
  Issue Type: Bug
  Components: FontBox
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFBOX-2642-277053-p3.pdf


 {code}
 java.lang.NullPointerException
   at 
 org.apache.pdfbox.pdmodel.font.PDCIDFontType0.getFontMatrix(PDCIDFontType0.java:169)
   at 
 org.apache.pdfbox.pdmodel.font.PDCIDFontType0.init(PDCIDFontType0.java:153)
   at 
 org.apache.pdfbox.pdmodel.font.PDFontFactory.createDescendantFont(PDFontFactory.java:121)
   at 
 org.apache.pdfbox.pdmodel.font.PDType0Font.init(PDType0Font.java:95)
   at 
 org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:83)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2576) Improve code quality

2015-01-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297267#comment-14297267
 ] 

ASF subversion and git services commented on PDFBOX-2576:
-

Commit 1655776 from [~msahyoun] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1655776 ]

PDFBOX-2576 remove duplicate code

 Improve code quality
 

 Key: PDFBOX-2576
 URL: https://issues.apache.org/jira/browse/PDFBOX-2576
 Project: PDFBox
  Issue Type: Task
Affects Versions: 2.0.0
Reporter: Tilman Hausherr

 This is a longterm issue for the task to improve code quality, by using the 
 [SonarQube 
 report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
  hints in different IDEs, the FindBugs tool and other code quality tools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-2641) ArrayIndexOutOfBoundsException in PDType1Font constructor

2015-01-29 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-2641:
---

 Summary: ArrayIndexOutOfBoundsException in PDType1Font constructor
 Key: PDFBOX-2641
 URL: https://issues.apache.org/jira/browse/PDFBOX-2641
 Project: PDFBox
  Issue Type: Bug
  Components: FontBox
Affects Versions: 2.0.0
Reporter: Tilman Hausherr


{code}
java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.pdfbox.pdmodel.font.PDType1Font.init(PDType1Font.java:168)
at 
org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:62)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2641) ArrayIndexOutOfBoundsException in PDType1Font constructor

2015-01-29 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2641:

Attachment: PDFBOX-2641-168002-p2.pdf

 ArrayIndexOutOfBoundsException in PDType1Font constructor
 -

 Key: PDFBOX-2641
 URL: https://issues.apache.org/jira/browse/PDFBOX-2641
 Project: PDFBox
  Issue Type: Bug
  Components: FontBox
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFBOX-2641-168002-p2.pdf


 {code}
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.pdfbox.pdmodel.font.PDType1Font.init(PDType1Font.java:168)
   at 
 org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:62)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2643:

Description: 
In the Bavaria test suite, PDFLib claims that the attached file is not a valid 
PDF/A-1b file, because Property stRef:instanceID in document XMP requires 
scheme identifier or XMP type violation in stRef:instanceID (They make both 
claims in Bavaria.xml).
{code}
rdf:Description rdf:about=
xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
 
xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
 xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
 xmpMM:DerivedFrom rdf:parseType=Resource

stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID

stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
 /xmpMM:DerivedFrom
/rdf:Description
{code}
PDF-Tools considers the file to be correct. But according to 
http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
 they don't raise the correct alarm for XMP violations. The PDFLib xmp checker 
also considers the XMP to be correct.

6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
thinks it is.

[~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
years old, so maybe Adobe/Callas have improved their product.

(Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
in document XMP does not match the actual base64-encoded image data)

  was:
In the Bavaria test suite, PDFLib claims that the attached file is not a valid 
PDF/A-1b file, because Property stRef:instanceID in document XMP requires 
scheme identifier or XMP type violation in stRef:instanceID (They make both 
claims in Bavaria.xml).
{code}
rdf:Description rdf:about=
xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
 
xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
 xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
 xmpMM:DerivedFrom rdf:parseType=Resource

stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID

stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
 /xmpMM:DerivedFrom
/rdf:Description
{code}
PDF-Tools considers the file to be correct. But according to 
http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
 they don't raise the correct alarm for XMP violations. The PDFLib xmp checker 
also considers the XMP to be correct.

6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
thinks it is.

[~msahyoun]what do you get for that file? The Bavaria Testsuite is already 5 
years old, so maybe Adobe/Callas have improved their product.

(Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
in document XMP does not match the actual base64-encoded image data)


 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 

[jira] [Created] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-2643:
---

 Summary: XMP type violation in stRef:instanceID not reported by 
preflight
 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr


In the Bavaria test suite, PDFLib claims that the attached file is not a valid 
PDF/A-1b file, because Property stRef:instanceID in document XMP requires 
scheme identifier or XMP type violation in stRef:instanceID (They make both 
claims in Bavaria.xml).
{code}
rdf:Description rdf:about=
xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
 
xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
 xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
 xmpMM:DerivedFrom rdf:parseType=Resource

stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID

stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
 /xmpMM:DerivedFrom
/rdf:Description
{code}
PDF-Tools considers the file to be correct. But according to 
http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
 they don't raise the correct alarm for XMP violations. The PDFLib xmp checker 
also considers the XMP to be correct.

6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
thinks it is.

[~msahyoun]what do you get for that file? The Bavaria Testsuite is already 5 
years old, so maybe Adobe/Callas have improved their product.

(Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2643:

Attachment: PDFA_Conference_2009_nc.pdf

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun]what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2576) Improve code quality

2015-01-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297245#comment-14297245
 ] 

ASF subversion and git services commented on PDFBOX-2576:
-

Commit 1655772 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1655772 ]

PDFBOX-2576: simplify code

 Improve code quality
 

 Key: PDFBOX-2576
 URL: https://issues.apache.org/jira/browse/PDFBOX-2576
 Project: PDFBox
  Issue Type: Task
Affects Versions: 2.0.0
Reporter: Tilman Hausherr

 This is a longterm issue for the task to improve code quality, by using the 
 [SonarQube 
 report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
  hints in different IDEs, the FindBugs tool and other code quality tools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2576) Improve code quality

2015-01-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297261#comment-14297261
 ] 

ASF subversion and git services commented on PDFBOX-2576:
-

Commit 1655775 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1655775 ]

PDFBOX-2576: remove double code; insert default to appease SonarQube

 Improve code quality
 

 Key: PDFBOX-2576
 URL: https://issues.apache.org/jira/browse/PDFBOX-2576
 Project: PDFBox
  Issue Type: Task
Affects Versions: 2.0.0
Reporter: Tilman Hausherr

 This is a longterm issue for the task to improve code quality, by using the 
 [SonarQube 
 report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
  hints in different IDEs, the FindBugs tool and other code quality tools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2576) Improve code quality

2015-01-29 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297285#comment-14297285
 ] 

Maruan Sahyoun commented on PDFBOX-2576:


it might be clearer to have the break statement on a new line

so instead of
{code}
case 1: colorSpace = PDDeviceGray.INSTANCE; break;
{code}

maybe
{code}
case 1: colorSpace = PDDeviceGray.INSTANCE;
break;
{code}

 Improve code quality
 

 Key: PDFBOX-2576
 URL: https://issues.apache.org/jira/browse/PDFBOX-2576
 Project: PDFBox
  Issue Type: Task
Affects Versions: 2.0.0
Reporter: Tilman Hausherr

 This is a longterm issue for the task to improve code quality, by using the 
 [SonarQube 
 report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
  hints in different IDEs, the FindBugs tool and other code quality tools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297294#comment-14297294
 ] 

Maruan Sahyoun commented on PDFBOX-2643:


According to Acrobat the file is OK. Would you like me to cross check with the 
specification(s)?

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2640) Fields within a fields kids entry are not correctly recognized

2015-01-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296888#comment-14296888
 ] 

ASF subversion and git services commented on PDFBOX-2640:
-

Commit 1655672 from [~msahyoun] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1655672 ]

PDFBOX-2640 correct resolving /Kids entries

 Fields within a fields kids entry are not correctly recognized 
 ---

 Key: PDFBOX-2640
 URL: https://issues.apache.org/jira/browse/PDFBOX-2640
 Project: PDFBox
  Issue Type: Bug
  Components: AcroForm
Affects Versions: 2.0.0
Reporter: Maruan Sahyoun
Assignee: Maruan Sahyoun
 Fix For: 2.0.0


 From the users mailing list:
 I'm using latest PDFBox 2.0.0 snaphots for filling out a form. Processing was 
 working fine both for AcroForm fields and for XFA, but latest changes 
 (probably sometimes between Jan 22 - Jan 25) caused that fields in a form are 
 not discovered.
 The file I'm using is available here: http://www.msmt.cz/file/34489_1_1/
 When I run PrintFields from examples - 
 http://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/fdf/PrintFields.java?view=markup
  - on the file, the output I get is:
 1 top-level fields were found on the form
 |--topmostSubform[0]
 But when I use older version of 2.0.0 branch, all fields are printed:
 1 top-level fields were found on the form
 |--topmostSubform[0]
 |  |--topmostSubform[0].Page1[0]
 |  |  |--topmostSubform[0].Page1[0]._01_Subtitul[0], 
 type=org.apache.pdfbox.pdmodel.interactive.form.PDComboBox
 |  |  |--topmostSubform[0].Page1[0]._02_Registrovany_Nazev[0], 
 type=org.apache.pdfbox.pdmodel.interactive.form.PDTextField
 |  |  |--topmostSubform[0].Page1[0]._03_Ulice[0], 
 type=org.apache.pdfbox.pdmodel.interactive.form.PDTextField
 |  |  |--topmostSubform[0].Page1[0]._04_Mesto[0], 
 type=org.apache.pdfbox.pdmodel.interactive.form.PDTextField
 |  |  |--topmostSubform[0].Page1[0]._05_PSC[0], 
 type=org.apache.pdfbox.pdmodel.interactive.form.PDTextField
 ...
 The problem seems to be at the line:
 ListCOSObjectable kids = field.getKids();
 The list is empty for the top-level field. Anyone has a clue what could be 
 wrong?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2634) Multiple text operations on multiple pages cause NPE in TTFSubsetter

2015-01-29 Thread Alex Nevidomsky (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297332#comment-14297332
 ] 

Alex Nevidomsky commented on PDFBOX-2634:
-

Perfect! Thank you )
[~jahewson] I guess one small thing that people may bump into later is spaces: 
00A0, 2008 and other spaces result in
{noformat}
java.lang.IllegalArgumentException: No glyph for U+2008 in font LiberationMono
at 
org.apache.pdfbox.pdmodel.font.PDCIDFontType2.encode(PDCIDFontType2.java:410)
at 
org.apache.pdfbox.pdmodel.font.PDType0Font.encode(PDType0Font.java:284)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:267)
{noformat}
I guess spaces are not part of the glyphs.

 Multiple text operations on multiple pages cause NPE in TTFSubsetter
 

 Key: PDFBOX-2634
 URL: https://issues.apache.org/jira/browse/PDFBOX-2634
 Project: PDFBox
  Issue Type: Bug
  Components: FontBox
Affects Versions: 2.0.0
Reporter: Alex Nevidomsky
Assignee: John Hewson
 Fix For: 2.0.0


 Problem seems to be of the same nature as in PDFBOX-2605, in a slightly 
 different scenario.
 {code:title=NullPTest.java}
 import org.apache.pdfbox.pdmodel.PDDocument;
 import org.apache.pdfbox.pdmodel.PDPage;
 import org.apache.pdfbox.pdmodel.common.PDRectangle;
 import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;
 import org.apache.pdfbox.pdmodel.font.PDFont;
 import org.apache.pdfbox.pdmodel.font.PDType0Font;
 import org.junit.Test;
 public class NullPTest {
 @Test
 public void testMultipageUnicodePDF() throws Exception {
 PDDocument document = new PDDocument();
 PDFont titleFont = PDType0Font.load(document, 
 this.getClass().getResourceAsStream(/Arial Unicode.ttf));
 PDPage page = new PDPage(PDRectangle.A4);
 document.addPage(page);
 PDPageContentStream contentStream = new PDPageContentStream(document, 
 page);
 contentStream.beginText();
 contentStream.setFont(titleFont, 12);
 contentStream.newLineAtOffset(0, 100);
 contentStream.showText(Pěkný žluťoučký kůň úpěl ďábelské ódy);
 contentStream.endText();
 contentStream.close();
 
 page = new PDPage(PDRectangle.A4);
 document.addPage(page);
 contentStream = new PDPageContentStream(document, page);
 contentStream.beginText();
 contentStream.setFont(titleFont, 12);
 contentStream.newLineAtOffset(0, 200);
 contentStream.showText(Pěkný žluťoučký kůň úpěl ďábelské ódy);
 contentStream.endText();
 contentStream.close();
 document.close();
 }
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2576) Improve code quality

2015-01-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297333#comment-14297333
 ] 

ASF subversion and git services commented on PDFBOX-2576:
-

Commit 1655798 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1655798 ]

PDFBOX-2576: reformat

 Improve code quality
 

 Key: PDFBOX-2576
 URL: https://issues.apache.org/jira/browse/PDFBOX-2576
 Project: PDFBox
  Issue Type: Task
Affects Versions: 2.0.0
Reporter: Tilman Hausherr

 This is a longterm issue for the task to improve code quality, by using the 
 [SonarQube 
 report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
  hints in different IDEs, the FindBugs tool and other code quality tools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Assigned] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun reassigned PDFBOX-2643:
--

Assignee: Maruan Sahyoun

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
Assignee: Maruan Sahyoun
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun updated PDFBOX-2643:
---
Assignee: (was: Maruan Sahyoun)

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297324#comment-14297324
 ] 

Tilman Hausherr commented on PDFBOX-2643:
-

Sure... Maybe I missed something. I did look at the XMP specification for the 
content of InstanceID.

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297383#comment-14297383
 ] 

Maruan Sahyoun commented on PDFBOX-2643:


IMHO the xml is valid, as 

- xmpMM refers to the XMP Media Management Schema which is predefined in XMP as 
of January 2004
- xmpMM:DerivedFrom is a property describe in the schema as a ResourceRef where 
not all entries are required
- stRef:instanceID and stRef:documentID are valid properties of 
xmlmm:DerivedFrom
- stRef:instanceID and stRef:documentID both describing a URI for which the 
content is valid

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297383#comment-14297383
 ] 

Maruan Sahyoun edited comment on PDFBOX-2643 at 1/29/15 7:16 PM:
-

IMHO the xml is valid, as 

- xmpMM refers to the XMP Media Management Schema which is predefined in XMP as 
of January 2004
- xmpMM:DerivedFrom is a property described in the schema as a ResourceRef 
where not all entries are required
- stRef:instanceID and stRef:documentID are valid properties of 
xmlMM:DerivedFrom
- stRef:instanceID and stRef:documentID both describing a URI for which the 
content is valid


was (Author: msahyoun):
IMHO the xml is valid, as 

- xmpMM refers to the XMP Media Management Schema which is predefined in XMP as 
of January 2004
- xmpMM:DerivedFrom is a property describe in the schema as a ResourceRef where 
not all entries are required
- stRef:instanceID and stRef:documentID are valid properties of 
xmlmm:DerivedFrom
- stRef:instanceID and stRef:documentID both describing a URI for which the 
content is valid

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297415#comment-14297415
 ] 

Maruan Sahyoun commented on PDFBOX-2643:


An addition to my above note.

I also looked at the other properties and xmpMM:InstanceID is NOT defined in 
the XMP specification of January 2004 which would make the XMP invalid.

But an agreement has been made

{quote}
Driven by requests from important user groups, the community of PDF/A creation 
and validation tool vendors came to the conclusion of not flagging InstanceID 
as a PDF/A-1 violation. In some workflows InstanceID is highly useful for the 
following reasons ...
{quote}

which has made it’s way into the Technical Corrigendum 2 of the PDF/A-1 
specification.

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297472#comment-14297472
 ] 

Maruan Sahyoun commented on PDFBOX-2643:


finally I looked up the requirement to have a scheme identifier for a URI and 
I’d think that a URI without a scheme is valid. In the current case of 
6544a661-c065-11dc-854c-dd4f35453e8b it’s a relative path. I couldn’t find 
anything in the PDF/A or XMP spec requiring a scheme for a URI. For 
xmlMM:InstanceID it’s written that it should be based on uuid but as this is 
only a should this is no hard requirement and only mentioned for 
xmlMM:InstanceID and not for stRef:instanceID.

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297737#comment-14297737
 ] 

Maruan Sahyoun commented on PDFBOX-2643:


Thought I add a little more analysis if one is revisiting that resolution :-)

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2610) Expand Isartor test for Bavaria test suite and other tests

2015-01-29 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297819#comment-14297819
 ] 

Maruan Sahyoun commented on PDFBOX-2610:


If we do it optionally I’d think we should ensure that the test is enabled on 
our build server. 

 Expand Isartor test for Bavaria test suite and other tests
 --

 Key: PDFBOX-2610
 URL: https://issues.apache.org/jira/browse/PDFBOX-2610
 Project: PDFBox
  Issue Type: Task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
Assignee: Tilman Hausherr
  Labels: photoshop

 1) Expand the isartor test code so that it can also check conforming 
 documents, i.e. documents that should not bring any errors. Support JBIG2.
 2) Test the files from the Bavaria suite with preflight. I'll create 
 sub-issues on that one. I counted 16 where something doesn't work as intented.
 3) Include the Bavaria tests in the build. Only if we agree on this one. If 
 not, I'll just keep it for myself as an additional regression test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-2643.
---
Resolution: Not a Problem

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2643) XMP type violation in stRef:instanceID not reported by preflight

2015-01-29 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297728#comment-14297728
 ] 

Tilman Hausherr commented on PDFBOX-2643:
-

Wow thanks, that was very detailed. So I'll close this as not a problem and 
mention this in expected_errors.txt at a later time. I'll work on the thumbnail 
problem at a later time in a separate issue.

 XMP type violation in stRef:instanceID not reported by preflight
 --

 Key: PDFBOX-2643
 URL: https://issues.apache.org/jira/browse/PDFBOX-2643
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: PDFA_Conference_2009_nc.pdf


 In the Bavaria test suite, PDFLib claims that the attached file is not a 
 valid PDF/A-1b file, because Property stRef:instanceID in document XMP 
 requires scheme identifier or XMP type violation in stRef:instanceID (They 
 make both claims in Bavaria.xml).
 {code}
 rdf:Description rdf:about=
   xmlns:xmpMM=http://ns.adobe.com/xap/1.0/mm/;
   xmlns:stRef=http://ns.adobe.com/xap/1.0/sType/ResourceRef#;
  
 xmpMM:InstanceIDuuid:b429d411-e628-45ca-b932-d2c77fbe6cd3/xmpMM:InstanceID
  
 xmpMM:DocumentIDadobe:docid:indd:db084a4d-dbb2-11dc-ac34-beb3cc4028ec/xmpMM:DocumentID
  xmpMM:RenditionClassproof:pdf/xmpMM:RenditionClass
  xmpMM:DerivedFrom rdf:parseType=Resource
   
 stRef:instanceID6544a661-c065-11dc-854c-dd4f35453e8b/stRef:instanceID
   
 stRef:documentIDadobe:docid:indd:fa7c6589-9f4a-11dc-9641-af983df728d7/stRef:documentID
  /xmpMM:DerivedFrom
 /rdf:Description
 {code}
 PDF-Tools considers the file to be correct. But according to 
 http://www.pdflib.com/fileadmin/pdflib/pdf/pdfa/2009-05-04-Bavaria-report-on-PDFA-validation-accuracy.pdf
  they don't raise the correct alarm for XMP violations. The PDFLib xmp 
 checker also considers the XMP to be correct.
 6544a661-c065-11dc-854c-dd4f35453e8b does not look like a valid URI to me 
 although the regex mentioned at http://tools.ietf.org/html/rfc3986#appendix-B 
 thinks it is.
 [~msahyoun] what do you get for that file? The Bavaria Testsuite is already 5 
 years old, so maybe Adobe/Callas have improved their product.
 (Another unreported error for that file is xapGImg:height for xmp:Thumbnails 
 in document XMP does not match the actual base64-encoded image data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-2644) Load FDF document creates Temp file when called with file parameter

2015-01-29 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-2644:
---

 Summary: Load FDF document creates Temp file when called with file 
parameter
 Key: PDFBOX-2644
 URL: https://issues.apache.org/jira/browse/PDFBOX-2644
 Project: PDFBox
  Issue Type: Bug
  Components: Parsing
Reporter: Tilman Hausherr


Load FDF document creates Temp file when called with file parameter, as shown 
by this stack trace from 
https://stackoverflow.com/questions/28229085/temp-file-creation-error-on-gae-with-pdfbox
{code}
com.sun.jersey.spi.container.ContainerResponse mapMappableContainerException: 
The RuntimeException could not be mapped to a response, re-throwing to the HTTP 
container
java.lang.SecurityException: Unable to create temporary file
at java.io.File.checkAndCreate(File.java:1873)
at java.io.File.createTempFile(File.java:1968)
at java.io.File.createTempFile(File.java:2013)
at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.createTmpFile(NonSequentialPDFParser.java:298)
at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.init(NonSequentialPDFParser.java:278)
at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.init(NonSequentialPDFParser.java:264)
at org.apache.pdfbox.pdmodel.fdf.FDFDocument.load(FDFDocument.java:200)
at org.apache.pdfbox.pdmodel.fdf.FDFDocument.load(FDFDocument.java:172)
{code}
and this source code
{code}
File pdfFile = new File(resources/GenerateFDF.pdf);
File fdfFile = new File(resources/fdftest.fdf);

PDDocument pdfDoc = PDDocument.load(pdfFile);
FDFDocument fdfDoc = FDFDocument.load(fdfFile);
{code}
I had a quick look at the sources of FDFDocument:
{code}
public static FDFDocument load( String filename ) throws IOException
{
return load( new BufferedInputStream( new FileInputStream( filename ) ) 
);
}
{code}
Is it needed this way, i.e. can't the NonSequentialPDFParser be called like 
with load PDF Document?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2644) Load FDF document creates Temp file when called with file parameter

2015-01-29 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2644:

Description: 
Load FDF document creates Temp file when called with file parameter, as shown 
by this stack trace from 
https://stackoverflow.com/questions/28229085/temp-file-creation-error-on-gae-with-pdfbox
{code}
com.sun.jersey.spi.container.ContainerResponse mapMappableContainerException: 
The RuntimeException could not be mapped to a response, re-throwing to the HTTP 
container
java.lang.SecurityException: Unable to create temporary file
at java.io.File.checkAndCreate(File.java:1873)
at java.io.File.createTempFile(File.java:1968)
at java.io.File.createTempFile(File.java:2013)
at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.createTmpFile(NonSequentialPDFParser.java:298)
at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.init(NonSequentialPDFParser.java:278)
at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.init(NonSequentialPDFParser.java:264)
at org.apache.pdfbox.pdmodel.fdf.FDFDocument.load(FDFDocument.java:200)
at org.apache.pdfbox.pdmodel.fdf.FDFDocument.load(FDFDocument.java:172)
{code}
and this source code
{code}
File pdfFile = new File(resources/GenerateFDF.pdf);
File fdfFile = new File(resources/fdftest.fdf);

PDDocument pdfDoc = PDDocument.load(pdfFile);
FDFDocument fdfDoc = FDFDocument.load(fdfFile);
{code}
I had a quick look at the sources of FDFDocument:
{code}
public static FDFDocument load( File file ) throws IOException
{
return load( new BufferedInputStream( new FileInputStream( file ) ) );
}
{code}
Is it needed this way, i.e. can't the NonSequentialPDFParser constructor be 
called instead, as it is done when opening a *P*DF Document?

  was:
Load FDF document creates Temp file when called with file parameter, as shown 
by this stack trace from 
https://stackoverflow.com/questions/28229085/temp-file-creation-error-on-gae-with-pdfbox
{code}
com.sun.jersey.spi.container.ContainerResponse mapMappableContainerException: 
The RuntimeException could not be mapped to a response, re-throwing to the HTTP 
container
java.lang.SecurityException: Unable to create temporary file
at java.io.File.checkAndCreate(File.java:1873)
at java.io.File.createTempFile(File.java:1968)
at java.io.File.createTempFile(File.java:2013)
at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.createTmpFile(NonSequentialPDFParser.java:298)
at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.init(NonSequentialPDFParser.java:278)
at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.init(NonSequentialPDFParser.java:264)
at org.apache.pdfbox.pdmodel.fdf.FDFDocument.load(FDFDocument.java:200)
at org.apache.pdfbox.pdmodel.fdf.FDFDocument.load(FDFDocument.java:172)
{code}
and this source code
{code}
File pdfFile = new File(resources/GenerateFDF.pdf);
File fdfFile = new File(resources/fdftest.fdf);

PDDocument pdfDoc = PDDocument.load(pdfFile);
FDFDocument fdfDoc = FDFDocument.load(fdfFile);
{code}
I had a quick look at the sources of FDFDocument:
{code}
public static FDFDocument load( String filename ) throws IOException
{
return load( new BufferedInputStream( new FileInputStream( filename ) ) 
);
}
{code}
Is it needed this way, i.e. can't the NonSequentialPDFParser be called like 
with load PDF Document?


 Load FDF document creates Temp file when called with file parameter
 ---

 Key: PDFBOX-2644
 URL: https://issues.apache.org/jira/browse/PDFBOX-2644
 Project: PDFBox
  Issue Type: Bug
  Components: Parsing
Reporter: Tilman Hausherr

 Load FDF document creates Temp file when called with file parameter, as shown 
 by this stack trace from 
 https://stackoverflow.com/questions/28229085/temp-file-creation-error-on-gae-with-pdfbox
 {code}
 com.sun.jersey.spi.container.ContainerResponse mapMappableContainerException: 
 The RuntimeException could not be mapped to a response, re-throwing to the 
 HTTP container
 java.lang.SecurityException: Unable to create temporary file
   at java.io.File.checkAndCreate(File.java:1873)
   at java.io.File.createTempFile(File.java:1968)
   at java.io.File.createTempFile(File.java:2013)
   at 
 org.apache.pdfbox.pdfparser.NonSequentialPDFParser.createTmpFile(NonSequentialPDFParser.java:298)
   at 
 org.apache.pdfbox.pdfparser.NonSequentialPDFParser.init(NonSequentialPDFParser.java:278)
   at 
 org.apache.pdfbox.pdfparser.NonSequentialPDFParser.init(NonSequentialPDFParser.java:264)
   at org.apache.pdfbox.pdmodel.fdf.FDFDocument.load(FDFDocument.java:200)
   at 

[jira] [Commented] (PDFBOX-2619) XMP dates contain time zone, while document info dates do not, and this isn't detected by preflight

2015-01-29 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298327#comment-14298327
 ] 

Tilman Hausherr commented on PDFBOX-2619:
-

Don't know, the problem is that the two dates may have different TZ content but 
still be the same (e.g. one with a name, one with a value). And this wouldn't 
solve the question about one having a TZ value and one who doesn't.

 XMP dates contain time zone, while document info dates do not, and this isn't 
 detected by preflight
 ---

 Key: PDFBOX-2619
 URL: https://issues.apache.org/jira/browse/PDFBOX-2619
 Project: PDFBox
  Issue Type: Sub-task
  Components: Preflight
Affects Versions: 1.8.8, 2.0.0
Reporter: Tilman Hausherr
 Attachments: empty_word.pdf


 Another one from the Bavaria test suite:
 {code}
 /CreationDate(D:20090317081112) 
 /ModDate(D:20090317081112)
 xmp:CreateDate2009-03-17T08:11:12Z/xmp:CreateDate
 xmp:ModifyDate2009-03-17T08:11:12Z/xmp:ModifyDate
 {code}
 The info dates do not have a timezone, but the xmp dates do (Z = Zulu). 
 This information (whether there was a timezone information in the string) is 
 lost in our conversion methods :-(
 Amusingly, PDF Tools says the file is valid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org