[jira] [Commented] (PDFBOX-4189) Enable PDF creation with Indian languages, by reading and utilizing the GSUB table

2022-05-10 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534507#comment-17534507
 ] 

Tilman Hausherr commented on PDFBOX-4189:
-

There's only the current code (and the code in the Apache FOP project). If you 
want to contribute code you're welcome, but please start small and try to 
follow the conventions.
https://pdfbox.apache.org/codingconventions.html

> Enable PDF creation with Indian languages, by reading and utilizing the GSUB 
> table
> --
>
> Key: PDFBOX-4189
> URL: https://issues.apache.org/jira/browse/PDFBOX-4189
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, PDModel
>Reporter: Palash Ray
>Priority: Major
> Attachments: Bengali-text-after.pdf, Bengali-text-before.pdf, 
> BengaliPdfGenerationHelloWorld.java, bengali-example.pdf, 
> bengali-example2.pdf, bengali-example3.pdf, bengali-word-lohit-bad.pdf, 
> bengali-word-lohit-good.pdf, committed.patch, pdf-output.png, screenshot.png
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Implemented proper rendering of Indian languages, which need extensive Glyph 
> substitution. The GSUB table has been read and used effectively to replace 
> some compound words with their respective Glyphs. All tests are passing. I 
> have tested this for the Bengali font. Please review these changes and let me 
> know if it makes sense to incorporate these.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5431) New NPE in xmpbox parser in trunk

2022-05-10 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534493#comment-17534493
 ] 

Tilman Hausherr commented on PDFBOX-5431:
-

It's easy to avoid the NPE, but I wonder what exactly is wrong with the file? 
(for the exception text)

> New NPE in xmpbox parser in trunk
> -
>
> Key: PDFBOX-5431
> URL: https://issues.apache.org/jira/browse/PDFBOX-5431
> Project: PDFBox
>  Issue Type: Task
>  Components: XmpBox
>Affects Versions: 3.0.0 PDFBox
>Reporter: Tim Allison
>Priority: Major
> Attachments: metadata.xml
>
>
> I noticed a new NPE in one of our test files on Tika when I recently built 
> PDFBox's trunk.  I've attached the file.
> If I don't set strict parsing to false, the parse works.
> {noformat}
> DomXmpParser xmpParser = new DomXmpParser();
> xmpParser.setStrictParsing(false);
> Path p = Paths.get(".../metadata.xml");
> try (InputStream is = Files.newInputStream(p)) {
> XMPMetadata metadata = xmpParser.parse(is);
> for (XMPSchema schema : metadata.getAllSchemas()) {
> for (AbstractField f : schema.getAllProperties()) {
> System.out.println(f);
> }
> }
> }
> {noformat}
> Stack
> {noformat}
> ava.lang.NullPointerException
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiDescription(DomXmpParser.java:608)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiElement(DomXmpParser.java:529)
>   at org.apache.xmpbox.xml.DomXmpParser.manageArray(DomXmpParser.java:487)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:352)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:319)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:248)
>   at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:201)
>   at 
> org.apache.tika.parser.indesign.IDMLParserTest.testXMP(IDMLParserTest.java:81)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-5431) New NPE in xmpbox parser in trunk

2022-05-10 Thread Tim Allison (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated PDFBOX-5431:

Description: 
I noticed a new NPE in one of our test files on Tika when I recently built 
PDFBox's trunk.  I've attached the file.

If I don't set strict parsing to false, the parse works.


{noformat}
DomXmpParser xmpParser = new DomXmpParser();
xmpParser.setStrictParsing(false);
Path p = Paths.get(".../metadata.xml");
try (InputStream is = Files.newInputStream(p)) {
XMPMetadata metadata = xmpParser.parse(is);
for (XMPSchema schema : metadata.getAllSchemas()) {
for (AbstractField f : schema.getAllProperties()) {
System.out.println(f);
}
}
}
{noformat}

Stack
{noformat}
ava.lang.NullPointerException
at 
org.apache.xmpbox.xml.DomXmpParser.parseLiDescription(DomXmpParser.java:608)
at 
org.apache.xmpbox.xml.DomXmpParser.parseLiElement(DomXmpParser.java:529)
at org.apache.xmpbox.xml.DomXmpParser.manageArray(DomXmpParser.java:487)
at 
org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:352)
at 
org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:319)
at 
org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:248)
at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:201)
at 
org.apache.tika.parser.indesign.IDMLParserTest.testXMP(IDMLParserTest.java:81)
{noformat}

  was:
I noticed a new NPE in one of our test files on Tika when I recently built 
PDFBox's trunk.  I've attached the file.

If I don't set strict parsing to false, the parse works.


{noformat}
DomXmpParser xmpParser = new DomXmpParser();
xmpParser.setStrictParsing(false);
Path p = Paths.get("/home/tallison/Desktop/tmp/META-INF/metadata.xml");
try (InputStream is = Files.newInputStream(p)) {
XMPMetadata metadata = xmpParser.parse(is);
for (XMPSchema schema : metadata.getAllSchemas()) {
for (AbstractField f : schema.getAllProperties()) {
System.out.println(f);
}
}
}
{noformat}

Stack
{noformat}
ava.lang.NullPointerException
at 
org.apache.xmpbox.xml.DomXmpParser.parseLiDescription(DomXmpParser.java:608)
at 
org.apache.xmpbox.xml.DomXmpParser.parseLiElement(DomXmpParser.java:529)
at org.apache.xmpbox.xml.DomXmpParser.manageArray(DomXmpParser.java:487)
at 
org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:352)
at 
org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:319)
at 
org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:248)
at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:201)
at 
org.apache.tika.parser.indesign.IDMLParserTest.testXMP(IDMLParserTest.java:81)
{noformat}


> New NPE in xmpbox parser in trunk
> -
>
> Key: PDFBOX-5431
> URL: https://issues.apache.org/jira/browse/PDFBOX-5431
> Project: PDFBox
>  Issue Type: Task
>  Components: XmpBox
>Affects Versions: 3.0.0 PDFBox
>Reporter: Tim Allison
>Priority: Major
> Attachments: metadata.xml
>
>
> I noticed a new NPE in one of our test files on Tika when I recently built 
> PDFBox's trunk.  I've attached the file.
> If I don't set strict parsing to false, the parse works.
> {noformat}
> DomXmpParser xmpParser = new DomXmpParser();
> xmpParser.setStrictParsing(false);
> Path p = Paths.get(".../metadata.xml");
> try (InputStream is = Files.newInputStream(p)) {
> XMPMetadata metadata = xmpParser.parse(is);
> for (XMPSchema schema : metadata.getAllSchemas()) {
> for (AbstractField f : schema.getAllProperties()) {
> System.out.println(f);
> }
> }
> }
> {noformat}
> Stack
> {noformat}
> ava.lang.NullPointerException
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiDescription(DomXmpParser.java:608)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiElement(DomXmpParser.java:529)
>   at org.apache.xmpbox.xml.DomXmpParser.manageArray(DomXmpParser.java:487)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:352)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:319)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:248)
>   at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:201)
>   at 
> org.apache.tika.parser.indesign.IDMLParserTest.testXMP(IDMLParserTest.java:81)
> {noformat}



--

[jira] [Updated] (PDFBOX-5431) New NPE in xmpbox parser in trunk

2022-05-10 Thread Tim Allison (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated PDFBOX-5431:

Component/s: XmpBox

> New NPE in xmpbox parser in trunk
> -
>
> Key: PDFBOX-5431
> URL: https://issues.apache.org/jira/browse/PDFBOX-5431
> Project: PDFBox
>  Issue Type: Task
>  Components: XmpBox
>Affects Versions: 3.0.0 PDFBox
>Reporter: Tim Allison
>Priority: Major
> Attachments: metadata.xml
>
>
> I noticed a new NPE in one of our test files on Tika when I recently built 
> PDFBox's trunk.  I've attached the file.
> If I don't set strict parsing to false, the parse works.
> {noformat}
> DomXmpParser xmpParser = new DomXmpParser();
> xmpParser.setStrictParsing(false);
> Path p = 
> Paths.get("/home/tallison/Desktop/tmp/META-INF/metadata.xml");
> try (InputStream is = Files.newInputStream(p)) {
> XMPMetadata metadata = xmpParser.parse(is);
> for (XMPSchema schema : metadata.getAllSchemas()) {
> for (AbstractField f : schema.getAllProperties()) {
> System.out.println(f);
> }
> }
> }
> {noformat}
> Stack
> {noformat}
> ava.lang.NullPointerException
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiDescription(DomXmpParser.java:608)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiElement(DomXmpParser.java:529)
>   at org.apache.xmpbox.xml.DomXmpParser.manageArray(DomXmpParser.java:487)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:352)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:319)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:248)
>   at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:201)
>   at 
> org.apache.tika.parser.indesign.IDMLParserTest.testXMP(IDMLParserTest.java:81)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-5431) New NPE in xmpbox parser in trunk

2022-05-10 Thread Tim Allison (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated PDFBOX-5431:

Affects Version/s: 3.0.0 PDFBox

> New NPE in xmpbox parser in trunk
> -
>
> Key: PDFBOX-5431
> URL: https://issues.apache.org/jira/browse/PDFBOX-5431
> Project: PDFBox
>  Issue Type: Task
>Affects Versions: 3.0.0 PDFBox
>Reporter: Tim Allison
>Priority: Major
> Attachments: metadata.xml
>
>
> I noticed a new NPE in one of our test files on Tika when I recently built 
> PDFBox's trunk.  I've attached the file.
> If I don't set strict parsing to false, the parse works.
> {noformat}
> DomXmpParser xmpParser = new DomXmpParser();
> xmpParser.setStrictParsing(false);
> Path p = 
> Paths.get("/home/tallison/Desktop/tmp/META-INF/metadata.xml");
> try (InputStream is = Files.newInputStream(p)) {
> XMPMetadata metadata = xmpParser.parse(is);
> for (XMPSchema schema : metadata.getAllSchemas()) {
> for (AbstractField f : schema.getAllProperties()) {
> System.out.println(f);
> }
> }
> }
> {noformat}
> Stack
> {noformat}
> ava.lang.NullPointerException
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiDescription(DomXmpParser.java:608)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiElement(DomXmpParser.java:529)
>   at org.apache.xmpbox.xml.DomXmpParser.manageArray(DomXmpParser.java:487)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:352)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:319)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:248)
>   at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:201)
>   at 
> org.apache.tika.parser.indesign.IDMLParserTest.testXMP(IDMLParserTest.java:81)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-5431) New NPE in xmpbox parser in trunk

2022-05-10 Thread Tim Allison (Jira)
Tim Allison created PDFBOX-5431:
---

 Summary: New NPE in xmpbox parser in trunk
 Key: PDFBOX-5431
 URL: https://issues.apache.org/jira/browse/PDFBOX-5431
 Project: PDFBox
  Issue Type: Task
Reporter: Tim Allison
 Attachments: metadata.xml

I noticed a new NPE in one of our test files on Tika when I recently built 
PDFBox's trunk.  I've attached the file.

If I don't set strict parsing to false, the parse works.


{noformat}
DomXmpParser xmpParser = new DomXmpParser();
xmpParser.setStrictParsing(false);
Path p = Paths.get("/home/tallison/Desktop/tmp/META-INF/metadata.xml");
try (InputStream is = Files.newInputStream(p)) {
XMPMetadata metadata = xmpParser.parse(is);
for (XMPSchema schema : metadata.getAllSchemas()) {
for (AbstractField f : schema.getAllProperties()) {
System.out.println(f);
}
}
}
{noformat}

Stack
{noformat}
ava.lang.NullPointerException
at 
org.apache.xmpbox.xml.DomXmpParser.parseLiDescription(DomXmpParser.java:608)
at 
org.apache.xmpbox.xml.DomXmpParser.parseLiElement(DomXmpParser.java:529)
at org.apache.xmpbox.xml.DomXmpParser.manageArray(DomXmpParser.java:487)
at 
org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:352)
at 
org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:319)
at 
org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:248)
at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:201)
at 
org.apache.tika.parser.indesign.IDMLParserTest.testXMP(IDMLParserTest.java:81)
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4189) Enable PDF creation with Indian languages, by reading and utilizing the GSUB table

2022-05-10 Thread Ramanathan Ramamoorthy (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534186#comment-17534186
 ] 

Ramanathan Ramamoorthy commented on PDFBOX-4189:


Thanks [~tilman] . Do let me know if there is any estimate or design already 
available. I can help if possible. 

> Enable PDF creation with Indian languages, by reading and utilizing the GSUB 
> table
> --
>
> Key: PDFBOX-4189
> URL: https://issues.apache.org/jira/browse/PDFBOX-4189
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, PDModel
>Reporter: Palash Ray
>Priority: Major
> Attachments: Bengali-text-after.pdf, Bengali-text-before.pdf, 
> BengaliPdfGenerationHelloWorld.java, bengali-example.pdf, 
> bengali-example2.pdf, bengali-example3.pdf, bengali-word-lohit-bad.pdf, 
> bengali-word-lohit-good.pdf, committed.patch, pdf-output.png, screenshot.png
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Implemented proper rendering of Indian languages, which need extensive Glyph 
> substitution. The GSUB table has been read and used effectively to replace 
> some compound words with their respective Glyphs. All tests are passing. I 
> have tested this for the Bengali font. Please review these changes and let me 
> know if it makes sense to incorporate these.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org