[jira] [Commented] (PDFBOX-5431) New NPE in xmpbox parser in trunk

2022-05-11 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17535022#comment-17535022
 ] 

Tilman Hausherr commented on PDFBOX-5431:
-

It is related to PDFBOX-5128. The line {{tm.addNewNameSpace(namespace, 
prefix);}} in DomXmpParser calls this code
{code}
public void addNewNameSpace(String ns, String preferred)
{
PropertiesDescription mapping = new PropertiesDescription();
schemaMap.put(ns, new XMPSchemaFactory(ns, XMPSchema.class, mapping));
}
{code}
mapping is empty, so there are no properties.

Later in {{checkPropertyDefinition()}}, {{tm.getSpecifiedPropertyType(prop)}} 
is null because of this empty mapping.

What we could do is to change {{TypeMapping.getSpecifiedPropertyType()}} to 
include a check whether the non null factory is empty and if yes, ignore it.
{code}
// found in schema
PropertyType propertyType = factory.getPropertyType(name.getLocalPart());
if (!factory.getPropertyDefinition().getPropertiesName().isEmpty() || 
structuredNamespaces.get(name.getNamespaceURI()) == null)
{
return propertyType;
}
{code}
and then fall through to the "else" part (i.e. removing the word "else")

> New NPE in xmpbox parser in trunk
> -
>
> Key: PDFBOX-5431
> URL: https://issues.apache.org/jira/browse/PDFBOX-5431
> Project: PDFBox
>  Issue Type: Task
>  Components: XmpBox
>Affects Versions: 3.0.0 PDFBox
>Reporter: Tim Allison
>Priority: Major
> Attachments: metadata.xml
>
>
> I noticed a new NPE in one of our test files on Tika when I recently built 
> PDFBox's trunk.  I've attached the file.
> If I don't set strict parsing to false, the parse works.
> {noformat}
> DomXmpParser xmpParser = new DomXmpParser();
> xmpParser.setStrictParsing(false);
> Path p = Paths.get(".../metadata.xml");
> try (InputStream is = Files.newInputStream(p)) {
> XMPMetadata metadata = xmpParser.parse(is);
> for (XMPSchema schema : metadata.getAllSchemas()) {
> for (AbstractField f : schema.getAllProperties()) {
> System.out.println(f);
> }
> }
> }
> {noformat}
> Stack
> {noformat}
> ava.lang.NullPointerException
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiDescription(DomXmpParser.java:608)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiElement(DomXmpParser.java:529)
>   at org.apache.xmpbox.xml.DomXmpParser.manageArray(DomXmpParser.java:487)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:352)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:319)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:248)
>   at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:201)
>   at 
> org.apache.tika.parser.indesign.IDMLParserTest.testXMP(IDMLParserTest.java:81)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5431) New NPE in xmpbox parser in trunk

2022-05-10 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534493#comment-17534493
 ] 

Tilman Hausherr commented on PDFBOX-5431:
-

It's easy to avoid the NPE, but I wonder what exactly is wrong with the file? 
(for the exception text)

> New NPE in xmpbox parser in trunk
> -
>
> Key: PDFBOX-5431
> URL: https://issues.apache.org/jira/browse/PDFBOX-5431
> Project: PDFBox
>  Issue Type: Task
>  Components: XmpBox
>Affects Versions: 3.0.0 PDFBox
>Reporter: Tim Allison
>Priority: Major
> Attachments: metadata.xml
>
>
> I noticed a new NPE in one of our test files on Tika when I recently built 
> PDFBox's trunk.  I've attached the file.
> If I don't set strict parsing to false, the parse works.
> {noformat}
> DomXmpParser xmpParser = new DomXmpParser();
> xmpParser.setStrictParsing(false);
> Path p = Paths.get(".../metadata.xml");
> try (InputStream is = Files.newInputStream(p)) {
> XMPMetadata metadata = xmpParser.parse(is);
> for (XMPSchema schema : metadata.getAllSchemas()) {
> for (AbstractField f : schema.getAllProperties()) {
> System.out.println(f);
> }
> }
> }
> {noformat}
> Stack
> {noformat}
> ava.lang.NullPointerException
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiDescription(DomXmpParser.java:608)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiElement(DomXmpParser.java:529)
>   at org.apache.xmpbox.xml.DomXmpParser.manageArray(DomXmpParser.java:487)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:352)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:319)
>   at 
> org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:248)
>   at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:201)
>   at 
> org.apache.tika.parser.indesign.IDMLParserTest.testXMP(IDMLParserTest.java:81)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org