[ 
https://issues.apache.org/jira/browse/ANY23-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526627#comment-16526627
 ] 

ASF GitHub Bot commented on ANY23-353:
--------------------------------------

GitHub user HansBrende opened a pull request:

    https://github.com/apache/any23/pull/89

    ANY23-353 fallback on string datatype for langstring with no lang

    For string literals of datatype "langString" with no language tag, I'm 
simply falling back to type "string" (or using the default language if 
specified).
    
    mvn clean test -> all tests pass
    
    @lewismc any comments?

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HansBrende/any23 ANY23-353

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/any23/pull/89.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #89
    
----
commit e650a8d1ad22c2f53628285c941f240bb3e6b9ff
Author: Hans <firedrake93@...>
Date:   2018-06-28T18:05:16Z

    ANY23-353 fallback on string datatype for langstring with no lang

----


> RDFParseException: datatype rdf:langString requires a language tag
> ------------------------------------------------------------------
>
>                 Key: ANY23-353
>                 URL: https://issues.apache.org/jira/browse/ANY23-353
>             Project: Apache Any23
>          Issue Type: Bug
>          Components: extractors
>    Affects Versions: 2.3
>            Reporter: Hans Brende
>            Assignee: Hans Brende
>            Priority: Major
>             Fix For: 2.3
>
>
> When extracting from [http://dbpedia.org/data/Trento.n3], I get the following 
> error log:
> {code}
> org.apache.any23.extractor.ExtractionException: Error while parsing RDF 
> document.
>       at 
> org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:179)
>       at 
> org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:60)
>       at 
> org.apache.any23.extractor.SingleDocumentExtraction.runExtractor(SingleDocumentExtraction.java:471)
>       at 
> org.apache.any23.extractor.SingleDocumentExtraction.run(SingleDocumentExtraction.java:259)
>       at org.apache.any23.Any23.extract(Any23.java:302)
>       at org.apache.any23.Any23.extract(Any23.java:437)
>       at org.apache.any23.Any23Test.testDemoCodeSnippet2(Any23Test.java:223)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>       at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>       at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>       at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>       at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>       at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
>       at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
>       at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> Caused by: org.eclipse.rdf4j.rio.RDFParseException: datatype rdf:langString 
> requires a language tag [line 303]
>       at 
> org.eclipse.rdf4j.rio.helpers.RDFParserHelper.reportFatalError(RDFParserHelper.java:442)
>       at 
> org.eclipse.rdf4j.rio.helpers.RDFParserHelper.createLiteral(RDFParserHelper.java:242)
>       at 
> org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.createLiteral(AbstractRDFParser.java:571)
>       at 
> org.eclipse.rdf4j.rio.turtle.TurtleParser.parseQuotedLiteral(TurtleParser.java:672)
>       at 
> org.eclipse.rdf4j.rio.turtle.TurtleParser.parseValue(TurtleParser.java:597)
>       at 
> org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObject(TurtleParser.java:474)
>       at 
> org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObjectList(TurtleParser.java:412)
>       at 
> org.eclipse.rdf4j.rio.turtle.TurtleParser.parsePredicateObjectList(TurtleParser.java:407)
>       at 
> org.eclipse.rdf4j.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:372)
>       at 
> org.eclipse.rdf4j.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:239)
>       at 
> org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:201)
>       at 
> org.apache.any23.extractor.rdf.RDFParserFactory$ExtendedTurtleParser.parse(RDFParserFactory.java:352)
>       at 
> org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:143)
>       at 
> org.apache.any23.extractor.rdf.RDFParserFactory$ExtendedTurtleParser.parse(RDFParserFactory.java:359)
>       at 
> org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:174)
>       ... 31 more
> Caused by: java.lang.IllegalArgumentException: datatype rdf:langString 
> requires a language tag
>       at 
> org.eclipse.rdf4j.model.impl.SimpleLiteral.<init>(SimpleLiteral.java:99)
>       at 
> org.eclipse.rdf4j.model.impl.AbstractValueFactory.createLiteral(AbstractValueFactory.java:118)
>       at 
> org.apache.any23.rdf.Any23ValueFactoryWrapper.createLiteral(Any23ValueFactoryWrapper.java:176)
>       at 
> org.eclipse.rdf4j.rio.helpers.RDFParserHelper.createLiteral(RDFParserHelper.java:235)
>       ... 44 more
> {code}
> This is caused by the following malformed markup from dbpedia:
> {code}
> dbr:Trento    geo:geometry    "POINT(11.116666793823 
> 46.066665649414)"^^virtrdf:Geometry ;
>       dbp:mayor       "Alessandro Andreatta"^^rdf:langString ;
>       dbp:imageFlag   "Flag of Trento.png"^^rdf:langString .
> {code}
> I propose that, for string literals of datatype "langString" with no language 
> tag, we should simply fall back to type "string".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to