[ 
https://issues.apache.org/jira/browse/ANY23-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358911#comment-16358911
 ] 

ASF GitHub Bot commented on ANY23-291:
--------------------------------------

Github user ferrerod commented on the issue:

    https://github.com/apache/any23/pull/60
  
    @HansBrende @lewismc: Today, I cloned master and tried to build: mvn clean 
install and the core build is failing. I also saw code commented out in the run 
method that would have written out the triples:
    
        for (JSONLDScript jsonldScript : jsonldScripts) {
          //String lang = documentLang;
          //if (jsonldScript.getLang() != null) {
          //        lang = jsonldScript.getLang();
          //}
          //out.writeTriple(documentIRI, jsonldScript.getName(),
          //                
SimpleValueFactory.getInstance().createLiteral(jsonldScript.getContent(), 
lang));
        }
    
    I currently don't see how the ExtractionResult out is being written to in 
the current master branch code for EmbeddedJSONLDExtractor.java
    
    Thoughts?
    
    [ERROR] Failures: 
    [ERROR]   
EmbeddedJSONLDExtractorTest.testEmbeddedJSONLDInBody:45->AbstractExtractorTestCase.assertModelNotEmpty:329
 The model is expected to not be empty.Assertion failed! Extracted triples:
    
    [ERROR]   
EmbeddedJSONLDExtractorTest.testEmbeddedJSONLDInHead:31->AbstractExtractorTestCase.assertModelNotEmpty:329
 The model is expected to not be empty.Assertion failed! Extracted triples:
    
    [ERROR]   
EmbeddedJSONLDExtractorTest.testEmbeddedJSONLDInHeadAndBody:52->AbstractExtractorTestCase.assertModelNotEmpty:329
 The model is expected to not be empty.Assertion failed! Extracted triples:
    
    [ERROR]   
EmbeddedJSONLDExtractorTest.testSeveralEmbeddedJSONLDInHead:38->AbstractExtractorTestCase.assertModelNotEmpty:329
 The model is expected to not be empty.Assertion failed! Extracted triples:



> JSON-LD should be looked up in entire HTML document, not just in <head>
> -----------------------------------------------------------------------
>
>                 Key: ANY23-291
>                 URL: https://issues.apache.org/jira/browse/ANY23-291
>             Project: Apache Any23
>          Issue Type: Improvement
>          Components: extractors
>    Affects Versions: 1.2
>            Reporter: Thomas Francart
>            Assignee: Hans Brende
>            Priority: Minor
>             Fix For: 2.2
>
>         Attachments: example-embedded-jsonld.html
>
>
> In 
> org.apache.any23.extractor.html.EmbeddedJSONLDExtractor.extractJSONLDScript(),
>  I think this line :
> List<Node> scriptNodes = DomUtils.findAll(in, "/HTML/HEAD/SCRIPT");
> is too restrictive. scripts containing json-ld can be placed anywhere in the 
> page, and actually some CMS/Wordpress plugin inserting JSON-LD are generating 
> their output in the body, not in the head.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to