[jira] [Commented] (ANY23-226) Extract JSON-LD embedded in HTML
[ https://issues.apache.org/jira/browse/ANY23-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323532#comment-15323532 ] Peter Ansell commented on ANY23-226: I think this can be resolved. > Extract JSON-LD embedded in HTML > > > Key: ANY23-226 > URL: https://issues.apache.org/jira/browse/ANY23-226 > Project: Apache Any23 > Issue Type: Wish > Components: core >Affects Versions: 1.0 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney > Fix For: 1.3 > > > See http://www.w3.org/TR/json-ld/#embedding-json-ld-in-html-documents > I feel that we need to push this down at the jsonld-java level. > I am investigating. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ANY23-226) Extract JSON-LD embedded in HTML
[ https://issues.apache.org/jira/browse/ANY23-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372874#comment-14372874 ] ASF GitHub Bot commented on ANY23-226: -- Github user lewismc commented on the pull request: https://github.com/apache/any23/pull/16#issuecomment-84384643 Ok Peter thank you for looking. This is great. I have not seen the test failures. Can you please tell me if it is in Any23 or in jsonld-Java? We could upgrade the Jsonld-Java implementation as well. To the 0.5.1 release On Saturday, March 21, 2015, Peter Ansell notificati...@github.com wrote: The main bug was that the entire script node was being sent to JSONLD-Java, and not just its content. However, I also made a few other changes while doing that testing. It turned out that the jsonld was invalid, but somehow the exception when parses fail was changed to be silently swallowed, so the only indication was that the count was 0. I turned on the exception propagation again (no reason it should be swallowed outside of temporary testing). However, in addition to the 4 tests currently failing on the core tests, there are now other tests failing due to an inability to parse — Reply to this email directly or view it on GitHub https://github.com/apache/any23/pull/16#issuecomment-84257478. -- *Lewis* Extract JSON-LD embedded in HTML Key: ANY23-226 URL: https://issues.apache.org/jira/browse/ANY23-226 Project: Apache Any23 Issue Type: Wish Components: core Affects Versions: 1.0 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 1.3 See http://www.w3.org/TR/json-ld/#embedding-json-ld-in-html-documents I feel that we need to push this down at the jsonld-java level. I am investigating. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ANY23-226) Extract JSON-LD embedded in HTML
[ https://issues.apache.org/jira/browse/ANY23-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372516#comment-14372516 ] ASF GitHub Bot commented on ANY23-226: -- Github user asfgit closed the pull request at: https://github.com/apache/any23/pull/16 Extract JSON-LD embedded in HTML Key: ANY23-226 URL: https://issues.apache.org/jira/browse/ANY23-226 Project: Apache Any23 Issue Type: Wish Components: core Affects Versions: 1.0 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 1.3 See http://www.w3.org/TR/json-ld/#embedding-json-ld-in-html-documents I feel that we need to push this down at the jsonld-java level. I am investigating. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ANY23-226) Extract JSON-LD embedded in HTML
[ https://issues.apache.org/jira/browse/ANY23-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372528#comment-14372528 ] Hudson commented on ANY23-226: -- UNSTABLE: Integrated in Any23-trunk #1309 (See [https://builds.apache.org/job/Any23-trunk/1309/]) ANY23-226 Extract JSON-LD embedded in HTML (lewis.j.mcgibbney: rev 1e3eb9c31af2f93906eee1081179d73c30a0881b) * core/src/main/java/org/apache/any23/extractor/html/EmbeddedJSONLDExtractorFactory.java * core/src/main/java/org/apache/any23/extractor/html/DomUtils.java * core/src/test/java/org/apache/any23/extractor/html/EmbeddedJSONLDExtractorTest.java * plugins/integration-test/src/test/java/org/apache/any23/plugin/PluginIT.java * core/src/main/resources/org/apache/any23/prefixes/prefixes.properties * core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java * core/src/main/resources/org/apache/any23/extractor/html/example-embedded-jsonld.html * test-resources/src/test/resources/html/html-embedded-jsonld-extractor.html * core/src/main/java/org/apache/any23/extractor/html/EmbeddedJSONLDExtractor.java ANY23-226 : Make JSONLD extraction work (p_ansell: rev fd822849190240b8cf981ecc7abd0b4f592381d5) * core/src/main/resources/META-INF/services/org.apache.any23.extractor.ExtractorFactory * src/site/apt/any23-plugins.apt * core/src/test/java/org/apache/any23/extractor/rdf/JSONLDExtractorTest.java * core/src/main/java/org/apache/any23/cli/MicrodataParser.java * plugins/html-scraper/src/main/java/org/apache/any23/plugin/htmlscraper/HTMLScraperExtractor.java * core/src/main/java/org/apache/any23/extractor/html/HResumeExtractorFactory.java * core/src/main/java/org/apache/any23/writer/TriXWriterFactory.java * core/src/test/java/org/apache/any23/extractor/html/RDFMergerTest.java * core/src/main/resources/META-INF/services/org.apache.any23.cli.Tool * core/src/main/java/org/apache/any23/extractor/rdfa/RDFa11ExtractorFactory.java * core/src/test/java/org/apache/any23/extractor/html/HCalendarExtractorTest.java * core/src/main/java/org/apache/any23/extractor/html/HCardExtractorFactory.java * plugins/basic-crawler/src/main/java/org/apache/any23/cli/Crawler.java * core/src/main/java/org/apache/any23/writer/URIListWriterFactory.java * core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java * core/src/main/java/org/apache/any23/extractor/xpath/XPathExtractorFactory.java * core/src/test/java/org/apache/any23/extractor/html/EmbeddedJSONLDExtractorTest.java * core/src/main/java/org/apache/any23/extractor/html/HeadLinkExtractorFactory.java * core/src/main/java/org/apache/any23/extractor/rdf/JSONLDExtractorFactory.java * test-resources/src/test/resources/html/html-embedded-jsonld-extractor.html * core/src/main/java/org/apache/any23/cli/ExtractorDocumentation.java * plugins/html-scraper/src/main/resources/META-INF/services/org.apache.any23.extractor.ExtractorFactory * core/src/main/java/org/apache/any23/writer/TurtleWriterFactory.java * core/src/main/java/org/apache/any23/extractor/rdf/NTriplesExtractorFactory.java * core/src/main/resources/META-INF/services/org.apache.any23.writer.WriterFactory * core/src/main/java/org/apache/any23/writer/RDFXMLWriterFactory.java * core/src/main/java/org/apache/any23/cli/MimeDetector.java * core/src/test/java/org/apache/any23/extractor/example/ExampleExtractorFactory.java * core/src/main/java/org/apache/any23/writer/NTriplesWriterFactory.java * core/src/test/java/org/apache/any23/extractor/rdfa/AbstractRDFaExtractorTestCase.java * plugins/office-scraper/src/main/resources/META-INF/services/org.apache.any23.extractor.ExtractorFactory * core/src/main/java/org/apache/any23/extractor/html/AdrExtractorFactory.java * core/src/main/java/org/apache/any23/cli/VocabPrinter.java * core/src/main/java/org/apache/any23/extractor/rdf/TriXExtractorFactory.java * plugins/html-scraper/src/main/java/org/apache/any23/plugin/htmlscraper/HTMLScraperExtractorFactory.java * core/src/main/java/org/apache/any23/extractor/html/EmbeddedJSONLDExtractor.java * core/src/main/java/org/apache/any23/extractor/rdfa/RDFaExtractorFactory.java * plugins/office-scraper/src/main/java/org/apache/any23/plugin/officescraper/ExcelExtractor.java * core/src/main/java/org/apache/any23/writer/NQuadsWriterFactory.java * plugins/office-scraper/src/main/java/org/apache/any23/plugin/officescraper/ExcelExtractorFactory.java * core/src/test/java/org/apache/any23/extractor/html/SpeciesExtractorTest.java * core/src/main/java/org/apache/any23/extractor/html/TurtleHTMLExtractorFactory.java * core/pom.xml * core/src/main/java/org/apache/any23/extractor/html/SpeciesExtractorFactory.java * core/src/main/java/org/apache/any23/extractor/html/HTMLMetaExtractorFactory.java * nquads/src/main/java/org/apache/any23/io/nquads/NQuadsParserFactory.java * core/src/main/java/org/apache/any23/extractor/html/ICBMExtractorFactory.java *
[jira] [Commented] (ANY23-226) Extract JSON-LD embedded in HTML
[ https://issues.apache.org/jira/browse/ANY23-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372518#comment-14372518 ] ASF GitHub Bot commented on ANY23-226: -- Github user ansell commented on the pull request: https://github.com/apache/any23/pull/16#issuecomment-84257478 The main bug was that the entire script node was being sent to JSONLD-Java, and not just its content. However, I also made a few other changes while doing that testing. It turned out that the jsonld was invalid, but somehow the exception when parses fail was changed to be silently swallowed, so the only indication was that the count was 0. I turned on the exception propagation again (no reason it should be swallowed outside of temporary testing). However, in addition to the 4 tests currently failing on the core tests, there are now other tests failing due to an inability to parse div itemscope Extract JSON-LD embedded in HTML Key: ANY23-226 URL: https://issues.apache.org/jira/browse/ANY23-226 Project: Apache Any23 Issue Type: Wish Components: core Affects Versions: 1.0 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 1.3 See http://www.w3.org/TR/json-ld/#embedding-json-ld-in-html-documents I feel that we need to push this down at the jsonld-java level. I am investigating. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ANY23-226) Extract JSON-LD embedded in HTML
[ https://issues.apache.org/jira/browse/ANY23-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359041#comment-14359041 ] Lewis John McGibbney commented on ANY23-226: bq. Hence, it may be more appropriate to pick out the script type=application/ld+json.../script elements in Any23 and pass their content to JSONLD-Java for parsing. I think that you are right Peter. Extract JSON-LD embedded in HTML Key: ANY23-226 URL: https://issues.apache.org/jira/browse/ANY23-226 Project: Apache Any23 Issue Type: Wish Components: core Affects Versions: 1.0 Reporter: Lewis John McGibbney Fix For: 1.3 See http://www.w3.org/TR/json-ld/#embedding-json-ld-in-html-documents I feel that we need to push this down at the jsonld-java level. I am investigating. -- This message was sent by Atlassian JIRA (v6.3.4#6332)