[ 
https://issues.apache.org/jira/browse/ANY23-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143922#comment-15143922
 ] 

Lewis John McGibbney commented on ANY23-131:
--------------------------------------------

bq. Any news on this?

Well I logged ANY23-273 as the service at any23.org is failing to extract the 
content of the bogus comment element... I realized that once I fixed that one 
(locally) I had well and truly opened a can of worms! I've just finished 
manually stepping through the webpage source and dealing with exceptions thrown 
by Any23. The Markup on this webpage is nothing short of hellish!!! Anyways, 
I've attached a JSON prettyprint of the extracted structure once everything has 
been cleaned. Your right, Any23 is not extracting relationships from nested 
<span> elements.
We need to reopen this issue and address it for this new use case.
Sorry it took me so bloody long to get around to this.

> Nested Microdata are not extracted
> ----------------------------------
>
>                 Key: ANY23-131
>                 URL: https://issues.apache.org/jira/browse/ANY23-131
>             Project: Apache Any23
>          Issue Type: Bug
>          Components: microdata
>    Affects Versions: 0.7.0
>            Reporter: Sebastien Richard
>            Assignee: Lewis John McGibbney
>             Fix For: 1.2
>
>
> Proposed patch:
> core/src/main/java/org/apache/any23/extractor/microdata/MicrodataParser.java:
> remove incorrect optim:
> L166
> - return getUnnestedNodes( topLevelItemScopes ); 
> + return topLevelItemScopes;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to