[
https://issues.apache.org/jira/browse/SOLR-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758652#action_12758652
]
Fergus McMenemie commented on SOLR-1437:
----------------------------------------
Noble,
Playing with the code... some observations I would like confirmed.
1) inside parse() the valuesAddedinThisFrame HashSet and the Stack<Set<String>>
stack variables are only used to aid in the clean up after out-puting record.
2) The code seems unable to collect text for a forEach xpath. So for the
following fragment of code
{code}
String xml="<root>\n"
+ " <status>live</status>\n"
+ " <contenido id=\"10097\" idioma=\"cat\">\n"
+ " Cats can be cute\n"
+ " <antetitulo></antetitulo>\n"
+ " <titulo>\n This is my title\n </titulo>\n"
+ " <resumen>\n This is my summary\n </resumen>\n"
+ " <texto>\n This is the body of my text\n </texto>\n"
+ " </contenido>\n"
+ "</root>";
XPathRecordReader rr = new XPathRecordReader("/root/contenido");
rr.addField("cat" ,"/root/contenido", false); // ***** FAILS *****
rr.addField("id", "/root/contenido/@id", false);
{code}
we can get the string associated with the id attrbute of <contenido> but not
its child text! Is this a design goal, or just the way the code ended up
behaving. Do we want it to continue to work this way?
> DIH: Enhance XPathRecordReader to deal with //tagname and other improvments.
> ----------------------------------------------------------------------------
>
> Key: SOLR-1437
> URL: https://issues.apache.org/jira/browse/SOLR-1437
> Project: Solr
> Issue Type: Improvement
> Components: contrib - DataImportHandler
> Affects Versions: 1.4
> Reporter: Fergus McMenemie
> Assignee: Noble Paul
> Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1437.patch, SOLR-1437.patch
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> As per
> http://www.nabble.com/Re%3A-Extract-info-from-parent-node-during-data-import-%28redirect%3A%29-td25471162.html
> it would be nice to be able to use expressions such as //tagname when
> parsing XML documents.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.