[ 
https://issues.apache.org/jira/browse/SOLR-10012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-10012:
-------------------------------
    Attachment: SOLR-10012.patch

The current XPathRecordReader code assumes each node in the combined XPath 
parse tree belongs to exactly one field. Additionally, flattened fields up the 
node hierarchy will prevent anything else from being collected.

This looks like a limitation/ bug to me, but changing it to support overlapping 
paths will change the current behavior (which is underspecified and ignored at 
the moment anyway: it silently omits certain field specifications).

> DIH's XPath processor works incorrectly for overlapping XPath paths defined 
> as different fields
> -----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-10012
>                 URL: https://issues.apache.org/jira/browse/SOLR-10012
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Dawid Weiss
>            Priority: Minor
>         Attachments: SOLR-10012.patch
>
>
> Reported by a friend --
> {code}
> <dataConfig>
> ...
>           <field column="Address"           
> xpath="/records/fullrecord_metadata/addresses/address_name/address_spec/full_address"
>  />
>           <field column="AddressALL"     
> xpath="/records/fullrecord_metadata/addresses flatten="true" />
> ...
> </dataConfig>
> {code}
> This definition doesn't seem to be importing anything in {{Address}} field -- 
> everything is consumed by AddressALL.
> I looked briefly at the implementation of {{XPathRecordReader}} and it seems 
> it's greedy with respect to flattened tree nodes, assuming no other field 
> extracts data from subnodes. 
> I think this is a bug (or is it by design)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to