[ 
https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797769#action_12797769
 ] 

Ning Zhang commented on HIVE-1027:
----------------------------------

Thanks for the detailed explanations. It seems we are supporting XPath 1.0 
here. When you say "xpath() returns multiple nodes(list)", do you mean it 
returns a serialized XML string representing the list of nodes such as 
<a>a1</a><a>a2</a> ...? In this case, do you have a test case for composing 
xpath() functions. For example and subquery returns XML string from the result 
of xpath() and the outer query takes that input to another xpath*() function?

For (4) I'm sure whether we should interpret of empty list as empty string etc. 
We can definitely define the mapping between the XML model to relation model 
this way, but it doesn't distinguish the case where the xpath_string() result 
is an empty list or it is a single node but the value of the node is empty 
(e.g., <a/> vs. no <a> element). 

Also all this information is better to be exposed to the wider community (not 
only developers) as well. Can you also add all these to the Hive's wiki page? 

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b 
> > class="odd">4</b><c>8</c></a>', 'sum(a/b...@class="odd"])') FROM src LIMIT 
> > 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 
> > 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 
> > 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, 
> xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to