KendraKrat opened a new issue #2307:
URL: https://github.com/apache/drill/issues/2307


   When an XML element has multiple sub-elements of the same name, and those 
sub-elements have attributes, the attribute values get concatenated in a way 
that it's impossible to separate.
   
   For example, start with the documentation's published "list of books" 
example. Add three sub-elements named "extra" to one of the books, each having 
two attributes (name and value). The following is excerpted from an XML that I 
have attached.
   
      <book>
        <author>Mark Twain</author>
        <title>The Adventures of Tom Sawyer</title>
        <category>FICTION</category>
        <year>1876</year>
        <extra name="width" value="6"/>
        <extra name="height" value="10"/>
        <extra name="depth" value="2"/>
      </book>
   
   The output for this turns into:
   
+-----------------------------------------------------------------+------------+---------------------------------+-------------+------+-----------------------------------------+
   |                           attributes                            |   author 
  |              title              |  category   | year |                 
authors                 |
   
+-----------------------------------------------------------------+------------+---------------------------------+-------------+------+-----------------------------------------+
   | {"extra_name":"widthheightdepth","extra_value":"6102"}          | Mark 
Twain | The Adventures of Tom Sawyer    | FICTION     | 1876 | {}               
                       |
   
   It shows only one value for the "extra_name" attributes, which is the 
concatenation of the names "width", "height", and "depth" into 
"widthheightdepth". Similarly it only shows one value for the "extra_value" 
attributes, which is the concatenation of the values "6" "10" and "2" into 
"6102". Unfortunately it's impossible to know how to separate those 
concatenated strings.
   
   I would have expected to see something like one of the following for the 
attributes output instead, so that the different attribute values are separable:
   
   
{{"extra_name":"width","extra_value":"6"},{"extra_name":"height","extra_value":"10"},{"extra_name":"depth","extra_value":"2"}}
   or
   {"extra_name":{"width","height","depth"},"extra_value":{"6","10","2"}}
   
   **Desktop (please complete the following information):**
    - OS: Windows 10
    - Browser: N/A
    - Version: 1.19.0
   
   
[books-multiple-extras.xml.txt](https://github.com/apache/drill/files/7101221/books-multiple-extras.xml.txt)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to