Hi,

I have a simple xml file for myself as records shown below

<rec><name><fname>Mich</fname><lname>Talebzadeh</lname></name>
<domicile>UK</domicile>
<contact>
<email><personal>mich.talebza...@gmail.com</personal>
<official>m...@peridale.co.uk</official></email>
<phone><mobile>12345</mobile>
<office>12346</office>
<residence>12347</residence></phone></contact></rec>

Try to read it in as follows and I get an error

0: jdbc:hive2://rhes75:10099/default> use test;
OK
No rows affected (0.032 seconds)
0: jdbc:hive2://rhes75:10099/default> drop table if exists xml_temp;
OK
No rows affected (0.229 seconds)
0: jdbc:hive2://rhes75:10099/default> -- create a load table
0: jdbc:hive2://rhes75:10099/default> create table xml_temp (line string);
OK
No rows affected (0.181 seconds)
0: jdbc:hive2://rhes75:10099/default> -- load data from local xml file
0: jdbc:hive2://rhes75:10099/default> load data local inpath
"/home/hduser/dba/bin/xml/test.xml" into table xml_temp;
Loading data to table test.xml_temp
OK
No rows affected (0.487 seconds)
0: jdbc:hive2://rhes75:10099/default> select * from xml_temp;
OK
+----------------------------------------------------+
|                   xml_temp.line                    |
+----------------------------------------------------+
| <rec><name><fname>Mich</fname><lname>Talebzadeh</lname></name> |
| <domicile>UK</domicile>                            |
| <contact>                                          |
| <email><personal>mich.talebza...@gmail.com</personal> |
| <official>m...@peridale.co.uk</official></email>   |
| <phone><mobile>12345</mobile>                      |
| <office>12346</office>                             |
| <residence>12347</residence></phone></contact></rec> |
+----------------------------------------------------+
8 rows selected (0.16 seconds)
0: jdbc:hive2://rhes75:10099/default> select
xpath_string(line,'rec/name/fname')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/name/lname/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/domicile/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/contact/email/personal/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/contact/email/official/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/contact/phone/mobile/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/contact/phone/office/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/contact/phone/residence/text()')
. . . . . . . . . . . . . . . . . . > from xml_temp;
OK
[Fatal Error] :1:63: XML document structures must start and end within the
same entity.
Error: java.io.IOException:
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method
public org.apache.hadoop.io.Text
org.apache.hadoop.hive.ql.udf.xml.UDFXPathString.evaluate(java.lang.String,java.lang.String)
with arguments
{<rec><name><fname>Mich</fname><lname>Talebzadeh</lname></name>,rec/name/fname}:Error
loading expression 'rec/name/fname' (state=,code=0)
Closing: 0: jdbc:hive2://rhes75:10099/default


*However, if I go back and get rid of the carriage return in the source xml
file it works!*

<rec><name><fname>Mich</fname><lname>Talebzadeh</lname></name><domicile>UK</domicile><contact><email><personal>
mich.talebza...@gmail.com</personal><official>m...@peridale.co.uk
</official></email><phone><mobile>12345</mobile><office>12346</office><residence>12347</residence></phone></contact></rec>

0: jdbc:hive2://rhes75:10099/default> use test;
OK
No rows affected (0.036 seconds)
0: jdbc:hive2://rhes75:10099/default> drop table if exists xml_temp;
OK
No rows affected (0.222 seconds)
0: jdbc:hive2://rhes75:10099/default> -- create a load table
0: jdbc:hive2://rhes75:10099/default> create table xml_temp (line string);
OK
No rows affected (0.18 seconds)
0: jdbc:hive2://rhes75:10099/default> -- load data from local xml file
0: jdbc:hive2://rhes75:10099/default> load data local inpath
"/home/hduser/dba/bin/xml/test.xml" into table xml_temp;
Loading data to table test.xml_temp
OK
No rows affected (0.407 seconds)
0: jdbc:hive2://rhes75:10099/default> select * from xml_temp;
OK
+----------------------------------------------------+
|                   xml_temp.line                    |
+----------------------------------------------------+
|
<rec><name><fname>Mich</fname><lname>Talebzadeh</lname></name><domicile>UK</domicile><contact><email><personal>
mich.talebza...@gmail.com</personal><official>m...@peridale.co.uk</official></email><phone><mobile>12345</mobile><office>12346</office><residence>12347</residence></phone></contact></rec>
|
+----------------------------------------------------+
1 row selected (0.192 seconds)
0: jdbc:hive2://rhes75:10099/default> select
xpath_string(line,'rec/name/fname')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/name/lname/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/domicile/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/contact/email/personal/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/contact/email/official/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/contact/phone/mobile/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/contact/phone/office/text()')
. . . . . . . . . . . . . . . . . . >     ,
xpath(line,'rec/contact/phone/residence/text()')
. . . . . . . . . . . . . . . . . . > from xml_temp;
OK
+-------+-----------------+---------+--------------------------------+--------------------------+------------+------------+------------+
|  _c0  |       _c1       |   _c2   |              _c3
|           _c4            |    _c5     |    _c6     |    _c7     |
+-------+-----------------+---------+--------------------------------+--------------------------+------------+------------+------------+
| Mich  | ["Talebzadeh"]  | ["UK"]  | ["mich.talebza...@gmail.com"]  | ["
m...@peridale.co.uk"]  | ["12345"]  | ["12346"]  | ["12347"]  |
+-------+-----------------+---------+--------------------------------+--------------------------+------------+------------+------------+
1 row selected (0.185 seconds)
0: jdbc:hive2://rhes75:10099/default> !exit


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

Reply via email to