Hi, I have a simple xml file for myself as records shown below
<rec><name><fname>Mich</fname><lname>Talebzadeh</lname></name> <domicile>UK</domicile> <contact> <email><personal>mich.talebza...@gmail.com</personal> <official>m...@peridale.co.uk</official></email> <phone><mobile>12345</mobile> <office>12346</office> <residence>12347</residence></phone></contact></rec> Try to read it in as follows and I get an error 0: jdbc:hive2://rhes75:10099/default> use test; OK No rows affected (0.032 seconds) 0: jdbc:hive2://rhes75:10099/default> drop table if exists xml_temp; OK No rows affected (0.229 seconds) 0: jdbc:hive2://rhes75:10099/default> -- create a load table 0: jdbc:hive2://rhes75:10099/default> create table xml_temp (line string); OK No rows affected (0.181 seconds) 0: jdbc:hive2://rhes75:10099/default> -- load data from local xml file 0: jdbc:hive2://rhes75:10099/default> load data local inpath "/home/hduser/dba/bin/xml/test.xml" into table xml_temp; Loading data to table test.xml_temp OK No rows affected (0.487 seconds) 0: jdbc:hive2://rhes75:10099/default> select * from xml_temp; OK +----------------------------------------------------+ | xml_temp.line | +----------------------------------------------------+ | <rec><name><fname>Mich</fname><lname>Talebzadeh</lname></name> | | <domicile>UK</domicile> | | <contact> | | <email><personal>mich.talebza...@gmail.com</personal> | | <official>m...@peridale.co.uk</official></email> | | <phone><mobile>12345</mobile> | | <office>12346</office> | | <residence>12347</residence></phone></contact></rec> | +----------------------------------------------------+ 8 rows selected (0.16 seconds) 0: jdbc:hive2://rhes75:10099/default> select xpath_string(line,'rec/name/fname') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/name/lname/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/domicile/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/contact/email/personal/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/contact/email/official/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/contact/phone/mobile/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/contact/phone/office/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/contact/phone/residence/text()') . . . . . . . . . . . . . . . . . . > from xml_temp; OK [Fatal Error] :1:63: XML document structures must start and end within the same entity. Error: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text org.apache.hadoop.hive.ql.udf.xml.UDFXPathString.evaluate(java.lang.String,java.lang.String) with arguments {<rec><name><fname>Mich</fname><lname>Talebzadeh</lname></name>,rec/name/fname}:Error loading expression 'rec/name/fname' (state=,code=0) Closing: 0: jdbc:hive2://rhes75:10099/default *However, if I go back and get rid of the carriage return in the source xml file it works!* <rec><name><fname>Mich</fname><lname>Talebzadeh</lname></name><domicile>UK</domicile><contact><email><personal> mich.talebza...@gmail.com</personal><official>m...@peridale.co.uk </official></email><phone><mobile>12345</mobile><office>12346</office><residence>12347</residence></phone></contact></rec> 0: jdbc:hive2://rhes75:10099/default> use test; OK No rows affected (0.036 seconds) 0: jdbc:hive2://rhes75:10099/default> drop table if exists xml_temp; OK No rows affected (0.222 seconds) 0: jdbc:hive2://rhes75:10099/default> -- create a load table 0: jdbc:hive2://rhes75:10099/default> create table xml_temp (line string); OK No rows affected (0.18 seconds) 0: jdbc:hive2://rhes75:10099/default> -- load data from local xml file 0: jdbc:hive2://rhes75:10099/default> load data local inpath "/home/hduser/dba/bin/xml/test.xml" into table xml_temp; Loading data to table test.xml_temp OK No rows affected (0.407 seconds) 0: jdbc:hive2://rhes75:10099/default> select * from xml_temp; OK +----------------------------------------------------+ | xml_temp.line | +----------------------------------------------------+ | <rec><name><fname>Mich</fname><lname>Talebzadeh</lname></name><domicile>UK</domicile><contact><email><personal> mich.talebza...@gmail.com</personal><official>m...@peridale.co.uk</official></email><phone><mobile>12345</mobile><office>12346</office><residence>12347</residence></phone></contact></rec> | +----------------------------------------------------+ 1 row selected (0.192 seconds) 0: jdbc:hive2://rhes75:10099/default> select xpath_string(line,'rec/name/fname') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/name/lname/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/domicile/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/contact/email/personal/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/contact/email/official/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/contact/phone/mobile/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/contact/phone/office/text()') . . . . . . . . . . . . . . . . . . > , xpath(line,'rec/contact/phone/residence/text()') . . . . . . . . . . . . . . . . . . > from xml_temp; OK +-------+-----------------+---------+--------------------------------+--------------------------+------------+------------+------------+ | _c0 | _c1 | _c2 | _c3 | _c4 | _c5 | _c6 | _c7 | +-------+-----------------+---------+--------------------------------+--------------------------+------------+------------+------------+ | Mich | ["Talebzadeh"] | ["UK"] | ["mich.talebza...@gmail.com"] | [" m...@peridale.co.uk"] | ["12345"] | ["12346"] | ["12347"] | +-------+-----------------+---------+--------------------------------+--------------------------+------------+------------+------------+ 1 row selected (0.185 seconds) 0: jdbc:hive2://rhes75:10099/default> !exit Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.