Well you are always free to create a Serde on top that works with abstractSerde in any version. I don’t think it will be difficult since the input format is already there. I don’t know exactly when the interface SerDe was removed.
> On 9. Jun 2018, at 09:09, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > > thanks Jorn. The only alternative is to use xpath UDF? Works as shown below > but tedious > > Like the example below > > $cat employees.xml > <employee> > <id>1</id> > <name>Satish Kumar</name> > <designation>Technical Lead</designation> > </employee> > <employee> > <id>2</id> > <name>Ramya</name> > <designation>Testing</designation> > </employee> > > Step:1 Bring each record to one line, by executing below command > > $cat employees.xml | tr -d '&' | tr '\n' ' ' | tr '\r' ' ' | sed > 's|</employee>|</employee>\n|g' | grep -v '^\s*$' > employees_records.xml > > $cat employees_records.xml > <employee> <id>1</id> <name>Satish Kumar</name> <designation>Technical > Lead</designation> </employee> > <employee> <id>2</id> <name>Ramya</name> <designation>Testing</designation> > </employee> > > tep:2 Load the file to HDFS > > $hadoop fs -mkdir /user/hive/sample-xml-inputs > > $hadoop fs -put employees_records.xml /user/hive/sample-xml-inputs > > $hadoop fs -cat /user/hive/sample-xml-inputs/employees_records.xml > <employee> <id>1</id> <name>Satish Kumar</name><designation>Technical > Lead</designation> </employee> > <employee> <id>2</id> <name>Ramya</name> <designation>Testing</designation> > </employee> > > Step:3 Create a Hive table and point to xml file > > hive>create external table xml_table_org( xmldata string) LOCATION > '/user/hive/sample-xml-inputs/'; > > hive> select * from xml_table_org; > OK > <employee> <id>1</id> <name>Satish Kumar</name> <designation>Technical > Lead</designation> </employee> > <employee> <id>2</id> <name>Ramya</name> <designation>Testing</designation> > </employee> > > Step 4: From the stage table we can query the elements and load it to other > table. > > hive> CREATE TABLE xml_table AS SELECT > xpath_int(xmldata,'employee/id'),xpath_string(xmldata,'employee/name'),xpath_string(xmldata,'employee/designation') > FROM xml_table_org; > > Dr Mich Talebzadeh > > LinkedIn > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > http://talebzadehmich.wordpress.com > > Disclaimer: Use it at your own risk. Any and all responsibility for any loss, > damage or destruction of data or any other property which may arise from > relying on this email's technical content is explicitly disclaimed. The > author will in no case be liable for any monetary damages arising from such > loss, damage or destruction. > > >> On 9 June 2018 at 07:42, Jörn Franke <jornfra...@gmail.com> wrote: >> Yes. >> >> Serde must have been removed then in 2.x. >> >> >> >>> On 8. Jun 2018, at 23:52, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: >>> >>> Ok I am looking at this jar file >>> >>> jar tf hive-serde-3.0.0.jar|grep -i abstractserde >>> org/apache/hadoop/hive/serde2/AbstractSerDe.class >>> >>> Is this the correct one? >>> >>> Thanks >>> >>> >>> Dr Mich Talebzadeh >>> >>> LinkedIn >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> >>> http://talebzadehmich.wordpress.com >>> >>> Disclaimer: Use it at your own risk. Any and all responsibility for any >>> loss, damage or destruction of data or any other property which may arise >>> from relying on this email's technical content is explicitly disclaimed. >>> The author will in no case be liable for any monetary damages arising from >>> such loss, damage or destruction. >>> >>> >>>> On 8 June 2018 at 22:34, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: >>>> Thanks Jorn so what is the resolution? do I need another jar file? >>>> >>>> Dr Mich Talebzadeh >>>> >>>> LinkedIn >>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>> >>>> http://talebzadehmich.wordpress.com >>>> >>>> Disclaimer: Use it at your own risk. Any and all responsibility for any >>>> loss, damage or destruction of data or any other property which may arise >>>> from relying on this email's technical content is explicitly disclaimed. >>>> The author will in no case be liable for any monetary damages arising from >>>> such loss, damage or destruction. >>>> >>>> >>>>> On 8 June 2018 at 21:56, Jörn Franke <jornfra...@gmail.com> wrote: >>>>> Oha i see now Serde is a deprecated Interface , if i am not wrong it has >>>>> been replaced by the abstract class abstractserde >>>>> >>>>>> On 8. Jun 2018, at 22:22, Mich Talebzadeh <mich.talebza...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> Thanks Jorn. >>>>>> >>>>>> Spark 2.3.3 (labelled as stable) >>>>>> >>>>>> First I put the jar file hivexmlserde-1.0.5.3.jar under $HIVE_HOME/lib >>>>>> and explicitly loaded with ADD JAR as well in hive session >>>>>> >>>>>> hive> ADD JAR hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar; >>>>>> Added >>>>>> [/tmp/hive/7feb5165-780b-4ab6-aca8-f516d0388823_resources/hivexmlserde-1.0.5.3.jar] >>>>>> to class path >>>>>> Added resources: [hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar] >>>>>> >>>>>> Then I ran a simple code given here >>>>>> >>>>>> hive> CREATE TABLE xml_41 (imap map<string,string>) >>>>>> > ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe' >>>>>> > WITH SERDEPROPERTIES ( >>>>>> > "column.xpath.imap"="/file-format/data-set/element", >>>>>> > "xml.map.specification.element"="@name->#content" >>>>>> > ) >>>>>> > STORED AS >>>>>> > INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat' >>>>>> > OUTPUTFORMAT >>>>>> 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat' >>>>>> > TBLPROPERTIES ( >>>>>> > "xmlinput.start"="<file-format>", >>>>>> > "xmlinput.end"="</file-format>" >>>>>> > ); >>>>>> FAILED: Execution Error, return code 1 from >>>>>> org.apache.hadoop.hive.ql.exec.DDLTask. >>>>>> org/apache/hadoop/hive/serde2/SerDe >>>>>> >>>>>> And this is full error >>>>>> >>>>>> 2018-06-08T21:17:20,775 INFO [7feb5165-780b-4ab6-aca8-f516d0388823 >>>>>> main] ql.Driver: Starting task [Stage-0:DDL] in serial mode >>>>>> 2018-06-08T21:17:20,776 ERROR [7feb5165-780b-4ab6-aca8-f516d0388823 >>>>>> main] exec.DDLTask: java.lang.NoClassDefFoundError: >>>>>> org/apache/hadoop/hive/serde2/SerDe >>>>>> at java.lang.ClassLoader.defineClass1(Native Method) >>>>>> at java.lang.ClassLoader.defineClass(ClassLoader.java:763) >>>>>> at >>>>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >>>>>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) >>>>>> at java.net.URLClassLoader.access$100(URLClassLoader.java:73) >>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:368) >>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:362) >>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:361) >>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) >>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:411) >>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>>>>> at java.lang.Class.forName0(Native Method) >>>>>> at java.lang.Class.forName(Class.java:348) >>>>>> at >>>>>> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134) >>>>>> at >>>>>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099) >>>>>> at >>>>>> org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:4213) >>>>>> at >>>>>> org.apache.hadoop.hive.ql.plan.CreateTableDesc.toTable(CreateTableDesc.java:723) >>>>>> at >>>>>> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4321) >>>>>> at >>>>>> org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354) >>>>>> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) >>>>>> at >>>>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) >>>>>> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183) >>>>>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839) >>>>>> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526) >>>>>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) >>>>>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) >>>>>> at >>>>>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) >>>>>> at >>>>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) >>>>>> at >>>>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) >>>>>> at >>>>>> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) >>>>>> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) >>>>>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) >>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>> at >>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>>>>> at >>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>>> at java.lang.reflect.Method.invoke(Method.java:498) >>>>>> at org.apache.hadoop.util.RunJar.run(RunJar.java:221) >>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:136) >>>>>> Caused by: java.lang.ClassNotFoundException: >>>>>> org.apache.hadoop.hive.serde2.SerDe >>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) >>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>>>>> ... 40 more >>>>>> >>>>>> The jar file has the classes! >>>>>> >>>>>> jar tf hivexmlserde-1.0.5.3.jar >>>>>> META-INF/ >>>>>> META-INF/MANIFEST.MF >>>>>> com/ >>>>>> com/ibm/ >>>>>> com/ibm/spss/ >>>>>> com/ibm/spss/hive/ >>>>>> com/ibm/spss/hive/serde2/ >>>>>> com/ibm/spss/hive/serde2/xml/ >>>>>> com/ibm/spss/hive/serde2/xml/objectinspector/ >>>>>> com/ibm/spss/hive/serde2/xml/processor/ >>>>>> com/ibm/spss/hive/serde2/xml/processor/java/ >>>>>> com/ibm/spss/hive/serde2/xml/HiveXmlRecordReader.class >>>>>> com/ibm/spss/hive/serde2/xml/objectinspector/XmlListObjectInspector.class >>>>>> com/ibm/spss/hive/serde2/xml/objectinspector/XmlMapObjectInspector.class >>>>>> com/ibm/spss/hive/serde2/xml/objectinspector/XmlObjectInspectorFactory$1.class >>>>>> com/ibm/spss/hive/serde2/xml/objectinspector/XmlObjectInspectorFactory.class >>>>>> com/ibm/spss/hive/serde2/xml/objectinspector/XmlStructObjectInspector$1.class >>>>>> com/ibm/spss/hive/serde2/xml/objectinspector/XmlStructObjectInspector.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/AbstractXmlProcessor$1.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/AbstractXmlProcessor$2.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/AbstractXmlProcessor.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/java/JavaXmlProcessor$1.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/java/JavaXmlProcessor$2.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/java/JavaXmlProcessor.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/java/JavaXmlQuery.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/java/NodeArray.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/SerDeArray.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlMapEntry.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlMapFacet$Type.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlMapFacet.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlNode$1.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlNode$2.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlNode.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlNodeArray.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlProcessor.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlProcessorContext.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlQuery.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlTransformer.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlUtils$1.class >>>>>> com/ibm/spss/hive/serde2/xml/processor/XmlUtils.class >>>>>> com/ibm/spss/hive/serde2/xml/SplittableXmlInputFormat.class >>>>>> com/ibm/spss/hive/serde2/xml/XmlInputFormat$XmlRecordReader.class >>>>>> com/ibm/spss/hive/serde2/xml/XmlInputFormat.class >>>>>> com/ibm/spss/hive/serde2/xml/XmlSerDe$1.class >>>>>> com/ibm/spss/hive/serde2/xml/XmlSerDe.class >>>>>> META-INF/maven/ >>>>>> META-INF/maven/com.ibm.spss.hive.serde2.xml/ >>>>>> META-INF/maven/com.ibm.spss.hive.serde2.xml/hivexmlserde/ >>>>>> META-INF/maven/com.ibm.spss.hive.serde2.xml/hivexmlserde/pom.xml >>>>>> META-INF/maven/com.ibm.spss.hive.serde2.xml/hivexmlserde/pom.properties >>>>>> >>>>>> >>>>>> >>>>>> Dr Mich Talebzadeh >>>>>> >>>>>> LinkedIn >>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>>> >>>>>> http://talebzadehmich.wordpress.com >>>>>> >>>>>> Disclaimer: Use it at your own risk. Any and all responsibility for any >>>>>> loss, damage or destruction of data or any other property which may >>>>>> arise from relying on this email's technical content is explicitly >>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>> damages arising from such loss, damage or destruction. >>>>>> >>>>>> >>>>>>> On 8 June 2018 at 17:58, Jörn Franke <jornfra...@gmail.com> wrote: >>>>>>> Can you get the log files and start Hive with more detailled logs? >>>>>>> In could be that not all libraries are loaded (i don’t remember anymore >>>>>>> but I think this one needs more , I can look next week in my docs) or >>>>>>> that it does not support maps (not sure). >>>>>>> You can try first with a more simpler extraction with a String field to >>>>>>> see if it works . >>>>>>> >>>>>>> Hive has always had external libraries for xml support and I used the >>>>>>> one below with Hive 1.x, but it should also work with 2.x (3 not sure, >>>>>>> but it should if it works in 2.x) >>>>>>> >>>>>>> >>>>>>>> On 8. Jun 2018, at 17:53, Mich Talebzadeh <mich.talebza...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>> I tried Hive 2.0.1, 2.3.2 and now Hive 3/ >>>>>>>> >>>>>>>> I explicitly added hivexmlserde jar file as ADD JAR shown below >>>>>>>> >>>>>>>> 0: jdbc:hive2://rhes75:10099/default> ADD JAR >>>>>>>> hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar; >>>>>>>> No rows affected (0.002 seconds) >>>>>>>> >>>>>>>> But still cannot create an xml table >>>>>>>> >>>>>>>> 0: jdbc:hive2://rhes75:10099/default> CREATE TABLE xml_41 (imap >>>>>>>> map<string,string>) ROW FORMAT SERDE >>>>>>>> 'com.ibm.spss.hive.serde2.xml.XmlSerDe' WITH SERDEPROPERTIES >>>>>>>> ("column.xpath.imap"="/file-format/data-set/element","xml.map.specification.element"="@name->#content") >>>>>>>> STORED AS INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat' >>>>>>>> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat' >>>>>>>> TBLPROPERTIES >>>>>>>> ("xmlinput.start"="<file-format>","xmlinput.end"="</file-format>"); >>>>>>>> >>>>>>>> Error: Error while processing statement: FAILED: Execution Error, >>>>>>>> return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. >>>>>>>> org/apache/hadoop/hive/serde2/SerDe (state=08S01,code=1) >>>>>>>> >>>>>>>> Does anyone know the cause of this or which version of Hive supports >>>>>>>> creating an XML table? >>>>>>>> >>>>>>>> Thanks >>>>>>>> Dr Mich Talebzadeh >>>>>>>> >>>>>>>> LinkedIn >>>>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>>>>> >>>>>>>> http://talebzadehmich.wordpress.com >>>>>>>> >>>>>>>> Disclaimer: Use it at your own risk. Any and all responsibility for >>>>>>>> any loss, damage or destruction of data or any other property which >>>>>>>> may arise from relying on this email's technical content is explicitly >>>>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>>>> damages arising from such loss, damage or destruction. >>>>>>>> >>>>>> >>>> >>> >