Re: Which version of Hive can hanle creating XML table?

2018-06-12 Thread kristijan berta
*Apologies, here is the link to the
product: https://sonra.io/flexter-for-xml/
*

*and how it can be used with
Hive: https://sonra.io/2018/01/27/converting-xml-hive/
*


On Mon, Jun 11, 2018 at 5:46 PM, Mich Talebzadeh 
wrote:

> many thanks. but I cannot see any specific product name there?
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 11 June 2018 at 14:10, kristijan berta  wrote:
>
>> The XPath stuff works reasonably well for simple XML files.
>>
>> However for complex XML files that change frequently and need to be
>> ingested in realtime you might look at a 3rd party solution, e.g. here:
>> https://dataworkssummit.com/san-jose-2018/session/add-a-spar
>> k-to-your-etl/
>>
>> On Mon, Jun 11, 2018 at 3:05 PM, kristijan berta 
>> wrote:
>>
>>> thanks Jorn. The only alternative is to use xpath UDF? Works as shown
>>> below but tedious
>>>
>>> Like the example below
>>>
>>> *$cat employees.xml*
>>> 
>>> 1
>>> Satish Kumar
>>> Technical Lead
>>> 
>>> 
>>> 2
>>> Ramya
>>> Testing
>>> 
>>>
>>> *Step:1 Bring each record to one line, by executing below command*
>>>
>>> $cat employees.xml | tr -d '&' | tr '\n' ' ' | tr '\r' ' ' | sed
>>> 's||\n|g' | grep -v '^\s*$' >
>>> employees_records.xml
>>>
>>> *$cat employees_records.xml*
>>>  1 Satish Kumar Technical
>>> Lead 
>>>  2 Ramya Testing
>>> 
>>>
>>> *tep:2 Load the file to HDFS*
>>>
>>> *$hadoop fs -mkdir /user/hive/sample-xml-inputs*
>>>
>>> *$hadoop fs -put employees_records.xml /user/hive/sample-xml-inputs*
>>>
>>> *$hadoop fs -cat /user/hive/sample-xml-inputs/employees_records.xml*
>>>  1 Satish KumarTechnical
>>> Lead 
>>>  2 Ramya Testing
>>> 
>>>
>>> *Step:3 Create a Hive table and point to xml file*
>>>
>>> *hive>create external table xml_table_org( xmldata string) LOCATION
>>> '/user/hive/sample-xml-inputs/';*
>>>
>>> *hive> select * from xml_table_org;*
>>> *OK*
>>>  1 Satish Kumar Technical
>>> Lead 
>>>  2 Ramya Testing
>>> 
>>>
>>> *Step 4: From the stage table we can query the elements and load it to
>>> other table.*
>>>
>>> *hive> CREATE TABLE xml_table AS SELECT
>>> xpath_int(xmldata,'employee/id'),xpath_string(xmldata,'employee/name'),xpath_string(xmldata,'employee/designation')
>>> FROM xml_table_org;*
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 9 June 2018 at 07:42, Jörn Franke  wrote:
>>>
 Yes.

 Serde must have been removed then in 2.x.



 On 8. Jun 2018, at 23:52, Mich Talebzadeh 
 wrote:

 Ok I am looking at this jar file

  jar tf hive-serde-3.0.0.jar|grep -i abstractserde
 org/apache/hadoop/hive/serde2/AbstractSerDe.class

 Is this the correct one?

 Thanks


 Dr Mich Talebzadeh



 LinkedIn * 
 https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 *



 http://talebzadehmich.wordpress.com


 *Disclaimer:* Use it at your own risk. Any and all responsibility for
 any loss, damage or destruction of data or any other property which may
 arise from relying on this email's technical content is explicitly
 disclaimed. The author will in no case be liable for any monetary damages
 arising from such loss, damage or destruction.



 On 8 June 2018 at 22:34, Mich Talebzadeh 
 wrote:

> Thanks Jorn so what is the resolution? do I need another jar file?
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://tale

Re: Which version of Hive can hanle creating XML table?

2018-06-11 Thread Mich Talebzadeh
many thanks. but I cannot see any specific product name there?

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 11 June 2018 at 14:10, kristijan berta  wrote:

> The XPath stuff works reasonably well for simple XML files.
>
> However for complex XML files that change frequently and need to be
> ingested in realtime you might look at a 3rd party solution, e.g. here:
> https://dataworkssummit.com/san-jose-2018/session/add-a-spark-to-your-etl/
>
> On Mon, Jun 11, 2018 at 3:05 PM, kristijan berta  wrote:
>
>> thanks Jorn. The only alternative is to use xpath UDF? Works as shown
>> below but tedious
>>
>> Like the example below
>>
>> *$cat employees.xml*
>> 
>> 1
>> Satish Kumar
>> Technical Lead
>> 
>> 
>> 2
>> Ramya
>> Testing
>> 
>>
>> *Step:1 Bring each record to one line, by executing below command*
>>
>> $cat employees.xml | tr -d '&' | tr '\n' ' ' | tr '\r' ' ' | sed
>> 's||\n|g' | grep -v '^\s*$' > employees_records.xml
>>
>> *$cat employees_records.xml*
>>  1 Satish Kumar Technical
>> Lead 
>>  2 Ramya Testing
>> 
>>
>> *tep:2 Load the file to HDFS*
>>
>> *$hadoop fs -mkdir /user/hive/sample-xml-inputs*
>>
>> *$hadoop fs -put employees_records.xml /user/hive/sample-xml-inputs*
>>
>> *$hadoop fs -cat /user/hive/sample-xml-inputs/employees_records.xml*
>>  1 Satish KumarTechnical
>> Lead 
>>  2 Ramya Testing
>> 
>>
>> *Step:3 Create a Hive table and point to xml file*
>>
>> *hive>create external table xml_table_org( xmldata string) LOCATION
>> '/user/hive/sample-xml-inputs/';*
>>
>> *hive> select * from xml_table_org;*
>> *OK*
>>  1 Satish Kumar Technical
>> Lead 
>>  2 Ramya Testing
>> 
>>
>> *Step 4: From the stage table we can query the elements and load it to
>> other table.*
>>
>> *hive> CREATE TABLE xml_table AS SELECT
>> xpath_int(xmldata,'employee/id'),xpath_string(xmldata,'employee/name'),xpath_string(xmldata,'employee/designation')
>> FROM xml_table_org;*
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 9 June 2018 at 07:42, Jörn Franke  wrote:
>>
>>> Yes.
>>>
>>> Serde must have been removed then in 2.x.
>>>
>>>
>>>
>>> On 8. Jun 2018, at 23:52, Mich Talebzadeh 
>>> wrote:
>>>
>>> Ok I am looking at this jar file
>>>
>>>  jar tf hive-serde-3.0.0.jar|grep -i abstractserde
>>> org/apache/hadoop/hive/serde2/AbstractSerDe.class
>>>
>>> Is this the correct one?
>>>
>>> Thanks
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 8 June 2018 at 22:34, Mich Talebzadeh 
>>> wrote:
>>>
 Thanks Jorn so what is the resolution? do I need another jar file?

 Dr Mich Talebzadeh



 LinkedIn * 
 https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 *



 http://talebzadehmich.wordpress.com


 *Disclaimer:* Use it at your own risk. Any and all responsibility for
 any loss, damage or destruction of data or any other property which may
 arise from relying on this email's technical content is explicitly
 disclaimed. The author will in no case be liable for any monetary damages
 arising from such loss, damage or destruction.



 On 8 June 2018 at 21:56, Jörn Franke  wrote:

> Oha i see now Serde is a deprecated Inter

Re: Which version of Hive can hanle creating XML table?

2018-06-11 Thread kristijan berta
The XPath stuff works reasonably well for simple XML files.

However for complex XML files that change frequently and need to be
ingested in realtime you might look at a 3rd party solution, e.g. here:
https://dataworkssummit.com/san-jose-2018/session/add-a-spark-to-your-etl/

On Mon, Jun 11, 2018 at 3:05 PM, kristijan berta  wrote:

> thanks Jorn. The only alternative is to use xpath UDF? Works as shown
> below but tedious
>
> Like the example below
>
> *$cat employees.xml*
> 
> 1
> Satish Kumar
> Technical Lead
> 
> 
> 2
> Ramya
> Testing
> 
>
> *Step:1 Bring each record to one line, by executing below command*
>
> $cat employees.xml | tr -d '&' | tr '\n' ' ' | tr '\r' ' ' | sed
> 's||\n|g' | grep -v '^\s*$' > employees_records.xml
>
> *$cat employees_records.xml*
>  1 Satish Kumar Technical
> Lead 
>  2 Ramya Testing
> 
>
> *tep:2 Load the file to HDFS*
>
> *$hadoop fs -mkdir /user/hive/sample-xml-inputs*
>
> *$hadoop fs -put employees_records.xml /user/hive/sample-xml-inputs*
>
> *$hadoop fs -cat /user/hive/sample-xml-inputs/employees_records.xml*
>  1 Satish KumarTechnical
> Lead 
>  2 Ramya Testing
> 
>
> *Step:3 Create a Hive table and point to xml file*
>
> *hive>create external table xml_table_org( xmldata string) LOCATION
> '/user/hive/sample-xml-inputs/';*
>
> *hive> select * from xml_table_org;*
> *OK*
>  1 Satish Kumar Technical
> Lead 
>  2 Ramya Testing
> 
>
> *Step 4: From the stage table we can query the elements and load it to
> other table.*
>
> *hive> CREATE TABLE xml_table AS SELECT
> xpath_int(xmldata,'employee/id'),xpath_string(xmldata,'employee/name'),xpath_string(xmldata,'employee/designation')
> FROM xml_table_org;*
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 9 June 2018 at 07:42, Jörn Franke  wrote:
>
>> Yes.
>>
>> Serde must have been removed then in 2.x.
>>
>>
>>
>> On 8. Jun 2018, at 23:52, Mich Talebzadeh 
>> wrote:
>>
>> Ok I am looking at this jar file
>>
>>  jar tf hive-serde-3.0.0.jar|grep -i abstractserde
>> org/apache/hadoop/hive/serde2/AbstractSerDe.class
>>
>> Is this the correct one?
>>
>> Thanks
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 8 June 2018 at 22:34, Mich Talebzadeh 
>> wrote:
>>
>>> Thanks Jorn so what is the resolution? do I need another jar file?
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 8 June 2018 at 21:56, Jörn Franke  wrote:
>>>
 Oha i see now Serde is a deprecated Interface , if i am not wrong it
 has been replaced by the abstract class abstractserde

 On 8. Jun 2018, at 22:22, Mich Talebzadeh 
 wrote:

 Thanks Jorn.

 Spark 2.3.3 (labelled as stable)

 First I put the jar file hivexmlserde-1.0.5.3.jar under $HIVE_HOME/lib
 and explicitly loaded with ADD JAR as well in hive session

 hive> ADD JAR hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar;
 Added 
 [/tmp/hive/7feb5165-780b-4ab6-aca8-f516d0388823_resources/hivexmlserde-1.0.5.3.jar]
 to class path
 Added resources: [hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar]

 Then I ran a simple code given here
 

 hive> CREATE  TABLE xml_41 (imap map) > ROW FORMAT
 SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe' 

Re: Which version of Hive can hanle creating XML table?

2018-06-09 Thread Jörn Franke
Well you are always free to create a Serde on top that works with abstractSerde 
in any version. I don’t think it will be difficult since the input format is 
already there.
I don’t know exactly when the interface SerDe was removed.

> On 9. Jun 2018, at 09:09, Mich Talebzadeh  wrote:
> 
> thanks Jorn. The only alternative is to use xpath UDF? Works as shown below 
> but tedious 
> 
> Like the example below
> 
> $cat employees.xml
> 
> 1
> Satish Kumar
> Technical Lead
> 
> 
> 2
> Ramya
> Testing
> 
> 
> Step:1 Bring each record to one line, by executing below command
> 
> $cat employees.xml | tr -d '&' | tr '\n' ' ' | tr '\r' ' ' | sed 
> 's||\n|g' | grep -v '^\s*$' > employees_records.xml
> 
> $cat employees_records.xml
>  1 Satish Kumar Technical 
> Lead 
>  2 Ramya Testing 
> 
> 
> tep:2 Load the file to HDFS
> 
> $hadoop fs -mkdir /user/hive/sample-xml-inputs
> 
> $hadoop fs -put employees_records.xml /user/hive/sample-xml-inputs
> 
> $hadoop fs -cat /user/hive/sample-xml-inputs/employees_records.xml
>  1 Satish KumarTechnical 
> Lead 
>  2 Ramya Testing 
> 
> 
> Step:3 Create a Hive table and point to xml file
> 
> hive>create external table xml_table_org( xmldata string) LOCATION 
> '/user/hive/sample-xml-inputs/';
> 
> hive> select * from xml_table_org;
> OK
>  1 Satish Kumar Technical 
> Lead 
>  2 Ramya Testing 
> 
> 
> Step 4: From the stage table we can query the elements and load it to other 
> table.
> 
> hive> CREATE TABLE xml_table AS SELECT 
> xpath_int(xmldata,'employee/id'),xpath_string(xmldata,'employee/name'),xpath_string(xmldata,'employee/designation')
>  FROM xml_table_org;
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> http://talebzadehmich.wordpress.com
> 
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
> damage or destruction of data or any other property which may arise from 
> relying on this email's technical content is explicitly disclaimed. The 
> author will in no case be liable for any monetary damages arising from such 
> loss, damage or destruction.
>  
> 
>> On 9 June 2018 at 07:42, Jörn Franke  wrote:
>> Yes.
>> 
>> Serde must have been removed then in 2.x.
>> 
>> 
>> 
>>> On 8. Jun 2018, at 23:52, Mich Talebzadeh  wrote:
>>> 
>>> Ok I am looking at this jar file
>>> 
>>>  jar tf hive-serde-3.0.0.jar|grep -i abstractserde
>>> org/apache/hadoop/hive/serde2/AbstractSerDe.class
>>> 
>>> Is this the correct one?
>>> 
>>> Thanks
>>> 
>>> 
>>> Dr Mich Talebzadeh
>>>  
>>> LinkedIn  
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>  
>>> http://talebzadehmich.wordpress.com
>>> 
>>> Disclaimer: Use it at your own risk. Any and all responsibility for any 
>>> loss, damage or destruction of data or any other property which may arise 
>>> from relying on this email's technical content is explicitly disclaimed. 
>>> The author will in no case be liable for any monetary damages arising from 
>>> such loss, damage or destruction.
>>>  
>>> 
 On 8 June 2018 at 22:34, Mich Talebzadeh  wrote:
 Thanks Jorn so what is the resolution? do I need another jar file?
 
 Dr Mich Talebzadeh
  
 LinkedIn  
 https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
  
 http://talebzadehmich.wordpress.com
 
 Disclaimer: Use it at your own risk. Any and all responsibility for any 
 loss, damage or destruction of data or any other property which may arise 
 from relying on this email's technical content is explicitly disclaimed. 
 The author will in no case be liable for any monetary damages arising from 
 such loss, damage or destruction.
  
 
> On 8 June 2018 at 21:56, Jörn Franke  wrote:
> Oha i see now Serde is a deprecated Interface , if i am not wrong it has 
> been replaced by the abstract class abstractserde 
> 
>> On 8. Jun 2018, at 22:22, Mich Talebzadeh  
>> wrote:
>> 
>> Thanks Jorn.
>> 
>> Spark 2.3.3 (labelled as stable)
>> 
>> First I put the jar file hivexmlserde-1.0.5.3.jar under $HIVE_HOME/lib 
>> and explicitly loaded with ADD JAR as well in hive session
>> 
>> hive> ADD JAR hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar;
>> Added 
>> [/tmp/hive/7feb5165-780b-4ab6-aca8-f516d0388823_resources/hivexmlserde-1.0.5.3.jar]
>>  to class path
>> Added resources: [hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar]
>> 
>> Then I ran a simple code given here
>> 
>> hive> CREATE  TABLE xml_41 (imap map)
>> > ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
>> > WITH SERDEPROPERTIES (
>> > "column.xpath.imap"="/file-format/data-set/element",
>> > "xml.map.specification.element"="@name->#content"
>> > )
>> > STORED AS
>> > INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
>> > OUTPUTF

Re: Which version of Hive can hanle creating XML table?

2018-06-09 Thread Mich Talebzadeh
thanks Jorn. The only alternative is to use xpath UDF? Works as shown below
but tedious

Like the example below

*$cat employees.xml*

1
Satish Kumar
Technical Lead


2
Ramya
Testing


*Step:1 Bring each record to one line, by executing below command*

$cat employees.xml | tr -d '&' | tr '\n' ' ' | tr '\r' ' ' | sed
's||\n|g' | grep -v '^\s*$' > employees_records.xml

*$cat employees_records.xml*
 1 Satish Kumar Technical
Lead 
 2 Ramya Testing


*tep:2 Load the file to HDFS*

*$hadoop fs -mkdir /user/hive/sample-xml-inputs*

*$hadoop fs -put employees_records.xml /user/hive/sample-xml-inputs*

*$hadoop fs -cat /user/hive/sample-xml-inputs/employees_records.xml*
 1 Satish KumarTechnical
Lead 
 2 Ramya Testing


*Step:3 Create a Hive table and point to xml file*

*hive>create external table xml_table_org( xmldata string) LOCATION
'/user/hive/sample-xml-inputs/';*

*hive> select * from xml_table_org;*
*OK*
 1 Satish Kumar Technical
Lead 
 2 Ramya Testing


*Step 4: From the stage table we can query the elements and load it to
other table.*

*hive> CREATE TABLE xml_table AS SELECT
xpath_int(xmldata,'employee/id'),xpath_string(xmldata,'employee/name'),xpath_string(xmldata,'employee/designation')
FROM xml_table_org;*

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 9 June 2018 at 07:42, Jörn Franke  wrote:

> Yes.
>
> Serde must have been removed then in 2.x.
>
>
>
> On 8. Jun 2018, at 23:52, Mich Talebzadeh 
> wrote:
>
> Ok I am looking at this jar file
>
>  jar tf hive-serde-3.0.0.jar|grep -i abstractserde
> org/apache/hadoop/hive/serde2/AbstractSerDe.class
>
> Is this the correct one?
>
> Thanks
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 8 June 2018 at 22:34, Mich Talebzadeh 
> wrote:
>
>> Thanks Jorn so what is the resolution? do I need another jar file?
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 8 June 2018 at 21:56, Jörn Franke  wrote:
>>
>>> Oha i see now Serde is a deprecated Interface , if i am not wrong it has
>>> been replaced by the abstract class abstractserde
>>>
>>> On 8. Jun 2018, at 22:22, Mich Talebzadeh 
>>> wrote:
>>>
>>> Thanks Jorn.
>>>
>>> Spark 2.3.3 (labelled as stable)
>>>
>>> First I put the jar file hivexmlserde-1.0.5.3.jar under $HIVE_HOME/lib
>>> and explicitly loaded with ADD JAR as well in hive session
>>>
>>> hive> ADD JAR hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar;
>>> Added 
>>> [/tmp/hive/7feb5165-780b-4ab6-aca8-f516d0388823_resources/hivexmlserde-1.0.5.3.jar]
>>> to class path
>>> Added resources: [hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar]
>>>
>>> Then I ran a simple code given here
>>> 
>>>
>>> hive> CREATE  TABLE xml_41 (imap map)
>>> > ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
>>> > WITH SERDEPROPERTIES (
>>> > "column.xpath.imap"="/file-format/data-set/element",
>>> > "xml.map.specification.element"="@name->#content"
>>> > )
>>> > STORED AS
>>> > INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
>>> > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.
>>> IgnoreKeyTextOutputFormat'
>>> > TBLPROPERTIES (
>>> > "xmlinput.start"="",
>>> > "xmlinput.end"=""
>>> > );
>>> FAILED: Execution Error, return code 1 from
>>> org.apache.hadoop.hive.ql.exec.DDLTask. org/apache/hadoop/hive/serde2/
>>

Re: Which version of Hive can hanle creating XML table?

2018-06-08 Thread Jörn Franke
Yes.

Serde must have been removed then in 2.x.



> On 8. Jun 2018, at 23:52, Mich Talebzadeh  wrote:
> 
> Ok I am looking at this jar file
> 
>  jar tf hive-serde-3.0.0.jar|grep -i abstractserde
> org/apache/hadoop/hive/serde2/AbstractSerDe.class
> 
> Is this the correct one?
> 
> Thanks
> 
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> http://talebzadehmich.wordpress.com
> 
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
> damage or destruction of data or any other property which may arise from 
> relying on this email's technical content is explicitly disclaimed. The 
> author will in no case be liable for any monetary damages arising from such 
> loss, damage or destruction.
>  
> 
>> On 8 June 2018 at 22:34, Mich Talebzadeh  wrote:
>> Thanks Jorn so what is the resolution? do I need another jar file?
>> 
>> Dr Mich Talebzadeh
>>  
>> LinkedIn  
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>  
>> http://talebzadehmich.wordpress.com
>> 
>> Disclaimer: Use it at your own risk. Any and all responsibility for any 
>> loss, damage or destruction of data or any other property which may arise 
>> from relying on this email's technical content is explicitly disclaimed. The 
>> author will in no case be liable for any monetary damages arising from such 
>> loss, damage or destruction.
>>  
>> 
>>> On 8 June 2018 at 21:56, Jörn Franke  wrote:
>>> Oha i see now Serde is a deprecated Interface , if i am not wrong it has 
>>> been replaced by the abstract class abstractserde 
>>> 
 On 8. Jun 2018, at 22:22, Mich Talebzadeh  
 wrote:
 
 Thanks Jorn.
 
 Spark 2.3.3 (labelled as stable)
 
 First I put the jar file hivexmlserde-1.0.5.3.jar under $HIVE_HOME/lib and 
 explicitly loaded with ADD JAR as well in hive session
 
 hive> ADD JAR hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar;
 Added 
 [/tmp/hive/7feb5165-780b-4ab6-aca8-f516d0388823_resources/hivexmlserde-1.0.5.3.jar]
  to class path
 Added resources: [hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar]
 
 Then I ran a simple code given here
 
 hive> CREATE  TABLE xml_41 (imap map)
 > ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
 > WITH SERDEPROPERTIES (
 > "column.xpath.imap"="/file-format/data-set/element",
 > "xml.map.specification.element"="@name->#content"
 > )
 > STORED AS
 > INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
 > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
 > TBLPROPERTIES (
 > "xmlinput.start"="",
 > "xmlinput.end"=""
 > );
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. org/apache/hadoop/hive/serde2/SerDe
 
 And this is full error
 
 2018-06-08T21:17:20,775  INFO [7feb5165-780b-4ab6-aca8-f516d0388823 main] 
 ql.Driver: Starting task [Stage-0:DDL] in serial mode
 2018-06-08T21:17:20,776 ERROR [7feb5165-780b-4ab6-aca8-f516d0388823 main] 
 exec.DDLTask: java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hive/serde2/SerDe
 at java.lang.ClassLoader.defineClass1(Native Method)
 at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
 at 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
 at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:348)
 at 
 org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
 at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
 at 
 org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:4213)
 at 
 org.apache.hadoop.hive.ql.plan.CreateTableDesc.toTable(CreateTableDesc.java:723)
 at 
 org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4321)
 at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354)
 at org.apache.hadoop.hive.ql.exec.Task.execut

Re: Which version of Hive can hanle creating XML table?

2018-06-08 Thread Mich Talebzadeh
Ok I am looking at this jar file

 jar tf hive-serde-3.0.0.jar|grep -i abstractserde
org/apache/hadoop/hive/serde2/AbstractSerDe.class

Is this the correct one?

Thanks


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 8 June 2018 at 22:34, Mich Talebzadeh  wrote:

> Thanks Jorn so what is the resolution? do I need another jar file?
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 8 June 2018 at 21:56, Jörn Franke  wrote:
>
>> Oha i see now Serde is a deprecated Interface , if i am not wrong it has
>> been replaced by the abstract class abstractserde
>>
>> On 8. Jun 2018, at 22:22, Mich Talebzadeh 
>> wrote:
>>
>> Thanks Jorn.
>>
>> Spark 2.3.3 (labelled as stable)
>>
>> First I put the jar file hivexmlserde-1.0.5.3.jar under $HIVE_HOME/lib
>> and explicitly loaded with ADD JAR as well in hive session
>>
>> hive> ADD JAR hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar;
>> Added 
>> [/tmp/hive/7feb5165-780b-4ab6-aca8-f516d0388823_resources/hivexmlserde-1.0.5.3.jar]
>> to class path
>> Added resources: [hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar]
>>
>> Then I ran a simple code given here
>> 
>>
>> hive> CREATE  TABLE xml_41 (imap map)
>> > ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
>> > WITH SERDEPROPERTIES (
>> > "column.xpath.imap"="/file-format/data-set/element",
>> > "xml.map.specification.element"="@name->#content"
>> > )
>> > STORED AS
>> > INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
>> > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.
>> IgnoreKeyTextOutputFormat'
>> > TBLPROPERTIES (
>> > "xmlinput.start"="",
>> > "xmlinput.end"=""
>> > );
>> FAILED: Execution Error, return code 1 from 
>> org.apache.hadoop.hive.ql.exec.DDLTask.
>> org/apache/hadoop/hive/serde2/SerDe
>> And this is full error 2018-06-08T21:17:20,775  INFO
>> [7feb5165-780b-4ab6-aca8-f516d0388823 main] ql.Driver: Starting task
>> [Stage-0:DDL] in serial mode
>> 2018-06-08T21:17:20,776 ERROR [7feb5165-780b-4ab6-aca8-f516d0388823
>> main] exec.DDLTask: java.lang.NoClassDefFoundError:
>> org/apache/hadoop/hive/serde2/SerDe
>> at java.lang.ClassLoader.defineClass1(Native Method)
>> at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>> at java.security.SecureClassLoader.defineClass(SecureClassLoade
>> r.java:142)
>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>> at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:348)
>> at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Co
>> nfiguration.java:2134)
>> at org.apache.hadoop.conf.Configuration.getClassByName(Configur
>> ation.java:2099)
>> at org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask
>> .java:4213)
>> at org.apache.hadoop.hive.ql.plan.CreateTableDesc.toTable(Creat
>> eTableDesc.java:723)
>> at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.
>> java:4321)
>> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:
>> 354)
>> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
>> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(Ta

Re: Which version of Hive can hanle creating XML table?

2018-06-08 Thread Mich Talebzadeh
Thanks Jorn so what is the resolution? do I need another jar file?

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 8 June 2018 at 21:56, Jörn Franke  wrote:

> Oha i see now Serde is a deprecated Interface , if i am not wrong it has
> been replaced by the abstract class abstractserde
>
> On 8. Jun 2018, at 22:22, Mich Talebzadeh 
> wrote:
>
> Thanks Jorn.
>
> Spark 2.3.3 (labelled as stable)
>
> First I put the jar file hivexmlserde-1.0.5.3.jar under $HIVE_HOME/lib and
> explicitly loaded with ADD JAR as well in hive session
>
> hive> ADD JAR hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar;
> Added 
> [/tmp/hive/7feb5165-780b-4ab6-aca8-f516d0388823_resources/hivexmlserde-1.0.5.3.jar]
> to class path
> Added resources: [hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar]
>
> Then I ran a simple code given here
> 
>
> hive> CREATE  TABLE xml_41 (imap map)
> > ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
> > WITH SERDEPROPERTIES (
> > "column.xpath.imap"="/file-format/data-set/element",
> > "xml.map.specification.element"="@name->#content"
> > )
> > STORED AS
> > INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
> > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.
> IgnoreKeyTextOutputFormat'
> > TBLPROPERTIES (
> > "xmlinput.start"="",
> > "xmlinput.end"=""
> > );
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask.
> org/apache/hadoop/hive/serde2/SerDe
> And this is full error 2018-06-08T21:17:20,775  INFO
> [7feb5165-780b-4ab6-aca8-f516d0388823 main] ql.Driver: Starting task
> [Stage-0:DDL] in serial mode
> 2018-06-08T21:17:20,776 ERROR [7feb5165-780b-4ab6-aca8-f516d0388823 main]
> exec.DDLTask: java.lang.NoClassDefFoundError:
> org/apache/hadoop/hive/serde2/SerDe
> at java.lang.ClassLoader.defineClass1(Native Method)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> at java.security.SecureClassLoader.defineClass(
> SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(
> Configuration.java:2134)
> at org.apache.hadoop.conf.Configuration.getClassByName(
> Configuration.java:2099)
> at org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(
> DDLTask.java:4213)
> at org.apache.hadoop.hive.ql.plan.CreateTableDesc.toTable(
> CreateTableDesc.java:723)
> at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(
> DDLTask.java:4321)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.
> java:354)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(
> TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(
> CliDriver.java:233)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(
> CliDriver.java:184)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(
> CliDriver.java:403)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(
> CliDriver.java:821)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
> at sun.reflect.Native

Re: Which version of Hive can hanle creating XML table?

2018-06-08 Thread Jörn Franke
Oha i see now Serde is a deprecated Interface , if i am not wrong it has been 
replaced by the abstract class abstractserde 

> On 8. Jun 2018, at 22:22, Mich Talebzadeh  wrote:
> 
> Thanks Jorn.
> 
> Spark 2.3.3 (labelled as stable)
> 
> First I put the jar file hivexmlserde-1.0.5.3.jar under $HIVE_HOME/lib and 
> explicitly loaded with ADD JAR as well in hive session
> 
> hive> ADD JAR hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar;
> Added 
> [/tmp/hive/7feb5165-780b-4ab6-aca8-f516d0388823_resources/hivexmlserde-1.0.5.3.jar]
>  to class path
> Added resources: [hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar]
> 
> Then I ran a simple code given here
> 
> hive> CREATE  TABLE xml_41 (imap map)
> > ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
> > WITH SERDEPROPERTIES (
> > "column.xpath.imap"="/file-format/data-set/element",
> > "xml.map.specification.element"="@name->#content"
> > )
> > STORED AS
> > INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
> > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
> > TBLPROPERTIES (
> > "xmlinput.start"="",
> > "xmlinput.end"=""
> > );
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. org/apache/hadoop/hive/serde2/SerDe
> 
> And this is full error
> 
> 2018-06-08T21:17:20,775  INFO [7feb5165-780b-4ab6-aca8-f516d0388823 main] 
> ql.Driver: Starting task [Stage-0:DDL] in serial mode
> 2018-06-08T21:17:20,776 ERROR [7feb5165-780b-4ab6-aca8-f516d0388823 main] 
> exec.DDLTask: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hive/serde2/SerDe
> at java.lang.ClassLoader.defineClass1(Native Method)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
> at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:4213)
> at 
> org.apache.hadoop.hive.ql.plan.CreateTableDesc.toTable(CreateTableDesc.java:723)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4321)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.SerDe
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at j

Re: Which version of Hive can hanle creating XML table?

2018-06-08 Thread Mich Talebzadeh
Thanks Jorn.

Spark 2.3.3 (labelled as stable)

First I put the jar file hivexmlserde-1.0.5.3.jar under $HIVE_HOME/lib and
explicitly loaded with ADD JAR as well in hive session

hive> ADD JAR hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar;
Added
[/tmp/hive/7feb5165-780b-4ab6-aca8-f516d0388823_resources/hivexmlserde-1.0.5.3.jar]
to class path
Added resources: [hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar]

Then I ran a simple code given here


hive> CREATE  TABLE xml_41 (imap map)
> ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
> WITH SERDEPROPERTIES (
> "column.xpath.imap"="/file-format/data-set/element",
> "xml.map.specification.element"="@name->#content"
> )
> STORED AS
> INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
> TBLPROPERTIES (
> "xmlinput.start"="",
> "xmlinput.end"=""
> );
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask. org/apache/hadoop/hive/serde2/SerDe
And this is full error 2018-06-08T21:17:20,775  INFO
[7feb5165-780b-4ab6-aca8-f516d0388823 main] ql.Driver: Starting task
[Stage-0:DDL] in serial mode
2018-06-08T21:17:20,776 ERROR [7feb5165-780b-4ab6-aca8-f516d0388823 main]
exec.DDLTask: java.lang.NoClassDefFoundError:
org/apache/hadoop/hive/serde2/SerDe
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
at
org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:4213)
at
org.apache.hadoop.hive.ql.plan.CreateTableDesc.toTable(CreateTableDesc.java:723)
at
org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4321)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.serde2.SerDe
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 40 more

The jar file has the classes!

jar tf hivexmlserde-1.0.5.3.jar
META-INF/
META-INF/MANIFEST.MF
com/
com/ibm/
com/ibm/spss/
com/ibm/spss/hive/
com/ibm/spss/hive/serde2/
com/ibm/spss/hive/serde2/xml/
com/ibm/spss/hive/serde2/xml/objectinspector/
com/ibm/spss/hive/serde

Re: Which version of Hive can hanle creating XML table?

2018-06-08 Thread Jörn Franke
Can you get the log files and start Hive with more detailled logs?
In could be that not all libraries are loaded (i don’t remember anymore but I 
think this one needs more , I can look next week in my docs) or that it does 
not support maps (not sure). 
You can try first with a more simpler extraction with a String field to see if 
it works .

Hive has always had external libraries for xml support and I used the one below 
with Hive 1.x, but it should also work with 2.x (3 not sure, but it should if 
it works in 2.x)


> On 8. Jun 2018, at 17:53, Mich Talebzadeh  wrote:
> 
> I tried Hive 2.0.1, 2.3.2 and now Hive 3/
> 
> I explicitly added hivexmlserde  jar file as ADD JAR shown below
> 
> 0: jdbc:hive2://rhes75:10099/default> ADD JAR 
> hdfs://rhes75:9000/jars/hivexmlserde-1.0.5.3.jar;
> No rows affected (0.002 seconds)
> 
> But still cannot create an xml table
> 
> 0: jdbc:hive2://rhes75:10099/default> CREATE  TABLE xml_41 (imap 
> map) ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe' 
> WITH SERDEPROPERTIES 
> ("column.xpath.imap"="/file-format/data-set/element","xml.map.specification.element"="@name->#content")
>   STORED AS INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat' 
> TBLPROPERTIES 
> ("xmlinput.start"="","xmlinput.end"="");
> 
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> org/apache/hadoop/hive/serde2/SerDe (state=08S01,code=1)
> 
> Does anyone know the cause of this or which version of Hive supports creating 
> an XML table?
> 
> Thanks
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> http://talebzadehmich.wordpress.com
> 
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
> damage or destruction of data or any other property which may arise from 
> relying on this email's technical content is explicitly disclaimed. The 
> author will in no case be liable for any monetary damages arising from such 
> loss, damage or destruction.
>