Re: Convert Avro to ORC or JSON processor - retaining the data type
Hi Ravi, How about storing those as string, and cast strings into numeric data type int/bigint when you query them? https://stackoverflow.com/questions/28867438/hive-converting-a-string-to-bigint Thanks, Koji On Sat, Mar 9, 2019 at 6:10 AM Ravi Papisetti (rpapiset) wrote: > > Thanks Koji for the response. Our users want to run hiveql queries with some > comparators and cannot work with string type for numeric data type. > > Any other options? > > Thanks, > Ravi Papisetti > > On 07/03/19, 7:14 PM, "Koji Kawamura" wrote: > > Hi Ravi, > > I looked at following links, Hive does support some logical types like > timestamp-millis, but not sure if decimal is supported. > https://issues.apache.org/jira/browse/HIVE-8131 > > https://cwiki.apache.org/confluence/display/Hive/AvroSerDe#AvroSerDe-AvrotoHivetypeconversion > > If treating the number as String works in your use-case, then I'd > recommend disabling "Use Avro Logical Types" at ExecuteSQL. > > Thanks, > Koji > > On Fri, Mar 8, 2019 at 4:48 AM Ravi Papisetti (rpapiset) > wrote: > > > > Hi, > > > > > > > > Nifi version 1.7 > > > > > > > > We have a dataflow that would get data from Oracle database and load > into hive tables. > > > > > > > > Data flow is something like below: > > > > GenerateTableFetch -> ExecuteSQL > AvrotoJson/ORC (we tried both) > > PutHDFS > ListHDFS> ReplaceTExt (to build load data query form the file) > > PutHiveQL. > > > > > > > > Data at source (ex: column "cpyKey" NUMBER) in Number/INT format is > being written as > > > > > {"type":"record","name":"NiFi_ExecuteSQL_Record","namespace":"any.data","fields":[{"name":"cpyKey","type":["null",{"type":"bytes","logicalType":"decimal","precision":10,"scale":0}]} > > > > > > > > When this is inserted into hive table weather data is loaded from ORC > (convertAvroToORC) file or JSON (ConvertAvroToJSON) file, querying data from > hive throws parsing exception with incompatible data types. > > > > > > > > Error: java.io.IOException: java.lang.RuntimeException: ORC split > generation failed with exception: java.lang.IllegalArgumentException: ORC > does not support type conversion from file type binary (1) to reader type > bigint (1) (state=,code=0) > > > > > > > > Appreciate any help on this. > > > > > > > > Thanks, > > > > Ravi Papisetti > >
Re: Convert Avro to ORC or JSON processor - retaining the data type
Thanks Koji for the response. Our users want to run hiveql queries with some comparators and cannot work with string type for numeric data type. Any other options? Thanks, Ravi Papisetti On 07/03/19, 7:14 PM, "Koji Kawamura" wrote: Hi Ravi, I looked at following links, Hive does support some logical types like timestamp-millis, but not sure if decimal is supported. https://issues.apache.org/jira/browse/HIVE-8131 https://cwiki.apache.org/confluence/display/Hive/AvroSerDe#AvroSerDe-AvrotoHivetypeconversion If treating the number as String works in your use-case, then I'd recommend disabling "Use Avro Logical Types" at ExecuteSQL. Thanks, Koji On Fri, Mar 8, 2019 at 4:48 AM Ravi Papisetti (rpapiset) wrote: > > Hi, > > > > Nifi version 1.7 > > > > We have a dataflow that would get data from Oracle database and load into hive tables. > > > > Data flow is something like below: > > GenerateTableFetch -> ExecuteSQL > AvrotoJson/ORC (we tried both) > PutHDFS > ListHDFS> ReplaceTExt (to build load data query form the file) > PutHiveQL. > > > > Data at source (ex: column "cpyKey" NUMBER) in Number/INT format is being written as > > {"type":"record","name":"NiFi_ExecuteSQL_Record","namespace":"any.data","fields":[{"name":"cpyKey","type":["null",{"type":"bytes","logicalType":"decimal","precision":10,"scale":0}]} > > > > When this is inserted into hive table weather data is loaded from ORC (convertAvroToORC) file or JSON (ConvertAvroToJSON) file, querying data from hive throws parsing exception with incompatible data types. > > > > Error: java.io.IOException: java.lang.RuntimeException: ORC split generation failed with exception: java.lang.IllegalArgumentException: ORC does not support type conversion from file type binary (1) to reader type bigint (1) (state=,code=0) > > > > Appreciate any help on this. > > > > Thanks, > > Ravi Papisetti
Re: Convert Avro to ORC or JSON processor - retaining the data type
Hi Ravi, I looked at following links, Hive does support some logical types like timestamp-millis, but not sure if decimal is supported. https://issues.apache.org/jira/browse/HIVE-8131 https://cwiki.apache.org/confluence/display/Hive/AvroSerDe#AvroSerDe-AvrotoHivetypeconversion If treating the number as String works in your use-case, then I'd recommend disabling "Use Avro Logical Types" at ExecuteSQL. Thanks, Koji On Fri, Mar 8, 2019 at 4:48 AM Ravi Papisetti (rpapiset) wrote: > > Hi, > > > > Nifi version 1.7 > > > > We have a dataflow that would get data from Oracle database and load into > hive tables. > > > > Data flow is something like below: > > GenerateTableFetch -> ExecuteSQL > AvrotoJson/ORC (we tried both) > PutHDFS > > ListHDFS> ReplaceTExt (to build load data query form the file) > PutHiveQL. > > > > Data at source (ex: column "cpyKey" NUMBER) in Number/INT format is being > written as > > {"type":"record","name":"NiFi_ExecuteSQL_Record","namespace":"any.data","fields":[{"name":"cpyKey","type":["null",{"type":"bytes","logicalType":"decimal","precision":10,"scale":0}]} > > > > When this is inserted into hive table weather data is loaded from ORC > (convertAvroToORC) file or JSON (ConvertAvroToJSON) file, querying data from > hive throws parsing exception with incompatible data types. > > > > Error: java.io.IOException: java.lang.RuntimeException: ORC split generation > failed with exception: java.lang.IllegalArgumentException: ORC does not > support type conversion from file type binary (1) to reader type bigint (1) > (state=,code=0) > > > > Appreciate any help on this. > > > > Thanks, > > Ravi Papisetti
Convert Avro to ORC or JSON processor - retaining the data type
Hi, Nifi version 1.7 We have a dataflow that would get data from Oracle database and load into hive tables. Data flow is something like below: GenerateTableFetch -> ExecuteSQL > AvrotoJson/ORC (we tried both) > PutHDFS > ListHDFS> ReplaceTExt (to build load data query form the file) > PutHiveQL. Data at source (ex: column "cpyKey" NUMBER) in Number/INT format is being written as {"type":"record","name":"NiFi_ExecuteSQL_Record","namespace":"any.data","fields":[{"name":"cpyKey","type":["null",{"type":"bytes","logicalType":"decimal","precision":10,"scale":0}]} When this is inserted into hive table weather data is loaded from ORC (convertAvroToORC) file or JSON (ConvertAvroToJSON) file, querying data from hive throws parsing exception with incompatible data types. Error: java.io.IOException: java.lang.RuntimeException: ORC split generation failed with exception: java.lang.IllegalArgumentException: ORC does not support type conversion from file type binary (1) to reader type bigint (1) (state=,code=0) Appreciate any help on this. Thanks, Ravi Papisetti