Re: Convert Avro to ORC or JSON processor - retaining the data type

2019-03-10 Thread Koji Kawamura
Hi Ravi,

How about storing those as string, and cast strings into numeric data
type int/bigint when you query them?
https://stackoverflow.com/questions/28867438/hive-converting-a-string-to-bigint

Thanks,
Koji

On Sat, Mar 9, 2019 at 6:10 AM Ravi Papisetti (rpapiset)
 wrote:
>
> Thanks Koji for the response. Our users want to run hiveql queries with some 
> comparators and cannot work with string type for numeric data type.
>
> Any other options?
>
> Thanks,
> Ravi Papisetti
>
> On 07/03/19, 7:14 PM, "Koji Kawamura"  wrote:
>
> Hi Ravi,
>
> I looked at following links, Hive does support some logical types like
> timestamp-millis, but not sure if decimal is supported.
> https://issues.apache.org/jira/browse/HIVE-8131
> 
> https://cwiki.apache.org/confluence/display/Hive/AvroSerDe#AvroSerDe-AvrotoHivetypeconversion
>
> If treating the number as String works in your use-case, then I'd
> recommend disabling "Use Avro Logical Types" at ExecuteSQL.
>
> Thanks,
> Koji
>
> On Fri, Mar 8, 2019 at 4:48 AM Ravi Papisetti (rpapiset)
>  wrote:
> >
> > Hi,
> >
> >
> >
> > Nifi version 1.7
> >
> >
> >
> > We have a dataflow that would get data from Oracle database and load 
> into hive tables.
> >
> >
> >
> > Data flow is something like below:
> >
> > GenerateTableFetch -> ExecuteSQL > AvrotoJson/ORC (we tried both) > 
> PutHDFS > ListHDFS> ReplaceTExt (to build load data query form the file) > 
> PutHiveQL.
> >
> >
> >
> > Data at source (ex: column "cpyKey" NUMBER)  in Number/INT format is 
> being written as
> >
> > 
> {"type":"record","name":"NiFi_ExecuteSQL_Record","namespace":"any.data","fields":[{"name":"cpyKey","type":["null",{"type":"bytes","logicalType":"decimal","precision":10,"scale":0}]}
> >
> >
> >
> > When this is inserted into hive table weather data is loaded from ORC 
> (convertAvroToORC)  file or JSON (ConvertAvroToJSON) file, querying data from 
> hive throws parsing exception with incompatible data types.
> >
> >
> >
> > Error: java.io.IOException: java.lang.RuntimeException: ORC split 
> generation failed with exception: java.lang.IllegalArgumentException: ORC 
> does not support type conversion from file type binary (1) to reader type 
> bigint (1) (state=,code=0)
> >
> >
> >
> > Appreciate any help on this.
> >
> >
> >
> > Thanks,
> >
> > Ravi Papisetti
>
>


Re: Convert Avro to ORC or JSON processor - retaining the data type

2019-03-08 Thread Ravi Papisetti (rpapiset)
Thanks Koji for the response. Our users want to run hiveql queries with some 
comparators and cannot work with string type for numeric data type.

Any other options?

Thanks,
Ravi Papisetti

On 07/03/19, 7:14 PM, "Koji Kawamura"  wrote:

Hi Ravi,

I looked at following links, Hive does support some logical types like
timestamp-millis, but not sure if decimal is supported.
https://issues.apache.org/jira/browse/HIVE-8131

https://cwiki.apache.org/confluence/display/Hive/AvroSerDe#AvroSerDe-AvrotoHivetypeconversion

If treating the number as String works in your use-case, then I'd
recommend disabling "Use Avro Logical Types" at ExecuteSQL.

Thanks,
Koji

On Fri, Mar 8, 2019 at 4:48 AM Ravi Papisetti (rpapiset)
 wrote:
>
> Hi,
>
>
>
> Nifi version 1.7
>
>
>
> We have a dataflow that would get data from Oracle database and load into 
hive tables.
>
>
>
> Data flow is something like below:
>
> GenerateTableFetch -> ExecuteSQL > AvrotoJson/ORC (we tried both) > 
PutHDFS > ListHDFS> ReplaceTExt (to build load data query form the file) > 
PutHiveQL.
>
>
>
> Data at source (ex: column "cpyKey" NUMBER)  in Number/INT format is 
being written as
>
> 
{"type":"record","name":"NiFi_ExecuteSQL_Record","namespace":"any.data","fields":[{"name":"cpyKey","type":["null",{"type":"bytes","logicalType":"decimal","precision":10,"scale":0}]}
>
>
>
> When this is inserted into hive table weather data is loaded from ORC 
(convertAvroToORC)  file or JSON (ConvertAvroToJSON) file, querying data from 
hive throws parsing exception with incompatible data types.
>
>
>
> Error: java.io.IOException: java.lang.RuntimeException: ORC split 
generation failed with exception: java.lang.IllegalArgumentException: ORC does 
not support type conversion from file type binary (1) to reader type bigint (1) 
(state=,code=0)
>
>
>
> Appreciate any help on this.
>
>
>
> Thanks,
>
> Ravi Papisetti




Re: Convert Avro to ORC or JSON processor - retaining the data type

2019-03-07 Thread Koji Kawamura
Hi Ravi,

I looked at following links, Hive does support some logical types like
timestamp-millis, but not sure if decimal is supported.
https://issues.apache.org/jira/browse/HIVE-8131
https://cwiki.apache.org/confluence/display/Hive/AvroSerDe#AvroSerDe-AvrotoHivetypeconversion

If treating the number as String works in your use-case, then I'd
recommend disabling "Use Avro Logical Types" at ExecuteSQL.

Thanks,
Koji

On Fri, Mar 8, 2019 at 4:48 AM Ravi Papisetti (rpapiset)
 wrote:
>
> Hi,
>
>
>
> Nifi version 1.7
>
>
>
> We have a dataflow that would get data from Oracle database and load into 
> hive tables.
>
>
>
> Data flow is something like below:
>
> GenerateTableFetch -> ExecuteSQL > AvrotoJson/ORC (we tried both) > PutHDFS > 
> ListHDFS> ReplaceTExt (to build load data query form the file) > PutHiveQL.
>
>
>
> Data at source (ex: column "cpyKey" NUMBER)  in Number/INT format is being 
> written as
>
> {"type":"record","name":"NiFi_ExecuteSQL_Record","namespace":"any.data","fields":[{"name":"cpyKey","type":["null",{"type":"bytes","logicalType":"decimal","precision":10,"scale":0}]}
>
>
>
> When this is inserted into hive table weather data is loaded from ORC 
> (convertAvroToORC)  file or JSON (ConvertAvroToJSON) file, querying data from 
> hive throws parsing exception with incompatible data types.
>
>
>
> Error: java.io.IOException: java.lang.RuntimeException: ORC split generation 
> failed with exception: java.lang.IllegalArgumentException: ORC does not 
> support type conversion from file type binary (1) to reader type bigint (1) 
> (state=,code=0)
>
>
>
> Appreciate any help on this.
>
>
>
> Thanks,
>
> Ravi Papisetti


Convert Avro to ORC or JSON processor - retaining the data type

2019-03-07 Thread Ravi Papisetti (rpapiset)
Hi,

Nifi version 1.7

We have a dataflow that would get data from Oracle database and load into hive 
tables.

Data flow is something like below:
GenerateTableFetch -> ExecuteSQL > AvrotoJson/ORC (we tried both) > PutHDFS > 
ListHDFS> ReplaceTExt (to build load data query form the file) > PutHiveQL.

Data at source (ex: column "cpyKey" NUMBER)  in Number/INT format is being 
written as
{"type":"record","name":"NiFi_ExecuteSQL_Record","namespace":"any.data","fields":[{"name":"cpyKey","type":["null",{"type":"bytes","logicalType":"decimal","precision":10,"scale":0}]}

When this is inserted into hive table weather data is loaded from ORC 
(convertAvroToORC)  file or JSON (ConvertAvroToJSON) file, querying data from 
hive throws parsing exception with incompatible data types.


Error: java.io.IOException: java.lang.RuntimeException: ORC split generation 
failed with exception: java.lang.IllegalArgumentException: ORC does not support 
type conversion from file type binary (1) to reader type bigint (1) 
(state=,code=0)

Appreciate any help on this.

Thanks,
Ravi Papisetti