Hi, I'm using the hive.ddl attribute formed by ConvertAvroToORC for the DDL statement, and passing it to custom processor(ReplaceText route the file to failure if maximum buffer size is more than 1 mb size. For large data it is not a good option.) which modify the content of the flowfile to the hive.ddl statement + location of external table. I don’t think this is causing any issue.
I tried to create the table on the top of avro file as well, it also displays null. In the AvroRecordSetWriter doc, it says " Writes the contents of a RecordSet in Binary Avro format. ". Is this format different than what ConvertCsvToAvro writes? Because same flow is working with ValidateRecord(write it to csv) +ConvertCsvToAvro processor. Thanks, Mohit -----Original Message----- From: Matt Burgess <[email protected]> Sent: 24 April 2018 18:43 To: [email protected] Subject: Re: Nifi 1.6.0 ValidateRecord Processor- AvroRecordSetWriter issue Mohit, Can you share the config for your ConvertAvroToORC processor? Also, by "CreateHiveTable", do you mean ReplaceText (to set the content to the hive.ddl attribute formed by ConvertAvroToORC) -> PutHiveQL (to execute the DDL)? If not, are you using a custom processor or ExecuteStreamCommand or something else? If you are not using the generated DDL to create the table, can you share your CREATE TABLE statement for the target table? I'm guessing there's a mismatch somewhere between the data and the table definition. Regards, Matt On Tue, Apr 24, 2018 at 9:09 AM, Mohit <[email protected]> wrote: > Hi all, > > > > I’m using ValidateRecord processor to validate the csv and convert it > into Avro. Later, I convert this avro to orc using ConvertAvroToORC > processor and write it to hdfs and create a hive table on the top of it. > > When I query the table, it displays null, though the record count is > matching. > > > > Flow - ValidateRecord -> ConvertAvroToORC -> PutHDFS -> > CreateHiveTable > > > > > > To debug, I also tried to write the avro data to hdfs and created the > hive table on the top of it. It is also displaying null results. > > > > Flow - ValidateRecord -> ConvertCSVToAvro -> PutHDFS I manually created > hive table with avro format. > > > > > > When I use ValidateRecord + ConvertCSVToAvro, it is working fine. > > Flow - ValidateRecord -> ConvertCSVToAvro -> ConvertAvroToORC -> > PutHDFS > -> CreateHiveTable > > > > > > Is there anything I’m doing wrong? > > > > Thanks, > > Mohit > > > >
