Hello Team, Request you to please help me with the issues mentioned in the trailing mail. Thanking you in anticipation.
Thanks & Regards Shweta Soni Am Mo., 29. Juni 2020 um 16:54 Uhr schrieb shweta soni <[email protected]>: > Hello Team, > > Request you to please help me with the issues mentioned in the trailing > mail. Thanking you in anticipation. > > Thanks & Regards > Shweta Soni > > Am Do., 25. Juni 2020 um 23:36 Uhr schrieb shweta soni <[email protected] > >: > >> Hello Team, >> >> >> >> We are using Nifi in our data ingestion process. The version details are: >> *Nifi 1.11.4 , Cloudera Enterprise 5.16.2 and Hive 1.1*. I posted my >> issues on Nifi SLACK channel, but did not get answer for some of my >> questions. So I am posting all my queries here in ardent hope of getting >> solutions/workarounds for them. We are facing below issues: >> >> >> >> >> >> 1. *SCENARIO*: In RDBMS Source we have Date/Timestamp column. In >> Hive Destination we have Date/Timestamp columns but when we are trying to >> ingest from source to destination, we are getting Int/Longwritable cannot >> be written to Date/Timestamp errors in Hue. We are using following >> processors: QueryDatabaseProcessor à UpdateRecord( column mapping and >> output schema) à PutHDFS à ReplaceText à PutHiveQL. Below are the Avro >> Output Schema since we don’t have Date or Timestamp as datatype in Avro >> Schema. >> >> >> {"name":"dob","type":["null",{"type":"long","logicalType":"timestamp-millis"} >> >> {"name":"doA","type":["null",{"type":"int","logicalType":"date"} >> >> >> *Q. Please let me know how can we put data/timestamp source columns >> to data/timestamp destination columns?* >> >> >> >> >> >> >> 2. *SCENARIO* : Decimal data is not being inserted in ORC table. >> >> Solution: I am loading data in Avro table and then doing INSERT INTO ORC >> table from it. This solution I found from Cloudera community. >> >> >> *Q. Is there any other solution for loading decimal data in ORC table?* >> >> >> >> >> >> >> *3. **SCENARIO**: *We have a 1 time full load flow in Nifi – >> QueryDatabase à PutHiveQL. à LogAttribute. This acts as a pipeline in our >> custom based UI. This will run only once. In Nifi UI we can manually start >> processors to start the flow and once all the flowfiles are processed and >> the success queue of PutHiveQL becomes empty we can stop the processor in >> Nifi UI. But now we want to know programmatically that this flow ended at >> particular time and we want to show the pipeline status as completed in our >> custom based UI. So how can we stimulate this? >> >> >> >> * Q. *Since Nifi is for continuous data transfer, how can we know >> that a particular flow has ended? >> >> >> >> >> >> >> 4. *SCENARIO** :*I have Hive table with complex datatypes i.e. Array, >> Map. When I am trying to get this data via SELECTHIVEQL processor , it is >> giving output in String format for all the columns. Then in next >> UpdateProcessor it is giving error that string datatype cannot be converted >> to Array or Map. >> >> Avro Output Schema: >> >> {"type": "array", "items": "double"} >> >> {"type": "map", "values": "int"} >> >> >> *Q. How to handle complex datatype in Hive via Nifi. Source table as Hive >> and destination table as another Hive table.* >> >> >> >> >> >> >> 5. *SCENARIO*: In QueryDatabase Processor we have Max-value column >> which helps in incremental load. But there is not such functionality for >> Hive table incremental load(i.e.SELECTHIVEQL). I tried with >> GenerateTableFetch processor and QueryDatabase processor using Hive1_1 >> Connection service but it is not working. I was told on Nifi SLACK channel >> to raise JIRA for new processor GenerateHiveTableFetch/QueryHiveDatabase >> processor. >> >> >> >> *Q. Is there any alternative in which we can handle hive table incremental >> load or should I go ahead and raise JIRA for the same?* >> >> >> >> Request you to please help us. Thanking you in anticipation. >> >> >> >> >> >> Thanks & Regards, >> Shweta Soni >> >
