Hi,
I want to configure a map reduce job in Apache NIFI as a processor. The
scenario for which this job developed is as below :
There are two files:
1. User_data having tab separated data like userid
username movieid rating
2. Movie_data having | separated data like
movieid|movie_name
Requirement is :
To get movie name and it's aggregated rating in one
resultant file.
Used approach for now [Step by step]:
1. Used ExecuteCommandScript processor with using shell script to load
and fetch data from HIVE.
2. In shell script I have written SQL queries for loading and
fetching data then output data was written on disk by using putFile
processor.
Please suggest,
If I opted right approach [As I think ExecuteSQL processor should be used
for execution of SQL queries on HIVE but I do not know What is DB connection
string for it ]?
what is best approach for it?
Thanks with regards
Shashank Tiwari