Hi,

 

I want to configure a map reduce job in Apache NIFI as a processor. The
scenario for which this job  developed  is as below :

 

There are two files: 

 

1.       User_data having tab separated data like              userid
username           movieid                rating

2.       Movie_data having | separated data like
movieid|movie_name

 

Requirement is :

 

                To get movie name and it's aggregated rating in one
resultant file.

 

Used approach for now [Step by step]:

                

1.       Used ExecuteCommandScript processor with using shell script to load
and fetch data from HIVE.

2.       In  shell script I have written SQL queries for loading and
fetching data then output data was written on disk by using putFile
processor. 

 

Please suggest,

 

If I opted right approach [As I think ExecuteSQL processor should be used
for execution of SQL queries on HIVE but I do not know What is DB connection
string for it ]?

 

what is best approach for it?

 

Thanks with regards

Shashank Tiwari

Reply via email to