Guys, Whats the best known usage of flume with hive? Just curious to see what everyone is using. My requirements are standard..
* Currently writing logs onto HDFS from different production servers. * Need to pre process the logs before writing onto hive. * Need a way to merge the files generated by flume. I see that there is a flume+hive sink plugin, but did not find much usage data on that. I could write a custom sink or a custom decorator to do the pre processing & then run every hour cron jobs to write data from HDFS to hive. Any suggestions? Sushruth