I am investigating automated methods of moving our data from the web tier into HDFS for processing, a process that's performed periodically.
I am looking for feedback from anyone who has actually used Flume in a production setup (redundant, failover) successfully. I understand it is now being largely rearchitected during its incubation as Apache Flume-NG, so I don't have full confidence in the old, stable releases. The other option would be to write our own tools. What methods are you using for these kinds of tasks? Did you write your own or does Flume (or something else) work for you? I'm also on the Flume mailing list, but I wanted to ask these questions here because I'm interested in Flume _and_ alternatives. Thank you!