I am working on large dataset storage and processing. I am collect the large dataset through .NET application from multiple sensor devices. The data are storing in C drive. I want to process this on hadoop cluster. For this purpose i have setup the hadoop cluster. Now I dont know how I can automatically upload the data to hadoop cluster from remote computer. Actually, the machine collecting the data is not part of cluster.
Could you please let me know, in my case, flume can help me. I mean can I use the flume to automatically the upload the file into hadoop. The sensor devices contineously generating the files of 100 MB. I have read about the flume that it is used only for the log data. How I can use for any another type of data. Many thanks
