Hi Burak, Do the machines with the logs on them have syslog available (e.g., rsyslog for RedHat/CentOS)? Can the remote servers do any kind of push or do you have to pull data from them? If you you have a syslog daemon available on the remote servers then I would try configuring those to send the logs to the Flume multiport syslog TCP source.
In regards to pulling data from the remote servers, what part of rsync is causing issues (assuming your using rsync to pull data)? Is the problem with rsync itself in regards to getting the files from the remote servers or is it an issue related to getting the files into HDFS once you've pulled the files to the main server? If the problem is related to getting the files into HDFS you could try using the Spooling Directory Source and point it at the directory on your main server where you are aggregating the logs via rsync. Best, Ed On Wed, Jan 29, 2014 at 11:24 PM, burakkk <[email protected]> wrote: > Hi folks, > I have question about flume-ng. There are some different generating log > machines. These log files are small (around 4-5mb per file). I want to get > or read these files into my main server from these remote servers on > a specific directory and then I want to put it into HDFS. I can't install > any kind of application on these remote servers so that I can't use avro > and thrift source. > > For now I use rsync to sync files between two different machines and put > them using hdfs file commands such as hdfs fs -put. But there are some > issues about rsync. > > In order to solve this problem, what kind of source should I use and how > can I do that? > > > Thanks > Best Regards... > > -- > > *BURAK ISIKLI* | *http://burakisikli.wordpress.com > <http://burakisikli.wordpress.com>* > >
