Hari: you mean multiple disks, not multiple folders? Running off a single disk the performance is unfortunately not "reasonably good".

The reality of most companies hoping to aggregate logs is that a lot of machines generating the logs have a single set of raided disks, and that using multiple disks is not an option. Please do keep this in mind when running tests and not just the "best case scenario". After all, flume is going to be co-habiting on a server that was made for the primary task in mind. The servers are built for their primary purposes, not for flume.

In our case what we had hoped to do on our log sources, and currently are doing with scribed(which has its own issues, hence wanting to move):

- Run agents on all our log generating servers, using a channel that can retain data in case of network issues communicating with the collector layer. - Current setup is a scribed buffer store with network store as primary, file as secondary. - Intended setup with flume was a file channel connected to an avro sink. With only a single disk available, it is extremely slow. JDBC channel is also extremely slow, and MemoryChannel will fill up and start refusing puts as soon as a network issue comes up.

I think this is a very common use case and one that is likely holding up adoption until we solve it(at least is is for us).

On 07/09/2012 04:07 PM, Hari Shreedharan wrote:
Senthil,

Have you tried using it recently, with multiple data folders etc. In recent tests, we have seen reasonably good performance. Of course, the performance of MemoryChannel would be much better, since it is in-memory :-). You should try to use the FileChannel as much as you can, else there is a risk of losing data.

Thanks
Hari

--
Hari Shreedharan

On Monday, July 9, 2012 at 12:01 AM, Senthilvel Rangaswamy wrote:

We do use persistent channel when there is overflow. Using FileChannel for regular operations
is too slow for us.

On Sun, Jul 8, 2012 at 11:58 PM, Brock Noland <[email protected] <mailto:[email protected]>> wrote:
I am guessing you are aware, but you could use a persistent channel such as file channel.

--
Brock Noland
Sent with Sparrow <http://www.sparrowmailapp.com/?sig>

On Monday, July 9, 2012 at 7:18 AM, Senthilvel Rangaswamy wrote:

We are using Flume 1.2.0 with memory channel. When we rollout new configs/decorators we may need to restart flume at which point any events in memory channel is gone. Any
ways to avoid this ?

Thanks,
--
..Senthil

"If there's anything more important than my ego around, I want it
 caught and shot now."
           - Douglas Adams.





--
..Senthil

"If there's anything more important than my ego around, I want it
 caught and shot now."
                                                    - Douglas Adams.




Reply via email to