The channel capacity is how many events the channel holds at once; as you have it set at just 1 event, then if the channel is holding a single event all other attempts to put an event in the channel will fail (possibly fail silently without logging an error) thus not storing those events (essentially deleting those events); if you only send events faster than your sink can read and process those events, which is likely, you will be dropping events.
Transaction Capacity is how many events your channel will give a sink in a single transaction (generally a single run through the sink's .process() method). If this number is higher than the channel Capacity, nothing different will happen; the channel will give the sink all of the events it has. Also, you should only have a channel go to a single sink (unless you are using a sink group); that is possibly the issue here. If you want to replicate an event and send it to multiple sinks, you should replicate it at the source end by using multiple channels, and thus one channel per sink. If you want to read more about this, check out this email thread<http://mail-archives.apache.org/mod_mbox/flume-dev/201303.mbox/%3CCAH0o75ORSFuk%3DczCHOG%2BNC1zea%2BTZqVg%2BTryW7%2BUZDHphM4vnw%40mail.gmail.com%3E> . In summation, create a separate channel for each sink. If you are still dropping events try increasing your channel's capacity. If you want to speed up the sink, I think that is when you increase the transaction capacity. - Connor On Wed, Mar 13, 2013 at 11:36 AM, Vikram Kulkarni <[email protected]>wrote: > Hari, I still have the same problem even after setting the transaction > capacity. **** > > However, does it have anything to do with Batching as I am not using > ‘batchSize’ currently. I just wanted to keep it simple like the Logger sink. > **** > > ** ** > > *From:* Vikram Kulkarni [mailto:[email protected]] > *Sent:* Tuesday, March 12, 2013 1:33 PM > *To:* [email protected] > *Subject:* RE: Dropped events**** > > ** ** > > I had it at 10 originally and the capacity at 100. **** > > Yes, I am committing the transaction and closing it (and backoff) even if > the Event is null. **** > > ** ** > > Thanks,**** > > Vikram**** > > **** > > *From:* Hari Shreedharan > [mailto:[email protected]<[email protected]>] > > *Sent:* Tuesday, March 12, 2013 11:06 AM > *To:* [email protected] > *Subject:* Re: Dropped events**** > > ** ** > > Your channel capacity is set to 1. Are you sure you really want that? If a > null event is returned by the channel, then you should commit and close the > transaction, optionally backoff (usually, this is when a batch was empty), > and then try again. You should probably also use a higher transaction > capacity. **** > > ** ** > > ** ** > > Hari**** > > ** ** > > -- **** > > Hari Shreedharan**** > > ** ** > > On Tuesday, March 12, 2013 at 11:03 AM, Vikram Kulkarni wrote:**** > > I have my custom Source and Sink that I have hooked with a memory channel > but I am noticing that it is not very consistent. Even after sending many > events to the Source the Sink’s event is still null. It works for about 1 > out 4 events. I do see the events going to the Logger Sink so I know the > Source is doing its job. However, for the custom Sink I simply get Event is > null messages. I tried adjusting the channel capacity from 10 to 1 but no > difference.**** > > Thanks.**** > > Here’s my conf file**** > > # flume-httpxmlhttp.conf: A single-node Flume with Http Source and Http > sink configuration**** > > ** ** > > # Name the components on this agent**** > > agent1.sources = r1**** > > agent1.channels = c1**** > > ** ** > > # Describe/configure the source**** > > agent1.sources.r1.type = org.apache.flume.source.http.HTTPSource**** > > agent1.sources.r1.port = 5140**** > > agent1.sources.r1.handler = > main.java.org.apache.flume.source.http.XMLHandler**** > > agent1.sources.r1.handler.nickname = random props**** > > ** ** > > # Setup the sinks**** > > agent1.sinks = httpsink logsink**** > > ** ** > > # Describe the sink**** > > agent1.sinks.logsink.type = logger**** > > ** ** > > # Describe the sink**** > > agent1.sinks.httpsink.type = > main.java.org.apache.flume.sink.http.HttpPostSink**** > > agent1.sinks.httpsink.serverAddress = <my http server URL>**** > > ** ** > > # Use a channel which buffers events in memory**** > > agent1.channels.c1.type = memory**** > > agent1.channels.c1.capacity = 1**** > > agent1.channels.c1.transactionCapacity = 1**** > > ** ** > > # Bind the source and sink to the channel**** > > agent1.sources.r1.channels = c1**** > > agent1.sinks.logsink.channel = c1**** > > agent1.sinks.httpsink.channel = c1**** > > ** ** >
