[
https://issues.apache.org/jira/browse/FLUME-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286310#comment-13286310
]
Hari Shreedharan edited comment on FLUME-1246 at 5/31/12 4:40 AM:
------------------------------------------------------------------
Juhani,
File channel failing to start - It looks like it is caused by the fix for
FLUME-1236. The channel is not really failing to start, it is replaying events
from the log. Now the agent fully starts up only once the channel is done
starting(and replaying events from the log). So if there are a large number of
events being replayed, you are likely to see a replay handler message in the
flume log. Once it is done, the channel will start up. Can you please try
waiting for a while and see if the agent starts up completely? The only reason
I am not fully convinced it is FLUME-1236 fix, is that I do not see any replay
handler messages in the log you posted. So it would be great if you could wait
for a while and see if the agent starts up completely.
I have noticed the channel not stopping(and had to resort to kill -9), probably
during the replay phase. We should probably investigate this. I guess the
channel eats interrupts during this phase.
was (Author: hshreedharan):
Juhani,
File channel failing to start - It looks like it is caused by the fix for
FLUME-1236. The channel is not really failing to start, it is replaying events
from the log. Now the agent fully starts up only once the channel is done
starting(and replaying events from the log). So if there are a large number of
events being replayed, you are likely to see a replay handler message in the
flume log. Once it is done, the channel will start up. Can you please try
waiting for a while and see if the agent starts up completely?
I have noticed the channel not stopping, probably during the replay phase. We
should probably investigate this. I guess the channel eats interrupts during
this phase.
> FileChannel failing to start, also shutdown impossible without kill
> --------------------------------------------------------------------
>
> Key: FLUME-1246
> URL: https://issues.apache.org/jira/browse/FLUME-1246
> Project: Flume
> Issue Type: Bug
> Components: Channel
> Affects Versions: v1.2.0
> Environment: CentOS 5.4
> Reporter: Juhani Connolly
> Attachments: flume.log
>
>
> Reduced to a minimal configuration for simplicity. I can recreate this on
> some machines, and not others. I wouldn't be surprised if it is some machines
> specific issue(test machines on CentOS5.4. On some it worked, others not),
> however whatever exception was thrown when attempting to get created is
> consumed and never passed onwards
> Config:
> test.channels.ch1.type = file
> test.channels.ch1.checkpointDir =
> /home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/check
> test.channels.ch1.dataDirs =
> /home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/filechdata
> test.sources.top.type = exec
> test.sources.top.command = /usr/bin top -b -d 1
> test.sources.top.restart = true
> test.sources.top.restartThrottle = 1000
> test.sources.top.interceptors = ts
> test.sources.top.interceptors.ts.type =
> org.apache.flume.interceptor.TimestampInterceptor$Builder
> test.sources.top.channels = ch1
> test.sinks.log.type = logger
> test.sinks.log.channel = ch1
> test.channels = ch1
> test.sources = top
> test.sinks = log
> attaching logs with general/lifecycle loglevel down to debug.
> A solution to this is probably going to be just improving error reporting.
> Another possibly more important element is that flume enters a state from
> which it cannot shutdown without kill -9. It looks like the Interrupts are
> getting swallowed up silently somewhere
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira