[ 
https://issues.apache.org/jira/browse/FLUME-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286310#comment-13286310
 ] 

Hari Shreedharan edited comment on FLUME-1246 at 5/31/12 4:40 AM:
------------------------------------------------------------------

Juhani,

File channel failing to start - It looks like it is caused by the fix for 
FLUME-1236. The channel is not really failing to start, it is replaying events 
from the log. Now the agent fully starts up only once the channel is done 
starting(and replaying events from the log). So if there are a large number of 
events being replayed, you are likely to see a replay handler message in the 
flume log. Once it is done, the channel will start up. Can you please try 
waiting for a while and see if the agent starts up completely? The only reason 
I am not fully convinced it is FLUME-1236 fix, is that I do not see any replay 
handler messages in the log you posted. So it would be great if you could wait 
for a while and see if the agent starts up completely.

I have noticed the channel not stopping(and had to resort to kill -9), probably 
during the replay phase. We should probably investigate this. I guess the 
channel eats interrupts during this phase. 
                
      was (Author: hshreedharan):
    Juhani,

File channel failing to start - It looks like it is caused by the fix for 
FLUME-1236. The channel is not really failing to start, it is replaying events 
from the log. Now the agent fully starts up only once the channel is done 
starting(and replaying events from the log). So if there are a large number of 
events being replayed, you are likely to see a replay handler message in the 
flume log. Once it is done, the channel will start up. Can you please try 
waiting for a while and see if the agent starts up completely?

I have noticed the channel not stopping, probably during the replay phase. We 
should probably investigate this. I guess the channel eats interrupts during 
this phase. 
                  
> FileChannel failing to  start, also shutdown impossible without kill
> --------------------------------------------------------------------
>
>                 Key: FLUME-1246
>                 URL: https://issues.apache.org/jira/browse/FLUME-1246
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: v1.2.0
>         Environment: CentOS 5.4
>            Reporter: Juhani Connolly
>         Attachments: flume.log
>
>
> Reduced to a minimal configuration for simplicity. I can recreate this on 
> some machines, and not others. I wouldn't be surprised if it is some machines 
> specific issue(test machines on CentOS5.4. On some it worked, others not), 
> however whatever exception was thrown when attempting to get created is 
> consumed and never passed onwards
> Config:
> test.channels.ch1.type = file
> test.channels.ch1.checkpointDir = 
> /home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/check
> test.channels.ch1.dataDirs = 
> /home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/filechdata
> test.sources.top.type = exec
> test.sources.top.command = /usr/bin top -b -d 1
> test.sources.top.restart = true
> test.sources.top.restartThrottle = 1000
> test.sources.top.interceptors = ts
> test.sources.top.interceptors.ts.type = 
> org.apache.flume.interceptor.TimestampInterceptor$Builder
> test.sources.top.channels = ch1
> test.sinks.log.type = logger
> test.sinks.log.channel = ch1
> test.channels = ch1
> test.sources = top
> test.sinks = log
> attaching logs with general/lifecycle loglevel down to debug.
> A solution to this is probably going to be just improving error reporting.
> Another possibly more important element is that flume enters a state from 
> which it cannot shutdown without kill -9. It looks like the Interrupts are 
> getting swallowed up silently somewhere

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to