[ 
https://issues.apache.org/jira/browse/FLUME-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juhani Connolly updated FLUME-1246:
-----------------------------------

    Description: 
FileChannels Event implementation uses Hadoop IO's Writable for serialization. 
Without it a class loader error is thrown but never caught and logged killing 
the thread invisibly. This repeats and puts flume into a state where interrupts 
are not responded to, making clean shutdown impossible

old desc for posterity:
Reduced to a minimal configuration for simplicity. I can recreate this on some 
machines, and not others. I wouldn't be surprised if it is some machines 
specific issue(test machines on CentOS5.4. On some it worked, others not), 
however whatever exception was thrown when attempting to get created is 
consumed and never passed onwards

Config:


test.channels.ch1.type = file
test.channels.ch1.checkpointDir = 
/home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/check
test.channels.ch1.dataDirs = 
/home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/filechdata

test.sources.top.type = exec
test.sources.top.command = /usr/bin top -b -d 1
test.sources.top.restart = true
test.sources.top.restartThrottle = 1000
test.sources.top.interceptors = ts
test.sources.top.interceptors.ts.type = 
org.apache.flume.interceptor.TimestampInterceptor$Builder
test.sources.top.channels = ch1

test.sinks.log.type = logger
test.sinks.log.channel = ch1

test.channels = ch1
test.sources = top
test.sinks = log

attaching logs with general/lifecycle loglevel down to debug.

A solution to this is probably going to be just improving error reporting.

Another possibly more important element is that flume enters a state from which 
it cannot shutdown without kill -9. It looks like the Interrupts are getting 
swallowed up silently somewhere

  was:
Reduced to a minimal configuration for simplicity. I can recreate this on some 
machines, and not others. I wouldn't be surprised if it is some machines 
specific issue(test machines on CentOS5.4. On some it worked, others not), 
however whatever exception was thrown when attempting to get created is 
consumed and never passed onwards

Config:


test.channels.ch1.type = file
test.channels.ch1.checkpointDir = 
/home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/check
test.channels.ch1.dataDirs = 
/home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/filechdata

test.sources.top.type = exec
test.sources.top.command = /usr/bin top -b -d 1
test.sources.top.restart = true
test.sources.top.restartThrottle = 1000
test.sources.top.interceptors = ts
test.sources.top.interceptors.ts.type = 
org.apache.flume.interceptor.TimestampInterceptor$Builder
test.sources.top.channels = ch1

test.sinks.log.type = logger
test.sinks.log.channel = ch1

test.channels = ch1
test.sources = top
test.sinks = log

attaching logs with general/lifecycle loglevel down to debug.

A solution to this is probably going to be just improving error reporting.

Another possibly more important element is that flume enters a state from which 
it cannot shutdown without kill -9. It looks like the Interrupts are getting 
swallowed up silently somewhere

        Summary: FileChannel has a dependence on hadoop IO and will without 
logs, making clean shutdown impossible  (was: FileChannel failing to  start, 
also shutdown impossible without kill)
    
> FileChannel has a dependence on hadoop IO and will without logs, making clean 
> shutdown impossible
> -------------------------------------------------------------------------------------------------
>
>                 Key: FLUME-1246
>                 URL: https://issues.apache.org/jira/browse/FLUME-1246
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: v1.2.0
>         Environment: CentOS 5.4
>            Reporter: Juhani Connolly
>         Attachments: flume-hari-2012May312217PST.log, flume.log, flume.log, 
> flume.log, flume.log.20120601, test_conf.conf
>
>
> FileChannels Event implementation uses Hadoop IO's Writable for 
> serialization. Without it a class loader error is thrown but never caught and 
> logged killing the thread invisibly. This repeats and puts flume into a state 
> where interrupts are not responded to, making clean shutdown impossible
> old desc for posterity:
> Reduced to a minimal configuration for simplicity. I can recreate this on 
> some machines, and not others. I wouldn't be surprised if it is some machines 
> specific issue(test machines on CentOS5.4. On some it worked, others not), 
> however whatever exception was thrown when attempting to get created is 
> consumed and never passed onwards
> Config:
> test.channels.ch1.type = file
> test.channels.ch1.checkpointDir = 
> /home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/check
> test.channels.ch1.dataDirs = 
> /home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/filechdata
> test.sources.top.type = exec
> test.sources.top.command = /usr/bin top -b -d 1
> test.sources.top.restart = true
> test.sources.top.restartThrottle = 1000
> test.sources.top.interceptors = ts
> test.sources.top.interceptors.ts.type = 
> org.apache.flume.interceptor.TimestampInterceptor$Builder
> test.sources.top.channels = ch1
> test.sinks.log.type = logger
> test.sinks.log.channel = ch1
> test.channels = ch1
> test.sources = top
> test.sinks = log
> attaching logs with general/lifecycle loglevel down to debug.
> A solution to this is probably going to be just improving error reporting.
> Another possibly more important element is that flume enters a state from 
> which it cannot shutdown without kill -9. It looks like the Interrupts are 
> getting swallowed up silently somewhere

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to