[
https://issues.apache.org/jira/browse/FLUME-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287273#comment-13287273
]
Juhani Connolly commented on FLUME-1246:
----------------------------------------
Finally found the cause after adding a generic try/catch throwable block around
start and logging it:
2012-06-01 17:38:29,787 (lifecycleSupervisor-1-0) [INFO -
org.apache.flume.channel.file.FileChannel.start(FileChannel.java:177)] Updated
Starting FileChannel with dataDir
[/home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/filechdata]
2012-06-01 17:38:29,790 (lifecycleSupervisor-1-0) [INFO -
org.apache.flume.channel.file.FileChannel.start(FileChannel.java:220)] failed
java.lang.NoClassDefFoundError: org/apache/hadoop/io/Writable
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:179)
at
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:228)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.Writable
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 23 more
with the extra code for the catch, FileChannel.java:179 corresponds to:
log = new Log(checkpointInterval, maxFileSize, capacity,
checkpointDir, dataDirs);"
Building on a system with hadoop installed and then deploying on a system
without it. Since this is an Error, it passes invisibly through the lifecycle
supervisor and the way things are now, fails without warning. This repeats
constantly, and the supervisor doesn't receive interrupted exceptions.
I'm not sure why it's trying to load Writable at this point.
> FileChannel failing to start, also shutdown impossible without kill
> --------------------------------------------------------------------
>
> Key: FLUME-1246
> URL: https://issues.apache.org/jira/browse/FLUME-1246
> Project: Flume
> Issue Type: Bug
> Components: Channel
> Affects Versions: v1.2.0
> Environment: CentOS 5.4
> Reporter: Juhani Connolly
> Attachments: flume-hari-2012May312217PST.log, flume.log, flume.log,
> flume.log, flume.log.20120601, test_conf.conf
>
>
> Reduced to a minimal configuration for simplicity. I can recreate this on
> some machines, and not others. I wouldn't be surprised if it is some machines
> specific issue(test machines on CentOS5.4. On some it worked, others not),
> however whatever exception was thrown when attempting to get created is
> consumed and never passed onwards
> Config:
> test.channels.ch1.type = file
> test.channels.ch1.checkpointDir =
> /home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/check
> test.channels.ch1.dataDirs =
> /home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/filechdata
> test.sources.top.type = exec
> test.sources.top.command = /usr/bin top -b -d 1
> test.sources.top.restart = true
> test.sources.top.restartThrottle = 1000
> test.sources.top.interceptors = ts
> test.sources.top.interceptors.ts.type =
> org.apache.flume.interceptor.TimestampInterceptor$Builder
> test.sources.top.channels = ch1
> test.sinks.log.type = logger
> test.sinks.log.channel = ch1
> test.channels = ch1
> test.sources = top
> test.sinks = log
> attaching logs with general/lifecycle loglevel down to debug.
> A solution to this is probably going to be just improving error reporting.
> Another possibly more important element is that flume enters a state from
> which it cannot shutdown without kill -9. It looks like the Interrupts are
> getting swallowed up silently somewhere
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira