[ 
https://issues.apache.org/jira/browse/FLUME-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287273#comment-13287273
 ] 

Juhani Connolly commented on FLUME-1246:
----------------------------------------

Finally found the cause after adding a generic try/catch throwable block around 
start and logging it:

2012-06-01 17:38:29,787 (lifecycleSupervisor-1-0) [INFO - 
org.apache.flume.channel.file.FileChannel.start(FileChannel.java:177)] Updated 
Starting FileChannel with dataDir 
[/home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/filechdata]
2012-06-01 17:38:29,790 (lifecycleSupervisor-1-0) [INFO - 
org.apache.flume.channel.file.FileChannel.start(FileChannel.java:220)] failed
java.lang.NoClassDefFoundError: org/apache/hadoop/io/Writable
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
        at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
        at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:179)
        at 
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:228)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.Writable
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        ... 23 more

with the extra code for the catch, FileChannel.java:179 corresponds to:
log = new Log(checkpointInterval, maxFileSize, capacity,
          checkpointDir, dataDirs);"

Building on a system with hadoop installed and then deploying on a system 
without it. Since this is an Error, it passes invisibly through the lifecycle 
supervisor and the way things are now, fails without warning. This repeats 
constantly, and the supervisor doesn't receive interrupted exceptions.

I'm not sure why it's trying to load Writable at this point.

                
> FileChannel failing to  start, also shutdown impossible without kill
> --------------------------------------------------------------------
>
>                 Key: FLUME-1246
>                 URL: https://issues.apache.org/jira/browse/FLUME-1246
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: v1.2.0
>         Environment: CentOS 5.4
>            Reporter: Juhani Connolly
>         Attachments: flume-hari-2012May312217PST.log, flume.log, flume.log, 
> flume.log, flume.log.20120601, test_conf.conf
>
>
> Reduced to a minimal configuration for simplicity. I can recreate this on 
> some machines, and not others. I wouldn't be surprised if it is some machines 
> specific issue(test machines on CentOS5.4. On some it worked, others not), 
> however whatever exception was thrown when attempting to get created is 
> consumed and never passed onwards
> Config:
> test.channels.ch1.type = file
> test.channels.ch1.checkpointDir = 
> /home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/check
> test.channels.ch1.dataDirs = 
> /home/share/juhani_connolly/flume-1.2.0-incubating-SNAPSHOT/filechdata
> test.sources.top.type = exec
> test.sources.top.command = /usr/bin top -b -d 1
> test.sources.top.restart = true
> test.sources.top.restartThrottle = 1000
> test.sources.top.interceptors = ts
> test.sources.top.interceptors.ts.type = 
> org.apache.flume.interceptor.TimestampInterceptor$Builder
> test.sources.top.channels = ch1
> test.sinks.log.type = logger
> test.sinks.log.channel = ch1
> test.channels = ch1
> test.sources = top
> test.sinks = log
> attaching logs with general/lifecycle loglevel down to debug.
> A solution to this is probably going to be just improving error reporting.
> Another possibly more important element is that flume enters a state from 
> which it cannot shutdown without kill -9. It looks like the Interrupts are 
> getting swallowed up silently somewhere

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to