Edward, thanks a lot. Putting the jars in flume/lib resolves the problem. I simply copied the jars for ElasticSearch and Lucene from ElasticSearch/lib to flume/lib. A couple of things are worth being noted: 1) I am using elasticsearch-0.20.5, which works with apache-flume-1.3.1, but not apache-flume-1.3.0; 2) After the flume agent is started, it takes about 10 minutes for it to be ready to accept data from client then to ElasticSearch (putting data too early will result in ignored client calls). I do have some other issue about input data for ElasticSearch, and will start another mail thread for that question. Shushuai
________________________________ From: Edward Sargisson <[email protected]> To: user <[email protected]> Sent: Wednesday, June 12, 2013 12:10 PM Subject: ElasticSearchSink does not work Hi Shushuai, You need to put the jars for elasticsearch and Lucene into your flume/lib directory. The major version of the elasticsearch jar needs to match that of your elasticsearch cluster. The Lucene needs to be the version that elasticsearch depends on. Also not that the JVM version (down to the minor release) needs to match between the Flume agent running the ElasticSearchSink and the elasticsearch cluster itself. This is a known problem, and the docs for the upcoming 1.4.0 release specify what to do. " Allan, thanks for the reply. In my case, I only used one channel and one sink at the same time. About 10 minutes after the data were sent to the Flume agent, some messages were logged in flume.log (see below). It says class org/elasticsearch/common/transport/TransportAddress was not found. This seems indicating that the Cloudera version of Flume does not support ElasticSearchSink. Anyway to add the missing class or some jar file? I also tried to download the flume from Flume site: http://flume.apache.org/download.html http://www.apache.org/dyn/closer.cgi/flume/1.3.1/apache-flume-1.3.1-bin.tar.gz But the downloaded apache-flume-1.3.1-bin.tar.gz is complained as not a gzip file nor a tar file on my Linux box (Red Hat 5). Can anyone let me know the exact downloading process? If possible, please provide some step-by-step instruction for downloading and installation. Thanks. Shushuai ------------------------------------------------------------------------------------------------------------- 11 Jun 2013 19:40:37,082 INFO [lifecycleSupervisor-1-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start:61) - Configuration provider starting 11 Jun 2013 19:40:37,114 INFO [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:133) - Reloading configuration file:conf/flume.conf 11 Jun 2013 19:40:37,121 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:k1 11 Jun 2013 19:40:37,122 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:k1 11 Jun 2013 19:40:37,122 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:k1 11 Jun 2013 19:40:37,122 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:k1 11 Jun 2013 19:40:37,122 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:k1 11 Jun 2013 19:40:37,122 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:930) - Added sinks: k1 Agent: agent1 11 Jun 2013 19:40:37,122 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:k1 11 Jun 2013 19:40:37,123 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:k1 11 Jun 2013 19:40:37,123 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:k1 11 Jun 2013 19:40:37,457 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration.validateConfiguration:140) - Post-validation flume configuration contains configuration for agents: [agent1] 11 Jun 2013 19:40:37,457 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:150) - Creating channels 11 Jun 2013 19:40:37,464 INFO [conf-file-poller-0] (org.apache.flume.channel.DefaultChannelFactory.create:40) - Creating instance of channel ch1 type memory 11 Jun 2013 19:40:37,468 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:205) - Created channel ch1 11 Jun 2013 19:40:37,469 INFO [conf-file-poller-0] (org.apache.flume.source.DefaultSourceFactory.create:39) - Creating instance of source avro-source1, type avro 11 Jun 2013 19:40:37,484 INFO [conf-file-poller-0] (org.apache.flume.sink.DefaultSinkFactory.create:40) - Creating instance of sink: k1, type: org.apache.flume.sink.elasticsearch.ElasticSearchSink 11 Jun 2013 19:40:37,489 ERROR [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:145) - Failed to start agent because dependencies were not found in classpath. Error follows. java.lang.NoClassDefFoundError: org/elasticsearch/common/transport/TransportAddress at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:186) at org.apache.flume.sink.DefaultSinkFactory.getClass(DefaultSinkFactory.java:67) at org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:41) at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:415) at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:103) at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Caused by: java.lang.ClassNotFoundException: org.elasticsearch.common.transport.TransportAddress at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) ... 15 more ----------------------------------------------------------------------------------------------------------"
