Looks like one data node is inaccessible or down - so the HDFS client has black listed it and the writes are failing as blocks are allocated to that one.
Thanks, Hari On Tue, Sep 30, 2014 at 7:33 PM, Ed Judge <ejud...@gmail.com> wrote: > I’ve pulled over all of the Hadoop jar files for my flume instance to use. I > am seeing some slightly different errors now. Basically I have 2 identically > configured hadoop instances on the same subnet. Running flume on those same > instances and pointing flume at the local hadoop/hdfs instance works fine and > the files get written. However, when I point it to the adjacent hadoop/hdfs > instance I get many exceptions/errors (show below) and the files never get > written. Here is my HDFS sink configuration on 10.0.0.14: > # Describe the sink > a1.sinks.k1.type = hdfs > a1.sinks.k1.hdfs.path = hdfs://10.0.0.16:9000/tmp/ > a1.sinks.k1.hdfs.filePrefix = twitter > a1.sinks.k1.hdfs.fileSuffix = .ds > a1.sinks.k1.hdfs.rollInterval = 0 > a1.sinks.k1.hdfs.rollSize = 10 > a1.sinks.k1.hdfs.rollCount = 0 > a1.sinks.k1.hdfs.fileType = DataStream > #a1.sinks.k1.serializer = TEXT > a1.sinks.k1.channel = c1 > Any idea why this is not working? > Thanks. > 01 Oct 2014 01:59:45,098 INFO > [SinkRunner-PollingRunner-DefaultSinkProcessor] > (org.apache.flume.sink.hdfs.HDFSDataStream.configure:58) - Serializer = > TEXT, UseRawLocalFileSystem = false > 01 Oct 2014 01:59:45,385 INFO > [SinkRunner-PollingRunner-DefaultSinkProcessor] > (org.apache.flume.sink.hdfs.BucketWriter.open:261) - Creating > hdfs://10.0.0.16:9000/tmp//twitter.1412128785099.ds.tmp > 01 Oct 2014 01:59:45,997 INFO [Twitter4J Async Dispatcher[0]] > (org.apache.flume.source.twitter.TwitterSource.onStatus:178) - Processed 100 > docs > 01 Oct 2014 01:59:47,754 INFO [Twitter4J Async Dispatcher[0]] > (org.apache.flume.source.twitter.TwitterSource.onStatus:178) - Processed 200 > docs > 01 Oct 2014 01:59:49,379 INFO [Thread-7] > (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream:1378) > - Exception in createBlockOutputStream > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1987) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1346) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1272) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525) > 01 Oct 2014 01:59:49,390 INFO [Thread-7] > (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream:1275) > - Abandoning BP-1768727495-127.0.0.1-1412117897373:blk_1073743575_2751 > 01 Oct 2014 01:59:49,398 INFO [Thread-7] > (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream:1278) > - Excluding datanode 127.0.0.1:50010 > 01 Oct 2014 01:59:49,431 WARN [Thread-7] > (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run:627) - DataStreamer > Exception > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /tmp/twitter.1412128785099.ds.tmp could only be replicated to 0 nodes instead > of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are > excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1430) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2684) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) > at org.apache.hadoop.ipc.Client.call(Client.java:1410) > at org.apache.hadoop.ipc.Client.call(Client.java:1363) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy18.addBlock(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) > at com.sun.proxy.$Proxy18.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525) > 01 Oct 2014 01:59:49,437 WARN [hdfs-k1-call-runner-2] > (org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync:1950) - Error while > syncing > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /tmp/twitter.1412128785099.ds.tmp could only be replicated to 0 nodes instead > of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are > excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1430) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2684) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) > at org.apache.hadoop.ipc.Client.call(Client.java:1410) > at org.apache.hadoop.ipc.Client.call(Client.java:1363) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy18.addBlock(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) > at com.sun.proxy.$Proxy18.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525) > 01 Oct 2014 01:59:49,439 WARN > [SinkRunner-PollingRunner-DefaultSinkProcessor] > (org.apache.flume.sink.hdfs.HDFSEventSink.process:463) - HDFS IO error > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /tmp/twitter.1412128785099.ds.tmp could only be replicated to 0 nodes instead > of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are > excluded in this operation. > On Sep 30, 2014, at 3:18 PM, Hari Shreedharan <hshreedha...@cloudera.com> > wrote: >> You'd need to add the jars that hadoop itself depends on. Flume pulls it in >> if Hadoop is installed on that machine, else you'd need to manually download >> it and install it. If you are using Hadoop 2.x, install the RPM provided by >> Bigtop. >> >> On Tue, Sep 30, 2014 at 12:12 PM, Ed Judge <ejud...@gmail.com> wrote: >> I added commons-configuration and there is now another missing dependency. >> What do you mean by “all of Hadoop’s dependencies”? >> >> >> On Sep 30, 2014, at 2:51 PM, Hari Shreedharan <hshreedha...@cloudera.com> >> wrote: >> >>> You actually need to add of all Hadoop’s dependencies to Flume classpath. >>> Looks like Apache Commons Configuration is missing in classpath. >>> >>> Thanks, >>> Hari >>> >>> >>> On Tue, Sep 30, 2014 at 11:48 AM, Ed Judge <ejud...@gmail.com> wrote: >>> >>> Thank you. I am using hadoop 2.5 which I think uses >>> protobuf-java-2.5.0.jar. >>> >>> I am getting the following error even after adding those 2 jar files to my >>> flume-ng classpath: >>> >>> 30 Sep 2014 18:27:03,269 INFO [lifecycleSupervisor-1-0] >>> (org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start:61) >>> - Configuration provider starting >>> 30 Sep 2014 18:27:03,278 INFO [conf-file-poller-0] >>> (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:133) >>> - Reloading configuration file:./src.conf >>> 30 Sep 2014 18:27:03,288 INFO [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) >>> - Processing:k1 >>> 30 Sep 2014 18:27:03,289 INFO [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:930) >>> - Added sinks: k1 Agent: a1 >>> 30 Sep 2014 18:27:03,289 INFO [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) >>> - Processing:k1 >>> 30 Sep 2014 18:27:03,292 WARN [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration.<init>:101) - Configuration >>> property ignored: i# = Describe the sink >>> 30 Sep 2014 18:27:03,292 INFO [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) >>> - Processing:k1 >>> 30 Sep 2014 18:27:03,292 INFO [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) >>> - Processing:k1 >>> 30 Sep 2014 18:27:03,293 INFO [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) >>> - Processing:k1 >>> 30 Sep 2014 18:27:03,293 INFO [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) >>> - Processing:k1 >>> 30 Sep 2014 18:27:03,293 INFO [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) >>> - Processing:k1 >>> 30 Sep 2014 18:27:03,293 INFO [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) >>> - Processing:k1 >>> 30 Sep 2014 18:27:03,293 INFO [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) >>> - Processing:k1 >>> 30 Sep 2014 18:27:03,312 INFO [conf-file-poller-0] >>> (org.apache.flume.conf.FlumeConfiguration.validateConfiguration:140) - >>> Post-validation flume configuration contains configuration for agents: [a1] >>> 30 Sep 2014 18:27:03,312 INFO [conf-file-poller-0] >>> (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:150) - >>> Creating channels >>> 30 Sep 2014 18:27:03,329 INFO [conf-file-poller-0] >>> (org.apache.flume.channel.DefaultChannelFactory.create:40) - Creating >>> instance of channel c1 type memory >>> 30 Sep 2014 18:27:03,351 INFO [conf-file-poller-0] >>> (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:205) - >>> Created channel c1 >>> 30 Sep 2014 18:27:03,352 INFO [conf-file-poller-0] >>> (org.apache.flume.source.DefaultSourceFactory.create:39) - Creating >>> instance of source r1, type org.apache.flume.source.twitter.TwitterSource >>> 30 Sep 2014 18:27:03,363 INFO [conf-file-poller-0] >>> (org.apache.flume.source.twitter.TwitterSource.configure:110) - Consumer >>> Key: 'tobhMtidckJoe1tByXDmI4pW3' >>> 30 Sep 2014 18:27:03,363 INFO [conf-file-poller-0] >>> (org.apache.flume.source.twitter.TwitterSource.configure:111) - Consumer >>> Secret: '6eZKRpd6JvGT3Dg9jtd9fG9UMEhBzGxoLhLUGP1dqzkKznrXuQ' >>> 30 Sep 2014 18:27:03,363 INFO [conf-file-poller-0] >>> (org.apache.flume.source.twitter.TwitterSource.configure:112) - Access >>> Token: '1588514408-o36mOSbXYCVacQ3p6Knsf6Kho17iCwNYLZyA9V5' >>> 30 Sep 2014 18:27:03,364 INFO [conf-file-poller-0] >>> (org.apache.flume.source.twitter.TwitterSource.configure:113) - Access >>> Token Secret: 'vBtp7wKsi2BOQqZSBpSBQSgZcc93oHea38T9OdckDCLKn' >>> 30 Sep 2014 18:27:03,825 INFO [conf-file-poller-0] >>> (org.apache.flume.sink.DefaultSinkFactory.create:40) - Creating instance >>> of sink: k1, type: hdfs >>> 30 Sep 2014 18:27:03,874 ERROR [conf-file-poller-0] >>> (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:145) >>> - Failed to start agent because dependencies were not found in classpath. >>> Error follows. >>> java.lang.NoClassDefFoundError: >>> org/apache/commons/configuration/Configuration >>> at >>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38) >>> at >>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36) >>> at >>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:106) >>> at >>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:208) >>> at >>> org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:553) >>> at >>> org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:272) >>> at org.apache.flume.conf.Configurables.configure(Configurables.java:41) >>> at >>> org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:418) >>> at >>> org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:103) >>> at >>> org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140) >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) >>> at >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) >>> at >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> Caused by: java.lang.ClassNotFoundException: >>> org.apache.commons.configuration.Configuration >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >>> ... 17 more >>> 30 Sep 2014 18:27:33,491 INFO [agent-shutdown-hook] >>> (org.apache.flume.lifecycle.LifecycleSupervisor.stop:79) - Stopping >>> lifecycle supervisor 10 >>> 30 Sep 2014 18:27:33,493 INFO [agent-shutdown-hook] >>> (org.apache.flume.node.PollingPropertiesFileConfigurationProvider.stop:83) >>> - Configuration provider stopping >>> [vagrant@localhost 6]$ >>> >>> Is there another jar file I need? >>> >>> Thanks. >>> >>> On Sep 29, 2014, at 9:04 PM, shengyi.pan <shengyi....@gmail.com> wrote: >>> >>>> you need hadoop-common-x.x.x.jar and hadoop-hdfs-x.x.x.jar under your >>>> flume-ng classpath, and the dependent hadoop jar version must match your >>>> hadoop system. >>>> >>>> if sink to hadoop-2.0.0, you should use "protobuf-java-2.4.1.jar" >>>> (defaultly, flume-1.5.0 uses "protobuf-java-2.5.0.jar", the jar file is >>>> under flume lib directory ), because the pb interface of hdfs-2.0 is >>>> compiled wtih protobuf-2.4, while using protobuf-2.5 the flume-ng will >>>> fail to start.... >>>> >>>> >>>> >>>> >>>> 2014-09-30 >>>> shengyi.pan >>>> 发件人:Ed Judge <ejud...@gmail.com> >>>> 发送时间:2014-09-29 22:38 >>>> 主题:HDFS sink to a remote HDFS node >>>> 收件人:"user@flume.apache.org"<user@flume.apache.org> >>>> 抄送: >>>> >>>> I am trying to run the flume-ng agent on one node with an HDFS sink >>>> pointing to an HDFS filesystem on another node. >>>> Is this possible? What packages/jar files are needed on the flume agent >>>> node for this to work? Secondary goal is to install only what is needed >>>> on the flume-ng node. >>>> >>>> # Describe the sink >>>> a1.sinks.k1.type = hdfs >>>> a1.sinks.k1.hdfs.path = hdfs://<remote IP address>/tmp/ >>>> >>>> >>>> Thanks, >>>> Ed >>> >>> >> >>