Re: HDFS sink to a remote HDFS node

Hari Shreedharan Tue, 30 Sep 2014 21:32:19 -0700

Looks like one data node is inaccessible or down - so the HDFS client has black 
listed it and the writes are failing as blocks are allocated to that one.



Thanks,
Hari

On Tue, Sep 30, 2014 at 7:33 PM, Ed Judge <ejud...@gmail.com> wrote:

> I’ve pulled over all of the Hadoop jar files for my flume instance to use.  I 
> am seeing some slightly different errors now.  Basically I have 2 identically 
> configured hadoop instances on the same subnet.  Running flume on those same 
> instances and pointing flume at the local hadoop/hdfs instance works fine and 
> the files get written.  However, when I point it to the adjacent hadoop/hdfs 
> instance I get many exceptions/errors (show below) and the files never get 
> written.  Here is my HDFS sink configuration on 10.0.0.14:
> # Describe the sink
> a1.sinks.k1.type = hdfs
> a1.sinks.k1.hdfs.path = hdfs://10.0.0.16:9000/tmp/
> a1.sinks.k1.hdfs.filePrefix = twitter
> a1.sinks.k1.hdfs.fileSuffix = .ds
> a1.sinks.k1.hdfs.rollInterval = 0
> a1.sinks.k1.hdfs.rollSize = 10
> a1.sinks.k1.hdfs.rollCount = 0
> a1.sinks.k1.hdfs.fileType = DataStream
> #a1.sinks.k1.serializer = TEXT
> a1.sinks.k1.channel = c1
> Any idea why this is not working?
> Thanks.
> 01 Oct 2014 01:59:45,098 INFO  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.HDFSDataStream.configure:58)  - Serializer = 
> TEXT, UseRawLocalFileSystem = false
> 01 Oct 2014 01:59:45,385 INFO  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.BucketWriter.open:261)  - Creating 
> hdfs://10.0.0.16:9000/tmp//twitter.1412128785099.ds.tmp
> 01 Oct 2014 01:59:45,997 INFO  [Twitter4J Async Dispatcher[0]] 
> (org.apache.flume.source.twitter.TwitterSource.onStatus:178)  - Processed 100 
> docs
> 01 Oct 2014 01:59:47,754 INFO  [Twitter4J Async Dispatcher[0]] 
> (org.apache.flume.source.twitter.TwitterSource.onStatus:178)  - Processed 200 
> docs
> 01 Oct 2014 01:59:49,379 INFO  [Thread-7] 
> (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream:1378)
>   - Exception in createBlockOutputStream
> java.io.EOFException: Premature EOF: no length prefix available
>       at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1987)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1346)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1272)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
> 01 Oct 2014 01:59:49,390 INFO  [Thread-7] 
> (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream:1275)
>   - Abandoning BP-1768727495-127.0.0.1-1412117897373:blk_1073743575_2751
> 01 Oct 2014 01:59:49,398 INFO  [Thread-7] 
> (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream:1278)
>   - Excluding datanode 127.0.0.1:50010
> 01 Oct 2014 01:59:49,431 WARN  [Thread-7] 
> (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run:627)  - DataStreamer 
> Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /tmp/twitter.1412128785099.ds.tmp could only be replicated to 0 nodes instead 
> of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are 
> excluded in this operation.
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1430)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2684)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
>       at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>       at com.sun.proxy.$Proxy18.addBlock(Unknown Source)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>       at com.sun.proxy.$Proxy18.addBlock(Unknown Source)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
> 01 Oct 2014 01:59:49,437 WARN  [hdfs-k1-call-runner-2] 
> (org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync:1950)  - Error while 
> syncing
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /tmp/twitter.1412128785099.ds.tmp could only be replicated to 0 nodes instead 
> of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are 
> excluded in this operation.
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1430)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2684)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
>       at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>       at com.sun.proxy.$Proxy18.addBlock(Unknown Source)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
>       at com.sun.proxy.$Proxy18.addBlock(Unknown Source)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
> 01 Oct 2014 01:59:49,439 WARN  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.HDFSEventSink.process:463)  - HDFS IO error
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /tmp/twitter.1412128785099.ds.tmp could only be replicated to 0 nodes instead 
> of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are 
> excluded in this operation.
> On Sep 30, 2014, at 3:18 PM, Hari Shreedharan <hshreedha...@cloudera.com> 
> wrote:
>> You'd need to add the jars that hadoop itself depends on. Flume pulls it in 
>> if Hadoop is installed on that machine, else you'd need to manually download 
>> it and install it. If you are using Hadoop 2.x, install the RPM provided by 
>> Bigtop.
>> 
>> On Tue, Sep 30, 2014 at 12:12 PM, Ed Judge <ejud...@gmail.com> wrote:
>> I added commons-configuration and there is now another missing dependency.  
>> What do you mean by “all of Hadoop’s dependencies”?
>> 
>> 
>> On Sep 30, 2014, at 2:51 PM, Hari Shreedharan <hshreedha...@cloudera.com> 
>> wrote:
>> 
>>> You actually need to add of all Hadoop’s dependencies to Flume classpath. 
>>> Looks like Apache Commons Configuration is missing in classpath.
>>> 
>>> Thanks,
>>> Hari
>>> 
>>> 
>>> On Tue, Sep 30, 2014 at 11:48 AM, Ed Judge <ejud...@gmail.com> wrote:
>>> 
>>> Thank you.  I am using hadoop 2.5 which I think uses 
>>> protobuf-java-2.5.0.jar.
>>> 
>>> I am getting the following error even after adding those 2 jar files to my 
>>> flume-ng classpath:
>>> 
>>> 30 Sep 2014 18:27:03,269 INFO  [lifecycleSupervisor-1-0] 
>>> (org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start:61) 
>>>  - Configuration provider starting
>>> 30 Sep 2014 18:27:03,278 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:133)
>>>   - Reloading configuration file:./src.conf
>>> 30 Sep 2014 18:27:03,288 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)
>>>   - Processing:k1
>>> 30 Sep 2014 18:27:03,289 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:930)
>>>   - Added sinks: k1 Agent: a1
>>> 30 Sep 2014 18:27:03,289 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)
>>>   - Processing:k1
>>> 30 Sep 2014 18:27:03,292 WARN  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration.<init>:101)  - Configuration 
>>> property ignored: i# = Describe the sink
>>> 30 Sep 2014 18:27:03,292 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)
>>>   - Processing:k1
>>> 30 Sep 2014 18:27:03,292 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)
>>>   - Processing:k1
>>> 30 Sep 2014 18:27:03,293 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)
>>>   - Processing:k1
>>> 30 Sep 2014 18:27:03,293 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)
>>>   - Processing:k1
>>> 30 Sep 2014 18:27:03,293 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)
>>>   - Processing:k1
>>> 30 Sep 2014 18:27:03,293 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)
>>>   - Processing:k1
>>> 30 Sep 2014 18:27:03,293 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)
>>>   - Processing:k1
>>> 30 Sep 2014 18:27:03,312 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.conf.FlumeConfiguration.validateConfiguration:140)  - 
>>> Post-validation flume configuration contains configuration for agents: [a1]
>>> 30 Sep 2014 18:27:03,312 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:150)  - 
>>> Creating channels
>>> 30 Sep 2014 18:27:03,329 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.channel.DefaultChannelFactory.create:40)  - Creating 
>>> instance of channel c1 type memory
>>> 30 Sep 2014 18:27:03,351 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:205)  - 
>>> Created channel c1
>>> 30 Sep 2014 18:27:03,352 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.source.DefaultSourceFactory.create:39)  - Creating 
>>> instance of source r1, type org.apache.flume.source.twitter.TwitterSource
>>> 30 Sep 2014 18:27:03,363 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.source.twitter.TwitterSource.configure:110)  - Consumer 
>>> Key:        'tobhMtidckJoe1tByXDmI4pW3'
>>> 30 Sep 2014 18:27:03,363 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.source.twitter.TwitterSource.configure:111)  - Consumer 
>>> Secret:     '6eZKRpd6JvGT3Dg9jtd9fG9UMEhBzGxoLhLUGP1dqzkKznrXuQ'
>>> 30 Sep 2014 18:27:03,363 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.source.twitter.TwitterSource.configure:112)  - Access 
>>> Token:        '1588514408-o36mOSbXYCVacQ3p6Knsf6Kho17iCwNYLZyA9V5'
>>> 30 Sep 2014 18:27:03,364 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.source.twitter.TwitterSource.configure:113)  - Access 
>>> Token Secret: 'vBtp7wKsi2BOQqZSBpSBQSgZcc93oHea38T9OdckDCLKn'
>>> 30 Sep 2014 18:27:03,825 INFO  [conf-file-poller-0] 
>>> (org.apache.flume.sink.DefaultSinkFactory.create:40)  - Creating instance 
>>> of sink: k1, type: hdfs
>>> 30 Sep 2014 18:27:03,874 ERROR [conf-file-poller-0] 
>>> (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:145)
>>>   - Failed to start agent because dependencies were not found in classpath. 
>>> Error follows.
>>> java.lang.NoClassDefFoundError: 
>>> org/apache/commons/configuration/Configuration
>>>     at 
>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:38)
>>>     at 
>>> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:36)
>>>     at 
>>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.create(UserGroupInformation.java:106)
>>>     at 
>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:208)
>>>     at 
>>> org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:553)
>>>     at 
>>> org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:272)
>>>     at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
>>>     at 
>>> org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:418)
>>>     at 
>>> org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:103)
>>>     at 
>>> org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
>>>     at 
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>>>     at 
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>>>     at 
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> Caused by: java.lang.ClassNotFoundException: 
>>> org.apache.commons.configuration.Configuration
>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>>     ... 17 more
>>> 30 Sep 2014 18:27:33,491 INFO  [agent-shutdown-hook] 
>>> (org.apache.flume.lifecycle.LifecycleSupervisor.stop:79)  - Stopping 
>>> lifecycle supervisor 10
>>> 30 Sep 2014 18:27:33,493 INFO  [agent-shutdown-hook] 
>>> (org.apache.flume.node.PollingPropertiesFileConfigurationProvider.stop:83)  
>>> - Configuration provider stopping
>>> [vagrant@localhost 6]$ 
>>> 
>>> Is there another jar file I need?
>>> 
>>> Thanks.
>>> 
>>> On Sep 29, 2014, at 9:04 PM, shengyi.pan <shengyi....@gmail.com> wrote:
>>> 
>>>> you need hadoop-common-x.x.x.jar and hadoop-hdfs-x.x.x.jar under your 
>>>> flume-ng classpath, and the dependent hadoop jar version must match your 
>>>> hadoop system.
>>>>  
>>>> if sink to hadoop-2.0.0,  you should use "protobuf-java-2.4.1.jar" 
>>>> (defaultly, flume-1.5.0 uses "protobuf-java-2.5.0.jar", the jar file is 
>>>> under flume lib directory ), because the pb interface of hdfs-2.0 is 
>>>> compiled wtih protobuf-2.4, while using protobuf-2.5 the flume-ng will 
>>>> fail to start....
>>>>  
>>>>  
>>>>  
>>>>  
>>>> 2014-09-30
>>>> shengyi.pan
>>>> 发件人：Ed Judge <ejud...@gmail.com>
>>>> 发送时间：2014-09-29 22:38
>>>> 主题：HDFS sink to a remote HDFS node
>>>> 收件人："user@flume.apache.org"<user@flume.apache.org>
>>>> 抄送：
>>>>  
>>>> I am trying to run the flume-ng agent on one node with an HDFS sink 
>>>> pointing to an HDFS filesystem on another node.
>>>> Is this possible?  What packages/jar files are needed on the flume agent 
>>>> node for this to work?  Secondary goal is to install only what is needed 
>>>> on the flume-ng node.
>>>> 
>>>> # Describe the sink
>>>> a1.sinks.k1.type = hdfs
>>>> a1.sinks.k1.hdfs.path = hdfs://<remote IP address>/tmp/
>>>> 
>>>> 
>>>> Thanks,
>>>> Ed
>>> 
>>> 
>> 
>>

Re: HDFS sink to a remote HDFS node

Reply via email to