Denes Arvay created FLUME-3217:
----------------------------------

             Summary: Flume creates empty files when HDFS quota has been reached
                 Key: FLUME-3217
                 URL: https://issues.apache.org/jira/browse/FLUME-3217
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: 1.8.0
            Reporter: Denes Arvay


Flume creates empty files when HDFS quota has been reached and leaves them on 
HDFS. 
The file creation was successful, but as long as the quota did not allow any 
write, new file was created on every write attempt.

Relevant error message from flume log:
 {noformat}
2018-02-07 14:59:30,563 WARN org.apache.flume.sink.hdfs.BucketWriter: Caught 
IOException writing to HDFSWriter (The DiskSpace quota of /data/catssolprn is 
exceeded: quota = 2199023255552 B = 2 TB but diskspace consumed = 2199217840800 
B = 2.00 TB
        at 
org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyDiskspaceQuota(DirectoryWithQuotaFeature.java:149)
        at 
org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:159)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:2037)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1868)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1843)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:441)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3806)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3394)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
        at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)
). Closing file (/user/flume/events/FlumeData.1518033562880.log.tmp) and 
rethrowing exception. 
{noformat}

Config for reproduction:
{noformat}
tier1.sources = source1
tier1.channels = channel1
tier1.sinks    = sink1

tier1.sources.source1.type     = netcat
tier1.sources.source1.bind     = 127.0.0.1
tier1.sources.source1.port     = 9999
tier1.sources.source1.channels = channel1

tier1.channels.channel1.type                = memory

tier1.sinks.sink1.type= hdfs
tier1.sinks.sink1.fileType=DataStream
tier1.sinks.sink1.channel = channel1
tier1.sinks.sink1.hdfs.path = hdfs://nameservice1/user/flume/events
{noformat}

hdfs dfs commands:
{noformat}
sudo -u flume hdfs dfs -mkdir -p /user/flume/events
sudo -u hdfs hdfs dfsadmin -setSpaceQuota 3000 /user/flume/events
{noformat}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to