[ 
https://issues.apache.org/jira/browse/FLUME-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16583778#comment-16583778
 ] 

ASF GitHub Bot commented on FLUME-3217:
---------------------------------------

GitHub user ffernandez92 opened a pull request:

    https://github.com/apache/flume/pull/225

    FLUME-3217 . Creates empty files when quota

    As it can be seen in FLUME-3217, Flume creates empty files when HDFS quota 
has been reached.
    
    This is a first approach to the solution. It works although a better 
approach it could be implemented.
    The idea is to capture the quota Exception in order to delete this annoying 
empty files generated by flume.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ffernandez92/flume patch-3

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flume/pull/225.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #225
    
----
commit da08bed61fd91c2786a5e22d42eb265a8f3e76d3
Author: Ferran Fernández Garrido <ffernandez.upc@...>
Date:   2018-08-17T10:58:45Z

    FLUME-3217 . Creates empty files when quota
    
    As it can be seen in FLUME-3217, Flume creates empty files when HDFS quota 
has been reached.
    
    This is a first approach to the solution, it works although a better 
approach it could be implemented. The idea is to capture the quota Exception in 
order to delete this annoying empty files generated by flume.

----


> Flume creates empty files when HDFS quota has been reached
> ----------------------------------------------------------
>
>                 Key: FLUME-3217
>                 URL: https://issues.apache.org/jira/browse/FLUME-3217
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: 1.8.0
>            Reporter: Denes Arvay
>            Priority: Critical
>
> Flume creates empty files when HDFS quota has been reached and leaves them on 
> HDFS. 
> The file creation was successful, but as long as the quota did not allow any 
> write, new file was created on every write attempt.
> Relevant error message from flume log:
>  {noformat}
> 2018-02-07 14:59:30,563 WARN org.apache.flume.sink.hdfs.BucketWriter: Caught 
> IOException writing to HDFSWriter (The DiskSpace quota of /data/catssolprn is 
> exceeded: quota = 2199023255552 B = 2 TB but diskspace consumed = 
> 2199217840800 B = 2.00 TB
>       at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyDiskspaceQuota(DirectoryWithQuotaFeature.java:149)
>       at 
> org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:159)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:2037)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1868)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1843)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:441)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3806)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3394)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
>       at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
>       at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)
> ). Closing file (/user/flume/events/FlumeData.1518033562880.log.tmp) and 
> rethrowing exception. 
> {noformat}
> Config for reproduction:
> {noformat}
> tier1.sources = source1
> tier1.channels = channel1
> tier1.sinks    = sink1
> tier1.sources.source1.type     = netcat
> tier1.sources.source1.bind     = 127.0.0.1
> tier1.sources.source1.port     = 9999
> tier1.sources.source1.channels = channel1
> tier1.channels.channel1.type                = memory
> tier1.sinks.sink1.type= hdfs
> tier1.sinks.sink1.fileType=DataStream
> tier1.sinks.sink1.channel = channel1
> tier1.sinks.sink1.hdfs.path = hdfs://nameservice1/user/flume/events
> {noformat}
> hdfs dfs commands:
> {noformat}
> sudo -u flume hdfs dfs -mkdir -p /user/flume/events
> sudo -u hdfs hdfs dfsadmin -setSpaceQuota 3000 /user/flume/events
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to