[ 
https://issues.apache.org/jira/browse/FLUME-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16237406#comment-16237406
 ] 

ASF GitHub Bot commented on FLUME-3190:
---------------------------------------

GitHub user mcsanady opened a pull request:

    https://github.com/apache/flume/pull/180

    FLUME-3190: flume shutdown hook issue when both hbase and hdfs sink a…

    When both hdfs and hbase sink are in use, during shutdown (KILL SIGTERM), 
the hdfs sink won't be able to rename/close the .tmp hdfs file because the 
underlying filesystem could be closed earlier when shutting down the other 
component.
    
    This change registers a new ShutdownHook in the hadoop's 
ShutdownHookManager, which will prevent other hooks to run until Flume stops 
itself.
    
    Tested on a cluster which could reproduce the error before the change, but 
eliminated after.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mcsanady/flume FLUME-3190

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flume/pull/180.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #180
    
----
commit f3cff0bcc6112a4067ac606049da7f4bb58c20da
Author: Miklos Csanady <miklos.csan...@cloudera.com>
Date:   2017-11-03T10:21:40Z

    FLUME-3190: flume shutdown hook issue when both hbase and hdfs sink are in 
use

----


> flume shutdown hook issue when both hbase and hdfs sink are in use
> ------------------------------------------------------------------
>
>                 Key: FLUME-3190
>                 URL: https://issues.apache.org/jira/browse/FLUME-3190
>             Project: Flume
>          Issue Type: Bug
>    Affects Versions: 1.6.0
>            Reporter: Yuexin Zhang
>            Assignee: Miklos Csanady
>            Priority: Major
>
> When both hdfs and hbase sink are in use, during shutdown (KILL SIGTERM), the 
> hdfs sink won't be able to rename/close the .tmp hdfs file because the 
> underlying filesystem could be closed earlier when shutting down the other 
> component:
> {code:java}
> 2017/10/23 15:34:50,858 ERROR (AbstractHDFSWriter.hflushOrSync:268) - Error 
> while trying to hflushOrSync!
> 2017/10/23 15:34:50,859 WARN (BucketWriter.close:400) - failed to close() 
> HDFSWriter for file (/tmp/bothSource/FlumeData.1508744083526.tmp). Exception 
> follows.
> java.io.IOException: Filesystem closed
>         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:860)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2388)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:2334)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.flume.sink.hdfs.AbstractHDFSWriter.hflushOrSync(AbstractHDFSWriter.java:265)
>         at 
> org.apache.flume.sink.hdfs.HDFSDataStream.close(HDFSDataStream.java:134)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:327)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:323)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:701)
>         at 
> org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:698)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> the root cause is HBase client's DynamicClassLoader injection (See 
> DynamicClassLoader.java in HBase). HBase added a feature at some point to 
> load JARs from HDFS dynamically into its class loader, and to do this it 
> loads a DistributedFileSystem object via the standard FileSystem.get(…) / 
> equivalent call.
> Flume, OTOH, in its HDFS BucketWriter, uses FileSystem.get(…) too (all a 
> single instance, coming from the cache), but supplies an instruction that 
> disables automatic-close at shutdown (Look for fs.automatic.close in 
> BucketWriter.java).
> When HBase sink is active, HBase shares the FileSystem object indirectly for 
> its internal/implicit DynamicClassLoader object, but this is grabbed from the 
> cache without specifying 'do not auto-close at shutdown' cause HBase is not 
> really troubled by that. However, since the same FileSystem object instance 
> is now shared by something that wants it to auto-close and something that 
> does not, the shutdown causes a problem in Flume.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to