[ 
https://issues.apache.org/jira/browse/FLUME-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Csanady reassigned FLUME-3190:
-------------------------------------

    Assignee: Miklos Csanady

> flume shutdown hook issue when both hbase and hdfs sink are in use
> ------------------------------------------------------------------
>
>                 Key: FLUME-3190
>                 URL: https://issues.apache.org/jira/browse/FLUME-3190
>             Project: Flume
>          Issue Type: Bug
>    Affects Versions: 1.6.0
>            Reporter: Yuexin Zhang
>            Assignee: Miklos Csanady
>            Priority: Major
>
> When both hdfs and hbase sink are in use, during shutdown (KILL SIGTERM), the 
> hdfs sink won't be able to rename/close the .tmp hdfs file because the 
> underlying filesystem could be closed earlier when shutting down the other 
> component:
> {code:java}
> 2017/10/23 15:34:50,858 ERROR (AbstractHDFSWriter.hflushOrSync:268) - Error 
> while trying to hflushOrSync!
> 2017/10/23 15:34:50,859 WARN (BucketWriter.close:400) - failed to close() 
> HDFSWriter for file (/tmp/bothSource/FlumeData.1508744083526.tmp). Exception 
> follows.
> java.io.IOException: Filesystem closed
>         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:860)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2388)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:2334)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.flume.sink.hdfs.AbstractHDFSWriter.hflushOrSync(AbstractHDFSWriter.java:265)
>         at 
> org.apache.flume.sink.hdfs.HDFSDataStream.close(HDFSDataStream.java:134)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:327)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:323)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:701)
>         at 
> org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50)
>         at 
> org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:698)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> the root cause is HBase client's DynamicClassLoader injection (See 
> DynamicClassLoader.java in HBase). HBase added a feature at some point to 
> load JARs from HDFS dynamically into its class loader, and to do this it 
> loads a DistributedFileSystem object via the standard FileSystem.get(…) / 
> equivalent call.
> Flume, OTOH, in its HDFS BucketWriter, uses FileSystem.get(…) too (all a 
> single instance, coming from the cache), but supplies an instruction that 
> disables automatic-close at shutdown (Look for fs.automatic.close in 
> BucketWriter.java).
> When HBase sink is active, HBase shares the FileSystem object indirectly for 
> its internal/implicit DynamicClassLoader object, but this is grabbed from the 
> cache without specifying 'do not auto-close at shutdown' cause HBase is not 
> really troubled by that. However, since the same FileSystem object instance 
> is now shared by something that wants it to auto-close and something that 
> does not, the shutdown causes a problem in Flume.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to