[ 
https://issues.apache.org/jira/browse/FLUME-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185874#comment-14185874
 ] 

Hudson commented on FLUME-2517:
-------------------------------

UNSTABLE: Integrated in Flume-trunk-hbase-98 #40 (See 
[https://builds.apache.org/job/Flume-trunk-hbase-98/40/])
FLUME-2517. Cache SimpleDataFormat objects in bucketwriter for better 
performance. (hshreedharan: 
http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=77d56e95ead7a04499aa83d1a78fcfbd957b20c7)
* flume-ng-core/src/main/java/org/apache/flume/formatter/output/BucketPath.java


> Performance issue: SimpleDateFormat constructor takes 30% of 
> HDFSEventSink.process()
> ------------------------------------------------------------------------------------
>
>                 Key: FLUME-2517
>                 URL: https://issues.apache.org/jira/browse/FLUME-2517
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.5.0.1
>         Environment: linux i686
> java version "1.7.0_45"
>            Reporter: Pal Konyves
>            Assignee: Pal Konyves
>              Labels: performance
>         Attachments: flume_2517.patch, flume_2517.png
>
>
> I started investigating why HDFS sink has so bad throughput in v 1.5.0.0. It 
> seems to be better in 1.6.0.0 (current trunk).
> PseudoTx channel was filling up, because HDFS Sink could not write as fast as 
> data coming from source.
> Profiling from jconsole revealed that 30% of the time spent in 
> HDFSEventSink.process() method is taken by constructing SimpleDateFormat 
> objects. SimpleDateFormat object is notoriously a heavy and time consuming 
> object to create. It is also not thread-safe.
> It is used in HDFS Sink to calculate the path that contains date-time 
> wildcards. I will provide a patch to cache SimpleDateFormat objects for 
> thread. With this patch, the PseudoTx channel I used for testing was not 
> constantly filling up, and throughput was much better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to