Hi,

Yes it'd be great to get 1763 and 1794 into 1.3.1. I don't have time
right at this point in time. If another committer does, I'd love to
vote on an RC! :)

Brock

On Wed, Dec 19, 2012 at 4:34 AM, Rakos, Rudolf
<rudolf.ra...@morganstanley.com> wrote:
> Brock, Hari,
>
> I can confirm that the patch in FLUME-1794 fixes the performance issue.
>
> I was wondering whether it is possible to ask for a new release (1.3.1) 
> including the recent File Channel bug fixes?
>
>   Trunk: 
> https://git-wip-us.apache.org/repos/asf?p=flume.git;a=history;f=flume-ng-channels/flume-file-channel;h=cc779e886b4d6290723a43b4f874239150d93475;hb=trunk
>   1.3.0: 
> https://git-wip-us.apache.org/repos/asf?p=flume.git;a=history;f=flume-ng-channels/flume-file-channel;h=cc93d99eac6d631e9200d122928d5e307621b4fe;hb=refs/heads/flume-1.3.0
>
> Unfortunately we cannot use trunk, and waiting for Flume 1.4.0 could take a 
> few months.
> It's not a big problem if we need to stick with Flume 1.2.0, but according to 
> Juhani Connolly this was causing high CPU usage with non-NFS File Channels 
> too, so I think maybe it would be better for the community.
>
> Regards,
> Rudolf
>
> -----Original Message-----
> From: Rakos, Rudolf (ISGT)
> Sent: Wednesday, December 19, 2012 9:10 AM
> To: user@flume.apache.org
> Subject: RE: Flume 1.3.0 - NFS + File Channel Performance
>
> Brock, Hari,
>
> Thank you very much for looking so quickly into this.
>
> We're aware that the general performance will not be that great using NFS, 
> but having some "last minute" data on failover scenarios could be worth the 
> performance cost.
>
> You were right.
> I've taken some thread dumps and I can confirm that FLUME-1609 
> (File.getUsableSpace calls) are causing the issue. (I just don't understand 
> how could I miss this hot spot during profiling.)
>
> I'll check whether the patch in FLUME-1794 fixes this.
>
> Thanks,
> Rudolf
>
> -----Original Message-----
> From: Brock Noland [mailto:br...@cloudera.com]
> Sent: Tuesday, December 18, 2012 10:09 PM
> To: user@flume.apache.org
> Subject: Re: Flume 1.3.0 - NFS + File Channel Performance
>
> Hi,
>
> If you do have a chance, it would great to hear if the patch attached to this 
> JIRA (https://issues.apache.org/jira/browse/FLUME-1794) fixes the performance 
> problem.
>
> Brock
>
> On Tue, Dec 18, 2012 at 11:25 AM, Brock Noland <br...@cloudera.com> wrote:
>> Yeah I think we should do that check in the background and then update
>> a flag. This how hdfs and mapred do it.
>>
>> On Tue, Dec 18, 2012 at 11:04 AM, Hari Shreedharan
>> <hshreedha...@cloudera.com> wrote:
>>> Yep. The disk space calls require an NFS call for each write, and
>>> that slows things down a lot.
>>>
>>> --
>>> Hari Shreedharan
>>>
>>> On Tuesday, December 18, 2012 at 8:43 AM, Brock Noland wrote:
>>>
>>> We'd need those thread dumps to help confirm but I bet that
>>> FLUME-1609 results in a NFS call on each operation on the channel.
>>>
>>> If that is true, that would explain why it works well on local disk.
>>>
>>> Brock
>>>
>>> On Tue, Dec 18, 2012 at 10:17 AM, Brock Noland <br...@cloudera.com> wrote:
>>>
>>> Hi,
>>>
>>> Hmm, yes in general performance is not going to be great over NFS,
>>> but there haven't been any FC changes that stick out here.
>>>
>>> Could you take 10 thread dumps of the agent running the file channel
>>> and 10 thread dumps of the agent sending data to the agent with the
>>> file channel? (You can address them to myself directly since the list
>>> won't take attachements.)
>>>
>>> Are there any patterns, like it works for 40 seconds then times out
>>> and then works for 39 seconds, etc?
>>>
>>> Brock
>>>
>>> On Tue, Dec 18, 2012 at 10:07 AM, Rakos, Rudolf
>>> <rudolf.ra...@morganstanley.com> wrote:
>>>
>>> Hi,
>>>
>>>
>>>
>>> We’ve run into a strange problem regarding NFS and File Channel
>>> performance while evaluating the new version of Flume.
>>>
>>> We had no issues with the previous version (1.2.0).
>>>
>>>
>>>
>>> Our configuration looks like this:
>>>
>>> · Node1:
>>> (Avro RPC Clients ->) Avro Source and Custom Sources -> File Channel
>>> -> Avro Sink (-> Node 2)
>>>
>>> · Node2:
>>> (Node1s ->) Avro Source -> File Channel -> Custom Sink
>>>
>>>
>>>
>>> Both the checkpoint and the data directories of the File Channels are
>>> on NFS shares. We use the same share for checkpoint and data
>>> directories, but different shares for each Node. Unfortunately it is
>>> not an option for us to use local directories.
>>>
>>> The events are about 1KB large, and the batch sizes are the following:
>>>
>>> · Avro RPC Clients: 1000
>>>
>>> · Custom Sources: 2000
>>>
>>> · Avro Sink: 5000
>>>
>>> · Custom Sink: 10000
>>>
>>>
>>>
>>> We are experiencing very slow File Channel performance compared to
>>> the previous version, and high amount of timeouts (almost always) in
>>> the Avro RPC Clients and the Avro Sink.
>>>
>>> Something like this:
>>>
>>> · 2012-12-18 15:43:31,828
>>> [SinkRunner-PollingRunner-ExceptionCatchingSinkProcessor] WARN
>>> org.apache.flume.sink.AvroSink - Failed to send event batch
>>> org.apache.flume.EventDeliveryException: NettyAvroRpcClient { host:
>>> ***,
>>> port: *** }: Failed to send batch
>>> at
>>> org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClien
>>> t.java:236)
>>> ~[flume-ng-sdk-1.3.0.jar:1.3.0]
>>> ***
>>> at
>>> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>>> [flume-ng-core-1.3.0.jar:1.3.0]
>>> at java.lang.Thread.run(Thread.java:662) [na:1.6.0_31] Caused by:
>>> org.apache.flume.EventDeliveryException: NettyAvroRpcClient {
>>> host: ***, port: *** }: Handshake timed out after 20000ms at
>>> org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClien
>>> t.java:280)
>>> ~[flume-ng-sdk-1.3.0.jar:1.3.0]
>>> at
>>> org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClien
>>> t.java:224)
>>> ~[flume-ng-sdk-1.3.0.jar:1.3.0]
>>> ... 5 common frames omitted
>>> Caused by: java.util.concurrent.TimeoutException: null at
>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
>>> ~[na:1.6.0_31]
>>> at java.util.concurrent.FutureTask.get(FutureTask.java:91)
>>> ~[na:1.6.0_31]
>>> at
>>> org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClien
>>> t.java:278)
>>> ~[flume-ng-sdk-1.3.0.jar:1.3.0]
>>> ... 6 common frames omitted
>>>
>>> (I had to remove some details, sorry for that.)
>>>
>>>
>>>
>>> We managed to narrow down the root cause of the issue to the File
>>> Channel,
>>> because:
>>>
>>> · Everything works fine if we switch to the Memory Channel or to the
>>> Old File Channel (1.2.0).
>>>
>>> · Everything works fine if we use local directories.
>>>
>>> We’ve tested this on multiple different PCs (both Windows and Linux).
>>>
>>>
>>>
>>> I spent the day debugging and profiling, but I could not find
>>> anything worth mentioning (nothing with excessive CPU usage, no
>>> threads are waiting too much, etc…). The only problem is that File
>>> Channel takes and puts take way more time than with the previous version.
>>>
>>>
>>>
>>>
>>>
>>> Could someone please try the File Channel on an NFS share?
>>>
>>> Does anyone have similar issues?
>>>
>>>
>>>
>>> Thank you for your help.
>>>
>>>
>>>
>>> Regards,
>>>
>>> Rudolf
>>>
>>>
>>>
>>> Rudolf Rakos
>>> Morgan Stanley | ISG Technology
>>> Lechner Odon fasor 8 | Floor 06
>>> Budapest, 1095
>>> Phone: +36 1 881-4011
>>> rudolf.ra...@morganstanley.com
>>>
>>>
>>> Be carbon conscious. Please consider our environment before printing
>>> this email.
>>>
>>>
>>>
>>>
>>> ________________________________
>>>
>>> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
>>> opinions or views contained herein are not intended to be, and do not
>>> constitute, advice within the meaning of Section 975 of the
>>> Dodd-Frank Wall Street Reform and Consumer Protection Act. If you
>>> have received this communication in error, please destroy all
>>> electronic and paper copies and notify the sender immediately.
>>> Mistransmission is not intended to waive confidentiality or
>>> privilege. Morgan Stanley reserves the right, to the extent permitted
>>> under applicable law, to monitor electronic communications. This message is 
>>> subject to terms available at the following link:
>>> http://www.morganstanley.com/disclaimers If you cannot access these
>>> links, please notify us by reply message and we will send the
>>> contents to you. By messaging with Morgan Stanley you consent to the 
>>> foregoing.
>>>
>>>
>>>
>>>
>>> --
>>> Apache MRUnit - Unit testing MapReduce -
>>> http://incubator.apache.org/mrunit/
>>>
>>>
>>>
>>>
>>> --
>>> Apache MRUnit - Unit testing MapReduce -
>>> http://incubator.apache.org/mrunit/
>>>
>>>
>>
>>
>>
>> --
>> Apache MRUnit - Unit testing MapReduce -
>> http://incubator.apache.org/mrunit/
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
>
>
> --------------------------------------------------------------------------------
>
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions 
> or views contained herein are not intended to be, and do not constitute, 
> advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform 
> and Consumer Protection Act. If you have received this communication in 
> error, please destroy all electronic and paper copies and notify the sender 
> immediately. Mistransmission is not intended to waive confidentiality or 
> privilege. Morgan Stanley reserves the right, to the extent permitted under 
> applicable law, to monitor electronic communications. This message is subject 
> to terms available at the following link: 
> http://www.morganstanley.com/disclaimers. If you cannot access these links, 
> please notify us by reply message and we will send the contents to you. By 
> messaging with Morgan Stanley you consent to the foregoing.
>
>
> --------------------------------------------------------------------------------
>
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions 
> or views contained herein are not intended to be, and do not constitute, 
> advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform 
> and Consumer Protection Act. If you have received this communication in 
> error, please destroy all electronic and paper copies and notify the sender 
> immediately. Mistransmission is not intended to waive confidentiality or 
> privilege. Morgan Stanley reserves the right, to the extent permitted under 
> applicable law, to monitor electronic communications. This message is subject 
> to terms available at the following link: 
> http://www.morganstanley.com/disclaimers. If you cannot access these links, 
> please notify us by reply message and we will send the contents to you. By 
> messaging with Morgan Stanley you consent to the foregoing.



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Reply via email to