Hi, Yes it'd be great to get 1763 and 1794 into 1.3.1. I don't have time right at this point in time. If another committer does, I'd love to vote on an RC! :)
Brock On Wed, Dec 19, 2012 at 4:34 AM, Rakos, Rudolf <rudolf.ra...@morganstanley.com> wrote: > Brock, Hari, > > I can confirm that the patch in FLUME-1794 fixes the performance issue. > > I was wondering whether it is possible to ask for a new release (1.3.1) > including the recent File Channel bug fixes? > > Trunk: > https://git-wip-us.apache.org/repos/asf?p=flume.git;a=history;f=flume-ng-channels/flume-file-channel;h=cc779e886b4d6290723a43b4f874239150d93475;hb=trunk > 1.3.0: > https://git-wip-us.apache.org/repos/asf?p=flume.git;a=history;f=flume-ng-channels/flume-file-channel;h=cc93d99eac6d631e9200d122928d5e307621b4fe;hb=refs/heads/flume-1.3.0 > > Unfortunately we cannot use trunk, and waiting for Flume 1.4.0 could take a > few months. > It's not a big problem if we need to stick with Flume 1.2.0, but according to > Juhani Connolly this was causing high CPU usage with non-NFS File Channels > too, so I think maybe it would be better for the community. > > Regards, > Rudolf > > -----Original Message----- > From: Rakos, Rudolf (ISGT) > Sent: Wednesday, December 19, 2012 9:10 AM > To: user@flume.apache.org > Subject: RE: Flume 1.3.0 - NFS + File Channel Performance > > Brock, Hari, > > Thank you very much for looking so quickly into this. > > We're aware that the general performance will not be that great using NFS, > but having some "last minute" data on failover scenarios could be worth the > performance cost. > > You were right. > I've taken some thread dumps and I can confirm that FLUME-1609 > (File.getUsableSpace calls) are causing the issue. (I just don't understand > how could I miss this hot spot during profiling.) > > I'll check whether the patch in FLUME-1794 fixes this. > > Thanks, > Rudolf > > -----Original Message----- > From: Brock Noland [mailto:br...@cloudera.com] > Sent: Tuesday, December 18, 2012 10:09 PM > To: user@flume.apache.org > Subject: Re: Flume 1.3.0 - NFS + File Channel Performance > > Hi, > > If you do have a chance, it would great to hear if the patch attached to this > JIRA (https://issues.apache.org/jira/browse/FLUME-1794) fixes the performance > problem. > > Brock > > On Tue, Dec 18, 2012 at 11:25 AM, Brock Noland <br...@cloudera.com> wrote: >> Yeah I think we should do that check in the background and then update >> a flag. This how hdfs and mapred do it. >> >> On Tue, Dec 18, 2012 at 11:04 AM, Hari Shreedharan >> <hshreedha...@cloudera.com> wrote: >>> Yep. The disk space calls require an NFS call for each write, and >>> that slows things down a lot. >>> >>> -- >>> Hari Shreedharan >>> >>> On Tuesday, December 18, 2012 at 8:43 AM, Brock Noland wrote: >>> >>> We'd need those thread dumps to help confirm but I bet that >>> FLUME-1609 results in a NFS call on each operation on the channel. >>> >>> If that is true, that would explain why it works well on local disk. >>> >>> Brock >>> >>> On Tue, Dec 18, 2012 at 10:17 AM, Brock Noland <br...@cloudera.com> wrote: >>> >>> Hi, >>> >>> Hmm, yes in general performance is not going to be great over NFS, >>> but there haven't been any FC changes that stick out here. >>> >>> Could you take 10 thread dumps of the agent running the file channel >>> and 10 thread dumps of the agent sending data to the agent with the >>> file channel? (You can address them to myself directly since the list >>> won't take attachements.) >>> >>> Are there any patterns, like it works for 40 seconds then times out >>> and then works for 39 seconds, etc? >>> >>> Brock >>> >>> On Tue, Dec 18, 2012 at 10:07 AM, Rakos, Rudolf >>> <rudolf.ra...@morganstanley.com> wrote: >>> >>> Hi, >>> >>> >>> >>> We’ve run into a strange problem regarding NFS and File Channel >>> performance while evaluating the new version of Flume. >>> >>> We had no issues with the previous version (1.2.0). >>> >>> >>> >>> Our configuration looks like this: >>> >>> · Node1: >>> (Avro RPC Clients ->) Avro Source and Custom Sources -> File Channel >>> -> Avro Sink (-> Node 2) >>> >>> · Node2: >>> (Node1s ->) Avro Source -> File Channel -> Custom Sink >>> >>> >>> >>> Both the checkpoint and the data directories of the File Channels are >>> on NFS shares. We use the same share for checkpoint and data >>> directories, but different shares for each Node. Unfortunately it is >>> not an option for us to use local directories. >>> >>> The events are about 1KB large, and the batch sizes are the following: >>> >>> · Avro RPC Clients: 1000 >>> >>> · Custom Sources: 2000 >>> >>> · Avro Sink: 5000 >>> >>> · Custom Sink: 10000 >>> >>> >>> >>> We are experiencing very slow File Channel performance compared to >>> the previous version, and high amount of timeouts (almost always) in >>> the Avro RPC Clients and the Avro Sink. >>> >>> Something like this: >>> >>> · 2012-12-18 15:43:31,828 >>> [SinkRunner-PollingRunner-ExceptionCatchingSinkProcessor] WARN >>> org.apache.flume.sink.AvroSink - Failed to send event batch >>> org.apache.flume.EventDeliveryException: NettyAvroRpcClient { host: >>> ***, >>> port: *** }: Failed to send batch >>> at >>> org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClien >>> t.java:236) >>> ~[flume-ng-sdk-1.3.0.jar:1.3.0] >>> *** >>> at >>> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) >>> [flume-ng-core-1.3.0.jar:1.3.0] >>> at java.lang.Thread.run(Thread.java:662) [na:1.6.0_31] Caused by: >>> org.apache.flume.EventDeliveryException: NettyAvroRpcClient { >>> host: ***, port: *** }: Handshake timed out after 20000ms at >>> org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClien >>> t.java:280) >>> ~[flume-ng-sdk-1.3.0.jar:1.3.0] >>> at >>> org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClien >>> t.java:224) >>> ~[flume-ng-sdk-1.3.0.jar:1.3.0] >>> ... 5 common frames omitted >>> Caused by: java.util.concurrent.TimeoutException: null at >>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228) >>> ~[na:1.6.0_31] >>> at java.util.concurrent.FutureTask.get(FutureTask.java:91) >>> ~[na:1.6.0_31] >>> at >>> org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClien >>> t.java:278) >>> ~[flume-ng-sdk-1.3.0.jar:1.3.0] >>> ... 6 common frames omitted >>> >>> (I had to remove some details, sorry for that.) >>> >>> >>> >>> We managed to narrow down the root cause of the issue to the File >>> Channel, >>> because: >>> >>> · Everything works fine if we switch to the Memory Channel or to the >>> Old File Channel (1.2.0). >>> >>> · Everything works fine if we use local directories. >>> >>> We’ve tested this on multiple different PCs (both Windows and Linux). >>> >>> >>> >>> I spent the day debugging and profiling, but I could not find >>> anything worth mentioning (nothing with excessive CPU usage, no >>> threads are waiting too much, etc…). The only problem is that File >>> Channel takes and puts take way more time than with the previous version. >>> >>> >>> >>> >>> >>> Could someone please try the File Channel on an NFS share? >>> >>> Does anyone have similar issues? >>> >>> >>> >>> Thank you for your help. >>> >>> >>> >>> Regards, >>> >>> Rudolf >>> >>> >>> >>> Rudolf Rakos >>> Morgan Stanley | ISG Technology >>> Lechner Odon fasor 8 | Floor 06 >>> Budapest, 1095 >>> Phone: +36 1 881-4011 >>> rudolf.ra...@morganstanley.com >>> >>> >>> Be carbon conscious. Please consider our environment before printing >>> this email. >>> >>> >>> >>> >>> ________________________________ >>> >>> NOTICE: Morgan Stanley is not acting as a municipal advisor and the >>> opinions or views contained herein are not intended to be, and do not >>> constitute, advice within the meaning of Section 975 of the >>> Dodd-Frank Wall Street Reform and Consumer Protection Act. If you >>> have received this communication in error, please destroy all >>> electronic and paper copies and notify the sender immediately. >>> Mistransmission is not intended to waive confidentiality or >>> privilege. Morgan Stanley reserves the right, to the extent permitted >>> under applicable law, to monitor electronic communications. This message is >>> subject to terms available at the following link: >>> http://www.morganstanley.com/disclaimers If you cannot access these >>> links, please notify us by reply message and we will send the >>> contents to you. By messaging with Morgan Stanley you consent to the >>> foregoing. >>> >>> >>> >>> >>> -- >>> Apache MRUnit - Unit testing MapReduce - >>> http://incubator.apache.org/mrunit/ >>> >>> >>> >>> >>> -- >>> Apache MRUnit - Unit testing MapReduce - >>> http://incubator.apache.org/mrunit/ >>> >>> >> >> >> >> -- >> Apache MRUnit - Unit testing MapReduce - >> http://incubator.apache.org/mrunit/ > > > > -- > Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ > > > -------------------------------------------------------------------------------- > > NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions > or views contained herein are not intended to be, and do not constitute, > advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform > and Consumer Protection Act. If you have received this communication in > error, please destroy all electronic and paper copies and notify the sender > immediately. Mistransmission is not intended to waive confidentiality or > privilege. Morgan Stanley reserves the right, to the extent permitted under > applicable law, to monitor electronic communications. This message is subject > to terms available at the following link: > http://www.morganstanley.com/disclaimers. If you cannot access these links, > please notify us by reply message and we will send the contents to you. By > messaging with Morgan Stanley you consent to the foregoing. > > > -------------------------------------------------------------------------------- > > NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions > or views contained herein are not intended to be, and do not constitute, > advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform > and Consumer Protection Act. If you have received this communication in > error, please destroy all electronic and paper copies and notify the sender > immediately. Mistransmission is not intended to waive confidentiality or > privilege. Morgan Stanley reserves the right, to the extent permitted under > applicable law, to monitor electronic communications. This message is subject > to terms available at the following link: > http://www.morganstanley.com/disclaimers. If you cannot access these links, > please notify us by reply message and we will send the contents to you. By > messaging with Morgan Stanley you consent to the foregoing. -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/