Good idea, let me try it. J-D
On Wed, Jan 13, 2010 at 11:01 AM, Joydeep Sarma <jsensa...@gmail.com> wrote: > i posted on the jira as well - but we should be able to simulate the > effect of the patch. > > if the sync was simulated merely a sleep (for 2-3ms - whatever is the > average RTT for dfs write pipeline) instead of an actual call into dfs > client - it should simulate the effect of the patch. (the appends > would proceed in parallel, each sync would block for sometime). > > so we should be able to test whether this gets a performance win for > the queue threshold=1 case. > > On Wed, Jan 13, 2010 at 10:43 AM, Dhruba Borthakur <dhr...@gmail.com> wrote: >> Awesome, I will try to post a patch soon and will let you know as soon as I >> have the first version ready. >> >> thanks, >> dhruba >> >> >> On Wed, Jan 13, 2010 at 10:40 AM, Jean-Daniel Cryans >> <jdcry...@apache.org>wrote: >> >>> I'll be happy to benchmark, we already have code to test the >>> multi-client hitting 1 region server case. >>> know >>> J-D >>> >>> On Wed, Jan 13, 2010 at 10:38 AM, Dhruba Borthakur <dhr...@gmail.com> >>> wrote: >>> > I will try to make a patch for it first. depending on the complexity of >>> the >>> > patch code, we can decide which release it can go in. >>> > >>> > thanks, >>> > dhruba >>> > >>> > On Wed, Jan 13, 2010 at 9:56 AM, Jean-Daniel Cryans <jdcry...@apache.org >>> >wrote: >>> > >>> >> That's great dhruba, I guess the sooner it could go in is 0.21.1? >>> >> >>> >> J-D >>> >> >>> >> On Wed, Jan 13, 2010 at 8:51 AM, Dhruba Borthakur <dhr...@gmail.com> >>> >> wrote: >>> >> > I opened http://issues.apache.org/jira/browse/HDFS-895 for this one. >>> >> > >>> >> > thanks, >>> >> > dhruba >>> >> > >>> >> > On Tue, Jan 12, 2010 at 9:41 PM, Joydeep Sarma <jsensa...@gmail.com> >>> >> wrote: >>> >> > >>> >> >> this is internal to the dfsclient. this would explain why performance >>> >> >> would suck with queue threshold of 1. >>> >> >> >>> >> >> leave it up to Dhruba to explain the details. >>> >> >> >>> >> >> On Tue, Jan 12, 2010 at 9:16 PM, stack <st...@duboce.net> wrote: >>> >> >> > On Tue, Jan 12, 2010 at 9:12 PM, stack <st...@duboce.net> wrote: >>> >> >> > >>> >> >> >> > any IO to a HDFS-file (appends, writes, etc) ae actually blocked >>> on >>> >> a >>> >> >> >> > pending sync. "sync" in HDFS is a pretty heavyweight operation >>> as >>> >> it >>> >> >> >> stands. >>> >> >> >> >>> >> >> >> i think this is likely to explain limited throughput with the >>> default >>> >> >> >> write queue threshold of 1. if the appends cannot make progress >>> while >>> >> >> >> one is waiting for the sync - then the write pipeline is going to >>> be >>> >> >> >> idle most of the time (with queue threshold of 1). >>> >> >> >> >>> >> >> >> i think it would be good to have the sync not block other writers >>> on >>> >> >> >> the file/pipeline. logically - it's not clear why it needs to >>> (since >>> >> >> >> the sync is just a wait for the completion as of some write >>> >> >> >> transaction id - allowing new ones to be queued up subsequently). >>> >> >> > >>> >> >> > >>> >> >> > Are you talking about internal to DFSClient Joydeep? Or some >>> >> >> > synchronization block up in hlog? >>> >> >> > >>> >> >> > St.Ack >>> >> >> > >>> >> >> >>> >> > >>> >> > >>> >> > >>> >> > -- >>> >> > Connect to me at http://www.facebook.com/dhruba >>> >> > >>> >> >>> > >>> > >>> > >>> > -- >>> > Connect to me at http://www.facebook.com/dhruba >>> > >>> >> >> >> >> -- >> Connect to me at http://www.facebook.com/dhruba >> >