It may not be *the* bug, but if you hit it, it will produce *the* problem. :)
I think the fix will work. --- On Thu, 2/24/11, Eddie Kohler <[email protected]> wrote: > From: Eddie Kohler <[email protected]> > Subject: Re: [Click] Conversion of threads > To: "Bobby Longpocket" <[email protected]> > Cc: "Cliff Frey" <[email protected]>, [email protected] > Date: Thursday, February 24, 2011, 4:23 PM > Thank you very, very much for this > very clear explanation. > > That sure is *a* bug (not sure it is *the*). What do > you think of the > following fix? > > http://www.read.cs.ucla.edu/gitweb?p=click;a=commitdiff;h=0cbe4e8bdfbee6efe127cdd714c145cd80e89af9;hp=968964c3e344c2898681e9644d51b20669fef6cd > > Eddie > > > On 02/24/2011 02:48 PM, Bobby Longpocket wrote: > > The bug is in ThreadSafeQueue. > > > > You can't treat head and tail independently in the > push/pull operations. The result of the current code is that > you can fill the queue beyond capacity (turning it into an > 'empty' queue as a result). > > > > This happens because a thread can be using a stale > head and tail, and this can go unnoticed if the current > value of tail (in push) or head (in pull) has wrapped around > to the original value. > > > > For instance, suppose a thread gets into push with > _head==3 and _tail==1 (one slot left), then gets delayed for > a while... > > > > ...meanwhile, other pushers and pullers work on the > queue... head=3;tail=2...head=50;tail=49...head=2;tail=1... > > > > ...now our thread gets back to work. > _tail/_xtail is 1 again, so we pass the compare-and-swap and > we'll do our push with next-tail==2 even though h==2. > > > > > > > > > > --- On Thu, 2/24/11, Eddie Kohler<[email protected]> > wrote: > > > >> From: Eddie Kohler<[email protected]> > >> Subject: Re: [Click] Conversion of threads > >> To: "Cliff Frey"<[email protected]> > >> Cc: [email protected] > >> Date: Thursday, February 24, 2011, 10:30 AM > >> Hey > >> > >> I thought I found a bug in ThreadSafeQueue, but > convinced > >> myself it is OK. > >> There could be problems in the threading core > (which has > >> been changed for > >> coreperformance work). However, I do not > observe > >> Click getting stuck. > >> Rather, I observe it performing radically > slowly. The > >> attached config prints > >> counter information, etc. every second. The > counters > >> keep going up. Although > >> the InfiniteSources often look unscheduled, this > is because > >> they *are* > >> unscheduled much of the time; the > fast_reschedule() happens > >> at the end of > >> InfintieSource::run_task(). I don't know > what's going > >> on. Commenting out all > >> notification does not help. > >> > >> Eddie > >> > >> > >> On 2/24/11 8:12 AM, Cliff Frey wrote: > >>> I wanted to benchmark the difference between > >> ThreadSafeQueue and > >>> one-queue-per-thread + RoundRobinScheduler + > Unqueue + > >> another-Queue. > >>> However, I ended up > finding a bug. > >>> > >>> This config: > >>> > >>> elementclass Foo { > >>> is :: InfiniteSource > -> [0] output; > >>> }; > >>> > f0::Foo,f1::Foo,f2::Foo,f3::Foo,f4::Foo,f5::Foo > >>> -> master_q :: ThreadSafeQueue > >>> -> uq2 :: Unqueue > >>> -> c :: Counter(COUNT_CALL 4000000 > stop) > >>> -> Discard; > >>> StaticThreadSched( > >>> f0/is 0, > >>> f1/is 1, > >>> f2/is 2, > >>> f3/is 3, > >>> f4/is 4, > >>> f5/is 5 > >>> uq2 7); > >>> > >>> When run with 'click --threads=8 foo.click' > ends up > >> getting suck, with an > >>> empty master_q and none of the infinitesource > elements > >> scheduled. I'm running > >>> on a i7 CPU with 4 cores + hyperthreading, > giving 8 > >> CPUs from linux's perspective. > >>> > >>> Also interesting is how many drops will be > seen at the > >> queue in this case (not > >>> too surprising really) > >>> > >>> read f0/is.scheduled > >>> false > >>> read master_q.length > >>> 0 > >>> read master_q.drops > >>> 163540 > >>> read c.count > >>> 231797 > >>> > >>> Cliff > >>> > >>> On Thu, Feb 24, 2011 at 7:27 AM, Eddie > Kohler<[email protected] > >>> <mailto:[email protected]>> > >> wrote: > >>> > >>> Try > ThreadSafeQueue at the > >> converge point, which as Bobby points out should > be > >>> correct even > when > >> multithreaded, but doesn't require locks. > However, it > >> will > >>> still be > slowish, as would be > >> anything that tried to enforce strict order, > >>> since the queue > pointers will > >> bounce among cores. > >>> > >>> Eddie > >>> > >>> > >>> On 2/23/11 9:36 > PM, Philip > >> Prindeville wrote: > >>> > I was > wondering... if I'm > >> running multiple duplicate paths each on its > >>> own core and I > eventually need > >> to collect them up so I can resequence > >>> packets and > pass them up into > >> user-space... how feasible is it to go from > >>> threaded and > lockless to > >> converging into a single locked queue (fan-in)? > >>> > > >>> > I figure > it would be the best > >> of both worlds because most of the > >>> operations > could happen > >> threaded and without locking, and the locking > only > >>> needs to be > enforced in the > >> final stage... > >>> > > >>> > Thanks. > >>> > > >>> > -Philip > >>> > > >>> > > >> _______________________________________________ > >>> > click > mailing list > >>> > [email protected] > >> <mailto:[email protected]> > >>> > https://amsterdam.lcs.mit.edu/mailman/listinfo/click > >>> > >> > _______________________________________________ > >>> click mailing > list > >>> [email protected] > >> <mailto:[email protected]> > >>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click > >>> > >>> > >> > >> -----Inline Attachment Follows----- > >> > >> _______________________________________________ > >> click mailing list > >> [email protected] > >> https://amsterdam.lcs.mit.edu/mailman/listinfo/click > >> > > > > > > > _______________________________________________ > click mailing list > [email protected] > https://amsterdam.lcs.mit.edu/mailman/listinfo/click > _______________________________________________ click mailing list [email protected] https://amsterdam.lcs.mit.edu/mailman/listinfo/click
