Hi Ryan, In case you missed it, I want to make that 'test.rb' into a Java/JUnit test and add it to the test suite. Still waiting to see if a -1 comes in on that idea, but nothing so far.
Also, agree that all active devs have clusters of various sizes that can be used for testing, at least for now, but the project does not have some dedicated shared resource, and instead depends on the availability of resources at pset, or su, or tm, etc., resources that are used currently in an ad hoc manner. I know it's not exactly top of the list right now but I do think an automated suite for running repeatable performance and reliability/fault tolerance tests with some reasonable scale that can be deployed onto EC2 via some script is something to at least consider. I'm happy to contribute the first "application" - Andy ________________________________ From: Ryan Rawson <[email protected]> To: [email protected] Sent: Sunday, June 14, 2009 6:54:24 PM Subject: Re: scanner is returning everything in parent region plus one of the daughters? Hey, Yes, 1304 has revealed weaknesses in the automated tests. It would be nice if they were fully covering all edge cases and concurrent scenarios, but such as it goes. I'm not sure we need to be renting EC2 time... I have clusters, and so do pset folks, and we do run tests and verification on them. It's just that 1304 hit, just in time to have to prep a hundred slides and 3 talks. It was hoped there were few bugs, but 1304 really caused some neato bugs. I appreciate test.rb - but moving forward I think all tests should remain in Java. Dynamic scripting languages on the JVM are very difficult to debug top to bottom. JUnit is best really :-) -ryan On Sun, Jun 14, 2009 at 10:54 AM, Andrew Purtell <[email protected]>wrote: > Hi J-D, > > I agree on all your points. Regarding test hosting, I wonder if anyone > has resources available to dedicate on a long term basis. I have a 4 node > testbed which could conceivably run some suite once per day and generate > some automated report, but I can't guarantee the availability of it. We > might also consider EC2, as long as the tests are all self contained, all > I/O between instances only, no data in/out or S3 charges. Using the usage > calculator (http://calculator.s3.amazonaws.com/calc5.html), it seems that > 5 extra large instances running for 5 hours once per day will cost $140/ > month. 10 of them would cost $280, etc. That is not a large figure. > > Further, this 'test.rb' thing is a distillation of some of the HBase usage > of my crawler application, the write path. I may also simulate some of the > scan/read path, the document processing bits. It would be great if we can > get other contributions of test cases that simulate real world > applications. Maybe there are examples to draw on from stuff running at > Powerset, Streamy, Openspaces, etc. > > - Andy > > > > > ________________________________ > From: Jean-Daniel Cryans <[email protected]> > To: [email protected] > Sent: Sunday, June 14, 2009 9:59:26 AM > Subject: Re: scanner is returning everything in parent region plus one of > the daughters? > > Andrew, > > +1 I think it's a great idea. > > Building on that, I think we should have system-level tests to make > sure we don't break performance and reliability. For example, an > intensive and simultaneous read/write test of a couple of millions of > rows. We could even think of killing a region server or two during > that test (and a master of course). Currently, I don't think it's > easily doable on Hudson so someone would have to host it on a small > cluster. > > J-D > > On Sun, Jun 14, 2009 at 12:52 PM, Andrew Purtell<[email protected]> > wrote: > > This possibly belongs in one of the new existing/open issues put up over > the > > past few days: > > > > Insert 1000 rows with random row keys, and induce a split (see test.rb > > attached to HBASE-1500). I would expect that no more than 1000 rows > should > > be returned from a row count. However, the following is a series of row > > counts obtained after running the test, with total reinitialization in > > between, 5 times: > > > > 1516 > > 1492 > > 1497 > > 1509 > > 1501 > > > > Also the shell provides an additional clue: > > > > Current count: 1000, row: ffdcee2a75742697b375edef62fa4b75 > > > > 1516 row(s) in 2.9530 seconds > > > > Looks like the parent region is fully iterated first, then in addition > > one of the daughters? > > > > Also, as these issues come up, kindly consider adding test cases to the > > test suite to catch these regressions. It seems the current coverage for > > scanners is letting big issues pass unnoticed. > > > > One thing we could do right away is commit my 'test.rb' reimplemented > > as Java/JUnit into the suite, with some additional logic to test that > > the scanners return the count of unique row keys inserted. If no -1 I > > will go ahead and do that. > > > > - Andy > > > > > > > > > > >
