Re: [MTT devel] questions about MTT database from HDF

Jeff Squyres Sun, 7 Nov 2010 17:15:01 -0500

Yep; I mentioned your GDS-backend work to the HDF folks.  But your email is 
much more detailed than what I mentioned -- thanks!



On Nov 7, 2010, at 1:02 AM, Mike Dubman wrote:

> 
> Hi,
> Also, there is an MTT option to select Google Datastore as a storage backend 
> for mtt results.
> 
> 
> Pro:
>  - your data is stored in the Google`s cloud
>  - You can access your data from scripts
>  - You can create a custom UI for you data visualization
>  - You can use Google`s default datastore querying tools 
>  - seamless integration with mtt
>  - No need in DBA services 
>  - There are some simple report scripts to query data and generate Excel files
>  - You can define custom dynamic DB fields and associate it with your data
>  - You can define security policy/permissions for your data
> 
> Cons:
>  - No UI (mtt default UI works with sql backend only)
> 
> regards
> Mike
> 
> On Thu, Nov 4, 2010 at 11:08 PM, Quincey Koziol <koz...@hdfgroup.org> wrote:
> Hi Josh!
> 
> On Nov 4, 2010, at 8:30 AM, Joshua Hursey wrote:
> 
> >
> > On Nov 3, 2010, at 9:10 PM, Jeff Squyres wrote:
> >
> >> Ethan / Josh --
> >>
> >> The HDF guys are interested in potentially using MTT.
> >
> > I just forwarded a message to the mtt-devel list about some work at IU to 
> > use MTT to test the CIFTS FTB project. So maybe development between these 
> > two efforts can be mutually beneficial.
> >
> >> They have some questions about the database.  Can you guys take a whack at 
> >> answering them?  (be sure to keep the CC, as Elena/Quincey aren't on the 
> >> list)
> >>
> >>
> >> On Nov 3, 2010, at 1:29 PM, Quincey Koziol wrote:
> >>
> >>>     Lots of interest here about MTT, thanks again for taking time to demo 
> >>> it and talk to us!
> >>
> >> Glad to help.
> >>
> >>>     One lasting concern was the slowness of the report queries - what's 
> >>> the controlling parameter there?  Is it the number of tests, the size of 
> >>> the output, the number of configurations of each test, etc?
> >>
> >> All of the above.  On a good night, Cisco dumps in 250k test runs to the 
> >> database.  That's just a boatload of data.  End result: the database is 
> >> *HUGE*.  Running queries just takes time.
> >>
> >> If the database wasn't so huge, the queries wouldn't take nearly as long.  
> >> The size of the database is basically how much data you put into it -- so 
> >> it's really a function of everything you mentioned.  I.e., increasing any 
> >> one of those items increases the size of the database.  Our database is 
> >> *huge* -- the DB guys tell me that it's lots and lots of little data (with 
> >> blobs of stdout/stderr here an there) that make it "huge", in SQL terms.
> >>
> >> Josh did some great work a few summers back that basically "fixed" the 
> >> speed of the queries to a set speed by effectively dividing up all the 
> >> data into month-long chunks in the database.  The back-end of the web 
> >> reporter only queries the relevant month chunks in the database (I think 
> >> this is a postgres-specific SQL feature).
> >>
> >> Additionally, we have the DB server on a fairly underpowered machine that 
> >> is shared with a whole pile of other server duties (www.open-mpi.org, 
> >> mailman, ...etc.).  This also contributes to the slowness.
> >
> > Yeah this pretty much sums it up. The current Open MPI MTT database is 141 
> > GB, and contains data as far back as Nov. 2006. The MTT Reporter takes some 
> > of this time just to convert the raw database output into pretty HTML (it 
> > is currently written in PHP). At the bottom of the MTT Reporter you will 
> > see some stats on where the Reporter took most of its time.
> >
> > How long the Reporter took total to return the result is:
> >  Total script execution time: 24 second(s)
> > How long just the database query took is reported as:
> >  Total SQL execution time: 19 second(s)
> >
> > We also generate an overall contribution graph which is also linked at the 
> > bottom to give you a feeling of the amount of data coming in every 
> > day/week/month.
> >
> > Jeff mentioned the partition tables work that I did a couple summers ago. 
> > The partition tables help quite a lot by partitioning the data into week 
> > long chunks so shorter date ranges will be faster than longer date ranges 
> > since they pull a smaller table with respect to all of the data to perform 
> > a query. The database interface that the MTT Reporter uses is abstracted 
> > away from the partition tables, it is really just the DBA (I guess that is 
> > me these days) that has to worry about their setup (which is usually just a 
> > 5 min task once a year). Most of the queries to MTT ask for date ranges 
> > like 'past 24 hours', 'past 3 days' so breaking up the results by week 
> > saves some time.
> >
> > One thing to also notice is that usually the first query through the MTT 
> > Reporter is the slowest. After that first query the MTT database 
> > (postgresql in this case) it is able to cache some of the query information 
> > which should make subsequent queries a little faster.
> >
> > But the performance is certainly not where I would like it, and there are 
> > still a few ways to make it better. I think if we moved to a newer server 
> > that is not quite as heavily shared we would see a performance boost. 
> > Certainly if we added more RAM to the system, and potentially a faster disk 
> > array that would improve the performance. I think there are still a few 
> > things that I can do to the database schema to improve common queries. 
> > Better normalization of incoming data would certainly help things. There 
> > are likely also some places in the current MTT Reporter where performance 
> > might be improved on the sorting/rendering side of things.
> >
> > The text blobs (database fields of variable string length) for 
> > stderr/stdout should not be contributing to the problem. Most recent 
> > databases (and postgresql in particular does this) will be able to optimize 
> > the performance these fields so that they have the same performance as 
> > referencing small fixed length strings, with regard to the SQL query.
> >
> >
> > So in short. Most of the slowness is due to: (1) shared server environment 
> > hosting a number of active projects, (2) volume of existing data. There are 
> > some places to improve things, but we haven't had the cycles yet to 
> > investigate them too much.
> 
>        OK, that's all good to know.  And, probably shouldn't affect us as 
> much, since we'll be starting with a newer, less loaded machine and a lot 
> less data.
> 
> >>> For example, each HDF5 build includes on the order of 100 test 
> >>> executables, and we run 50 or so configurations each night.  How would 
> >>> that compare with the OpenMPI test results database?
> >>
> >> Good question.  I'm CC'ing the mtt-devel list to see if Josh or Ethan 
> >> could comment on this more intelligently than me -- they did almost all of 
> >> the database work, not me.
> >>
> >> I'm *guessing* that it won't come anywhere close to the size of the Open 
> >> MPI database (we haven't trimmed the data in the OMPI database since we 
> >> started gathering data in the database several years ago).
> >
> > An interesting site that might be useful to give you a feeling of the 
> > volume and type of data being submitted is the 'stats' page: 
> > www.open-mpi.org/mtt/stats
> >
> > We don't publicly link to this page since it is not really useful for 
> > anyone except MTT maintainers.
> >
> > I have a script that maintains stats on the database that we can use as a 
> > metric. It is a special table in the database that is updated about every 
> > night. It is a nice way to get insight into the distribution of testing 
> > (for instance about 90 % of Open MPI testing is on Linux, 8 % on Solaris, 1 
> > % on each of OS X and cygwin).
> >
> > For example, on Oct. 25, 2010 (put '2010-10-25 - 2010-10-25' in the Date 
> > Range) there were:
> >   691 MPI Install variations
> >   658 Test Builds
> > 78,539 Test Run results
> >   437 Performance results
> >
> > Since MTT has the capability to tell if there is a 'new' tarball to test or 
> > not, some organizations (like Cisco) only run MTT when there is a new 
> > tarball while others (like IU) run every night even if it is against an old 
> > tarball.
> >
> > So the current database is holding today about 186 million test records. 
> > The weekly contribution normally ranges from 1.25 - 0.5 million tests 
> > submitted (range depends on how many 'new' tarballs are created in the 
> > week).
> >
> >
> > Hopefully my comments help more than confuse. If it would be useful to chat 
> > on the phone sometime, I'm sure we could setup something.
> 
>        That is very helpful, thanks.  I guess Elena and I will have to 
> discuss it a bit and then find a place for MTT testing on our priority list.
> 
>        Thanks!
>                Quincey
> 
> 
> _______________________________________________
> mtt-devel mailing list
> mtt-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
> 
> _______________________________________________
> mtt-devel mailing list
> mtt-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [MTT devel] questions about MTT database from HDF

Reply via email to