Re: [MTT devel] GSOC application

2009-04-15 Thread Josh Hursey
I have been listening in on the thread, but have not had time to  
really look at much (which is why I have not been replying). I'm  
interested in listening in on the teleconf as well, though if I become  
a blocker for finding a time feel free to cut me out.


Best,
Josh

On Apr 14, 2009, at 8:51 PM, Jeff Squyres wrote:

BTW -- if it's useful to have a teleconference about this kind of  
stuff, I can host a WebEx meeting.  WebEx has local dialins around  
the world, including Israel...



sure, what about next week?


I have a Doodle account -- let's try that to do the scheduling:

   http://doodle.com/gzpgaun2ef4szt29

Ethan, Josh, and I are all in US Eastern timezone (I don't know if  
Josh will participate), so that might make scheduling *slightly*  
easier.  I started timeslots at 8am US Eastern and stopped as 2pm US  
Eastern -- that's already pretty late in Israel.  I also didn't list  
Friday, since that's the weekend in Israel.




Re: [MTT devel] GSOC application

2009-04-15 Thread Mike Dubman
On Wed, Apr 15, 2009 at 8:50 PM, Jeff Squyres  wrote:

> On Apr 15, 2009, at 1:45 PM, Mike Dubman wrote:
>
>  yep. correct. We can define only static attributes (which we know for sure
>> should present in every object of given type and leave phase specific
>> attributes to stay dynamic)
>>
>> Hmm.  I would think that even in each phase, we have a bunch of fields
>> that we *know* we want to have, right?
>>
>> correct, in gds terms they call it static attributes.
>>
>
> I was more nit-picking your statement that we would only have a field
> fields that would be available for every phase, and then use dynamic fields
> for all phase-specific data.  While GDS *can* handle that, wouldn't it be
> better to have a model for each phase (similar to your mockup) that expects
> a specific set of data for each phase?  Extra data on top of that would be a
> bonus, but wouldn't be necessary.  More specifically: we *know* what data
> should be available in each phase, so why not tell GDS about it in the model
> (rather than using dynamic fields that we know will always be there)?
>
> Perhaps we're just getting confused by language and I should wait for your
> next mock-up to see what you guys do... :-)
>

completely agree, the model for every phase object should contain mostly
static fields, based on current mtt phases info.
Also, we will have flexibility to expand phase objects without changing the
model.


Re: [MTT devel] GSOC application

2009-04-15 Thread Jeff Squyres

On Apr 15, 2009, at 1:45 PM, Mike Dubman wrote:

yep. correct. We can define only static attributes (which we know  
for sure should present in every object of given type and leave  
phase specific attributes to stay dynamic)


Hmm.  I would think that even in each phase, we have a bunch of  
fields that we *know* we want to have, right?


correct, in gds terms they call it static attributes.


I was more nit-picking your statement that we would only have a field  
fields that would be available for every phase, and then use dynamic  
fields for all phase-specific data.  While GDS *can* handle that,  
wouldn't it be better to have a model for each phase (similar to your  
mockup) that expects a specific set of data for each phase?  Extra  
data on top of that would be a bonus, but wouldn't be necessary.  More  
specifically: we *know* what data should be available in each phase,  
so why not tell GDS about it in the model (rather than using dynamic  
fields that we know will always be there)?


Perhaps we're just getting confused by language and I should wait for  
your next mock-up to see what you guys do... :-)



I have a Doodle account -- let's try that to do the scheduling:

  http://doodle.com/gzpgaun2ef4szt29

aha, tried and here what I got:



Ahh -- looks like they're offline for the next ~2 hours (3pm US  
Pacific / 21:00 CET).  Well, we can't complain -- it's a free  
service.  :-)


--
Jeff Squyres
Cisco Systems



Re: [MTT devel] GSOC application

2009-04-15 Thread Mike Dubman
On Wed, Apr 15, 2009 at 5:23 PM, Jeff Squyres  wrote:

> On Apr 15, 2009, at 9:14 AM, Mike Dubman wrote:
>
>  Hmm.  Ok, so you're saying that we define a "phase object" (for each
>> phase) with all the fields that we expect to have, but if we need to, we can
>> create fields on the fly, and google will just "do the right thing" and
>> associate *all* the data (the "expected" fields and the "dynamic" fields)
>> together?
>>
>> yep. correct. We can define only static attributes (which we know for sure
>> should present in every object of given type and leave phase specific
>> attributes to stay dynamic)
>>
>
> Hmm.  I would think that even in each phase, we have a bunch of fields that
> we *know* we want to have, right?


correct, in gds terms they call it static attributes.


>
>
>  I have a Doodle account -- let's try that to do the scheduling:
>>
>>   http://doodle.com/gzpgaun2ef4szt29
>>
>> Ethan, Josh, and I are all in US Eastern timezone (I don't know if Josh
>> will participate), so that might make scheduling *slightly* easier.  I
>> started timeslots at 8am US Eastern and stopped as 2pm US Eastern -- that's
>> already pretty late in Israel.  I also didn't list Friday, since that's the
>> weekend in Israel.
>>
>> can we do it on your morining? (our after noon) :)
>>
>
>
> Visit the Doodle URL (above) and you'll see.  :-)


aha, tried and here what I got:

Wir sind bald zurück
i tempt to agree with it :)





>
>
> --
> Jeff Squyres
> Cisco Systems
>
>


Re: [MTT devel] GSOC application

2009-04-15 Thread Jeff Squyres

On Apr 15, 2009, at 9:14 AM, Mike Dubman wrote:

Hmm.  Ok, so you're saying that we define a "phase object" (for each  
phase) with all the fields that we expect to have, but if we need  
to, we can create fields on the fly, and google will just "do the  
right thing" and associate *all* the data (the "expected" fields and  
the "dynamic" fields) together?


yep. correct. We can define only static attributes (which we know  
for sure should present in every object of given type and leave  
phase specific attributes to stay dynamic)


Hmm.  I would think that even in each phase, we have a bunch of fields  
that we *know* we want to have, right?



I have a Doodle account -- let's try that to do the scheduling:

   http://doodle.com/gzpgaun2ef4szt29

Ethan, Josh, and I are all in US Eastern timezone (I don't know if  
Josh will participate), so that might make scheduling *slightly*  
easier.  I started timeslots at 8am US Eastern and stopped as 2pm US  
Eastern -- that's already pretty late in Israel.  I also didn't list  
Friday, since that's the weekend in Israel.


can we do it on your morining? (our after noon) :)



Visit the Doodle URL (above) and you'll see.  :-)

--
Jeff Squyres
Cisco Systems



Re: [MTT devel] GSOC application

2009-04-15 Thread Mike Dubman
On Wed, Apr 15, 2009 at 3:51 AM, Jeff Squyres  wrote:

> On Apr 14, 2009, at 2:27 PM, Mike Dubman wrote:
>
> Ah, good point (python/java not perl).  But I think that
>> lib/MTT/Reporter/GoogleDataStore.pm could still be a good thing -- we have
>> invested a lot of time/effort into getting our particular mtt clients setup
>> just the way we want them, setting up INI files, submitting to batch
>> schedulers, etc.
>>
>> A GoogleDataStore.pm reporter could well fork/exec a python/java
>> executable to do the actual communication/storing of the data, right...?
>>  More below.
>>
>> completely agree, once we have external python/java/cobol scripts to
>> manipulate GDS objects, we should wrap it by perl and call from MTT in same
>> way like it works today for submitting to the postgress.
>>
>
> So say we all!  :-)
>
> (did they show Battlestar Gallactica in Israel?  :-) )
>
> sounds good, we should introduce some guid (like pid) for mtt session,
>> where all mtt results generated by this session will be referring to this
>> guid.  Later we use this guid to submit partial results as they become ready
>> and connect it to the appropriate mtt session object (see models.py)
>>
>
> I *believe* have have 2 values like this in the MTT client already:
>
> - an ID that represents a single MTT client run
> - an ID that represents a single MTT mpi install->test build->test run tree
>
>
> I think that Ethan was asking was: can't MTT run Fluent and then use the
>> normal Reporter mechanism to report the results into whatever back-end data
>> store we have?  (postgres or GDS)
>>
>> ahhh, okie, i see.
>>
>> Correct me if Im wrong, the current mtt implementation allows following
>> way of executing mpi test:
>> /path/to/mpirun  
>>
>
> Yes and no; it's controlled by the mpi details section, right?  You can put
> whatever you want in there.
>
> Many mpi based applications have embedded MPI libraries and non-standard
>> way to start it, one should set env variable to point to desired mpi
>> installation or pass it as cmd line argument, for example:
>>
>> for fluent:
>>
>> export OPENMPI_ROOT=/path/to/openmpi
>> fluent 
>>
>>
>> for pamcrash:
>> pamworld -np 2 -mpidir=/path/to/openmpi/dir 
>>
>> Im not sure if it is possible to express that execution semantic in mtt
>> ini file. Please suggest.
>> So far, it seems that such executions can be handled externally from mtt
>> but using same object model.
>>
>
> Understood.  I think you *could* get MTT to run these with specialized mpi
> details sections.  But it may or may not be worth it.
>
> For the attachment...
>>
>> I can "sorta read" python, but I'm not familiar with its intricacies and
>> its internal APIs.
>>
>> - models.py: looks good.  I don't know if *all* the fields we have are
>> listed here; it looks fairly short to me.  Did you attempt to include all of
>> the fields we submit through the various phases in Reporter are there, or
>> did you intentionally leave some out?  (I honestly haven't checked; it just
>> "feels short" to me compared to our SQL schema).
>>
>> I listed only some of the fields in every object representing specific
>> test result source (called phase in mtt language).
>>
>
> Ok.  So that's only a sample -- just showing an example, not necessarily
> trying to be complete.  Per Ethan's comments, there are a bunch of other
> fields that we have and/or we might just be able to "tie them together" in
> GDS.  I.e., our data is hierarchical -- it worked well enough in SQL because
> you could just have one record about a test build refer to another record
> about the corresponding mpi install.  And so on.  Can we do something
> similar in GDS?
>


yep, actually in GDS it should be much easier to have hierarchy, because it
is OO storage. We just need to map all object relations and put it in
models.py - gds will do the rest :)




>
>
> This is because every test result source object is derived from python
>> provided db.Expando class. This gives us great flexibility, like adding
>> dynamic attributes for every objects, for example:
>>
>> obj = new MttBuildPhaseResult()
>> obj.my_favorite_dynamic_key = "hello"
>> obj.my_another_dynamic_key = 7
>>
>> So, we can have all phase attributes in the phase object without defining
>> it in the *sql schema way*. Also we can query object model by these dynamic
>> keys.
>>
>
> Hmm.  Ok, so you're saying that we define a "phase object" (for each phase)
> with all the fields that we expect to have, but if we need to, we can create
> fields on the fly, and google will just "do the right thing" and associate
> *all* the data (the "expected" fields and the "dynamic" fields) together?


yep. correct. We can define only static attributes (which we know for sure
should present in every object of given type and leave phase specific
attributes to stay dynamic)


>
>
> --> meta question: is it in the zen of GDS to not have too many index
>> fields like you would in SQL?  I.e., if you want to do an operation on GDS
>> that you
>>
>

Re: [MTT devel] GSOC application

2009-04-15 Thread Mike Dubman
On Tue, Apr 14, 2009 at 11:50 PM, Ethan Mallove wrote:

>  On Tue, Apr/14/2009 09:27:14PM, Mike Dubman wrote:
> >On Tue, Apr 14, 2009 at 5:04 PM, Jeff Squyres 
> wrote:
> >
> >  On Apr 13, 2009, at 2:08 PM, Mike Dubman wrote:
> >
> >Hello Ethan,
> >
> >  Sorry for joining the discussion late... I was on travel last week
> and
> >  that always makes me waaay behind on my INBOX. *:-(
> >
> >On Mon, Apr 13, 2009 at 5:44 PM, Ethan Mallove <
> ethan.mall...@sun.com>
> >wrote:
> >
> >Will this translate to something like
> >lib/MTT/Reporter/GoogleDatabase.pm? *If we are to move away from
> the
> >current MTT Postgres database, we want to be able to submit
> results to
> >both the current MTT database and the new Google database during
> the
> >transition period. Having a GoogleDatabase.pm would make this
> easier.
> >
> >I think we should keep both storage options: current postgress and
> >datastore. The mtt changes will be minor to support datastore.
> >Due that fact that google appengine API (as well as datastore API)
> can
> >be python or java only, we will create external scripts to
> manipulate
> >datastore objects:
> >
> >  Ah, good point (python/java not perl). *But I think that
> >  lib/MTT/Reporter/GoogleDataStore.pm could still be a good thing --
> we
> >  have invested a lot of time/effort into getting our particular mtt
> >  clients setup just the way we want them, setting up INI files,
> >  submitting to batch schedulers, etc.
> >
> >  A GoogleDataStore.pm reporter could well fork/exec a python/java
> >  executable to do the actual communication/storing of the data,
> right...?
> >  *More below.
> >
> >completely agree, once we have external python/java/cobol scripts to
> >manipulate GDS objects, we should wrap it by perl and call from MTT in
> >same way like it works today for submitting to the postgress.
> >
> >*
> >
> >The mtt will dump test results in xml format. Then, we provide two
> >python (or java?) scripts:
> >
> >mtt-results-submit-to-datastore.py - script will be called at the
> end
> >of mtt run and will read xml files, create objects and save to
> >datastore
> >
> >  Could be pretty easy to have a Reporter/GDS.pm (I keep making that
> >  filename shorter, don't I? :-) ) that simply invokes the
> >  mtt-result-submit-to-datastore.pt script on the xml that it dumped
> for
> >  that particular test.
> >
> >  Specifically: I do like having partial results submitted while my
> MTT
> >  tests are running. *Cisco's testing cycle is about 24 hours, but
> groups
> >  of tests are finishing all the time, so it's good to see those
> results
> >  without having to wait the full 24 hours before anything shows up.
> *I
> >  guess that's my only comment on the idea of having a script that
> >  traverses the MTT scratch to find / submit everything -- I'd prefer
> if
> >  we kept the same Reporter idea and used an underlying .py script to
> >  submit results as they become ready.
> >
> >  Is this do-able?
> >
> >sounds good, we should introduce some guid (like pid) for mtt session,
> >where all mtt results generated by this session will be referring to
> this
> >guid.* Later we use this guid to submit partial results as they become
> >ready and connect it to the appropriate mtt session object (see
> models.py)
> >
> >mtt-results-query.py - sample script to query datastore and
> generate
> >some simple visual/tabular reports. It will serve as tutorial for
> >howto access mtt data from scripts for reporting.
> >
> >Later, we add another script to replace php web frontend. It will
> be
> >hosted on google appengine machines and will provide web viewer
> for
> >mtt results. (same way like index.php does today)
> >
> >  Sounds good.
> >
> >> * * *b. mtt_save_to_db.py - script which will go over mtt
> scratch
> >dir, find
> >> * * *all xml files generated for every mtt phase, parse it and
> save
> >to
> >> * * *datastore, preserving test results relations,i.e. all test
> >results will
> >> * * *be grouped by mtt general info: mpi version, name, date,
> 
> >>
> >> * * *c. same script can scan, parse and save from xml files
> >generated by
> >> * * *wrapper scripts for non mtt based executions (fluent, ..)
> >
> >I'm confused here. *Can't MTT be outfitted to report results of a
> >Fluent run?
> >
> >I think we can enhance mtt to be not only mpi testing platform,
> but
> >also to serve as mpi benchmarking platform. We can use datastore
> to
> >keep mpi-based benchmarking results in the same manner like mtt
> does
> >for testing results. (no changes to