Re: [HACKYSTAT-DEV-L] SDT representation of build data, evolutionary SDTs

Hongbing Kou Sun, 06 Feb 2005 10:42:46 -0800

If it's just failure type we can use compound failure type in this
situation. To your example it can be "checkstyle+unittest". And error
message can be compound too.


Hope it helps.
Hongbing

----- Original Message -----
From: "(Cedric) Qin ZHANG" <[EMAIL PROTECTED]>
Date: Saturday, February 5, 2005 10:51 pm
Subject: Re: [HACKYSTAT-DEV-L] SDT representation of build data,
evolutionary SDTs

> It's not that I want to invent something newer or cooler, instead
> I have
>  some problem persisting all required information in a flat scheme.
>
> The major problem I encountered is that for each build, there
> could be
> more than one failures, and each failure might have different failure
> type, different failure message, and associated with different
> modules.
> An example would be:
>  * hackyKernel/file1.java checkstyle failure: line longer than 100.
>  * hackyKernel/file2.java checkstyle failure: missing javadoc.
>  * hackyKerenl/file3.java junit failure: assertion failed.
>  * hackyStdExt/file4.java junit failure: security voilation.
>
> The information I need to record in the above example is:
>
> <BuildReport>
>   <BuildContext StartTimeMillis="1107607869107"
>                 EndTimeMillis="1107609708935"
>                 Project="hacky2004-all"
>                 Configuration="Hackystat-JPL"
>                 StartType="CruiseControl-Auto" />
>   <BuildResult BuildFailed="true" CheckstyleRunned="true"
>                CompilationRunned="true" UnittestRunned="true">
>      <Failure ModuleName="hackyKernel" FailureType="Checkstyle"
>               Message="file 1 line longer than 100"/>
>      <Failure ModuleName="hackyKernel" FailureType="Checkstyle"
>               Message="file 2 missing javadoc"/>
>      <Failure ModuleName="hackyKernel" FailureType="JUnit"
>               Message="file 3 assertion failed"/>
>      <Failure ModuleName="hackyStdExt" FailureType="JUnit"
>               Message="file 4 security violation"/>
>   </BuildResult>
> </BuildReport>
>
> It's really hard to figure out what should go to "failure type",
> "failure message" fields, and what key-value pairs should stay in
> "additional information" field. I though the persistence details are
> hidden by SDT, we are safe so long as it knows how to handle them.
>
> I want to use the existing flat persistence scheme (much simpler),
> but I
> just don't know how. Can somebody help me?
>
> Thanks.
>
> Cedric
>
>
>
>
>
>
>
> Philip Johnson wrote:
> > Hi Cedric,
> >
> > When I logged in to the server as hackystat-l to see if build
> sensor data
> > on last night's failure was sent to the server, I noticed that
> you've gone
> > in a direction with respect to your sensor data representation
> that is an
> > important topic for discussion.
> >
> > In a nutshell, what appears to have happened is that you've
> bailed on the
> > SDT representation of Build.  In other words, there's no values
> being> provided for the required fields "result", "failureType", or
> > "failureMessage".  Instead, all of the information is being
> provided as a
> > XML string in the "data" field.  This is a violation of the
> specification,> which states that the Build SDT "data" field is
> supposed to consist of
> > key-value pairs, not XML:
> >
> > <http://hackystat.ics.hawaii.edu/hackystat/docbook/apbs04.html>
> >
> > There are several problems with the direction you're heading.
> The reason
> > for having SDTs in the first place is to define a consistent
> structure for
> > the data so that anyone analyzing the data (either inside of
> hackystat or
> > as an external tool) knows what to expect and where to get it.  Your
> > approach defeats this.  Instead of a well-defined structure,
> there is
> > instead an arbitrary XML string without any constraints or semantics
> > attached to it.  (There are semantics attached to the Build SDT--
> it's the
> > link above. There are also constraints, but you've worked around
> them :-).
> >
> > You could add some syntax constraints to your approach by
> providing the
> > DTD, but that's essentially implementing a brand new, parallel
> structure> for sensor data to the current one.  That seems
> redundent, complicated, and
> > confusing.
> >
> > It's also important to note that XML is nothing more than a
> hierarchical> set of key-value strings with a commonly accepted
> syntactic sugar.  The SDT
> > is a nothing more than a two-level hierarchical set of key-value
> strings.> In other words, they are representationally extremely
> close to each other.
> >
> > In looking at your current representation, you don't have appear
> to have
> > any deep hierarchies, so the question is, what motivated you to
> bail on the
> > current SDT specification?
> >
> > I am going to conjecture that there are two short-comings of the
> current> SDT implementation that you're trying to work around:
> >
> > (1) It's quite hard to evolve an SDT.  If you change an SDT's
> structure,> all of the current data becomes unusable, and you have
> to start from
> > scratch.
> >
> > (2) Putting/retrieving data from a "data" field is not trivial.
> You need to
> > basically hand-write the code to parse the key-value pair
> string, generate
> > the HashMap, put the HashMap back into its serialized form, etc.
> >
> > I've known about these inadequacies for a couple of years now,
> but have
> > left them on the back burner.
> >
> > What I want to suggest is that your cure, while perhaps
> convenient for you
> > in the short-term,  is ultimately worse than the disease in the
> long term,
> > because the result is a representational mishmash that will be
> confusing> and irrational to new people trying to understand the
> data in the system.
> >
> > In the best of all possible worlds, the way SDTs are supposed to
> work is as
> > follows:
> >
> > - The defined fields are supposed to specify the "required" data
> that every
> > sensor for every tool should provide and generate in a
> comparable fashion.
> >
> > - The "data" field is supposed to support "optional" data that
> is tool or
> > context specific.
> >
> > The problem is that when a new SDT is under development, as with
> the Build
> > data, it is hard to always be exactly right at the beginning
> about what
> > should be required and what should be optional.  As Burt has
> been finding
> > out with the Issue SDT, it's painful to make a change, you have to
> > essentially delete all of the old data.
> >
> > So, what I'd like to propose is as follows:
> >
> > (1) Cedric backs out of this direction, reverting to the
> standard form for
> > representing sensor data.
> >
> > (2) I will start work on enhancements to the SDT implementation
> to better
> > support evolution.  The basic idea will be to allow people to
> add and
> > delete fields in the SDT definition without invalidating data
> already on
> > the server. It will also include a new, implicitly defined field
> (like> tstamp and tool) called something like "keyvaluepairs" that
> will provide an
> > API for easy setting/getting of optional values.  This will
> eliminate the
> > need for every SDT to have a "data" field.
> >
> > Let me know what you think about this.  Cedric, please let me
> know if there
> > are other issues that I should be considering in this discussion.
> >
> > Cheers,
> > Philip
>

Re: [HACKYSTAT-DEV-L] SDT representation of build data, evolutionary SDTs

Reply via email to