[HACKYSTAT-DEV-L:91] Re: Review Thoughts

Takuya Yamashita Sat, 25 Sep 2004 10:15:55 -0700

> > - First off, I don't believe that the Jupiter Sensor is working properly.  I don't 
> > seem
> > to be sending any data and I don't see anything in the logs.  I believe I have the
> > latest version of the sensor and have all the settings configured correctly.  Has
> > anyone successfully sent Review data to the server?
> 
> I struggled with this myself today and can't get any data sent to the server, even 
> after 
> installing last night's build on the public server and installing the corresponding 
> sensors in Eclipse.  The problem appears to be on the client-side---there are no 
> error 
> messages on the public server console, for example.  Takuya, can you check this out?


Confirmed. I checked that the sensor that is downloaded from the public
site does not work. it seems that the jupiter sensor was not instantiated
in Eclipse even though the Hackystat sensor and jupiter is successfully
instantiated. In addition, in my virtual environment (i.e. launching hacky
jupiter senor in my Eclipse) they work. Let me investigate what's wrong.

> - Review Issues are product metrics much like FileMetrics. So, I'm wondering if it
> would be easier to create an ant based sensor for Review Issues, instead of trying to
> capture the Issues using Jupiter.  An ant based sensor makes more sense if it is
> easier.

This actually seems harder to me than the current situation, in that we now have to 
run 
Ant to get the issue data sent to the server. I don't see any real cost to the current 
approach.

> > - Review Issues are product metrics much like FileMetrics. So, I'm wondering if it
> > would be easier to create an ant based sensor for Review Issues, instead of trying 
> > to
> > capture the Issues using Jupiter.  An ant based sensor makes more sense if it is
> > easier.
> 
> This actually seems harder to me than the current situation, in that we now have to 
> run 
> Ant to get the issue data sent to the server. I don't see any real cost to the 
> current 
> approach.

I agree with Philip too. it seems easier than that of Jupiter. However,
even if Ant deals with review issues, we still need the review issues
which are generated by review tool such as Jupiter. I guess the
hacky-jupiter sensor mechanism is convenient and reasonable way.

> > Actually, I'm also thinking that we could generate a report based on the Review
> > Issues.  I think this would be useful to be able to see all the outstanding issues 
> > that
> > have not been fixed.  In addition, this report could help the education of 
> > developers,
> > as they can refer to issues that are associated with code similar to what they are
> > writing.
> 
> I agree that this would be a very interesting analysis report to provide. For 
> example, 
> you could list the number of non-resolved issues by priority. There could also be a 
> Reduction function so that you can get Telemetry regarding the total number of 
> review 
> issues that are open and how that changes over time.

I am not sure this is the report in local or server though, it would be
nicer if it is implemented in Hackystat server to be cooperate with
project group. However, this could be implemented as one of Ant task if
the report is needed in local.

> > - ReviewId in the ReviewIssue SDT - It seems odd that the Review Id is so random.
> > Wouldn't it be better to use <ReviewId>-<ReviewerId>-<created timestamp> or 
> > something
> > like that?  If the Review Id is truly random at some point you will have a 
> > duplicate Id
> > depending how good your random generator is.  But, I guess in the end it really 
> > doesn't
> > matter.
> 
> A more nicely structured reviewID would seem to be better.

I am not sure how we can deal with it. Is it better for Jupiter to check
the duplication name when new review ID is created? Even though it does,
it would be harder for jupiter to check the duplicated name over
projects (opened / closed, or not imported).

> A more nicely structured reviewID would seem to be better.
> 
> > - reviewId in the ReviewAcitivty SDT
> > The information provided in the SDT help page says "reviewId - The unique reviewID 
> > for
> > the review entry, eg takuyay".  I believe you meant "reviewId - The unique 
> > reviewID for
> > the review entry, eg SelectionInterval1"
> 
> Yes.

Thank you. it's fixed.

> > - The long term goal of the Review Metrics are to be able to evaluate our review
> > process and the effectiveness of our review.  I think that we should be able to 
> > bring
> > up Hackystat before a review meeting and discuss the effectiveness of the review 
> > based
> > on the number and severity of review issues generated.  There are many statistical
> > information that can be derived.  For example, is one hour enough preparation 
> > time? Or
> > is it the most effective?
> 
> This is a GREAT idea.  I had never thought of using Hackystat as a way of checking 
> to 
> see, for example, whether or not it was even appropriate to do the review meeting. 
> For 
> example, you might decide to not do the review meeting if there are no critical or 
> major 
> issues uncovered during preparation. You might decide to delay the group meeting if 
> preparation time was not sufficient.

It seems interesting idea. However, it also seems to answer these
research questions.

> is one hour enough preparation time? 

How do we determine that one hour is enough or not? We might correct
questionnaire depending upon the amount of the review materials (i.e.
number of class files) to adjust the enough-feeling one hour review
though...

> you might decide to not do the review meeting if there are no critical or major 
> issues uncovered during preparation. You might decide to delay the group meeting if 
> preparation time was not sufficient.

This may or may not be good because as you see, almost reviewers use
the default severity (i.e. in our case, unset or normal) so that we can
not see there exist really critical or major issues. Even though we can
force them to chose one of them, it would be hard to determine that
review should be held or not because the magnitude of severity that each
reviewer thought might be different from that of the other reviewers
think in the team phase.

> > - It would also be interesting to test the age old belief that reviews decrease the
> > number of defects.  We can do this easily if we can associate a defect to a 
> > particular
> > class and checking if that class has been reviewed.
> 
> I don't think this is easy at all to do.  There are all sorts of conflicting 
> independent 
> variables, including the complexity of the code and the skill of the author. You'll 
> need 
> a very large sample size to factor this stuff out.

I agree with Philip. As well as the complexity and the skill, the
deduction of defects might be caused by another factors such as test
cases and so forth.

> > - Another major goal is to catch defects early in the development process. 
> > However, I
> > would claim that for our situation we don't really adopt that trend.  Projects that
> > follow that trend tend to be larger systems where testing is expensive.  I believe 
> > that
> > most of the defects are caught in our Unit Test cases (of course that is if we have
> > good test cases).  Our review process seems more like confirmation and I would 
> > claim
> > that we would have less critical defects (defects that cause the program to 
> > function
> > incorrectly) than typical software projects.  I would also claim that when we 
> > review
> > code in CSDL, we need to pay more attention to the Unit Tests; are we testing the
> > program correctly, effectively, and thoroughly?
> 
> More good ideas here.  For example, if we are paying attention to Unit Tests, then 
> it 
> would be reasonable to expect that coverage would go up after the rework following a 
> review.  Coverage certainly shouldn't go _down_ after the rework following a review! 
>  We 
> might also expect that test case failures related to the module under review should 
> go 
> down after the rework.  All of these are hypotheses that could be empirically tested.

To examine the relationship between # of unit test and review, we might
re-define our review process and defect category precisely. If we
conduct review ambiguously, checking test and coverage in a review
increases in some extent for a while, but not to be a trend for long
period. If we define more review process (e.g. we are supposed to check
test cases and coverage, and invoke both of them in every review time),
we might see the relationship that review increase the number of
coverage and test cases and so forth. Our review objective is rather
spreading our knowledge than finding defects, it might be difficult to
claim that.

Cheers,

Takuya

On Wed, 22 Sep 2004 15:58:47 -1000
Philip Johnson <[EMAIL PROTECTED]> wrote:

> Greetings, all,
> 
> Aaron sent Takuya and I some interesting thoughts based upon the latest Hackystat 
> review 
> that I would like to hear other people's thoughts on.  I also provide my feedback 
> below:
> 
> --On Wednesday, September 22, 2004 2:29 AM -1000 Aaron Kagawa <[EMAIL PROTECTED]> 
> wrote:
> 
> > Hi All,
> >
> > A couple comments about Review using Jupiter:
> >
> > - First off, I don't believe that the Jupiter Sensor is working properly.  I don't 
> > seem
> > to be sending any data and I don't see anything in the logs.  I believe I have the
> > latest version of the sensor and have all the settings configured correctly.  Has
> > anyone successfully sent Review data to the server?
> 
> I struggled with this myself today and can't get any data sent to the server, even 
> after 
> installing last night's build on the public server and installing the corresponding 
> sensors in Eclipse.  The problem appears to be on the client-side---there are no 
> error 
> messages on the public server console, for example.  Takuya, can you check this out?
> 
> > - Review Issues are product metrics much like FileMetrics. So, I'm wondering if it
> > would be easier to create an ant based sensor for Review Issues, instead of trying 
> > to
> > capture the Issues using Jupiter.  An ant based sensor makes more sense if it is
> > easier.
> 
> This actually seems harder to me than the current situation, in that we now have to 
> run 
> Ant to get the issue data sent to the server. I don't see any real cost to the 
> current 
> approach.
> 
> > Actually, I'm also thinking that we could generate a report based on the Review
> > Issues.  I think this would be useful to be able to see all the outstanding issues 
> > that
> > have not been fixed.  In addition, this report could help the education of 
> > developers,
> > as they can refer to issues that are associated with code similar to what they are
> > writing.
> 
> I agree that this would be a very interesting analysis report to provide. For 
> example, 
> you could list the number of non-resolved issues by priority. There could also be a 
> Reduction function so that you can get Telemetry regarding the total number of 
> review 
> issues that are open and how that changes over time.
> 
> > - ReviewId in the ReviewIssue SDT - It seems odd that the Review Id is so random.
> > Wouldn't it be better to use <ReviewId>-<ReviewerId>-<created timestamp> or 
> > something
> > like that?  If the Review Id is truly random at some point you will have a 
> > duplicate Id
> > depending how good your random generator is.  But, I guess in the end it really 
> > doesn't
> > matter.
> 
> A more nicely structured reviewID would seem to be better.
> 
> > - reviewId in the ReviewAcitivty SDT
> > The information provided in the SDT help page says "reviewId - The unique reviewID 
> > for
> > the review entry, eg takuyay".  I believe you meant "reviewId - The unique 
> > reviewID for
> > the review entry, eg SelectionInterval1"
> 
> Yes.
> 
> > - Consider doing a usability study within your thesis.  Some readers would probably
> > doubt your findings if the Jupiter Interface hasn't been proven to conduct reviews
> > efficiently.
> 
> I agree.
> 
> > Goals of collecting review data (this is just what I'm thinking)
> >
> > - The long term goal of the Review Metrics are to be able to evaluate our review
> > process and the effectiveness of our review.  I think that we should be able to 
> > bring
> > up Hackystat before a review meeting and discuss the effectiveness of the review 
> > based
> > on the number and severity of review issues generated.  There are many statistical
> > information that can be derived.  For example, is one hour enough preparation 
> > time? Or
> > is it the most effective?
> 
> This is a GREAT idea.  I had never thought of using Hackystat as a way of checking 
> to 
> see, for example, whether or not it was even appropriate to do the review meeting. 
> For 
> example, you might decide to not do the review meeting if there are no critical or 
> major 
> issues uncovered during preparation. You might decide to delay the group meeting if 
> preparation time was not sufficient.
> 
> > - It would also be interesting to test the age old belief that reviews decrease the
> > number of defects.  We can do this easily if we can associate a defect to a 
> > particular
> > class and checking if that class has been reviewed.
> 
> I don't think this is easy at all to do.  There are all sorts of conflicting 
> independent 
> variables, including the complexity of the code and the skill of the author. You'll 
> need 
> a very large sample size to factor this stuff out.
> 
> > - One of the major goals of conducting reviews is to spread knowledge.  I would 
> > claim
> > that is rather important of our review process.  How can we evaluate that? And, 
> > how do
> > we build an infrastructure or process that helps promote spreading knowledge?
> 
> Excellent thoughts!  Anyone have any ideas?
> 
> > - Another major goal is to catch defects early in the development process. 
> > However, I
> > would claim that for our situation we don't really adopt that trend.  Projects that
> > follow that trend tend to be larger systems where testing is expensive.  I believe 
> > that
> > most of the defects are caught in our Unit Test cases (of course that is if we have
> > good test cases).  Our review process seems more like confirmation and I would 
> > claim
> > that we would have less critical defects (defects that cause the program to 
> > function
> > incorrectly) than typical software projects.  I would also claim that when we 
> > review
> > code in CSDL, we need to pay more attention to the Unit Tests; are we testing the
> > program correctly, effectively, and thoroughly?
> 
> More good ideas here.  For example, if we are paying attention to Unit Tests, then 
> it 
> would be reasonable to expect that coverage would go up after the rework following a 
> review.  Coverage certainly shouldn't go _down_ after the rework following a review! 
>  We 
> might also expect that test case failures related to the module under review should 
> go 
> down after the rework.  All of these are hypotheses that could be empirically tested.
> 
> > Jupiter is working great.. Now the focus shifts on how to use the data that we get 
> > from
> > it to understand process, productivity and quality of reviews and how it affects 
> > the
> > overall process, productivity, and quality of the software product.
> 
> I couldn't have put it better.
> 
> > thanks, aaron
> 
> Thank you!
> 
> Cheers,
> Philip
> 
> 



================================
Takuya Yamashita
E-mail: [EMAIL PROTECTED]
================================

[HACKYSTAT-DEV-L:91] Re: Review Thoughts

Reply via email to