Firstly, sorry for taking so long with this...

I just uploaded installment 1, showing the 'normalize attribute values when set' approach [1]. The patch doesn't address configuring the behaviour, but it does demonstrate what I was talking about. I forgot to mention in the attachment comments on the ticket, but the patch is also only for the HTML taglib. I'll widen the scope to include the other taglibs once we agree which approach to use.

Putting the patch together, I tried to consider on a case-by-case (attribute-by-attribute) basis whether trim()'ing whitespace was appropriate (i.e. whether to treat whitespace-only and empty values equivalently). Although Nial mentioned having had reports of problems due to whitespace trimming, I only found a few cases where it seemed like it would matter.

Specifically, alt, title and longdesc (where some browsers or other user agents will behave differently if the attribute appears with an empty value vs. if it is missing) and value (where the user may very well want to distinguish emtpy-string and whitespace-only values) were the only attributes I ended up *not* trim()'ing.

These attributes do present a problem for this patch: since empty/whitespace values need to be preserved, the normalize-on-set approach falls down. Specifically, cases like <tag alt="${x}" altKey=${y} ...> will still fail even if 'y' is non-empty even if 'x' *is* emtpy. That may be an argument for the alternate approach discussed.

I'll put together a patch showing the other approach next, but comments on this one are welcome. In particular, I left some annotations (marked '[lh]' in the patch) that may be worth additional review.

L.

[1] http://issues.apache.org/bugzilla/show_bug.cgi?id=33064

Laurie Harper wrote:
Ted Husted wrote:

Basically, I'm suggesting making more changes now (though still very
minor ones, just pervasively) to (a) ensure the code adheres to the DRY
principle, (b) reduce the chance of the changes being incomplete
(through missing checks that would need to be updated with approach 1)
and (c) reduces the chance of future mistakes creeping in.


I don't understand how the second approach furthers (a) and (c).
(a) It sounds like we would be repeating the same checks in every
accessor rather "saying it once" in a shared method in a base class.

(c) With the second approach, a very common mistake might be to
neglect applying the conversion or to apply conversions
inconsistently.

As to (b), I'd think that the first approach does imply the same level
of code review you'd make with the second approach, even if we would
be changing fewer lines of code.

IMHO, the preferred approach would be the one that combined ease of
maintenance for the development team and ease of modifcation for
individual users. That sounds like the first approach to me, since it
would seem that we would centralize our expectations for the
attribute's characteristics: null or empty, trimmed or verbatim. We
would have one point of access to maintain, and individual users would
have one point of access to modify.


Yep, that's what I was going for, I perhaps didn't explain my thinking clearly though. What I was thinking was to make each setter read something like this:

  public void setFoo(String value) {
    this.value = TagUtils.getInstance().normalize(value);
  }

where normalize() would be something like

  public static String normalize(String value) {
    if (value == null || "".equals(value)) {
      return null;
    }
    return value;
  }

In cases where an attribute shouldn't be 'normalized' the setter can include a clear comment to that effect and not call the utility routine. Existing checks for whether attributes have a null value would remain unchanged.

The alternative I listed as approach 1 would be to leave the attribute setters alone and make the call to normalize() or equivalent everywhere each attribute value is checked, which is the bit I see as error-prone.

The assertion that this approach would lead to less code review was based on the fact that any existing checks for 'null' would remain unchanged and it therefore wouldn't be necessary to review the entire taglib codebase looking for checks that should have been updated in a patch but weren't.

Of course we still have to ensure that normalization makes sense on an attribute-by-attribute basis, so maybe it's a moot point. Also, calling normalize() from the setters may not work well if the switch to control this behaviour was specified as a tag attribute; I hadn't thought of that before...

Whether to return empty or null strings *is* an important issue. To
streamline our business logic, my own team has to convert the empty
strings to null sometimes. Though, in other cases, we do want an empty
string to pass through, since the property might correspond to a field
on the database that doesn't accept nulls, but does accept empty
strings. We could adjust our interface between the presentation and
application layers easily enough, but other teams might not be so
agile.


That's the crux of the issue with respect to whether this change will be acceptable, really; if not passing empty strings through anymore breaks things for people depending on that behaviour (and clearly it could) then there needs to be a means of transitioning, as you mentioned before.

If we centralize the logic, as Niall suggested, then it seems like it
would be easier to managage a common switch attribute, like
"nullOnEmpty". If we decentralize the logic, then it would seem like
we'd have to make that test over and over again, creating new
opportunities for oversight and inconsistency.

Having a switch attribute, as Laurie has mentioned several times,
seems important, since people will want to test how well the changed
behavior will play with other taglibs that extend or utilize the
Struts Taglib. If there is a problem, we will need to be able to
quickly isolate the empty-vs-null behavior change, to discover whether
it is the culprit.


Yep, absolutely on both counts. Whether the nullOnEmpty behaviour occurs at the point of injection or the point of inspection, it needs to be trivial to switch it off. That's why I put the 'normalize' method on TagUtils, so there's only one place to change the logic that everything calls.

Of course, it would be nice to see both patches, so that we have a
factual basis for comparison.


Heh, I should have seen that coming... I'll see what I can do ;-).
L.

-Ted.

On 9/17/05, Laurie Harper <[EMAIL PROTECTED]> wrote:

Ted Husted wrote:

In general, we work by lazy consensus. If no one squawks, we sally forth.


OK, before I sally ;-) I've been thinking a little more about this
change. There are two approaches that seem reasonable, one of which
minimized changes in the code and one of which minimizes the amount of
code that will have to be reviewed in addition to testing as well as the
potential for mistakes to creep in later. Guess which I'm leaning toward ;-)

The first approach would be what Niall already proposed: changing the
checks for 'null' in BaseHandlerTag and a few other places. As I said,
that has the advantage of minimizing the amount of code change.

The second approach would be to modify the attribute setters to convert
empty-String arguments to null right away. This has a few advantages:

 - it makes it very clear and explicit that empty strings are being
thrown away, and the decision of whether or not to ignore empty strings
can be made once per attribute, rather than everywhere that attribute
value is referenced

 - it ensures that no existing checks for null that should be changed
into checks for empty-or-null get missed; whereever there's an 'if (x ==
null) now, the code would remain unchanged

 - more importantly, it ensures no checks for null get added later that
should be checks for empty-or-null, particularly since the checking
semantics could so easily get out-of-sync if the same check is made in
more than one place

Basically, I'm suggesting making more changes now (though still very
minor ones, just pervasively) to (a) ensure the code adheres to the DRY
principle, (b) reduce the chance of the changes being incomplete
(through missing checks that would need to be updated with approach 1)
and (c) reduces the chance of future mistakes creeping in.

Am I over-selling? ;-) This is the approach I'll take unless I hear an
objection.

L.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to