Re: Incoming PRs related to 'tree' enhancements

Bear Giles Sun, 30 Mar 2025 11:17:45 -0700

P.S., another possibility is extending AutoCloseable - the implementation
can have endVisit(node) cache the last node seen so it's available to the
close() method.


There are two benefits to this. The first is that it eliminates the need
for a new method. AutoClosable requires 'close()' but that's familiar to
everyone. (I would still prefer an explicit method since it gives me more
flexibility.)

The second is that there are benefits to making AutoCloseable transitive,
in a sense. E.g., if the constructor takes a Writer then the class should
also implement AutoCloseable to it can be added to the list of resources in
a try-with-resources block. This makes it much clearer that the object
only makes sense within that try-with-resources block - it's a small thing
but those small things add up when you look at a lot of unfamiliar code.

As before close() would have an empty default implementation.

Bear

On Sun, Mar 30, 2025 at 11:25 AM Bear Giles <bgi...@coyotesong.com> wrote:

> Here's one proposal I mentioned. It would allow the *caller* to eliminate
> all calls to visit() and endVisit() while maintaining existing behavior.
> This may allow the callers to drop some now-unnecessary code as well.
>
> /**
>  * UpdatedependencyNodeVisitor
>  */
> public interface NewDependencyNodeVisitor extends Function<DependencyNode, 
> Boolean> {
>     default boolean visit(DependencyNode node) {
>         return true;
>     };
>
>     default boolean endVisit(DependencyNode node) {
>         return true;
>     };
>
>     /**
>      * {@inheritDoc}
>      */
>     @Override
>     default Boolean apply(DependencyNode node) {
>         if (!visit(node)) {
>             return Boolean.FALSE;
>         }
>         for (DependencyNode child : node.getChildren()) {
>             if (!apply(child)) {
>                 return Boolean.FALSE;
>             }
>         }
>         return endVisit(node);
>     };
> }
>
>
> Bear
>
> On Sun, Mar 30, 2025 at 11:04 AM Bear Giles <bgi...@coyotesong.com> wrote:
>
>> I've taken the time to look around the other 'tree' classes, or more
>> precisely the '*DependencyNodeVisitor" classes, and noticed that most if
>> not all of them follow the anti-pattern of doing more than one thing in
>> each method. More precisely they mix logic ('what to include') and
>> presentation ('what to send to the writer). This makes code difficult to
>> understand, to test, to extend by a third party, etc.
>>
>> This is why template engines were developed! :-)
>>
>> Fortunately there's already a dependency on 'velocity', although it
>> appears to only be used during the extensive testing. However it's still
>> there in the 'compile' scope so using it won't require adding new
>> dependencies.
>>
>> That's a very clean solution to the problem identified above. For most
>> formats the velocity template won't need anything beyond what's already
>> included in the WrappedNode (DependencyNode). If a format can benefit from
>> adding a tiny bit of additional information it can be added to the
>> WrappedNode since that class is exclusive to the 'tree' package. In more
>> complex cases the individual class can maintain it's own data structures
>> and add them to the Context provided to velocity.
>>
>> This should work with all of the existing classes although I'm only
>> planning to modify one at this time. It should make the classes much easier
>> to maintain since they can focus on the logic of how to convert the data
>> provided by the DependencyNode to the data usable by the Velocity engine. I
>> know other template engines, e.g., jinjava, have a way to provide
>> user-defined functions but I don't think velocity does - that's unfortunate
>> since it will require a modest amount of pre-computation. Fortunately that
>> can be handled separately, immediately before calling the Velocity 'merge'
>> method.
>>
>> This will only require a few classes - basically just a wrapper to set up
>> the velocity engine. It will also require a standard place to put the
>> templates and I recommend 'templates/{format}/{name} and
>> 'templates/{format}/macros/{name}. The 'format' would be 'dot', 'json',
>> etc., and the 'macros' allows velocity macros to be kept separately from
>> the templates since they're much more likely to be reused. (Hmm... perhaps
>> also 'templates/macros/' for macros that are format-agnostic.
>>
>> -----------------------------
>>
>> The second PR will probably be a bit more controversial since there's
>> multiple ways to implement it and comments in some of the 'upstream'
>> methods suggest there's already WIP that may affect how the data is
>> provided to the *DependencyNodeVisitor classes.
>>
>> At the moment these classes only include two methods - visit(node) and
>> endVisit(node). It is the responsibility of the endVisit(node) method to
>> determine that all data has been received. I think that's an obvious
>> omission but the DependencyNodeVisitor interface is defined in a different
>> project. It would be trivial to add to the existing implementations, and
>> java8 supports optional methods so there would be no impact on others if
>> this method is added to the interface with an empty optional method.
>>
>> That's not a big problem if your visitor is able to run in a 'streaming'
>> mode, but some formats can benefit from using a 'document' mode instead.
>> In most cases this doesn't matter since the results are not intended for
>> human consumption - they need to be a standard format accepted by other
>> applications and these formats are usually designed to work in a streaming
>> fashion.
>>
>> However the dot format, and potentially other, are intended for human
>> consumption. The file itself isn't human-friendly but there are widely
>> available tools to convert it into human-friendly images and even
>> interactive formats like svg. In these cases it makes much more sense  to
>> use a document mode when creating the final document since it is often
>> appropriate to provide the same information, or at least related
>> information, in multiple places in the document.
>>
>> So there's clearly a benefit to a new method that is called immediately
>> after the final endVisit(), one that provides the complete dependency tree.
>> The implementations should be able to ignore visit() and endVisit(). (And
>> in fact they can be added to the abstract class if not the
>> DependencyNodeVisitor interface)
>>
>> The question is how to implement this, esp. when looking at other
>> potential enhancements to the class.
>>
>> For instance it's possible to have DependencyNodeVisitor implement
>> Consumer<DependencyNode> and the 'accept()' method calls visit(). It may
>> not be practical since many implementations will need the nesting that
>> endVisit() provides.
>>
>> Another possibility is to have DependencyNodeVisitor implement
>> Function<DependencyNode, Boolean> that's called after the final endVisit()
>> call. Again the idea is to simplify integration into streams but there's no
>> reason you couldn't just use the (object::function) notation.
>>
>> I think this ultimately comes down to what's best with future
>> improvements to the upstream code that creates the data sent to these
>> classes. There are notes that indicate the desire to shift to a streaming
>> pattern (while maintaining legacy interfaces, of course) and it would make
>> sense to use a consistent approach here - even if it means reformatting the
>> data immediately before calling the new method.
>>
>> For now I'll just add a new method to the abstract class that all of the
>> implementations extend but I consider it a stopgap measure since this
>> functionality would be useful to third party developers, etc.
>>
>> (Sidenote - it could even quietly replace some of the existing logic
>> since the final method could easily perform a recursive descent of rootNode
>> and call visit() and endVisit() itself. I think this could be put into the
>> DependencyNodeVisitor interface as a default method so there would be no
>> risk to removing the existing code. The primary drawback would be that the
>> current approach allows the implementation to bail out early if it hits a
>> problem while this approach would require the entire rootNode structure to
>> be generated first.)
>>
>> Bear
>>
>>
>>
>>

Re: Incoming PRs related to 'tree' enhancements

Reply via email to