Hi,

I've been researching issue #4711 [1] and, as requested by danielsh in JIRA, I'm turning to dev@ to discuss a possible solution.

As already found by danielsh, the problem lies in the filtering code in svn_cl__log_entry_receiver_xml. If the log entry doesn't match the search pattern, the code will return without emitting any opening tag. However when the library calls back to close a tag, it is done without checking if the opening tag was emitted or not.

Without filtering, the tree is recursive. I have made a test case where I have three branches, a, b and c. I've made commits to a, which I have subsequently merged to b and then from b to c. The XML is as follows:

[[[

    <logentry revision="12">
        <author>dsahlberg</author>
        <date>2021-08-08T19:29:38.411358Z</date>
        <msg>merge b to c</msg>
        <logentry reverse-merge="false" revision="11">
            <author>dsahlberg</author>
            <date>2021-08-08T19:29:26.563564Z</date>
            <msg>merge a to b</msg>
            <logentry revision="10" reverse-merge="false">
                <author>dsahlberg</author>
<date>2021-08-08T19:29:10.860725Z</date>
                <msg>add 1 file</msg>
            </logentry>
        </logentry>
    </logentry>
]]]

I'm proposing to keep the XML tree intact, but filter out the information that doesn't match the search pattern. If searching for "add 1 file", the result would be as follows:

[[[

    <logentry revision="12">
        <logentry revision="11">
            <logentry revision="10" reverse-merge="false">
                <author>dsahlberg</author>
<date>2021-08-08T19:29:10.860725Z</date>
                <msg>add 1 file</msg>
            </logentry>
        </logentry>
    </logentry>
]]]

The most simple solution is to add the opening tag (<logentry revision="n">) whenever a log entry has children:
[[[

Index: subversion/svn/log-cmd.c
===================================================================
--- subversion/svn/log-cmd.c    (revision 1892053)
+++ subversion/svn/log-cmd.c    (working copy)
@@ -552,6 +552,7 @@
       return SVN_NO_ERROR;
     }

+  revstr = apr_psprintf(pool, "%ld", log_entry->revision);
   /* Match search pattern before XML-escaping. */
   if (lb->search_patterns &&
       ! match_search_patterns(lb->search_patterns, author, date, message,
@@ -559,6 +560,9 @@
     {
       if (log_entry->has_children)
         {
+          svn_xml_make_open_tag(&sb, pool, svn_xml_normal, "logentry",
+                              "revision", revstr, SVN_VA_NULL);
+          SVN_ERR(svn_cl__error_checked_fputs(sb->data, stdout));
           if (! lb->merge_stack)
             lb->merge_stack = apr_array_make(lb->pool, 1, sizeof(svn_revnum_t));

@@ -575,7 +579,6 @@
   if (message)
     message = svn_xml_fuzzy_escape(message, pool);

-  revstr = apr_psprintf(pool, "%ld", log_entry->revision);
   /* <logentry revision="xxx"> */
   if (lb->merge_stack && lb->merge_stack->nelts > 0)
     {

]]]

The problem with this solution is that it might emit empty XML elements, like this:

[[[

    <logentry revision="6">
    </logentry>
]]]

Depending on how the XML parser is built and how the information is used/displayed this may or may not be a problem. As far as I understand, most (if not all) tags are optional so it should not be an issue that there are no children (for example <msg></msg>). But it may be that an empty revision 6 is displayed in this case.

A more complex solution would require keeping track of the XML tags that should be emitted if a child matches the search pattern. There is already an array (lb->merge_stack) that could probably be used for this purpose. Instead of just storing the revision number it would have to store if the opening tag has been emitted or not. When a child matches the search pattern, all parent nodes that has not previously been emitted will be emitted. When the library signals to close a tag, the code checks if the corresponding open tag has been emitted or not.

The following is a concept for such a patch (I know HACKING has some things to say about how to name typedefs etc, but I don't have time to study it now, there are also whitespace issues). If this is a more desirable solution (ie, I get some +1 (concept)) then I will cleanup the patch and repost.

[[[

Index: subversion/svn/log-cmd.c
===================================================================
--- subversion/svn/log-cmd.c    (revision 1892053)
+++ subversion/svn/log-cmd.c    (working copy)
@@ -45,6 +45,11 @@

 #include "svn_private_config.h"

+typedef struct merge_stack {
+  svn_revnum_t rev;
+  svn_boolean_t emitted;
+} merge_stack_t;
+
 
 /*** Code. ***/

@@ -355,10 +360,14 @@
     {
       if (log_entry->has_children)
         {
+      merge_stack_t ms;
+
           if (! lb->merge_stack)
-            lb->merge_stack = apr_array_make(lb->pool, 1, sizeof(svn_revnum_t)); +            lb->merge_stack = apr_array_make(lb->pool, 1, sizeof(merge_stack_t));

-          APR_ARRAY_PUSH(lb->merge_stack, svn_revnum_t) = log_entry->revision;
+      ms.rev = log_entry->revision;
+      ms.emitted = FALSE;
+          APR_ARRAY_PUSH(lb->merge_stack, merge_stack_t) = ms;
         }

       return SVN_NO_ERROR;
@@ -435,10 +444,10 @@
       iterpool = svn_pool_create(pool);
       for (i = 0; i < lb->merge_stack->nelts; i++)
         {
-          svn_revnum_t rev = APR_ARRAY_IDX(lb->merge_stack, i, svn_revnum_t); +          merge_stack_t ms = APR_ARRAY_IDX(lb->merge_stack, i, merge_stack_t);

           svn_pool_clear(iterpool);
-          SVN_ERR(svn_cmdline_printf(iterpool, " r%ld%c", rev,
+          SVN_ERR(svn_cmdline_printf(iterpool, " r%ld%c", ms.rev,
                                      i == lb->merge_stack->nelts - 1 ?
'\n' : ','));
         }
@@ -475,10 +484,14 @@

   if (log_entry->has_children)
     {
+      merge_stack_t ms;
+
       if (! lb->merge_stack)
-        lb->merge_stack = apr_array_make(lb->pool, 1, sizeof(svn_revnum_t)); +        lb->merge_stack = apr_array_make(lb->pool, 1, sizeof(merge_stack_t));

-      APR_ARRAY_PUSH(lb->merge_stack, svn_revnum_t) = log_entry->revision;
+      ms.rev = log_entry->revision;
+      ms.emitted = FALSE;
+      APR_ARRAY_PUSH(lb->merge_stack, merge_stack_t) = ms;
     }

   return SVN_NO_ERROR;
@@ -533,6 +546,7 @@
   const char *author;
   const char *date;
   const char *message;
+  int i;

   if (lb->ctx->cancel_func)
SVN_ERR(lb->ctx->cancel_func(lb->ctx->cancel_baton));
@@ -544,11 +558,14 @@

   if (! SVN_IS_VALID_REVNUM(log_entry->revision))
     {
-      svn_xml_make_close_tag(&sb, pool, "logentry");
-      SVN_ERR(svn_cl__error_checked_fputs(sb->data, stdout));
-      if (lb->merge_stack)
-        apr_array_pop(lb->merge_stack);
-
+      if (lb->merge_stack) {
+    if (lb->merge_stack->nelts > 0 && (APR_ARRAY_IDX(lb->merge_stack, lb->merge_stack->nelts-1, merge_stack_t).emitted))
+      {
+        svn_xml_make_close_tag(&sb, pool, "logentry");
+        SVN_ERR(svn_cl__error_checked_fputs(sb->data, stdout));
+      }
+    apr_array_pop(lb->merge_stack);
+      }
       return SVN_NO_ERROR;
     }

@@ -559,15 +576,32 @@
     {
       if (log_entry->has_children)
         {
+      merge_stack_t ms;
           if (! lb->merge_stack)
-            lb->merge_stack = apr_array_make(lb->pool, 1, sizeof(svn_revnum_t)); +            lb->merge_stack = apr_array_make(lb->pool, 1, sizeof(merge_stack_t));

-          APR_ARRAY_PUSH(lb->merge_stack, svn_revnum_t) = log_entry->revision;
+      ms.rev = log_entry->revision;
+      ms.emitted = FALSE;
+          APR_ARRAY_PUSH(lb->merge_stack, merge_stack_t) = ms;
         }

       return SVN_NO_ERROR;
     }
-
+  else if (lb->search_patterns && lb->merge_stack) {
+    /* match_search_patterns returned true */
+    for (i = 0; i < lb->merge_stack->nelts; i++)
+      {
+    if (!APR_ARRAY_IDX(lb->merge_stack, i, merge_stack_t).emitted)
+      {
+        revstr = apr_psprintf(pool, "%ld",
+                  APR_ARRAY_IDX(lb->merge_stack, i, merge_stack_t).rev);
+        svn_xml_make_open_tag(&sb, pool, svn_xml_normal, "logentry",
+                  "revision", revstr, SVN_VA_NULL);
+        APR_ARRAY_IDX(lb->merge_stack, i, merge_stack_t).emitted = TRUE;
+      }
+      }
+  }
+
   if (author)
     author = svn_xml_fuzzy_escape(author, pool);
   if (date)
@@ -606,7 +640,6 @@
   if (log_entry->changed_paths2)
     {
       apr_array_header_t *sorted_paths;
-      int i;

       /* <paths> */
       svn_xml_make_open_tag(&sb, pool, svn_xml_normal, "paths",
@@ -684,10 +717,14 @@

   if (log_entry->has_children)
     {
+      merge_stack_t ms;
+
       if (! lb->merge_stack)
-        lb->merge_stack = apr_array_make(lb->pool, 1, sizeof(svn_revnum_t)); +        lb->merge_stack = apr_array_make(lb->pool, 1, sizeof(merge_stack_t));

-      APR_ARRAY_PUSH(lb->merge_stack, svn_revnum_t) = log_entry->revision;
+      ms.rev = log_entry->revision;
+      ms.emitted = TRUE;
+      APR_ARRAY_PUSH(lb->merge_stack, merge_stack_t) = ms;
     }
   else
     svn_xml_make_close_tag(&sb, pool, "logentry");

]]]

I have run the testsuite with both patch versions without any problems. (I have obviously removed XFail() from the test for #4711 in log_tests.py).

Kind regards,

Daniel Sahlberg


[1] https://issues.apache.org/jira/browse/SVN-4711

Reply via email to