Steve Rowe created LUCENE-7308:
----------------------------------

             Summary: checkJavaDocs.py mis-chunks javadocs HTML and then 
wrongly reports imbalanced tags
                 Key: LUCENE-7308
                 URL: https://issues.apache.org/jira/browse/LUCENE-7308
             Project: Lucene - Core
          Issue Type: Bug
            Reporter: Steve Rowe


Spin-off from SOLR-9107, where [~hossman] wrote:

{quote}
but as things stand with this patch, precommit currently complains about 
malformed javadocs...
{noformat}
     [echo] Checking for malformed docs...
     [exec] 
     [exec] 
/home/hossman/lucene/dev/solr/build/docs/solr-test-framework/org/apache/solr/util/RandomizeSSL.html
     [exec]   broken details HTML: Field Detail: reason: saw closing "</ul>" 
without opening <ul...>
     [exec]   broken details HTML: Field Detail: ssl: saw closing "</ul>" 
without opening <ul...>
     [exec]   broken details HTML: Field Detail: clientAuth: saw closing 
"</ul>" without opening <ul...>
{noformat}
...but i can't really understand why. The <ul> tags look balanced to me, and 
tidy -output /dev/null .../RandomizeSSL.html concurs that "No warnings or 
errors were found." I thought maybe the problem was related to some of the @see 
tags in the docs for these attributes, but even if i completley remove the 
javadocs the same validation errors occur.
{quote}

When I modify {{checkJavaDocs.py}} to print out the offending chunk of HTML, 
here's what I see for the first of the above:

{noformat}
solr/build/docs/solr-test-framework/org/apache/solr/util/RandomizeSSL.html
  broken details HTML: Field Detail: reason: saw closing "</ul>" without 
opening <ul...> in:
-----
<ul><pre>public abstract&nbsp;<a 
href="https://docs.oracle.com/javase/8/docs/api/java/lang/String.html?is-external=true";
 title="class or interface in java.lang">String</a>&nbsp;reason</pre>
<div class="block">Comment to inlcude when logging details of SSL 
randomization</div>
<dl>
<dt>Default:</dt>
<dd>""</dd>
</dl>
</li>
</ul>
</li>
</ul>
<ul class="blockList">
<li class="blockList"><a name="ssl--">
<!--   -->
</a>
<ul class="blockList">
<li class="blockList">
</ul>
{noformat}

So the chunking that's happening here isn't aligning with the detail HTML for 
methods, fields etc. - it doesn't start early enough and ends too late.

Furthormore, I can see that the chunking procedure ignores the final item in an 
HTML file (the stuff after the last {{<h4>}} - if I insert trash after the 
final <h4> (but within the javadocs for the corresponding final detail item in 
the HTML file), the current implementation ignores the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to