Hello Martin, Stuart,
Thank you for the notes,
Yes, the initial utility is quite ugly, I just tried to prepare it as
quickly as possible hoping that it covers the majority of "trivial"
replace cases. Yes, it does not process multi-line <code> inclusions.
> s = s.replace( "<CODE>", tag1);
> s = s.replace( "<Code>", tag1);
> s = s.replace("</CODE>", tag2);
> s = s.replace("</Code>", tag2);
- replaced with "s = ln.replaceAll("(?i)<code>",
"<code>").replaceAll("(?i)</code>", "</code>");"
Unfortunately my Perl/lisp knowledge are zero :)
> Should you publish your specdiff? I guess not - it would be empty!
For now it contains a single fixed misprint diff, but there are a few
another misprints at the moment which I'd like to include in the patch
as well.
So if you don't have objections, I'll delay for a several days and then
publish a final RFR (probably containing changes in some other repos
like jaxws, corba or jaxp) which would be more formal (containing bug #
and the final specdiff report).
Thanks again,
Alexander
On 10/1/2015 9:54 AM, Martin Buchholz wrote:
Hi s'marks,
You probably don't need to absolutify paths.
And you can easily handle multiple args.
(just for fun!)
Checks for javadoc comment; handles popular html entities; handles
multiple lines; handles both tt and code:
#!/bin/bash
find "$@" -name '*.java' | \
xargs -r perl -p0777i -e \
'do {} while s~^
*\*.*\K<(tt|code)>((?:[^<>{}\&\@]|&(?:lt|gt|amp);)*)</\1>~$_=$2;
s/</</g; s/>/>/g; s/&/&/g; "{\@code $_}"~mgie'
On Wed, Sep 30, 2015 at 6:16 PM, Stuart Marks <stuart.ma...@oracle.com
<mailto:stuart.ma...@oracle.com>> wrote:
Hi Alexander, Martin,
The challenge of Perl file slurping and Emacs one-liners was too
much to bear.
This is Java, so one-liners are hardly possible. Still, there are
a bunch of improvements that can be made to the Java version. (OK,
and I'm showing off a bit.)
Take a look at this:
http://cr.openjdk.java.net/~smarks/misc/SimpleTagEditorSmarks1.java
<http://cr.openjdk.java.net/%7Esmarks/misc/SimpleTagEditorSmarks1.java>
I haven't studied the output exhaustively, but it seems to do a
reasonably good job for the common cases. I ran it over java.lang
and I noticed a few cases where there is markup embedded within
<code></code> text, which should be looked at more closely.
I don't particularly care if you use my version, but there are
some techniques that I'd strongly recommend that you consider
using in any such tool. In particular:
- Pattern.DOTALL to do multi-line matches
- Pattern.CASE_INSENSITIVE
- try-with-resources to ensure that files are closed properly
- NIO instead of old java.io <http://java.io> APIs, particularly
Files.walk() and streams
- use Scanner to deal with input file buffering
- Scanner's stream support (I recently added this to JDK 9)
Enjoy,
s'marks
On 9/29/15 2:23 PM, Martin Buchholz wrote:
Hi Alexander,
your change looks good. It's OK to have manual corrections
for automated
mega-changes like this, as long as they all revert changes.
Random comments:
Should you publish your specdiff? I guess not - it would be
empty!
while((s = br.readLine()) != null) {
by matching only one line at a time, you lose the ability to make
replacements that span lines. Perlers like to "slurp" in the
entire file
as a single string.
s = s.replace( "<CODE>", tag1);
s = s.replace( "<Code>", tag1);
s = s.replace("</CODE>", tag2);
s = s.replace("</Code>", tag2);
Why not use case-insensitive regex?
Here's an emacs-lisp one-liner I've been known to use:
(defun tt-code ()
(interactive)
(query-replace-regexp
"<\\(tt\\|code\\)>\\([^&<>\\\\]+\\)</\\1>" "{@code
\\2}"))
With more work, one can automate transformation of embedded
things like <
But of course, it's not even possible to transform ALL uses of
<code> to
{@code, if there was imaginative use of nested html tags.
On Tue, Sep 29, 2015 at 3:21 AM, Alexander Stepanov <
alexander.v.stepa...@oracle.com
<mailto:alexander.v.stepa...@oracle.com>> wrote:
Updated: a few manual corrections were made (as @linkplain
tags displays
nested {@code } literally):
http://cr.openjdk.java.net/~avstepan/tmp/codeTags/jdk.patch
<http://cr.openjdk.java.net/%7Eavstepan/tmp/codeTags/jdk.patch>
-checked with specdiff (which of course does not cover
documentation for
internal packages), no unexpected diffs detected.
Regards,
Alexander
On 9/27/2015 4:52 PM, Alexander Stepanov wrote:
Hello Martin,
Here is some simple app. to replace <code></code> tags
with a new-style
{@code } one (which is definitely not so elegant as
the Perl one-liners):
http://cr.openjdk.java.net/~avstepan/tmp/codeTags/SimpleTagEditor.java
<http://cr.openjdk.java.net/%7Eavstepan/tmp/codeTags/SimpleTagEditor.java>
Corresponding patch for jdk and replacement log (~62k
of the tag changes):
http://cr.openjdk.java.net/~avstepan/tmp/codeTags/jdk.patch
<http://cr.openjdk.java.net/%7Eavstepan/tmp/codeTags/jdk.patch>
http://cr.openjdk.java.net/~avstepan/tmp/codeTags/replace.log
<http://cr.openjdk.java.net/%7Eavstepan/tmp/codeTags/replace.log>
(sorry, I have to check the correctness of the patch
with specdiff yet,
so this is rather demo at the moment).
Don't know if these changes (cosmetic by nature) are
desired for now or
not. Moreover, probably some part of them should go to
another repos (e.g.,
awt, swing -> "client" instead of "dev").
Regards,
Alexander
----- Исходное сообщение -----
От: alexander.v.stepa...@oracle.com
<mailto:alexander.v.stepa...@oracle.com>
Кому: marti...@google.com <mailto:marti...@google.com>
Копия: core-libs-dev@openjdk.java.net
<mailto:core-libs-dev@openjdk.java.net>
Отправленные: Четверг, 24 Сентябрь 2015 г 16:06:56 GMT
+03:00 Москва,
Санкт-Петербург, Волгоград
Тема: Re: RFR [9] 8133651: replace some <tt> tags
(obsolete in html5) in
core-libs docs
Hello Martin,
Thank you for review and for the notes!
> I'm biased of course, but I like the approach I
took with
blessed-modifier-order:
> - make the change completely automated
> - leave "human editing" for a separate change
> - publish the code used to make the automated
change (in my case,
typically a perl one-liner)
Automated replacement has an obvious advantage: it is
fast and massive.
But there are some disadvantages at the same time
(just IMHO).
Using script it is quite easy to miss some not very
trivial cases, e.g.:
- remove unnecessary linebreaks, like
* <tt>someCode
* </tt>
(which would be better to replace with single-line
{@code someCode};
- joining of successive terms, like "<tt>ONE</tt>,
<tt>TWO</tt>,
<tt>THREE</tt>" -> "{@code ONE, TWO, THREE}";
- errors like extra or missing "<" or ">": *
<tt>Collection
<T></tt>", - there were a lot of them;
- some cases when <tt></tt> should be replaced with
<code></code>, not
{@code } (e.g. because of unicode characters inside of
code etc.);
- extra tags inside of <tt> or <code> which should be
moved outside of
{@code }, like <tt><i>someCode</i></tt> or
<tt><b>someCode</b></tt>;
- simple removing of needless tags, like "<tt>{@link
...}</tt>" ->
"{@link ...}";
- replace HTML codes with symbols ('<', '>', '@', ...)
- etc.
- plus some other formatting changes and fixes for
misprints which would
be omitted during the automated replacement (and
wouldn't be done in
future manually because there is no motivation for
repeated processing).
So sometimes it may be difficult to say where is the
border between
"trivial" and "human-editing" cases (and the portion
of "non-trivial
cases" is definitely not minor); moreover, even the
automated
replacement requires the subsequent careful review
before publishing of
webrev (as well as by reviewers who probably wouldn't
be happy to review
hundreds of files at the same time) and iterative
checks/corrections.
specdiff is very useful for this task but also cannot
fully cover the
diffs (as some changes are situated in the internal
com/... sun/...
packages).
Moreover, I'm sure that some reviewers would be
annoyed with the fact
that some (quite simple) changes were postponed
because they are "not
too trivial to be fixed just now" (because they will
suspect they would
be postponed forever). So the patch creator would
(probably) receive
some advices during the review like "please fix also
fix this and that"
(which is normal, of course).
So my preference was to make the changes package by
package (in some
reasonable amount of files) not postponing part of the
changes for the
future (sorry for these boring repeating review
requests). Please note
that all the above mentioned is *rather explanation of
my motivation
than objection* :) (and of course I used some text
editor replace
automation which is surely not so advanced as Perl).
> It's probably correct, but I would have left it
out of this change
Yes, I see. Reverted (please update the web page):
http://cr.openjdk.java.net/~avstepan/8133651/jdk.00/index.html
<http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/index.html>
Thanks,
Alexander
P.S. The <tt> replacement job is mostly (I guess,
~80%) complete. But
probably this approach should be used if some similar
replacement task
for, e.g., <code></code> tags would be planned in
future (there are
thousands of them).
On 9/24/2015 6:10 AM, Martin Buchholz wrote:
On Sat, Sep 19, 2015 at 6:58 AM, Alexander Stepanov
<alexander.v.stepa...@oracle.com
<mailto:alexander.v.stepa...@oracle.com>
<mailto:alexander.v.stepa...@oracle.com
<mailto:alexander.v.stepa...@oracle.com>>> wrote:
Hello,
Could you please review the following fix
http://cr.openjdk.java.net/~avstepan/8133651/jdk.00/
<http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/>
<http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/>
http://cr.openjdk.java.net/~avstepan/8133651/jaxws.00/index.html
<http://cr.openjdk.java.net/%7Eavstepan/8133651/jaxws.00/index.html>
<http://cr.openjdk.java.net/%7Eavstepan/8133651/jaxws.00/index.html
for
https://bugs.openjdk.java.net/browse/JDK-8133651
Just another portion of deprecated <tt> (and
<xmp>) tags replaced
with {@code }. Some misprints were also fixed.
I'm biased of course, but I like the approach I
took with
blessed-modifier-order:
- make the change completely automated
- leave "human editing" for a separate change
- publish the code used to make the automated
change (in my case,
typically a perl one-liner)
The following (expected) changes were
detected by specdiff:
- removed needless dashes in java.util.Locale,
- removed needless curly brace in
xml.bind.annotation.XmlElementRef
I would do a separate automated "removed needless
dashes" changeset.
Please let me know if the following changes
are desirable or not:
http://cr.openjdk.java.net/~avstepan/8133651/jdk.00/src/jdk.jconsole/share/classes/sun/tools/jconsole/Formatter.java.udiff.html
<http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/src/jdk.jconsole/share/classes/sun/tools/jconsole/Formatter.java.udiff.html>
<
http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/src/jdk.jconsole/share/classes/sun/tools/jconsole/Formatter.java.udiff.html
This is an actual change to the behavior of this
code - the
maintainers of jconsole need to approve it. It's
probably correct,
but I would have left it out of this change. If
you remove it, then I
approve this change.