Kelly,

Is there a reason you don't use "expand"?

For a while now I've been using the following simple script to fix the whitespace in my files before I commit a changeset.

-- Jon



#!/bin/sh

# Find and fix whitespace errors in files that would otherwise be caught and rejected by the
# OpenJDK Mercurial "jcheck" facility.
# 1. Tabs in files are expanded to spaces
# 2. Trailing whitespace is removed
# 3. Add final newline if it is missing

# With no args, script uses "hg status" to determine modified and new files.
# Otherwise, script scans files and directories given on command line and fixes
# *java, *.g, *.properties, *.xml files.

if [ $# = 0 ]; then
  files=$(hg status --modified --added --no-status)
else
  files="$*"
fi

find $files -name SCCS -prune -o \( -name \*.java -o -name \*.g -o -name \*.properties -o -name \*.xml \) -print |
    while read f ; do
        updated=0

        # check for tabs or trailing whitespace, fix if found
        if egrep '    |( $)' $f > /dev/null ; then
            expand $f | sed -e 's/[     ]*$//' > $f~ && mv $f~ $f
            updated=1
        fi

        # check for final newline, fix if not found
        if perl -ne 'END { exit 1 if $nl; } $nl = /\n$/' $f ; then
            echo >> $f
            updated=1
        fi

        # log update
        if [ $updated = 1 ]; then
            echo $f
        fi
    done




On 03/02/2012 03:41 PM, Kelly O'Hair wrote:
A TAB takes you to a specific TAB spot, default is one TAB stop every 8 
characters in a line.
So the conversion is not just 'replace TAB with N characters'.
This is one of the issues with TABs, they aren't as predictable as people might 
think, especially
when mixed with spaces or placed in any location other than the beginning of 
the line all by
themselves.

It should not change the indenting, but that's with the assumption that the 
TABs follow the
8 character spacing. And that of course depends on how you are viewing the 
source. :^(

-kto

On Mar 2, 2012, at 3:04 PM, David Holmes wrote:

On 3/03/2012 3:12 AM, Kelly O'Hair wrote:
I don't understand the question. It only changes TAB characters, removes 
trailing whitespace on lines,
and duplicate blank lines at the end of the file.
I think the issue is what does it replace a TAB with? 4 spaces for JDK or 2 
spaces for Hotspot?

David

-kto

On Mar 2, 2012, at 1:00 AM, Staffan Larsen wrote:

Does this handle the difference between indents in HotSpot (indent 2) vs the 
JDK (indent 4)?

/Staffan

On 1 mar 2012, at 22:32, Kelly O'Hair wrote:

Need reviewer. Adding the whitespace normalizer script as a convenience to the 
jdk developers.

6625113: Add the normalize and rmkw perl script to the openjdk repository or 
openjdk site?
http://cr.openjdk.java.net/~ohair/openjdk8/normalizer-script/webrev/

Probably a little history is warranted here. This script was originally written 
to normalize the
whitespace in the jdk7 sources as they entered the Mercurial repositories in 
"changeset 0".
It's been modified since then very slightly. I can't recall who wrote it 
(please speak up if you know)
but it has been a valuable tool and I've had this CR to add it to the 
make/scripts directory for some time.

The SCCS keyword removed (rmkw) was less useful, and I decided that it did not 
deserve being added.

Why whitespace normalization? This was decided a long time ago when we had a 
raft of complaints from
people viewing the sources with different tools and getting different views 
based on the TABs and trailing
blanks or trailing newlines. So we decided to normalize on no TABs, no trailing 
blanks on lines, and
no more than one blank line at the end of the file. This script was used to do 
that normalization.

-kto


Reply via email to