In case it helps anyone else, here's a little bash script I wrote to find HTML files with XML errors and to print summary totals. (It calls tidy, which is included in Leopard.) Since we have a ton of files that need to be validated for proper XML, I'm using this to get a high- level view of how many files are OK and how many still have errors.

You use it like this:

        xmlcheck [directory]

It will find all .html files under the directory and identify any with XML formatting problems.

To see other options (show errors/warnings only, show verbose messages), use

        xmlcheck -h

for help.

-Nathan


(Source below. Also attached.)

Attachment: xmlcheck
Description: Binary data




-----------------------< copy and paste >-----------------------
#!/bin/bash

# Uses 'tidy' to find XML errors in HTML files

export IFS=$'\n' # separate tokens by newline only (needed so 'find' command works with filenames having spaces)
TOTAL_COUNT=0
ERR_COUNT=0
WARN_COUNT=0
SHOW_WARNINGS=1
SHOW_OK=1
VERBOSE=0
FILES=

while getopts 'ewvh' OPTION; do
        
        case $OPTION in
                e)      SHOW_WARNINGS=0
                        SHOW_OK=0
                        ;;
                        
                w)      SHOW_WARNINGS=1
                        SHOW_OK=0
                        ;;
                        
                v)      VERBOSE=1
                        ;;
        
h|?) printf "Usage: %s [-e|w] [-v] [-h] [root directory]\n\n" $ (basename $0) >&2 printf " Uses 'tidy' command to find XML formatting errors in .html files.\n\n" >&2
                        printf "  -e  Show only files with errors\n" >&2
                        printf "  -w  Show only files with errors or warnings\n" 
>&2
                        printf "  -v  Verbose; show error and warning messages\n" 
>&2
                        printf "  -h  Help\n\n" >&2
                        exit 2
                        ;;
        esac
done
shift $(($OPTIND - 1))
FILES=$*

if [[ "$FILES" == "" ]]; then
        FILES="."
fi

for i in $(find $FILES -path "*.html"); do
        (( TOTAL_COUNT += 1 ))
        TIDY_OUT=`tidy -xml -e -q $i 2>&1`
        ERROR_CODE=$?
        if [[ "$ERROR_CODE" == "2" ]]; then
                (( ERR_COUNT += 1))
                echo "ERRORS!   $i" >&2
                if [[ $VERBOSE == 1 ]]; then
                        echo $TIDY_OUT >&2
                        echo >&2
                fi

        elif [[ "$ERROR_CODE" == "1" ]]; then
                (( WARN_COUNT += 1 ))
                if [[ $SHOW_WARNINGS == 1 ]]; then
                        echo "WARNINGS! $i" >&2
                        if [[ $VERBOSE == 1 ]]; then
                                echo $TIDY_OUT  >&2
                                echo    >&2
                        fi
                fi
                
        else
                if [[ $SHOW_OK == 1 ]]; then
                        echo "OK        $i"
                fi
        fi
done

OK_COUNT=$(( TOTAL_COUNT - ERR_COUNT - WARN_COUNT ))

PERCENT_ERR=$(( ERR_COUNT * 100 / TOTAL_COUNT ))
PERCENT_WARN=$(( WARN_COUNT * 100 / TOTAL_COUNT ))
PERCENT_OK=$(( OK_COUNT * 100 / TOTAL_COUNT ))

echo
printf "               Total files: %4d\n" $TOTAL_COUNT
echo
printf "                  OK files: %4d %3d%%\n" $OK_COUNT $PERCENT_OK
printf "         Files with errors: %4d %3d%%\n" $ERR_COUNT $PERCENT_ERR
printf " Files with warnings only: %4d %3d%%\n" $WARN_COUNT $PERCENT_WARN
echo
-----------------------< copy and paste >-----------------------

--
Nathan Hadfield
[EMAIL PROTECTED]

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to