Hi all!

As part of my personal website's test suite, I wrote some code to check against
spelling errors in the resultant static HTML pages in Perl and Bash using
some CPAN modules. Now, since I want to add this functionality to other
static sites I maintain, I've thought of extracting this to its own CPAN
distribution, but then remembered that the problem may have already been
solved elsewhere. So I'm looking for such a system in any language I can
easily manage. What are you using?

My requirements and current features are:

1. Spell check in en_GB, en_US, and possibly he_IL using hunspell or perhaps
aspell.

2. Ability to traverse a directory tree looking for files of the pattern *.html
and *.xhtml with the ability to prune directories or ignore certain path
patterns.

3. Ability to provide whitelists - both a global one , and ones for sets of
documents (including some partially overlapping sets). The whitelists should be
notated in a well-formed text format with the ability to sort the whitelist for
being canonical and to minimise version control changes.

4. Search for text between tags with a bonus for text in alt="..."
attributes/etc.

5. No requirement to handle malformed or invalid HTML/XHTML. It can easily
report on error on this case.

===============

I didn't find anything on Google and DuckDuckGo from brief searches and the
only things of relevance on CPAN appear to be
https://metacpan.org/pod/Test::HTML::Spelling (which can handle one document
at a time, though I can still make use of it ) and
https://metacpan.org/pod/Apache::AxKit::Language::SpellCheck (which is
Apache-specific).

Any pointers will be appreciated.

Regards,

        Shlomi Fish

-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
http://www.shlomifish.org/humour/ways_to_do_it.html

Chuck Norris doesn't celebrate holidays -- holidays celebrate Chuck Norris.
(By sevvie: http://sevvie.github.io/ .)
    — http://www.shlomifish.org/humour/bits/facts/Chuck-Norris/

Please reply to list if it's a mailing list post - http://shlom.in/reply .
_______________________________________________
Perl mailing list
Perl@perl.org.il
http://mail.perl.org.il/mailman/listinfo/perl

Reply via email to