Re: [PROPOSAL] improvements to site-scan.rb/site.cgi

Shane Curcuru Fri, 11 May 2018 12:20:02 -0700

New architecture:

- lib/whimsy/sitestandards.rb defines hashes for all types of checks as
regexes.  In most (but not all cases), the site-scan.rb simply uses the
CHECK_CAPTURE regex to determine what a_href|a_text to capture (i.e. put
into the site-scan.json), as well as various utility functions to ease
finding tlps vs. podlings.


The original design thought is:
* CHECK_CAPTURE is a lax/broad regex that would be used to define the
text|link we want to store.  This will capture some items that might not
strictly meet a requirement as spec'd, but are close.

* CHECK_TEXT|CHECK_VALIDATE (for links) would be used in UI display, to
more strictly validate if a captured value is either SITE_PASS or
SITE_WARN.  This is useful, because we've later updated the capture
(like below for security) when it's clear that some related text is
actually good enough to pass IMO.

- lib/whimsy/sitewebsite.rb defines 90% of the UI code to display data.

- www/site.cgi|pods.cgi are now just text output for descriptions and
calls to SiteWebsite to display the data, and should act exactly the
same.  The podling version also has a link to the podling status page.

- tools/site-scan.rb is commented and reorganized, with changes:

* most checks now use SiteStandards regexes (but not all)

* see the USAGE: minor update for improved functionality

Overall this changes 18 websites to now SITE_PASS 'security' check
instead of SITE_FAIL, primarily because it allows a_text of "Security
Reports" instead of just "Security", which I think is justified since it
certainly is clear to users the purpose of the link.

-
-- 

- Shane
  Director & Member
  The Apache Software Foundation

Re: [PROPOSAL] improvements to site-scan.rb/site.cgi

Reply via email to