>> I believe you can use git for this.  Try
>>   $ git clean -n -x
> I didn't know this git command, neat!
> The problem we were discussing is different: it's about deleting
> HTML pages that have been published and that have no corresponding
> .org file anymore -- the way I do this for other projects of mine
> is to delete all HTML files and republish my project, but we don't
> want to take that route here...

Couldn't we just compare input and output?  Or is that not safe
enough?  E.g. in an over-simplistic form obtain dead pages via
something like this:

#+BEGIN_SRC emacs-lisp
(let* ((html '("dir1/my-page1.html" 
       (org '("dir1/" 
       (html-sans-extensions (mapcar 'file-name-sans-extension html))
       (org-sans-extensions (mapcar 'file-name-sans-extension org)))
  (mapcar (lambda (x) (concat x ".html"))
          (dolist (x org-sans-extensions html-sans-extensions)
           (setq html-sans-extensions (remove x html-sans-extensions)))))

