Re: [dev] sites linkrot

Christoph Lohmann Sun, 13 Jan 2013 01:23:00 -0800

Greetings.

On Sun, 13 Jan 2013 10:12:40 +0100 Kai Hendry <[email protected]> wrote:
> Hi guys,
> 
> Please rip this to shreds https://github.com/kaihendry/linkrot and
> perhaps guide me to a better script. Something that can do the http
> requests in parallel and hence much faster?
> 
> I ran it over sites/
> for i in *; do test -d "$i" || continue; linkrot $i > $i.linkrot; done
> 
> and the output is over here:
> http://s.natalian.org/2013-01-13/
> 
> 000 means the domain didn't resolve. Definitely have some false
> negatives, for e.g. on cat-v. I guess sites sometimes aren't working
> and the failures need to be counted/recorded and when it hits a
> threshold (e.g. 10 consecutive failures in 10 day daily check), only
> then an admin needs to manually intervene?


Could you please make the output of your script more readable?

Something like

        $domain$path: Link to %s is not found.
        $domain$path: Link to %s does redirect to %s.

The  repeated  sed  strings  need the human read to repeatedly parse the
same over and over again, which makes it tiresome to follow and in  con‐
junction  with  the  unnatural error codes this is the same as you would
have outputted long XML subtrees that need the  reading  of  many  lines
just to grasp simple metadata.

Could you please adapt your script to be more readable? The sed commands
are useless, so don’t output them.


Thanks for your efforts.


Sincerely,

Christoph Lohmann

Re: [dev] sites linkrot

Reply via email to