here are some sitescooper definition files i wish to contribute:

# Bastard Operator from Hell
URL: http://www.theregister.co.uk/content/30/index.html
  Name: BOFH
  Levels: 2
  StoryURL: /content/\d+/\d+\.html
  StoryCacheable: 1
  MinPages: 2
  StoryUseTableSmarts: 0
  ContentsUseTableSmarts: 0
  ContentsStart: <IFRAME SRC=.http://ad.uk.doubleclick.net/
  ContentsEnd: <TD WIDTH="150" ALIGN="right" VALIGN="top">

  StoryHTMLPreProcess: {
    s/<DIV CLASS=.story_head.>(.*?)<\/DIV>/<H2 CLASS='story_head'>$1<\/H2>/is;
    s/<br>.<br><B>Related (?:[sS]tory|[sS]tories|[lL]ink|[lL]inks)<\/B>.*\Z//s;
    s/<br>+/<br>/i;
    s/<br><p>(?:<br>)*/<p>/i;
  }
  MinPages: 2


# Ditherati
URL: http://ditherati.com/
  Name: Ditherati
  StoryStart: <!--content-->
  StoryEnd: <!--/content-->


# Pigdog Journal
URL:            http://www.pigdog.org/pigdog.rdf
Name:           Pigdog Journal
Description:    The Online Handbook of Bad People of the Future
ContentsFormat: rss

StoryURL:       /.*.s?html?
StoryStart:     <a href="mailto:[EMAIL PROTECTED]";>Feedback</a><br>
StoryEnd:       <td background="images/rightborder.gif">
ContentsStart:  <item>


# TopFive
URL: http://www.topfive.com/html/left_defaultmasterborder.htm
Name: TopFive
Levels: 2

ContentsStart: <B>Today's Stuff:
ContentsEnd: <B>Previous Stuff:


-- 
Robert Edmonds
[EMAIL PROTECTED]

_______________________________________________
Sitescooper-talk mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/sitescooper-talk

Reply via email to