>   We are interested in data mining the
> texts of these newspapers for changes that would be of interest to a
> social psychologist.  What I am particuarly wanting advice on is the
> next step.  We want to filter out all the invariant and useless junk
> that you see in a newspaper's website.  We want the text of the
> articles and nothing else.

I think your best bet is to ask the sales department of the newspapers
for the texts.

Some newspapers offer print-versions of the articles, which may
or may not be linked using URLs that follow a certain pattern.


HTH, georg

Reply via email to