Those interested in tracking Indian news websites (I've seen so many news
article links get broken just one or two years on), you might want to
explore adding some URLs to this global tracker.

Cross-posting from
https://lists.okfn.org/pipermail/data-driven-journalism/2018-August/004855.html

>From kalev leetaru:

For those interested in measuring the fluidity of the online sphere and in
particular how much online news coverage changes over time (from 404's to
redirects to title and text editing), we've released this morning the new
GDELT GDG, which recrawls every article monitored by GDELT again after 24
hours and after one week and catalogs all of the changes it observes. Text
changes record only changes to the article text itself, not the surrounding
page. Changes are reported at the "word" level for space delimited
languages and character level for others (currently for Burmese, Chinese,
Dzongkha, Japanese, Khmer, Laothian, Thai, Tibetan and Vietnamese, with
more being added shortly).

We're particularly excited about the ability to assess change globally
across countries and languages and at scale, across everything GDELT
monitors each day.

The resulting global change log is all open data and available in one
minute updates as JSON files, a BigQuery table and an RSS feed for web
archives (allowing them to recrawl changed pages).

This is an alpha grade release, so you will undoubtedly find some rough
edges, but we're incredibly excited to see what people are able to do with
it!

https://blog.gdeltproject.org/announcing-the-gdelt-global-difference-graph-gdg-planetary-scale-change-detection-for-the-global-news-media/


You can also couple this with our global frontpage outlink monitoring (35
billion outlinks to 240 million unique URLs to date) to assess what percent
of homepage links are edited over time:

https://blog.gdeltproject.org/announcing-gdelt-global-frontpage-graph-gfg/
Kalev

--
Cheers,
Nikhil VJ
+91-966-583-1250
Pune, India
Website <http://nikhilvj.co.in>
DataMeet Pune chapter <https://datameet-pune.github.io/>
Self-designed learner at Swaraj University <http://www.swarajuniversity.org>
Payment / Contribute <https://nikhilvj.benow.in/pay>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to