A nice article on titled "Data journalism: how to find stories in
numbers" by Sandra Crucianelli.

I like this quote

“Evidence suggests that data journalism is the journalism of the future”

"Data journalism: how to find stories in numbers" by Sandra Crucianelli.

Colleagues often ask me what data journalism is. They're confused by
why it needs its own name — don't all journalists use data?

The term is shorthand for 'database journalism' or 'data-driven
journalism', where journalists find stories, or angles for stories,
within large volumes of data.

It overlaps with investigative journalism in requiring lots of
research, sometimes against people's wishes. It can also overlap with
data visualisation, as it requires close collaboration between
journalists and digital specialists to find the best ways of
presenting data.

So why get involved with spreadsheets and visualisation tools? At its
most basic, adding data can give a story a new, factual dimension. But
delving into datasets can also reveal new stories, or new aspects to
them, that may not have otherwise surfaced.

Data journalism can also sometimes tell complicated stories more
easily or clearly than relying on words alone — so it's particularly
useful for science journalists.

It can seem daunting if you're trained in print or broadcast media.
But I'll introduce you to some new skills, and show you some excellent
digital tools, so you too can soon find your feet as a data
journalist.

Where to begin

Like all journalism, ideas for stories can come from many sources. A
statistic might not sound quite right, tempting you to look at the
data behind it. Or you might have a question to answer — how has
science funding changed in the UK?, for example.

One way data journalism differs from other forms is that you may have
no inkling of the story until well after you start investigating. That
doesn't mean getting hold of any old data and expecting to find a
story — rather that the story is what the data tells you. This
presentation on The Guardian's Datablog gives an idea of the workflow
in data journalism.

So how do you choose what to delve into? It's good to familiarise
yourself with data types and sources in your 'beats' and when that
data might be released, just as you would know conference or journal
publication dates.

It's best to start small with your first data journalism projects,
particularly while you get used to the data processing and using all
the available tools. Your main challenge will probably be the time
needed to process data. Peter Aldhous, the New Scientist San Francisco
bureau chief, has produced a tutorial on how to approach science data
journalism projects, and The Data Journalism Handbook also has tips on
where to start.

Finding and accessing data

Data journalism experts say that journalists' roles are changing from
hunting and gathering scarce information to processing information in
'an age of abundance'.

“Evidence suggests that data journalism is the journalism of the future”

Sandra Crucianelli

Data might be abundant, but some types of data are easier to get hold
of than others. Governments are beginning to recognise the importance
of releasing data — including research findings — but this varies from
country to country, and even a government that believes in openness
may lack adequate systems for making data accessible.

Some nations, such as Kenya, proactively make data available, while in
others you'll have to ask — sometimes through systems such as India's
Right to Information Act.

International bodies such as the World Bank release data, and projects
such as Gapminder and Google Public Data Explorer collate data from
various organisations. For science/health journalists,
clinicaltrials.gov is a registry of clinical trial data. And
environment or earth science reporters can access information from the
US Geological Survey, for example.

You might even find some ready packaged data at your disposal. Data
Dredger, a collaboration between Internews and Kenya's open government
data initiative, provides links to Kenyan health reports and has
infographics on health topics you can download and use in stories.

And the web is full of data — finding it just requires honing your
search engine skills. Sometimes you can just search for a term plus
'data', or use a specialised academic search engine such as Google
Scholar or Scirus. 'Semantic' web resources, such as Wolfram|Alpha,
which search by extra data, not just the keywords within the page, are
also useful.

Google's advanced search allows you to narrow your results by domain
extension, helping you to search for academic or government data, and
file format — such as the Excel files in which you're most likely to
find tables of figures or statistics. Tables and graphics are often
uploaded as an image, so your data hunt should include Flickr and
Google Images.

You can even retrieve data that have been deleted from the web but
were 'cached' or saved as screenshots. Try the Internet Archive and
its Wayback Machine to recover old files or broken URLs.

Social media can also be a data source. Tools such as
SocialMention,48ers, Twitterfall Addictomatic, Boardreader and
Whostalkin allow you to make searches by name, subject, time and
geo-reference. An interesting example of social networks revealing
news is the Eye on the Bailoutproject of ProPublica, an investigative
journalism organisation, which has used social media mentions to alert
journalists to new data on what has happened to the US 2008 bank
bailout money.

Remember — it's good practice to link to, or state the sources of, your data.

Data handling

You've found the data, but can you use it? You'll need to import it
into a spreadsheet such as those in Excel or Google Drive, so download
data in a 'comma separated value', or CSV, format if possible.

You might have a table in a PDF file, or as a JPEG image file. Try a
file converter like Zamzar to get these into spreadsheets. Optical
character recognition software can also be a big help: a simple, free
one is Free Ocr. As a last resort you may have to manually input data,
which is time consuming and error prone.

Wherever your data comes from, it probably needs 'cleaning' to make it
useful. This can mean anything from reorganising and deleting data you
don't need, to using tools such as OpenRefine (formerly Google Refine)
to make the data more consistent (watch the video tutorials for
guidance on what this cleaning can mean). Science journalists at least
should have access to well-kept scientific data that needs less
cleaning.

You'll also need to start doing some basic processing. You might sort
data from smallest to largest or by location. You might be looking for
averages, or to join or compare two datasets.

Treat data as a 'source': ask it questions as your audience might. And
ask it lots of questions — the answer might not be what you first
think. For example, a spreadsheet of journal retractions might suggest
rising fraud detection, but you also need to ask whether there are
other interpretations.

Think carefully about your results — do they sound plausible? It's
best to check and recheck calculations. Don't ruin your reputation for
a basic error.

You can strengthen your conclusions or pinpoint new questions with
simple statistical analyses. For example, you might spot more
catastrophic storms in your country each year for 20 years. But is
this a significant result or might it be chance natural variation?
Tools such as the R-Project and RStudio can help you judge that. You
might also want to check your conclusions with experts or other
experienced data journalists, particularly when you're starting out.

Presenting the data

Your presentation will depend on the story. There may be very little
to present; you could have slaved to get a single but important figure
to report in a conventional news piece — that your government has
spent half what it promised on science, for example.

Or you might use data visualisation as an integral part of the story.
Thisinvestigation from The Seattle Times in the United States combines
a written feature with supporting graphs, maps and source documents.
One is an interactive map; elements like this can be used within
larger stories and projects, or can be self-contained, like this
visualisation of the causes of death hosted by the UK newspaper The
Guardian.

Online tools such as Tableau Public and Many Eyes can visualise data
in various ways, while Google Fusion Tables, Geocommons and
Indiemapperproduce good maps using longitude/latitude data or more
complex GIS data. Many of these tools also let you add an animation
layer to show timescales, for example.

Sometimes it's not just about presenting data, but letting your
audience see what it means to them. This ProPublica project shows
users whether their doctor receives drug company money, while this
Texas Tribuneeffort shows you how US public money is spent.

Going further, this Guardian project asks readers to help analyse data
on UK public spending. This kind of project, called a 'news app',
requires collaboration between journalists and programmers to design
and build applications that handle and analyse many variables within
big databases or across many datasets.

I've been involved in a news app at Argentina's La Nación newspaper as
part of my Knight International Journalism Fellowship. It uses
national census information from 2001 and 2010, letting people explore
how demographics have changed in their areas.

The website Information is Beautiful has examples of creative data
visualisation, and shows how working with your publication's digital
or graphics team can be productive.

You may need to persuade your editors to make time for data
journalism. This gets easier when you see results, and this report
(which I co-authored) on integrating data journalism into newsrooms
might also help.

It might seem like a big ask, but evidence suggests that data
journalism is the journalism of the future. If you can invest the
time, you'll not only get better stories but you'll better serve your
audience and the public interest.



http://www.scidev.net/global/journalism/practical-guide/data-journalism-how-to-find-stories-in-numbers.html?utm_medium=email&utm_source=SciDev.Net&utm_campaign=2762015_130701+Newsletter&dm_i=1SCG,1N76N,AVIF2X,5QW34,1


www.injube.blogspot.com

-- 
For more details about this list
http://datameet.org/discussions/
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to