Hi Chris,

However, making sense of this data is very, very time consuming, not to mentioned writing and maintaining bots (we now have hundreds and hundreds of them) to scrape jurisdictions that aren't open data (the vast majority) takes significant resources, and we don't see any way of sustaining this on a CC-BY licence.
a) is the code for these bots somewhere?
b) we hope to find a way to maintain it this time. DBpedia has received funding via http://smartdataweb.de/ and also http://aligned-project.eu/ We are also currently working on a charter for an non-profit association that is committed to keep all data open under cc-by (we are accepting donations, membership fees among other things)

I could also write a book about corporate identifiers, and the issues with those on the list (but don't have time).
We are writing such a book* in parallel, do you want to help?
Sebastian

*= well it's just a paper


On 05.11.2015 19:18, Chris Taggart wrote:
Rolf etc

Thanks for cc'ing me. We'd had contact from Sebastian and given him an API key. The main issues here are sustainability and domain knowledge. We'd love more people to be downloading the open datasets from the UK and others, and using them in all sorts of innovative ways, and the main reason we do the Open Company Data Index <http://registries.opencorporates.com/>, is to motivate company registers to opening up their data (I was speaking at the Open Govt Partnership Summit in Mexico City last week on the same subject). However, making sense of this data is very, very time consuming, not to mentioned writing and maintaining bots (we now have hundreds and hundreds of them) to scrape jurisdictions that aren't open data (the vast majority) takes significant resources, and we don't see any way of sustaining this on a CC-BY licence.

Finally, there are very few registers that are CC-BY licences or less (for example Denmark places restrictions on use for marketing), even ignoring DPA issues (we are now spending a considerable amount on legal fees on this issue). I could also write a book about corporate identifiers, and the issues with those on the list (but don't have time).

So, we'd love to see more activity in the area, particularly in Germany – where the Handelsregister and Bundesanzeiger are very definitely not open data ;-)

Chris

On 5 November 2015 at 12:49, Rolf Kleef <r...@openforchange.info <mailto:r...@openforchange.info>> wrote:

    Hi Sebastian, Kay,

    If you haven't done it yet, I suggest getting in touch with Chris
    Taggart of Open Corporates (cc'd). He has years of experience doing
    this, and is also involved in cross-standards work on "organisational
    identifiers", crucial in the development of for instance the Open
    Contracting Data Standard and the International Aid Transparancy
    Initiative:

    http://www.open-contracting.org/
    http://iatistandard.org/201/organisation-identifiers/

    ~~Rolf.

    On 03/11/15 16:17, Sebastian Hellmann wrote:
    > [Apologies for cross-posting]
    >
    > Dear all,
    > this message is part announcement of an open data initiative and
    part
    > call for feedback and support.
    >
    > We are considering to work on creating a free, open and
    interoperable
    > dataset on companies and organisations, which we are planing to
    > integrate into DBpedia+ and offer as dump download. As we are in
    a very
    > early phase of the endeavour, we would like to know whether there is
    > existing work in this area.
    >
    > We are looking for any available datasets which have information
    about
    > companies and other organizations in any language and any country.
    > Ideally, the datasets are:
    > 1. downloadable as dump
    > 2. openly licensed , e.g. CC-BY following the
    http://opendefinition.org/
    > 3. in an easily parseable format, e.g. RDF or CSV and not PDF
    >
    > But hey! Send around anything you know, and we will look at it
    and see
    > whether we can make use of it. You can reach us either by
    replying  to
    > this email or send feedback directly to me and Kay Müller
    > <kay.muel...@informatik.uni-leipzig.de
    <mailto:kay.muel...@informatik.uni-leipzig.de>>.
    > If you have any private/closed data, please contact us as well.
    We might
    > make use of it to cross-reference and validate public/open data
    with it.
    > Or just learn from it to build a good scheme.
    >
    > We started a link collection here (and attached the current
    status at
    > the end of this email)
    >
    
https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
    > Also we started to collect potential identifiers for linking here:
    >
    
https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
    >
    > Regards and thank you for any support on this,
    > Sebastian and Kay
    >
    > ##############################
    >
    >
    
https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
    >
    >
    > *
    >
    >
    >   Open Company Data
    >
    > Open Company Data
    >
    
<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.buuo7dypfd9a>
    >
    > Identifiers for companies/organisation
    >
    
<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.qs150ivpio94>
    >
    > URIs (Linked Data/Semantic Web)
    >
    
<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.b9yeovqjeglz>
    >
    > Downloadable Datasets with Company info (confirmed)
    >
    
<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.7ihxrlrypp14>
    >
    > Portals with no bulk downloads
    >
    
<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.a95o85lqil72>
    >
    > Portals, we will still need to investigate
    >
    
<https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.p50bjh96q3ok>
    >
    >
    >
    >     Identifiers for companies/organisation
    >
    > Table with identifiers:
    >
    >
    
<https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0>https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
    >
    >
    >       URIs (Linked Data/Semantic Web)
    >
    >   *
    >
    >     DBpedia/Wikipedia/Wikidata URIs -
    <http://dbpedia.org>http://dbpedia.org
    >
    >   *
    >
    >     LinkedGeoData -
    <http://linkedgeodata.org/>http://linkedgeodata.org/
    >
    >
    >     DownloadableDatasets with Company info (confirmed)
    >
    >   *
    >
    >     VIAF - <http://viaf.org/viaf/data/>http://viaf.org/viaf/data/
    >
    >   *
    >
    >     DBpedia -
> <http://downloads.dbpedia.org/current/core/>http://downloads.dbpedia.org/current/core/
    >
    >   *
    >
    >     Wikidata -
> <http://downloads.dbpedia.org/current/ext/wikidata/>http://downloads.dbpedia.org/current/ext/wikidata/
    >
    >   *
    >
    >     LinkedGeoData -
> <http://downloads.linkedgeodata.org/releases/>http://downloads.linkedgeodata.org/releases/
    >
    >   *
    >
    >     Company Data Index:
> <http://index.okfn.org/dataset/companies/>http://index.okfn.org/dataset/companies/
    >
    >       o
    >
    >         e.g. UK company data:
> <http://download.companieshouse.gov.uk/en_output.html>http://download.companieshouse.gov.uk/en_output.html
    >
    >
    >     Portals with no bulk downloads
    >
    >   *
    >
    >     <https://opencorporates.com/>https://opencorporates.com/
    >
    >   *
    >
> <http://registries.opencorporates.com/>http://registries.opencorporates.com/
    >
    >
    >     Portals, we will still need to investigate
    >
    >
    >   *
    >
    >     <https://www.wlw.de/>https://www.wlw.de/
    >
    >   *
    >
    >     <https://www.crunchbase.com>https://www.crunchbase.com
    >
    >   *
    >
> <http://data.crunchbase.com/v3/page/crunchbase-open-data-map-odm>http://data.crunchbase.com/v3/page/crunchbase-open-data-map-odm
    >
    >   *
    >
    >     <http://www.industrystock.de>http://www.industrystock.de
    >
    >   *
    >
    >     <http://www.ebr.org/>http://www.ebr.org/
    >
    >   *
    >
> <https://simfin.com/data/browse/companies>https://simfin.com/data/browse/companies
    >
    >   *
    >
    >     <http://c-lei.org/>http://c-lei.org/
    >
    >   *
    >
    >     <http://data.imf.org/>http://data.imf.org/
    >
    >   *
    >
> <http://worldbank.270a.info/.html>http://worldbank.270a.info/.html
    >
    >   *
    >
> <http://datacatalog.worldbank.org/>http://datacatalog.worldbank.org/
    >
    >   *
    >
    >     <http://www.europages.com/>http://www.europages.com/
    >
    >   *
    >
    >     <http://www.sec.gov/data>http://www.sec.gov/data
    >
    >   *
    >
> <http://faculty.philau.edu/russowl/industry.html>http://faculty.philau.edu/russowl/industry.html
    >
    >   *
    >
    >     USA: http://www.corporationwiki.com/
    >
    >   *
    >
    >     India: http://www.companywiki.in/
    >
    >   *
    >
    >     Handelsregister: www.Handelsregister.de
    <http://www.Handelsregister.de>
    >
    >   *
    >
    >     Creditreform: http://www.creditsafetrial.com/de/?country=DE
    >
    >   *
    >
    >     Bürgel: https://www.buergel.de/en
    >
    >   *
    >
    >     Factiva:
    > https://global.factiva.com/factivalogin/login.asp?productname=global
    >
    >   *
    >
    >
    > Interesting Links:
    >
    >   *
    >
    >     German
> <http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-1/>http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-1/
    >
    >   *
    >
> <http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-2/>http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-2/
    >
    > *
    >
    > --
    > Sebastian Hellmann
    > AKSW/KILT research group
    > Insitute for Applied Informatics (InfAI) at Leipzig University
    > DBpedia Association
    > Events:
    > * *Nov 20th, 2015* Extended Deadline for Quality Management of
    Semantic
    > Web Assets (Data, Services and Systems)
    >
    
<http://www.semantic-web-journal.net/blog/call-papers-special-issue-quality-management-semantic-web-assets-data-services-and-systems>
    > Venha para a Alemanha como PhD:
    >
    
<http://bis.informatik.uni-leipzig.de/csf>http://bis.informatik.uni-leipzig.de/csf
    > Projects: http://dbpedia.org, http://nlp2rdf.org,
    > <http://linguistics.okfn.org>http://linguistics.okfn.org,
    > https://www.w3.org/community/ld4lt
    <http://www.w3.org/community/ld4lt>
    > Homepage: http://aksw.org/SebastianHellmann
    > Research Group: http://aksw.org
    > Thesis:
    > http://tinyurl.com/sh-thesis-summary
    > http://tinyurl.com/sh-thesis

    --
    Rolf Kleef                Open for Change, network for open
    development
    r...@openforchange.info <mailto:r...@openforchange.info>
    +31617232772 <tel:%2B31617232772> @rolfkleef
    www.openforchange.info <http://www.openforchange.info>

    Internet trailblazer. Weaving the web to help humanity. Implementing
    open data, open organisations and online collaboration in civil
    society.




--
-------------------------------------------------------
OpenCorporates :: The Open Database of the Corporate World http://opencorporates.com OpenlyLocal :: Making Local Government More Transparent http://openlylocal.com
Blog: http://countculture.wordpress.com
Twitter: http://twitter.com/CountCulture


--
Sebastian Hellmann
AKSW/KILT research group
Insitute for Applied Informatics (InfAI) at Leipzig University
DBpedia Association
Events:
* *Nov 20th, 2015* Extended Deadline for Quality Management of Semantic Web Assets (Data, Services and Systems) <http://www.semantic-web-journal.net/blog/call-papers-special-issue-quality-management-semantic-web-assets-data-services-and-systems>
Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt <http://www.w3.org/community/ld4lt>
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
Thesis:
http://tinyurl.com/sh-thesis-summary
http://tinyurl.com/sh-thesis

Reply via email to