Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-13 Thread Dimitris Kontokostas
Hi Roberto,

You can define constant mappings in the mappings wiki [1]. For example in
the actor mapping you can define

{{ConstantMapping | ontologyProperty = occupation | value = Actor }}

and everyone will get an additional occupation Actor, We have a
deduplication step so don't worry if it gets extracted twice ;)

Cheers,
Dimitris


[1]
http://mappings.dbpedia.org/index.php/How_to_edit_DBpedia_Mappings#Constant_Mappings

On Thu, Apr 10, 2014 at 10:59 PM, Roberto Alsina 
roberto.als...@canonical.com wrote:

 Thanks everyone for all the awesome answers. You surely have given me a
 lot of links to follow and a lot of things I need to learn about!
 I'll take a few days to digest all the information and finish some pending
 tasks, and then I'll get back to this.

 One thing I did in our copy of the data  was deduce some extra properties
 from existing data.

 For example, if there are 3 or more starring pointing at the same
 person, I added a occupation::actor to him. Maybe there could be some way
 to automate that process (although this does mark the Dalai Lama as an
 actor ;-)


 On Thu, Apr 10, 2014 at 9:56 AM, Marco Fossati hell.j@gmail.comwrote:

 Hi Roberto,

 Do you need multilingual support for your app?
 If so, mapping infobox properties in different languages would be the
 way to go.
 Otherwise, raw infobox properties may be enough. You can find them under
 the under the http://dbpedia.org/property namespace.
 See my replies below for your examples.

 On 4/9/14, 9:09 PM, Roberto Alsina wrote:
  For example: most actors don't have occupation::Actor.
 http://dbpedia.org/property/occupation
  Or, publicly
  traded companies (example: Microsoft) have a Traded as field in their
  infoboxes but no matching data in DBPedia.
 http://dbpedia.org/property/tradedAs
 
  For the latter, adding mappings in
  http://mappings.dbpedia.org/index.php/Main_Page should be enough,
 right?
 Yep, if you want more homogeneous data in general and support for
 multiple languages.
 Hope this helps!
 --
 Marco Fossati
 http://about.me/marco.fossati
 Twitter: @hjfocs
 Skype: hell_j


 --
 Put Bad Developers to Shame
 Dominate Development with Jenkins Continuous Integration
 Continuously Automate Build, Test  Deployment
 Start a new project now. Try Jenkins in the cloud.
 http://p.sf.net/sfu/13600_Cloudbees
 ___
 Dbpedia-discussion mailing list
 Dbpedia-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion




 --
 Put Bad Developers to Shame
 Dominate Development with Jenkins Continuous Integration
 Continuously Automate Build, Test  Deployment
 Start a new project now. Try Jenkins in the cloud.
 http://p.sf.net/sfu/13600_Cloudbees
 ___
 Dbpedia-discussion mailing list
 Dbpedia-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion




-- 
Kontokostas Dimitris
--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-12 Thread Kingsley Idehen

On 4/11/14 6:50 PM, Paul Houle wrote:

One thing to watch out for is that many people have types that are
correct but strange.  Looking at this report

http://basekb.com/subjectiveEye/typeReport/linkBased/Athlete.html
Is there a particular reason why you don't expose DBpedia URIs in that 
page? It would make a world of difference :-)


--

Regards,

Kingsley Idehen 
Founder  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen







smime.p7s
Description: S/MIME Cryptographic Signature
--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-11 Thread Paul Houle
One thing to watch out for is that many people have types that are
correct but strange.  Looking at this report

http://basekb.com/subjectiveEye/typeReport/linkBased/Athlete.html

we see Bob Hope is the #2 Athlete of all time.  Well,  perhaps Bob
Hope is in a class by himself,  but you don't expect him to be a
member of that class -- except he is,  because he was a boxer early in
his career.  Other types have this problem (particularly cities) but
because people often do different things,  and because types like
actor and musician are duck types (i.e. you are an actor because
you acted in something) you get unexpected results.  For instance,  I
think Bodybuilder,  Actor,  Politician are all acceptable
one-word descriptions for Arnold Schwarzenegger,  but you wouldn't
expect Author (even though he co-wrote some influential books) or
Musician (because his voice is on some fitness recordings)

Anyway,  I went and did my thing with the Amazon cloud and produced
the type assignment file that I promised.  If you go to

s3://basekb-misc/0.9/freebaseTypesForDbpedia

you will find that directory has 12 files in it,  and those files
together have Freebase types.  You can do queries like

prefix : http://rdf.basekb.com/ns/

select COUNT(?s) {
   ?s a :people.person
}

and get much better accuracy than DBpedia ontology types.  That bucket
is public access and you should be able to access it with s3cmd,  S3
Browser or any other S3 client.

On Thu, Apr 10, 2014 at 9:59 AM, Roberto Alsina
roberto.als...@canonical.com wrote:
 Thanks everyone for all the awesome answers. You surely have given me a lot
 of links to follow and a lot of things I need to learn about!
 I'll take a few days to digest all the information and finish some pending
 tasks, and then I'll get back to this.

 One thing I did in our copy of the data  was deduce some extra properties
 from existing data.

 For example, if there are 3 or more starring pointing at the same person,
 I added a occupation::actor to him. Maybe there could be some way to
 automate that process (although this does mark the Dalai Lama as an actor
 ;-)


 On Thu, Apr 10, 2014 at 9:56 AM, Marco Fossati hell.j@gmail.com wrote:

 Hi Roberto,

 Do you need multilingual support for your app?
 If so, mapping infobox properties in different languages would be the
 way to go.
 Otherwise, raw infobox properties may be enough. You can find them under
 the under the http://dbpedia.org/property namespace.
 See my replies below for your examples.

 On 4/9/14, 9:09 PM, Roberto Alsina wrote:
  For example: most actors don't have occupation::Actor.
 http://dbpedia.org/property/occupation
  Or, publicly
  traded companies (example: Microsoft) have a Traded as field in their
  infoboxes but no matching data in DBPedia.
 http://dbpedia.org/property/tradedAs
 
  For the latter, adding mappings in
  http://mappings.dbpedia.org/index.php/Main_Page should be enough, right?
 Yep, if you want more homogeneous data in general and support for
 multiple languages.
 Hope this helps!
 --
 Marco Fossati
 http://about.me/marco.fossati
 Twitter: @hjfocs
 Skype: hell_j


 --
 Put Bad Developers to Shame
 Dominate Development with Jenkins Continuous Integration
 Continuously Automate Build, Test  Deployment
 Start a new project now. Try Jenkins in the cloud.
 http://p.sf.net/sfu/13600_Cloudbees
 ___
 Dbpedia-discussion mailing list
 Dbpedia-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion



 --
 Put Bad Developers to Shame
 Dominate Development with Jenkins Continuous Integration
 Continuously Automate Build, Test  Deployment
 Start a new project now. Try Jenkins in the cloud.
 http://p.sf.net/sfu/13600_Cloudbees
 ___
 Dbpedia-discussion mailing list
 Dbpedia-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion




-- 
Paul Houle
Expert on Freebase, DBpedia, Hadoop and RDF
(607) 539 6254paul.houle on Skype   ontolo...@gmail.com

--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-10 Thread Kingsley Idehen

On 4/9/14 6:53 PM, Pablo N. Mendes wrote:



I like very much the contributing back aspect of this. Thanks for 
offering! One problem is that some pages have no template, making it 
impossible to use the template-type mappings defined on the mappings wiki.


Other people have implemented type inferencing from categories, lists 
and even from the text.


Others, by cross-referencing with Freebase, Cyc, etc.

I am wondering if the type statements obtained through all these 
approaches should not be imported back to DBpedia through some 
semiautomatic curation method (read mappings wiki beyond templates).


I guess we could also use the wiki, and allow people to also add 
mappings for Lists, Categories, Tables, and other features generated 
by these approaches?


Cheers
Pablo



Or folks could just publish their mapping documents from a Web 
accessible URL :-)


--

Regards,

Kingsley Idehen 
Founder  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen







smime.p7s
Description: S/MIME Cryptographic Signature
--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-10 Thread Marco Fossati
Hi Roberto,

Do you need multilingual support for your app?
If so, mapping infobox properties in different languages would be the 
way to go.
Otherwise, raw infobox properties may be enough. You can find them under 
the under the http://dbpedia.org/property namespace.
See my replies below for your examples.

On 4/9/14, 9:09 PM, Roberto Alsina wrote:
 For example: most actors don't have occupation::Actor.
http://dbpedia.org/property/occupation
 Or, publicly
 traded companies (example: Microsoft) have a Traded as field in their
 infoboxes but no matching data in DBPedia.
http://dbpedia.org/property/tradedAs

 For the latter, adding mappings in
 http://mappings.dbpedia.org/index.php/Main_Page should be enough, right?
Yep, if you want more homogeneous data in general and support for 
multiple languages.
Hope this helps!
-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-10 Thread Roberto Alsina
Thanks everyone for all the awesome answers. You surely have given me a lot
of links to follow and a lot of things I need to learn about!
I'll take a few days to digest all the information and finish some pending
tasks, and then I'll get back to this.

One thing I did in our copy of the data  was deduce some extra properties
from existing data.

For example, if there are 3 or more starring pointing at the same person,
I added a occupation::actor to him. Maybe there could be some way to
automate that process (although this does mark the Dalai Lama as an actor
;-)


On Thu, Apr 10, 2014 at 9:56 AM, Marco Fossati hell.j@gmail.com wrote:

 Hi Roberto,

 Do you need multilingual support for your app?
 If so, mapping infobox properties in different languages would be the
 way to go.
 Otherwise, raw infobox properties may be enough. You can find them under
 the under the http://dbpedia.org/property namespace.
 See my replies below for your examples.

 On 4/9/14, 9:09 PM, Roberto Alsina wrote:
  For example: most actors don't have occupation::Actor.
 http://dbpedia.org/property/occupation
  Or, publicly
  traded companies (example: Microsoft) have a Traded as field in their
  infoboxes but no matching data in DBPedia.
 http://dbpedia.org/property/tradedAs
 
  For the latter, adding mappings in
  http://mappings.dbpedia.org/index.php/Main_Page should be enough, right?
 Yep, if you want more homogeneous data in general and support for
 multiple languages.
 Hope this helps!
 --
 Marco Fossati
 http://about.me/marco.fossati
 Twitter: @hjfocs
 Skype: hell_j


 --
 Put Bad Developers to Shame
 Dominate Development with Jenkins Continuous Integration
 Continuously Automate Build, Test  Deployment
 Start a new project now. Try Jenkins in the cloud.
 http://p.sf.net/sfu/13600_Cloudbees
 ___
 Dbpedia-discussion mailing list
 Dbpedia-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-10 Thread Paul Houle
It's more accurate to say that the Infovore software is a bridge
between Freebase,  DBpedia,  and other RDF data sets.  It merges data
sets in batch job and creates data sets that are normalized.  I think
also it

:BaseKB is a family of products produced from Freebase and Dbpedia
data using the Infovore software.  The main product,  :BaseKB Now,  is
a cleaned up version of the Freebase RDF dump that is much easier to
work with than the raw dump.  :BaseKB is similar to DBpedia and could
be used as a substitute in many applications,  but Dbpedia has some
unique and valuable information that is not in Freebase.

As for vocabulary conversion I get asked about that a lot.  One reason
I haven't done it is that every transformation you do to data risks
messing things up and the data quality issues are up in the air enough
that a half-baked effort at conversion will cause more problems than
it solves.  If you keep the vocabulary separate,  you can query
Freebase's opinion and query Dbpedia's opinion and know things haven't
been worse.

The mapping process would be done one predicate at a time and would
probably be guided by how prevalent the predicates are.  Some of the
predicates are going to be easy to process (just rewrite them) but
other ones might need more work if compound value types are involved
or if the types are literal types that have a system and domain
dependent meaning that needs to be preserved (is it feet or meters?)

It might also be useful to map to some third vocabulary.  I know
people would like to see DBpedia and Freebase through schema.org eyes
and I think that would be a good idea.

Common types and properties will get handled quickly but if somebody
is interested in some vertical,  say boats (20,000 known in Freebase)
they probably personally will need to do the work to figure things
out.  For instance,  Freebase is missing a lot of facts about boats
that are in DBpedia.  A union database will benefit from that,  and
there ought to be some community process where those fixes can be
expressed as rules and added to the system.






On Wed, Apr 9, 2014 at 5:25 PM, Kingsley Idehen kide...@openlinksw.com wrote:
 On 4/9/14 4:53 PM, Paul Houle wrote:

   The type assignments in DBpedia are very precise (few false
 statements) but not accurate in the sense that recall is poor;  many
 things fall through the cracks.  The real problem is that the the
 mappings are the map,  not the territory.  Wikipedia is an
 encyclopedia for humans,  not for machines,  so DBpedia has to parse
 whatever unsane markup they give us.

   Systems like Wikidata and Freebase can be edited by machines and
 human ontologists and get better recall for types.

 http://basekb.com/

is a conversion from Freebase to industry standard RDF.  You could
 use :BaseKB as a substitute for DBpedia,  but DBpedia has advantages
 too because in addition to the 4 million things important enough to be
 in DBpedia,  there is another 37 million unimportant things in :BaseKB
 that matter only to librarians,  video store clerks and professional
 discographers.

These unimportant things will drive you crazy unless you master
 them,  and the easiest way to turn down the noise is to restrict
 search to the 4 million things.

I could make you an RDF file that has statements such as

 ?dbpediaTopic a ?freebaseType .

you could load that together with the rest of DBpedia.  That would
 get you a long way towards good lists.  The trouble at this point is
 that you don't have the freebase types connected to the DBpedia types
 so you can't join them against the schema to find properties and such.
   Mapping the types to the DBpedia types would not be that hard either,
   since the two systems are well aligned.  Then you get something that
 looks like DBpedia but has more accurate types.

 Freebase has more accurate and better populated data for things
 like ticker symbols,  geo-coordinates,  genders,  birth dates and the
 like.  It would not be hard to rewrite Freebase statements to

 ?dbpediaTopic ?freebasePredicate ?anotherDbpediaTopic .

and that would produce something that would be remarkably user
 friendly.


 :baseKB could (and maybe should) pitched as a human-and-machine curated
 bridge between Freebase, DBpedia, and Wikidata (I think).

 Have you considered mapping the classes and properties across DBpedia,
 Freebase, and Wikidata?


 --

 Regards,

 Kingsley Idehen
 Founder  CEO
 OpenLink Software
 Company Web: http://www.openlinksw.com
 Personal Weblog: http://www.openlinksw.com/blog/~kidehen
 Twitter Profile: https://twitter.com/kidehen
 Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
 LinkedIn Profile: http://www.linkedin.com/in/kidehen






 --
 Put Bad Developers to Shame
 Dominate Development with Jenkins Continuous Integration
 Continuously Automate Build, Test  Deployment
 Start a new project now. Try Jenkins in the 

Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-10 Thread Kingsley Idehen

On 4/10/14 2:37 PM, Paul Houle wrote:

It's more accurate to say that the Infovore software is a bridge
between Freebase,  DBpedia,  and other RDF data sets.  It merges data
sets in batch job and creates data sets that are normalized.


Yep!


I think
also it

:BaseKB is a family of products produced from Freebase and Dbpedia
data using the Infovore software.  The main product,  :BaseKB Now,  is
a cleaned up version of the Freebase RDF dump that is much easier to
work with than the raw dump.  :BaseKB is similar to DBpedia and could
be used as a substitute in many applications,  but Dbpedia has some
unique and valuable information that is not in Freebase.


No need for the replacement pitch. You have a curation-branded dataset 
culled from DBpedia, Freebase etc... Once loaded, this dataset can serve 
many useful purposes in conjunction with existing DBpedia and Freebase data.




As for vocabulary conversion I get asked about that a lot.  One reason
I haven't done it is that every transformation you do to data risks
messing things up and the data quality issues are up in the air enough
that a half-baked effort at conversion will cause more problems than
it solves.


Mapping at the definitions level (data dictionary, schema, vocabulary, 
ontology) has more power and longevity than at the instance data level.


TBox (entity types definitions)  RBox (entity relation type 
definitions)  driven tours are eternally superior to ABox driven tours, 
across the Linked Open Data cloud  :-)



  If you keep the vocabulary separate,  you can query
Freebase's opinion and query Dbpedia's opinion and know things haven't
been worse.


The TBox, RBox, and ABox relations should always be loosely coupled. 
Conflation is our worst enemy in the Data Economy.




The mapping process would be done one predicate at a time and would
probably be guided by how prevalent the predicates are.  Some of the
predicates are going to be easy to process (just rewrite them) but
other ones might need more work if compound value types are involved
or if the types are literal types that have a system and domain
dependent meaning that needs to be preserved (is it feet or meters?)

It might also be useful to map to some third vocabulary.  I know
people would like to see DBpedia and Freebase through schema.org eyes
and I think that would be a good idea.


Yes, there should be many of these, all loosely coupled.



Common types and properties will get handled quickly but if somebody
is interested in some vertical,  say boats (20,000 known in Freebase)
they probably personally will need to do the work to figure things
out.  For instance,  Freebase is missing a lot of facts about boats
that are in DBpedia.  A union database will benefit from that,  and
there ought to be some community process where those fixes can be
expressed as rules and added to the system.


Yes, and there is value here for those who want functional business 
models in the Data Economy.



Kingsley







On Wed, Apr 9, 2014 at 5:25 PM, Kingsley Idehen kide...@openlinksw.com wrote:

On 4/9/14 4:53 PM, Paul Houle wrote:

   The type assignments in DBpedia are very precise (few false
statements) but not accurate in the sense that recall is poor;  many
things fall through the cracks.  The real problem is that the the
mappings are the map,  not the territory.  Wikipedia is an
encyclopedia for humans,  not for machines,  so DBpedia has to parse
whatever unsane markup they give us.

   Systems like Wikidata and Freebase can be edited by machines and
human ontologists and get better recall for types.

http://basekb.com/

is a conversion from Freebase to industry standard RDF.  You could
use :BaseKB as a substitute for DBpedia,  but DBpedia has advantages
too because in addition to the 4 million things important enough to be
in DBpedia,  there is another 37 million unimportant things in :BaseKB
that matter only to librarians,  video store clerks and professional
discographers.

These unimportant things will drive you crazy unless you master
them,  and the easiest way to turn down the noise is to restrict
search to the 4 million things.

I could make you an RDF file that has statements such as

?dbpediaTopic a ?freebaseType .

you could load that together with the rest of DBpedia.  That would
get you a long way towards good lists.  The trouble at this point is
that you don't have the freebase types connected to the DBpedia types
so you can't join them against the schema to find properties and such.
   Mapping the types to the DBpedia types would not be that hard either,
   since the two systems are well aligned.  Then you get something that
looks like DBpedia but has more accurate types.

 Freebase has more accurate and better populated data for things
like ticker symbols,  geo-coordinates,  genders,  birth dates and the
like.  It would not be hard to rewrite Freebase statements to

?dbpediaTopic ?freebasePredicate ?anotherDbpediaTopic .

and that would produce 

Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-10 Thread Andy Mabbett
On 9 April 2014 20:09, Roberto Alsina roberto.als...@canonical.com wrote:

 For example: most actors don't have occupation::Actor. Or, publicly traded
 companies (example: Microsoft) have a Traded as field in their infoboxes
 but no matching data in DBPedia.

[Resending to list; apologies to Roberto]

I've done some work to get things like people's occupations/ reason
for notability, and gender, added to their infobox on the English
Wikipedia, but have met a lot of resistance (details on request), so
it's slow going. An example of success is the role alpine skier and
gender symbol in the infobox on on:

   https://en.wikipedia.org/wiki/Tina_Maze

I've also done a considerable amount of work to deploy sub-templates
in infoboxes, to improve machine-readability and data granularity, for
things like dates, and multiple values, about which I've posted here
from time to time. Again, there is resistance in some quarters, but
we've had more successes there.

I'm always interested to hear about how these are or are not useful
and what else Wikipedians could do to improve the reusability of our
content.

-- 
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-09 Thread Paul Houle
 The type assignments in DBpedia are very precise (few false
statements) but not accurate in the sense that recall is poor;  many
things fall through the cracks.  The real problem is that the the
mappings are the map,  not the territory.  Wikipedia is an
encyclopedia for humans,  not for machines,  so DBpedia has to parse
whatever unsane markup they give us.

 Systems like Wikidata and Freebase can be edited by machines and
human ontologists and get better recall for types.

http://basekb.com/

  is a conversion from Freebase to industry standard RDF.  You could
use :BaseKB as a substitute for DBpedia,  but DBpedia has advantages
too because in addition to the 4 million things important enough to be
in DBpedia,  there is another 37 million unimportant things in :BaseKB
that matter only to librarians,  video store clerks and professional
discographers.

  These unimportant things will drive you crazy unless you master
them,  and the easiest way to turn down the noise is to restrict
search to the 4 million things.

  I could make you an RDF file that has statements such as

?dbpediaTopic a ?freebaseType .

  you could load that together with the rest of DBpedia.  That would
get you a long way towards good lists.  The trouble at this point is
that you don't have the freebase types connected to the DBpedia types
so you can't join them against the schema to find properties and such.
 Mapping the types to the DBpedia types would not be that hard either,
 since the two systems are well aligned.  Then you get something that
looks like DBpedia but has more accurate types.

   Freebase has more accurate and better populated data for things
like ticker symbols,  geo-coordinates,  genders,  birth dates and the
like.  It would not be hard to rewrite Freebase statements to

?dbpediaTopic ?freebasePredicate ?anotherDbpediaTopic .

  and that would produce something that would be remarkably user friendly.



On Wed, Apr 9, 2014 at 3:09 PM, Roberto Alsina
roberto.als...@canonical.com wrote:
 Hi!

 First, a brief introduction. My name is Roberto Alsina, and my team at
 Canonical is using DBPedia in the upcoming ubuntu touch phone operating
 system to improve a suggestions engine.

 What we are doing is, when a user searches for something, we look it up in
 wikipedia, and then use the entity name in dbpedia to get properties, which
 we then associate with different results.

 For example:

 User types Metallica = Wikipedia matches = DBPedia says Type::band =
 We suggest searching in grooveshark and youtube

 All in all, the approach works remarkably well, but we are finding some
 missing mappings, and we'd like to help improve DBPedia :-)

 For example: most actors don't have occupation::Actor. Or, publicly traded
 companies (example: Microsoft) have a Traded as field in their infoboxes
 but no matching data in DBPedia.

 For the latter, adding mappings in
 http://mappings.dbpedia.org/index.php/Main_Page should be enough, right?

 Looking forward to working on this :-)



 --
 Put Bad Developers to Shame
 Dominate Development with Jenkins Continuous Integration
 Continuously Automate Build, Test  Deployment
 Start a new project now. Try Jenkins in the cloud.
 http://p.sf.net/sfu/13600_Cloudbees
 ___
 Dbpedia-discussion mailing list
 Dbpedia-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion




-- 
Paul Houle
Expert on Freebase, DBpedia, Hadoop and RDF
(607) 539 6254paul.houle on Skype   ontolo...@gmail.com

--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-09 Thread Kingsley Idehen

On 4/9/14 4:53 PM, Paul Houle wrote:

  The type assignments in DBpedia are very precise (few false
statements) but not accurate in the sense that recall is poor;  many
things fall through the cracks.  The real problem is that the the
mappings are the map,  not the territory.  Wikipedia is an
encyclopedia for humans,  not for machines,  so DBpedia has to parse
whatever unsane markup they give us.

  Systems like Wikidata and Freebase can be edited by machines and
human ontologists and get better recall for types.

http://basekb.com/

   is a conversion from Freebase to industry standard RDF.  You could
use :BaseKB as a substitute for DBpedia,  but DBpedia has advantages
too because in addition to the 4 million things important enough to be
in DBpedia,  there is another 37 million unimportant things in :BaseKB
that matter only to librarians,  video store clerks and professional
discographers.

   These unimportant things will drive you crazy unless you master
them,  and the easiest way to turn down the noise is to restrict
search to the 4 million things.

   I could make you an RDF file that has statements such as

?dbpediaTopic a ?freebaseType .

   you could load that together with the rest of DBpedia.  That would
get you a long way towards good lists.  The trouble at this point is
that you don't have the freebase types connected to the DBpedia types
so you can't join them against the schema to find properties and such.
  Mapping the types to the DBpedia types would not be that hard either,
  since the two systems are well aligned.  Then you get something that
looks like DBpedia but has more accurate types.

Freebase has more accurate and better populated data for things
like ticker symbols,  geo-coordinates,  genders,  birth dates and the
like.  It would not be hard to rewrite Freebase statements to

?dbpediaTopic ?freebasePredicate ?anotherDbpediaTopic .

   and that would produce something that would be remarkably user friendly.


:baseKB could (and maybe should) pitched as a human-and-machine curated 
bridge between Freebase, DBpedia, and Wikidata (I think).


Have you considered mapping the classes and properties across DBpedia, 
Freebase, and Wikidata?



--

Regards,

Kingsley Idehen 
Founder  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen







smime.p7s
Description: S/MIME Cryptographic Signature
--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-09 Thread Pablo N. Mendes
I like very much the contributing back aspect of this. Thanks for
offering! One problem is that some pages have no template, making it
impossible to use the template-type mappings defined on the mappings wiki.

Other people have implemented type inferencing from categories, lists and
even from the text.

Others, by cross-referencing with Freebase, Cyc, etc.

I am wondering if the type statements obtained through all these approaches
should not be imported back to DBpedia through some semiautomatic curation
method (read mappings wiki beyond templates).

I guess we could also use the wiki, and allow people to also add mappings
for Lists, Categories, Tables, and other features generated by these
approaches?

Cheers
Pablo
On Apr 9, 2014 2:29 PM, Kingsley Idehen kide...@openlinksw.com wrote:

 On 4/9/14 4:53 PM, Paul Houle wrote:

   The type assignments in DBpedia are very precise (few false
 statements) but not accurate in the sense that recall is poor;  many
 things fall through the cracks.  The real problem is that the the
 mappings are the map,  not the territory.  Wikipedia is an
 encyclopedia for humans,  not for machines,  so DBpedia has to parse
 whatever unsane markup they give us.

   Systems like Wikidata and Freebase can be edited by machines and
 human ontologists and get better recall for types.

 http://basekb.com/

is a conversion from Freebase to industry standard RDF.  You could
 use :BaseKB as a substitute for DBpedia,  but DBpedia has advantages
 too because in addition to the 4 million things important enough to be
 in DBpedia,  there is another 37 million unimportant things in :BaseKB
 that matter only to librarians,  video store clerks and professional
 discographers.

These unimportant things will drive you crazy unless you master
 them,  and the easiest way to turn down the noise is to restrict
 search to the 4 million things.

I could make you an RDF file that has statements such as

 ?dbpediaTopic a ?freebaseType .

you could load that together with the rest of DBpedia.  That would
 get you a long way towards good lists.  The trouble at this point is
 that you don't have the freebase types connected to the DBpedia types
 so you can't join them against the schema to find properties and such.
   Mapping the types to the DBpedia types would not be that hard either,
   since the two systems are well aligned.  Then you get something that
 looks like DBpedia but has more accurate types.

 Freebase has more accurate and better populated data for things
 like ticker symbols,  geo-coordinates,  genders,  birth dates and the
 like.  It would not be hard to rewrite Freebase statements to

 ?dbpediaTopic ?freebasePredicate ?anotherDbpediaTopic .

and that would produce something that would be remarkably user
 friendly.


 :baseKB could (and maybe should) pitched as a human-and-machine curated
 bridge between Freebase, DBpedia, and Wikidata (I think).

 Have you considered mapping the classes and properties across DBpedia,
 Freebase, and Wikidata?


 --

 Regards,

 Kingsley Idehen
 Founder  CEO
 OpenLink Software
 Company Web: http://www.openlinksw.com
 Personal Weblog: http://www.openlinksw.com/blog/~kidehen
 Twitter Profile: https://twitter.com/kidehen
 Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
 LinkedIn Profile: http://www.linkedin.com/in/kidehen







 --
 Put Bad Developers to Shame
 Dominate Development with Jenkins Continuous Integration
 Continuously Automate Build, Test  Deployment
 Start a new project now. Try Jenkins in the cloud.
 http://p.sf.net/sfu/13600_Cloudbees
 ___
 Dbpedia-discussion mailing list
 Dbpedia-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-09 Thread Alexandru Todor
Hi Pablo,

We're working on the topic you are discussing. We're semi-automatically
committing things back to to Mappings Wiki with bots [1]. Currently we are
limiting ourselves to label translations, but will move on to the inferred
mappings generated by Airpedia. There is also an Google Summer of Code
proposal that includes checking DBpedia info against other language
chapters and external data sources such as freebase, wikidata, geonames,
musicbrainz etc, and creating a feedback loop to Wikipedia [3] (you can
read the full proposal in Google melange if you have access, the allocation
of a slot for this proposal is still being debated though).


Roberto: If you want to help improve DBpedia, please register in the
mappings wiki and request an editor status, we will gladly give you access.
DBpedia is a community effort and we don't have the financial backing of
Google or the donations Wikipedia gets. In conclusion, any small edits you
can make to improve the mappings are greatly appreciated.

Cheers,
Alexandru

[1] https://github.com/ag-csw/missingBot
[2] http://www.airpedia.org/
[3] http://wiki.dbpedia.org/gsoc2014/ideas#h359-20


On Thu, Apr 10, 2014 at 12:53 AM, Pablo N. Mendes pablomen...@gmail.comwrote:


 I like very much the contributing back aspect of this. Thanks for
 offering! One problem is that some pages have no template, making it
 impossible to use the template-type mappings defined on the mappings wiki.

 Other people have implemented type inferencing from categories, lists and
 even from the text.

 Others, by cross-referencing with Freebase, Cyc, etc.

 I am wondering if the type statements obtained through all these
 approaches should not be imported back to DBpedia through some
 semiautomatic curation method (read mappings wiki beyond templates).

 I guess we could also use the wiki, and allow people to also add mappings
 for Lists, Categories, Tables, and other features generated by these
 approaches?

 Cheers
 Pablo
 On Apr 9, 2014 2:29 PM, Kingsley Idehen kide...@openlinksw.com wrote:

 On 4/9/14 4:53 PM, Paul Houle wrote:

   The type assignments in DBpedia are very precise (few false
 statements) but not accurate in the sense that recall is poor;  many
 things fall through the cracks.  The real problem is that the the
 mappings are the map,  not the territory.  Wikipedia is an
 encyclopedia for humans,  not for machines,  so DBpedia has to parse
 whatever unsane markup they give us.

   Systems like Wikidata and Freebase can be edited by machines and
 human ontologists and get better recall for types.

 http://basekb.com/

is a conversion from Freebase to industry standard RDF.  You could
 use :BaseKB as a substitute for DBpedia,  but DBpedia has advantages
 too because in addition to the 4 million things important enough to be
 in DBpedia,  there is another 37 million unimportant things in :BaseKB
 that matter only to librarians,  video store clerks and professional
 discographers.

These unimportant things will drive you crazy unless you master
 them,  and the easiest way to turn down the noise is to restrict
 search to the 4 million things.

I could make you an RDF file that has statements such as

 ?dbpediaTopic a ?freebaseType .

you could load that together with the rest of DBpedia.  That would
 get you a long way towards good lists.  The trouble at this point is
 that you don't have the freebase types connected to the DBpedia types
 so you can't join them against the schema to find properties and such.
   Mapping the types to the DBpedia types would not be that hard either,
   since the two systems are well aligned.  Then you get something that
 looks like DBpedia but has more accurate types.

 Freebase has more accurate and better populated data for things
 like ticker symbols,  geo-coordinates,  genders,  birth dates and the
 like.  It would not be hard to rewrite Freebase statements to

 ?dbpediaTopic ?freebasePredicate ?anotherDbpediaTopic .

and that would produce something that would be remarkably user
 friendly.


 :baseKB could (and maybe should) pitched as a human-and-machine curated
 bridge between Freebase, DBpedia, and Wikidata (I think).

 Have you considered mapping the classes and properties across DBpedia,
 Freebase, and Wikidata?


 --

 Regards,

 Kingsley Idehen
 Founder  CEO
 OpenLink Software
 Company Web: http://www.openlinksw.com
 Personal Weblog: http://www.openlinksw.com/blog/~kidehen
 Twitter Profile: https://twitter.com/kidehen
 Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
 LinkedIn Profile: http://www.linkedin.com/in/kidehen







 --
 Put Bad Developers to Shame
 Dominate Development with Jenkins Continuous Integration
 Continuously Automate Build, Test  Deployment
 Start a new project now. Try Jenkins in the cloud.
 http://p.sf.net/sfu/13600_Cloudbees
 ___
 Dbpedia-discussion mailing 

Re: [Dbpedia-discussion] Fwd: Hello DBPedia!

2014-04-09 Thread Alexandru Todor
I'm replying to my own email:

Hi Pablo: I'm sorry, it's late and I noticed I didn't read your mail
correctly. I see you are talking about extending the mappings wiki for
tables, lists and the main article text. This approach is also under
discussion in GSoC , under the idea 4.18 [1] .

[1] http://wiki.dbpedia.org/gsoc2014/ideas#h359-23


On Thu, Apr 10, 2014 at 1:14 AM, Alexandru Todor to...@inf.fu-berlin.dewrote:

 Hi Pablo,

 We're working on the topic you are discussing. We're semi-automatically
 committing things back to to Mappings Wiki with bots [1]. Currently we are
 limiting ourselves to label translations, but will move on to the inferred
 mappings generated by Airpedia. There is also an Google Summer of Code
 proposal that includes checking DBpedia info against other language
 chapters and external data sources such as freebase, wikidata, geonames,
 musicbrainz etc, and creating a feedback loop to Wikipedia [3] (you can
 read the full proposal in Google melange if you have access, the allocation
 of a slot for this proposal is still being debated though).


 Roberto: If you want to help improve DBpedia, please register in the
 mappings wiki and request an editor status, we will gladly give you access.
 DBpedia is a community effort and we don't have the financial backing of
 Google or the donations Wikipedia gets. In conclusion, any small edits you
 can make to improve the mappings are greatly appreciated.

 Cheers,
 Alexandru

 [1] https://github.com/ag-csw/missingBot
 [2] http://www.airpedia.org/
 [3] http://wiki.dbpedia.org/gsoc2014/ideas#h359-20


 On Thu, Apr 10, 2014 at 12:53 AM, Pablo N. Mendes 
 pablomen...@gmail.comwrote:


 I like very much the contributing back aspect of this. Thanks for
 offering! One problem is that some pages have no template, making it
 impossible to use the template-type mappings defined on the mappings wiki.

 Other people have implemented type inferencing from categories, lists and
 even from the text.

 Others, by cross-referencing with Freebase, Cyc, etc.

 I am wondering if the type statements obtained through all these
 approaches should not be imported back to DBpedia through some
 semiautomatic curation method (read mappings wiki beyond templates).

 I guess we could also use the wiki, and allow people to also add mappings
 for Lists, Categories, Tables, and other features generated by these
 approaches?

 Cheers
 Pablo
 On Apr 9, 2014 2:29 PM, Kingsley Idehen kide...@openlinksw.com wrote:

 On 4/9/14 4:53 PM, Paul Houle wrote:

   The type assignments in DBpedia are very precise (few false
 statements) but not accurate in the sense that recall is poor;  many
 things fall through the cracks.  The real problem is that the the
 mappings are the map,  not the territory.  Wikipedia is an
 encyclopedia for humans,  not for machines,  so DBpedia has to parse
 whatever unsane markup they give us.

   Systems like Wikidata and Freebase can be edited by machines and
 human ontologists and get better recall for types.

 http://basekb.com/

is a conversion from Freebase to industry standard RDF.  You could
 use :BaseKB as a substitute for DBpedia,  but DBpedia has advantages
 too because in addition to the 4 million things important enough to be
 in DBpedia,  there is another 37 million unimportant things in :BaseKB
 that matter only to librarians,  video store clerks and professional
 discographers.

These unimportant things will drive you crazy unless you master
 them,  and the easiest way to turn down the noise is to restrict
 search to the 4 million things.

I could make you an RDF file that has statements such as

 ?dbpediaTopic a ?freebaseType .

you could load that together with the rest of DBpedia.  That would
 get you a long way towards good lists.  The trouble at this point is
 that you don't have the freebase types connected to the DBpedia types
 so you can't join them against the schema to find properties and such.
   Mapping the types to the DBpedia types would not be that hard either,
   since the two systems are well aligned.  Then you get something that
 looks like DBpedia but has more accurate types.

 Freebase has more accurate and better populated data for things
 like ticker symbols,  geo-coordinates,  genders,  birth dates and the
 like.  It would not be hard to rewrite Freebase statements to

 ?dbpediaTopic ?freebasePredicate ?anotherDbpediaTopic .

and that would produce something that would be remarkably user
 friendly.


 :baseKB could (and maybe should) pitched as a human-and-machine curated
 bridge between Freebase, DBpedia, and Wikidata (I think).

 Have you considered mapping the classes and properties across DBpedia,
 Freebase, and Wikidata?


 --

 Regards,

 Kingsley Idehen
 Founder  CEO
 OpenLink Software
 Company Web: http://www.openlinksw.com
 Personal Weblog: http://www.openlinksw.com/blog/~kidehen
 Twitter Profile: https://twitter.com/kidehen
 Google+ Profile: https://plus.google.com/+KingsleyIdehen/about