Hi,
On 11/08/2011 12:16 PM, Pablo Mendes wrote:
Hi Mariano,
I don't have answers for everything, but here goes my 2c.
is there any policy for creatingDBpedia classes or properties?. For
example, we missed the classBullFighter, we checked there was no
othersimilar class, and we created it.
The only guidelines I know are specified here:
http://mappings.dbpedia.org/index.php/Mapping_Guide
How do we delete an erroneous mapping? All wiki pages have a
delete tab, but we do not know if it is an immediate delete or it
willbe checked byany admin
AFAIK, it's immediate.
In DBpedia-Live, we reprocess all the changed pages we get from
Wikipedia update stream, and we also reprocess the pages that are
affected by a mapping change.
The pages we get from Wikipedia update stream have higher priority, so
they are reprocessed first.
So, the pages affected by a mapping may take a few minutes to get
reprocessed depending on how many live page are waiting for
reprocessing, but it will not take long to appear.
When we create a DBpedia class or property, when it becomes
effective?, what is the life cycle of the modifications?
AFAIK, it's immediate. What do you mean "life cycle"? Changes show up
in live.dbpedia.org <http://live.dbpedia.org> nearly immediate and on
dbpedia.org <http://dbpedia.org> in the next release (usually twice a
year for the entire data & as frequent as you want for your localized
version.
Same as the previous issue, it may take a few minutes to appear in
DBpedia-Live.
If we consider that it is necessary a given property (e.g.,
debutDate) in the DBpedia ontology, but that property was deleted
(we can see this in the page history), what do we have to do?.
See if there is a duplicate. That might be the reason for the
deletion. You should then use the one that remained. Otherwise,
discuss in the list and the discussion page of that property.
is there any way for knowing which username created more mappings?
Yes. We do that for the DBpedia Portuguese. See pt.dbpedia.org
<http://pt.dbpedia.org>. I'm glad to share the code.
It seems that the extraction process reads the properties found in
the infobox instances, without checking if those properties are in
the infobox definition. is that so?
I think so.
Eg: In the statistics of (es) Ficha_de_futbolista we can find the
property "altura" as one of the most used, but that property is
not in the infobox definition. In the infobox definition we can
see "estatura" (a concept similar to "altura") but is much less
used that "altura". Do we have a mechanism to map both infobox
properties to the same DBpedia property?
Yes.
We tried creating two mappings, one for "altura" and another for
"estatura",
Exactly.
but we get always two triples for each infobox instance (although
the instance has only one of these properties). Any solution?
What happens if you map only one? Maybe the infobox itself is doing
some resolution there? When you say you get two triples, do you mean
you get one for the http://dbpedia.org/ontology namespace and one for
the http://dbpedia.org/property namespace? That is expected. One is
for the mapped and one for the non-mapped property.
The parsing of spanish dates (dd/mm/yyyy) does not work (property
mapped to xsd:date). Do we have the same problem for decimal
numbers? (in spanish, decimal numbers use to be like 2,5 instead
of 2.5).
You can patch the Date and Decimal extractors to take some i18n config
params.
I guess that is not too much effort in the parser.
Some wikipedia pages have infobox instances with properties that
are not in the infobox definition. May be those properties have
been deleted from the definition, producing an inconsistency (e.g.
(es) Partidos and Ficha_de_montaña). Any recommendation?
What do you mean by inconsistency? Why is it a problem?
What is the meaning of the grey rows in the statistics page? It
says "template is on the ignorelist". What is this, a "deprecated"
property/class?
The answer is here:
http://mappings.dbpedia.org/index.php/Mapping_Statistics
"...the statistics contain non relevant templates like Unreferenced or
Rail line. These templates aren't classical infoboxes and shouldn't
affect the statistics. On that account they can be ignored. If a
template is on the ignore list, it does not count for the number of
potential infoboxes."
Can we map an infobox to 2 DBpedia classes if both classes are
equivalent? E.g.: Organization and Organisation classes exist in
DBpedia.
We should not have both classes. That is a bug in the ontology and
should be fixed.
I had a look on the ontology wiki and found only "Organisation" class
and did not find the other one you mentioned.
In the statistics page (e.g spanish at
http://mappings.dbpedia.org/server/statistics/es)
<http://mappings.dbpedia.org/server/templatestatistics/es/INFOBOXNAME%29>we
getinformation about the spanish infoboxes sorted by instance
number. In the case of spanish, it says there are 1311 different
infoboxes, but the table shows only ~300. Wherecan be find the
rest?. The number of properties shown in statistics have a similar
issue. For example, in thedefinition of infobox (es)
Ficha_de_futbolista there are20 properties, but in the infobox
statistics (information about the spanish infoboxes sorted by
instance number. In the case of spanish, it says there are 1311
different infoboxes, but the table shows only ~300. Wherecan be
find the rest?. The number of properties shown instatistics have a
similar issue. For example, in thedefinition of infobox (es)
Ficha_de_futbolista there are20 properties, but in the infobox
statistics
(http://mappings.dbpedia.org/server/templatestatistics/es/Ficha_de_futbolista)
there are 22. These 2 additional properties come from the infobox
instances?
I don't know the answer. Paul Kreis is possibly the only one that
would know.
Some properties seem toexist in DBpedia, but when we use them in
the mappingsare considered nonexistent (are rendered in red). E.g:
in (es) Ficha_de_Tenista
<http://mappings.dbpedia.org/index.php/Mapping_es:Ficha_de_tenista>we
tried to use the DBpedia property "turnedpro"(intheory existing,
as can be seen at http://dbpedia.org/property/turnedpro)theory
existing, as can be seen at
http://dbpedia.org/property/turnedpro). When we try to use that
property in our mapping we get "When we try to use that property
in our mapping we get "Couldn'tload property mapping on page
en:Mapping es:Ficha detenista. Details: Ontology property
turnedpro not found". As well we tried withdbpprop:turnedpro,
getting the same result.
This is probably a confusion between infobox property and DBpedia
property. The dbprop (http://dbpedia.org/property) namespace should be
read as "infobox property", while the http://dbpedia.org/ontology
namespace is the one that contains the DBpedia properties. You can
only map infobox properties to DBpedia Ontology properties.
is there any scheduling for the next dump? We are anxious about
knowing how many spanish triples we are going to get.
Generalized dumps for the entire (Internationalized) DBpedia usually
happen twice a year. The international chapters are free to release
their data in any release cycle they see fit. So you may just run the
extraction framework on your side and tell us how many triples you
get. We are also curious! :)
I have a february version of a document entitled "DBpedia mapping
language", do you have an actualized version? I found some typos
and it does not cover conditional mappings.
I also don't know the answer to that question. You can check directly
in the repository.
http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/file/cefae9797133/core/doc/mapping_language
I have a "big machine" for hosting the spanish DBpedia, and I hope
to set up the extraction process on that machine very soon. Once
we get a good spanish extraction process, what do we have to do in
order to get the *es.*dbpedia.org <http://dbpedia.org> redirect?
Whenever the machine is set up, please e-mail dbpedia-developers with
the IP and the responsible party will set up the domain forwarding.
Concerning internationalized resource URIs, we see that the
spanish triples generated now in DBpedia have the URI form
http://dbpedia.org/Resource/Whatever. Therefore, if we query about
the resource http://dbpedia.org/Resource/Berlin, we will get a
unique resource with all the properties specified by 15
internationalized versions of wikipedia. Right? However, the
"hosted" versions of DBpedia (ge, el, ru...) have a URI like
http://*ge.*dbpedia.org/Resource/Berlin
<http://dbpedia.org/Resource/Berlin>. Right?
There is a current debate about this in the i18n committee. The
current solution is to generate the triples under
http://es.dbpedia.org/resource/Berlin, and set sameAs links to
http://dbpedia.org/resource/Berlin. My preferred solution would be to
bypass this step at least in cases where we're more confident that the
link is true (for example with bidirectional language links). Feel
free to join the discussion:
http://sourceforge.net/mailarchive/forum.php?thread_name=BANLkTin1a9tHUvQb%2B1sMsfuzr8fgUgyQ_Q%40mail.gmail.com&forum_name=dbpedia-developers
<http://sourceforge.net/mailarchive/forum.php?thread_name=BANLkTin1a9tHUvQb%2B1sMsfuzr8fgUgyQ_Q%40mail.gmail.com&forum_name=dbpedia-developers>
Folks, anybody else can chip in?
Cheers,
Pablo
--
Pablo N. Mendes
Research Associate
Web Based Systems Group
Freie Universität Berlin
http://wbsg.de
------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
--
Kind Regards
Mohamed Morsey
Department of Computer Science
University of Leipzig
------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion