Clarifying:

The function of an ID (identifier) is to identity a concept, not to 
describe it. So they must be unique by definition, a concept can not have 
multiple IDs . Multiple concepts may be equivalent to one another, but 
that doesn't mean they are the same. If you have two different IDs (i.e. 
the Unicode comparison of the strings results in mismatch) you have by 
definition two different concepts. That is not what you want. You want 
multiple labels for the same concept/ID. 

The description of a concept (or definition, philosophically speaking) is 
the whole group of other properties you ascribe to the concept by 
referencing its ID, including as may human readable labels as you need and 
relations to other concepts.  You can choose to use an automatic generated 
ID in order to guarantee uniqueness or use manually ascribed human 
readable strings, but with the later you must guarantee the uniqueness by 
process (as is the case with DBpedia IDs, which are derived from Wikipedia 
IDs). 

It is not a RDF requirement for the IRIs to be simple (although that saves 
parsing time) or human readable, but they MUST be unique and stable, 
otherwise the identity of the concept is compromised.  Human readable IDs 
are only helpful in manual edition of files, which happens only in 
examples and didactic purposes. Real world IDs are mostly UUIDs 
(universally unique IDs) generated by the system (for Java, see 
java.util.UUID ). Some systems use prefixed URLs in order to embed 
provenance into the ID, but that is a very, very bad practice: provenance 
is metadata as any other, you should use specific properties for that. IDs 
were not designed to provide any other semantics besides identity.

Cheers.
=============================================
Marcelo Jaccoud Amaral
PETROBRAS
Tecnologia da Informação e Comunicações - Arquitetura  Tecnológica 
(TIC/ARQTIC/AT)
=============================================
dum loquimur, fugetir invida aetas: carpe diem, quam minimum credula 
postero.
-- Horatius





De:     Katie Frey <kf...@cfa.harvard.edu>
Para:   Markus Kroetzsch <markus.kroetz...@tu-dresden.de>
Cc:     "DBpedia Discussion \(ML\)" 
<dbpedia-discussion@lists.sourceforge.net>
Data:   2016-05-31 11:50
Assunto:        Re: [Dbpedia-discussion] Concept Identifiers



Dear Markus,

Thank you for the insight.  We might also try to assign both numeric and 
descriptive IDs to a concept.  It seems as though best practices don't 
really exist in this area, other than the general imperative to keep the 
URIs simple and as stable as possible.

Best,
Katie


--
Katie E. Frey
John G. Wolbach Library, Harvard-Smithsonian Center for Astrophysics
60 Garden Street, MS-56, Cambridge, MA 02138
email: kf...@cfa.harvard.edu   |   phone: 617-496-7579
http://astrothesaurus.org           |   http://library.cfa.harvard.edu/

"Surprising what you can dig out of books if you read long enough, isn’t 
it?"
- Rand al'Thor (in Robert Jordan's The Shadow Rising, Book Four of the 
Wheel of Time)

"This is insanity!"   "No, this is scholarship!"
- Yalb and Shallan (in Brandon Sanderson's Words of Radiance, Book Two of 
the Stormlight Archive)

On Thu, May 26, 2016 at 4:06 PM, Markus Kroetzsch <
markus.kroetz...@tu-dresden.de> wrote:
Dear Katie,

DBpedia mostly uses descriptive URIs that are based on the titles of 
Wikipedia articles in a specific language. These URIs change if pages are 
renamed, but for many concepts, this does not occur so often. You would 
probably only notice it if you are using the URIs for several years.

If you instead want to use numeric IDs based on Wikipedia pages (or 
DBpedia URIs), you can take them from Wikidata. These IDs are stable, but 
not descriptive. They are kept unique in that they can only be deleted but 
not reused. For example, http://dbpedia.org/page/Solar_System is currently 
the same as
https://www.wikidata.org/entity/Q544.

The Wikidata URI uses content negotiation to redirect you to the HTML page 
if you open it in a browser, and to RDF if you open it with an RDF 
crawler. See https://www.wikidata.org/wiki/Wikidata:Data_access for direct 
links to the RDF content.

To manually find out what the Wikidata ID is for a Wikipedia page, you can 
go to the Wikipedia page and use the link to "Wikidata item" on the left. 
To do this in an automated fashion, you can use the SPARQL endpoint, e.g., 
with the query

SELECT *
WHERE {
  <https://en.wikipedia.org/wiki/Solar%20System> schema:about ?item .
}

(try it in the Wikidata SPARQL UI: http://tinyurl.com/jlk4fz2)

The Wikidata Web API can also map page titles to IDs for you prefer JSON 
over SPARQL:

https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&sites=enwiki&titles=Solar+System&props=


Each of these methods can also be used to fetch many IDs at once. So 
basically it is fairly straightforward to translate from DBpedia URIs to 
Wikidata URIs. The mapping between the two changes over time only when 
DBPedia URIs change their meaning (e.g., if "Solar System" is renamed to 
"Solar System (astronomy)" or something).

Best regards,

Markus



On 26.05.2016 20:43, Katie Frey wrote:
Hello,

How are concept IDs handled for DBpedia?  It looks like the concept URIs
are descriptive (i.e. for the concept
http://dbpedia.org/page/Solar_System, the concept ID is
"Solar_System").  Are the descriptive IDs used throughout all of dbpedia
(back and front end) or are terms ultimately kept unique by using
numeric identifiers?

I've been developing a controlled vocabulary and I would also like to
use URIs so that my terms can be used with other linked data schemes.
My group and I have had a lot of discussions regarding the concept IDs;
some want them to be descriptive, based on the preferred term for each
concept so that they are human readable but this could cause problems if
the terms used to describe each concept change over time, others want
them to be randomly generated so that if the description of a term
drifts over time the URI for the concept will always remain static.

We are trying to figure out if there are any standards or best practices
we should be looking towards when it comes to concept IDs.  Any
thoughts/comments/justifications would be appreciated.

Best,
Katie

--
Katie E. Frey
John G. Wolbach Library, Harvard-Smithsonian Center for Astrophysics
60 Garden Street, MS-56, Cambridge, MA 02138
email: kf...@cfa.harvard.edu <mailto:kf...@cfa.harvard.edu>   |   phone:
617-496-7579 <tel:617-496-7579>
http://astrothesaurus.org           | http://library.cfa.harvard.edu/

"Surprising what you can dig out of books if you read long enough, isn’t
it?"
- Rand al'Thor (in Robert Jordan's The Shadow Rising, Book Four of the
Wheel of Time)

"This is insanity!"   "No, this is scholarship!"
- Yalb and Shallan (in Brandon Sanderson's Words of Radiance, Book Two
of the Stormlight Archive)


------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data 
untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j



_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


-- 
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and 
traffic
patterns at an interface-level. Reveals which users, apps, and protocols 
are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion



 
"O emitente desta mensagem é responsável por seu conteúdo e endereçamento. Cabe 
ao destinatário cuidar quanto ao tratamento adequado. Sem a devida autorização, 
a divulgação, a reprodução, a distribuição ou qualquer outra ação em 
desconformidade com as normas internas do Sistema Petrobras são proibidas e 
passíveis de sanção disciplinar, cível e criminal."
 
"The sender of this message is responsible for its content and addressing. The 
receiver shall take proper care of it. Without due authorization, the 
publication, reproduction, distribution or the performance of  any other action 
not conforming to Petrobras System internal policies and procedures is 
forbidden and liable to disciplinary, civil or criminal sanctions."
 
"El emisor de este mensaje es responsable por su contenido y direccionamiento. 
Cabe al destinatario darle el tratamiento adecuado. Sin la debida autorización, 
su divulgación, reproducción, distribución o cualquier otra acción no conforme 
a las normas internas del Sistema Petrobras están prohibidas y serán pasibles 
de sanción disciplinaria, civil y penal."
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to