Hi Krzysztof,
What I am trying to get at is a coherent ontology of attributes that can
be used for mapping ABox instance data for integration and
interoperability purposes. The general idea is to have a reference
grounding upon which external semantic datasets can co-reference as a
bridging mechanism.
As one example, let's say that dataset A describes location of an entity
with the attribute country, and only provides literal values, whereas
another dataset B describes location with object properties by ISO
country code. The reference grounding could be an ontology with the
complete listing of ISO country codes. This is not the simplest example,
since the literal in dataset A would need to be evaluated and lifted to
the reference object property. Probably some user interface would need
to be involved to reconcile uncertainties, making the process
semi-automatic.
Other examples may not involve lifting, but may involve unit conversions
or other manipulations. Those, too, would likely need to be semi-automatic.
On the face of it, the scope may sound daunting. But, my observation is
that most attributes (explicitly used to describe entities) follow a
Pareto distribution and the number of commonly used attributes (say,
between schema.org, Wikidata, and other leading KBs) is tractable. Once
a suitable design and starting framework was in place, grounding values
could continue to be expanded, as well as possible lifting and
conversion utilities.
The advantage of this approach to dataset/KB authors is that only one
mapping need to be made to the reference grounding. Thereafter, other
datasets mapping to the same attribute(s) could be inspected for
possible interoperability.
UMBEL, as a subset of Cyc, already has about 2000+ of its concepts
already assigned to the attribute SuperType [1]. I was able to rather
quickly pull together one initial high-level view for 90 or so of them
to construct what such a attribute concept structure may look like:
Attributes
ObjectValueCharacteristics
StringObject
StringDatatype_Unlimited
List_Information
FrequentlyAskedQuestionsList
MailingList
AlphabeticalList
Index_List_Information
BullettedFormat
UnitOfMeasure
UnitOfDistance
InternationalUnitOfMeasure
UnitOfMeasure_Common
NaturalLanguage
Encrypted
AuthenticationSource
PersistenceDistribution
Uniform_PersistenceDistribution
UnitOfMeasureConcept
Ratio
CollectionType
Phase
EmptyCollection
Preference
Quantity
AttachmentAttribute
WrittenInfo
StructuredInfo
VisualInfo
AudioInfo
LogicalFieldAttribute
TruthValue
EntityCharacteristics
DescriptiveAttributes
Definition_PCW
VisualPattern
SpatialThingTypeByShape
ShapeAttributes
Color
Name
Title
EnumeratedAttributes
EconomicalQuantity
DispositionalQuantity
MentalQuantity
PhysicalQuantity
Quality
SocialQuantity
MeasurableQuantity
TotallyOrderedQuantityType
QuantityType
NonAspectualQuantity
EnvironmentalQuantity
ActionAttributeLevelQuantity
EmotionalQuantityType
LocationAttributes
OrientationAttributes
GeographicalPlace
MappableAttributes
ContactLocation
PopulatedPlace
TimeAttributes
HistoricTemporalThing
Time_Quantity
EventAttributes
TimeInterval
TemporalThing
IdentificationAttributes
ContactLocation
ReferenceWork
IDString
UniqueID
SituationAttributes
Situation
Qualifier
Statement
Collection
'Ordered Collection'
Individual
'Concept Scheme'
Class
Concept
Statement
Class
RefConcept
This is *very* preliminary, and some of the names don't yet feel right.
Also, there are some new concepts added (which need to be checked in
Cyc) for better organization. But it does try to capture one
more-or-less high-level view of the outlines for this structure. SIO has
a different, but similar, approach.
I am purposefully excluding "relations" between entity types in this
thinking. Rather, I am focusing strictly on the instance descriptions
and characterizations (attributes). For the attributes as defined,
however, both bundles and hierarchies are of interest.
Does this help?
If so and there is a relationship with your own geographic interests,
perhaps we can talk offline. Since I envision this reference grounding
having common use, geographic attributes would definitely be included,
as shown above.
Best, Mike
[1] See Annex G at http://umbel.org/annexes/
On 7/11/2014 1:08 PM, Krzysztof Janowicz wrote:
SIO looks really interesting! Thanks for sharing. Just to make sure we
all talk about the same. Mike, are you looking for bundles of relations
and attributes that characterize types or hierarchies of relations and
attributes? We are doing the first for geographic feature types (e.g.,
state) if this would be of any interest to you.
Best,
Krzysztof
On 07/11/2014 10:49 AM, Michel Dumontier wrote:
Hi Mike,
We have done some work in SIO [1] to guide the development of
descriptive and quantitative attributes. We have a recently published
paper [2] that articulates some of our design decisions, and how we
use them in our work. Happy to work with you on your use cases in the
context of our public mailing list [3]
Best,
m.
[1] http://sio.semanticscience.org
[2] http://www.jbiomedsem.com/content/5/1/14
[3] http://groups.google.com/group/sio-ontology
Michel Dumontier
Associate Professor of Medicine (Biomedical Informatics), Stanford
University
Chair, W3C Semantic Web for Health Care and the Life Sciences Interest
Group
http://dumontierlab.com