Milton Ponson
GSM: +297 747 8280
Rainbow Warriors Core Foundation
PO Box 1154, Oranjestad
Aruba, Dutch Caribbean
www.rainbowwarriors.net
Project Paradigm: A structured approach to bringing the tools for
sustainable development to all stakeholders worldwide
www..projectparadigm.info
NGO-Opensource: Creating ICT tools for NGOs worldwide for Project Paradigm
www.ngo-opensource.org
MetaPortal: providing online access to web sites and repositories of
data and information for sustainable development
www.metaportal.info
SemanticWebSoftware, part of NGO-Opensource to enable SW technologies
in the Metaportal project
www.semanticwebsoftware.info
--- On *Wed, 4/29/09, Wolfgang Orthuber
/<orthu...@kfo-zmk.uni-kiel.de>/* wrote:
From: Wolfgang Orthuber <orthu...@kfo-zmk.uni-kiel.de>
Subject: numeric data on the web, numeric web search
To: public-lod@w3.org
Date: Wednesday, April 29, 2009, 3:25 PM
Hello!
We know that quantifiable objects play a central role in daily
life. Nevertheless up to now quantifiable objects have in general
no well defined globally machine readable and precise
representation on the web. The following concept proposes a simple
data structure called "pattern" for such representation of
quantifiable objects in general which also allows their similarity
search:
--------
* Numeric web search *
Web search is up to now word based. Additionally language
independent similarity search of quantifiable objects is
desirable. For well defined numeric representation of quantifiable
objects a simple data structure called "pattern" is proposed,
which contains a feature vector (a sequence of numbers) for
representation of the object, and a "pattern name" which is a URI
which uniquely identifies the kind of object which is represented
by the feature vector.
Pattern: Pattern name + feature
vector (+ auxilliary data)
Patterns with the same pattern name represent the same kind of
object. Because the number of possible pattern names is not
limited*, infinitely* many different kinds of quantifiable objects
can be represented by patterns. (*only physically limited by
finite time and energy)
So the search terms are not words, but feature vectors in patterns
which allow quantification of similarity. Feature vectors of
patterns with the same pattern name are directly comparable using
a given metric. At this similarities of the original quantifiable
objects are mapped to spatial similarities of the feature vectors.
So similarity search is possible by calculating distances: Objects
are the more similar, the smaller the distance between the feature
vectors of the representing patterns is.
Due to the multitude of different kinds of quantifiable objects
the work for development of efficient pattern resp. feature vector
definitions for their representation is open ended. Global task
sharing has the greatest potential: According to this suggestion
every owner of an internet domain name abc.xyz gets the right to
define feature vectors of all patterns with names abc.xyz/* (in
well defined location abc.xyz/pat/*).
Patterns are machine readable, uniformly comparable and
searchable. They allow to search with the same search engine not
only for text, but also for an increasing number of well-defined
quantifiable objects on the web. This bundling of the search
activity into one crawler and web database for all quantifiable
objects is much more efficient than building and managing a
database and a crawler for every kind of object.
Numeric similarity search could be efficiently combined with
conventional word based search. Details are described in
http://www.orthuber.com/wpa.htm , don't hesitate to ask me further
questions.
--------------------
It seems clear that introduction of the above conventions would
have relevant advantages. Can this get support that we can step by
step realize this?
Regards
Wolfgang Orthuber (Mathematician and Orthodontist at University
of Kiel / Germany)