Ian Hickson wrote:
I can't speak for Elliot, but the Web repository connector inside SAP
Netweaver's Knowledge Management has supported RFC2731-style encoded
metadata (as shown above) for many years now.
Could you elaborate on how this tool consumes this data? Any information
you may have would be very useful. Could you walk us through an example of
how this information gets used? How do the various schemes affect the
handling of the metadata? Have you found particular processing is needed
to process invalid values? Is the tool's input limited to files generated
by one organisation, or does it process input from arbitrary Web sites?
I think I did that already once many months ago.
Anyway.
In the SAP KM system, everything is designed around the concept of
resources, which essentially consist of a binary content stream, generic
metadata (MIME type, encoding, whatnot), Access Control information
(ACLs), versioning information (checked-in/out, version history...), and
custom metadata.
Most metadata lives in name/value pairs, where the name is an XML type
name (nsuri + localname), and the value can be numbers, strings, XML,
... (and lists of them).
SAP KM resources expose a generic API, which is used by the UI, protocol
handlers (HTTP/WebDAV, ICE, web services...), and internals services
(search, collaboration, ...).
The implementation of resources varies, they can be be based on file
shares, database tables, remote content management systems, remote
WebDAV servers, LDAP, ... and also generic HTTP servers.
The latter are usually used to pull in read-only information that should
be exposed to the internal search system (SAP TREX). The code that
implements these resources extracts metadata from well-known HTML
elements (title, keywords, ...), using configurable filters, and through
the use of RFC 2731 formatted meta elements.
How this information is used in detail depends on the consumers using
the KM API, which is hard to predict. Some use cases are decorations in
the UI based on additional properties, or support in custom searches.
One of the reasons RFC 2731 support was added specifically was that
several companies wanted to expose additional properties in their HTML
documents (such as additional document related dates), and have them be
accessible through the services mentioned above.
Back to your question:
> how this information gets used? How do the various schemes affect the
> handling of the metadata? Have you found particular processing is needed
Schemes aren't used (as I said in a later mail), but link/meta and the
RCF 2731 style encoding of prefixes is.
> to process invalid values? Is the tool's input limited to files
generated
> by one organisation, or does it process input from arbitrary Web sites?
The tool works for generic web resources.
BR, Julian