[tbc-users] Re: j.1:, j.2: .... j.n: namespaces spontaneously created.

Jeremy Carroll Thu, 16 Oct 2008 10:23:38 -0700

I wrote the code that introduces these prefixes - I never really decided 
whether the j stands for Jena or Jeremy!

The code writes RDF/XML documents, given an RDF graph.

Here is the logic, understanding it may help you choose prefixes better and 
avoid the problems - points 5 and 6 may be under your control (also 7 but 
probably not the issue here). Point 9 may be the cause of your problems, but is 
somewhat different from the others.

1) The namespace prefixes are not a crucial part of the data, and can be thrown 
away (and replaced with these generated symbols)

2) Many users are unhappy if this is done, so try not to do it.

3) The generated files have the following format

<rdf:RDF
    ALL NAMESPACES DECLARED HERE
    ...
    ...
    >
   TRIPLE-DATA
</rdf:RDF>

4) The input files from which the namespace prefixes came may have different 
legal XML formats, or be N3 formats, and may be numerous.

5) Hence namespace prefix clashes can occur:
    The same prefix is used to abbreviate different URIs in different places:
     - either in different parts of the same XML document (legal: supported for 
input but not for output)
     - or in different XML or N3 documents

  I cannot remember whether when a clash occurs if both bindings are discarded 
or one is chosen (at random); the implementation of clash resolution has gone 
through several iterations.

6) If the prefix is chosen in a manner other than in an XML document, it might 
not be a legal XML namespace prefix. Legal XML namespace prefixes match the 
NCName construct in the Namespaces in XML Recommendation, and do not start 
(case-insenstively) in "xml".

7) A different sort of namespace clash occurs when the same namespace URI is 
given different prefixes in different documents (or document sections). Again 
this triggers clash resolution code, whose behavior has gone through many 
iterations - since there is no obviously right solution; and many solutions 
that are also not obviously wrong.

8) The clash resolution code maintains a "prefix mapping" which is, essentially 
a one-one map between prefixes and URIs.

9) Typically the namespace URI is determined from a URI using the following 
algorithm:
   Start at the end of the URI.
   Work to the left, and find the leftmost alphabetic character that is 
followed by alphanumeric characters.
  (The actual character sets are NCNameStartChar and NCNameChar defined in 
Namespaces in XML Rec. These extend to much of Unicode, and also allow certain, 
but not all punctuation characters). A poor choice of Namespace URI will lead 
to problems.
  Bad namespace URI
    http://example.org/myNamespace
    http://example.org/myProject#myNamespace
  Good namespace URIs end in "#" or "/"

Example of how a bad choice of namespace URI goes wrong:
   xmlns:eg="http://example.org/myNamespace";

then
  eg:foo abbreviates http://example.org/myNamespacefoo
  eg:bar abbreviates http://example.org/myNamespacebar

these are split as
  http://example.org/ myNamespacefoo
and
  http://example.org/ myNamespacebar

So we introduce j.0 as http://example.org/ (since no prefix was used in the 
input)
And these come out as 
   j.0:myNamespacefoo
and
   j.0:myNamespacebar


====

The suggestion then is to review whether your data either includes illegal 
prefixes (point 6) or prefix clashes (point 5), and to fix at source.

====

One further point: when writing the TRIPLE-DATA above, the code very 
occasionally finds it doesn't have a namespace that it needs: at this point it 
introduces a j.cook.up prefix which is purely local in scope (i.e. it is only 
used for the element and attributes on which the namespace is declared), this 
could end up in Composer's rendition in some unusual circumstance perhaps.

Jeremy

PS I also point out that this sort of processing is entirely the point of 
namespace URIs.
What really matters is the namespace URI - and this doesn't get lost.
When merging data from many sources it is not surprising if more than one 
source uses the same short form for different things; when this happens tools 
need to resolve the problem somehow, and Jena (underlying TBC here), resolves 
things as described above.

PPS I think in the first versions all namespaces were assigned j.N prefixes!

> -----Original Message-----
> From: [email protected] [mailto:topbraid-composer-
> [EMAIL PROTECTED] On Behalf Of Scott Henninger
> Sent: Thursday, October 16, 2008 8:34 AM
> To: TopBraid Composer Users
> Subject: [tbc-users] Re: j.1:, j.2: .... j.n: namespaces spontaneously
> created.
> 
> 
> Don; The spontaneous namespace prefixes has to do with element naming
> when creating legal XML.  In particular, element names have to be
> qnames (<prefix>:<name>).  The underlying XML serialization writer
> will create legal XML names by creating a qname (j.x:<name>) and
> adding the prefix statement xmlns:j.x="..."  To make the prefixes
> meaningful, go to the ontology home and re-name the prefix.
> 
> -- Scott
> 
> On Oct 16, 8:37 am, donundeen <[EMAIL PROTECTED]> wrote:
> > Hi,
> > What's up with these j.n (where n is an integer) namespace prefixes
> > being spontaneously created in my ontologies?
> >
> > I've been importing xml, running constructs to create new classes, and
> > I'm noticing these namespaces are being created by the system as
> > abbreviations for some of my ad-hoc URIs.
> >
> > why does that happen?
> 


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"TopBraid Composer Users" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/topbraid-composer-users?hl=en
-~----------~----~----~----~------~----~------~--~---
[tbc-users] Re: j.1:, j.2: .... j.n: namespaces spontaneously created.

Reply via email to