mbeckerle commented on a change in pull request #17: Design notes on schema 
compiler space/speed issue
URL: 
https://github.com/apache/incubator-daffodil-site/pull/17#discussion_r376057942
 
 

 ##########
 File path: site/dev/design-notes/namespace-binding-minimization.adoc
 ##########
 @@ -0,0 +1,142 @@
+:page-layout: page
+:keywords: schema-compiler performance alignment optimization
+// ///////////////////////////////////////////////////////////////////////////
+//
+// This file is written in AsciiDoc.
+//
+// If you can read this comment, your browser is not rendering asciidoc 
automatically.
+//
+// You need to install the asciidoc plugin to Chrome or Firefox
+// so that this page will be properly rendered for your viewing pleasure.
+//
+// You can get the plugins by searching the web for 'asciidoc plugin'
+//
+// You will want to change plugin settings to enable diagrams (they're off by 
default.)
+//
+// You need to view this page with Chrome or Firefox.
+//
+// ///////////////////////////////////////////////////////////////////////////
+//
+// When editing, please start each sentence on a new line.
+// See 
https://asciidoctor.org/docs/asciidoc-recommended-practices/#one-sentence-per-line[one
 sentence-per-line writing technique.]
+// This makes textual diffs of this file useful in a similar way to the way 
they work for code.
+//
+// //////////////////////////////////////////////////////////////////////////
+
+== Namespace Binding Minimization
+
+=== Introduction
+
+DFDL schemas are XML schemas and so DFDL inherits the namespace system of XML 
and XML Schema for composing large schemas from smaller ones, for reusing 
schema files, and for managing name conflicts. 
+
+A DFDL Infoset isn't necessarily represented as XML however. 
+Some representations won't have any ability to deal with namespaces (JSON for 
example), and so Daffodil will sometimes issue warnings when compiling a schema 
if the namespace usage will not allow unambiguous representation without 
namespaces. 
+
+Most representations of DFDL Infosets will, like XML, use some representation 
of the namespaces of elements, and in textual forms this will most commonly be 
by way of namespace prefixes. 
+XML is not the only representation that uses namespaces, however, so this 
should not be taken as an entirely XML-specific discussion.
+
+There are two goals for namespace-binding minimization. 
+
+. Clarity: Infosets that have redundant namespace bindings are very hard to 
read and understand, and require namespace-binding-aware tooling to compare 
them, or clumsy post-processing to remove the excess bindings. 
+
+. Performance: Attaching an element to the infoset at runtime should take 
constant time.
+
+The most straightforward way to achieve both these objectives is to express 
namespace definitions only on the root element of the infoset, and to change 
the namespace prefixes (or use of default namespace) for some qualified-name 
elements if that is necessary to enable this.  
 
 Review comment:
   From team discussion: messing with prefix definitions like this creates many 
problems. Even if all the algorithms are deterministic, moving around include 
files or the order of namespace bindings could produce different namespace 
prefixes for the same exact logical data. 
   
   Consider just pre-computing a data structure that  will answer the question 
of what local namespace bindings to introduce, and what prefix to use 
(hopefully NOT tns). This can't be constant time, though it usuallly would be, 
but worst case one needs different bindings based on knowing the full nest from 
root->A->B->C when appending D, the right namespace bindings can depend on the 
whole set from root on down.  
   
   Usually it won't be that bad. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to