Goal: Remove concept of slots from InfosetImpl

Reasons:
    1. So that element/node lookup can be name based.
    2. Ease re-enabling of Unordered Sequences.
3. Ease enabling of 'query-style' expressions that return more than one node.

An Infoset is a document (DIDocument) and the document contains exactly one element (DIElement) as its root. These elements can be simple (DISimple) or complex (DIComplex). Simple elements do not contain children, complex elements do. DIArray represents an array (maxOccurs >1). A DIArray shall always be a child of DIComplex. All of these inherit from DINode.

Currently DIComplex models its children as an Array of DINodes. The problem with the current implementation is that there's no way for us to just ask for an element by name without walking the structure.

The solution involves modifying DIComplex to represent its children both as a mutable.ArrayBuffer[DINode] and a HashMap[String, mutable.Seq[DINode]]. The ArrayBuffer shall give us the necessary insertion ordering information we need regarding the children as well as constant time indexing access to the child in question. The ArrayBuffer also has the added benefit of growing and shrinking as objects are added and removed. While the HashMap will allow us to associate a name (String) with DINodes that have that name thus providing a fast named lookup.

This named lookup can be optimized further by only inserting into the HashMap those elements that we know are used in expressions. These are the elements that would most likely benefit from a fast named lookup and will allow us to avoid additional overhead of hashing strings and allocating/appending to a sequence.

The reason for having the HashMap contain a mutable.Seq[DINode] is that it's possible for the following:

    <element name="a" .../>
    <element name="a" maxOccurs="unbounded"..../>
    <element name="b".../>
    <element name="a" ..../>

Each element named 'a' would have an associated DINode. Because there are multiple DINodes named 'a', we need to represent them in the HashMap using the same key 'a'. To do this, we add all instances of 'a' to a Seq[DINode]. In this particular case, the HashMap would look as follows:

    "a" -> mutable.Seq(DIElement, DIArray, DIElement)
    "b" -> mutable.Seq(DIElement)

The mutable.ArrayBuffer maintains the insertion order and would look as follows:

    mutable.ArrayBuffer(DIElement, DIArray, DIElement, DIElement)

Back tracking shall be accomplished by going to the last item in the DIComplex ArrayBuffer and removing it. 1. If the last item is a DIElement, we simply delete it from the ArrayBuffer and then remove it also from the HashMap. 2. If the last item is a DIArray, we then have to look at the contents of the DIArray. a. If the DIArray has more than one entry, remove the last entry. Then update the associated entry in the ArrayBuffer and HashMap with the modified DIArray. b. If the DIArray has a single entry then we simply remove DIArray from ArrayBuffer and the HashMap.

    <element name="root>
         <sequence ../>
         <element name="a" .../>
         <element name="a" maxOccurs="unbounded">
           ...
           <element name="b" .../>
           <element name="c" .../>
           ...
         </element>
       </sequence>
    </element>

If the HashMap of DIComplex-root looks like the following:
     "a" -> mutable.Seq(DIElement, DIArray)

ArrayBuffer of DIComplex-root looks like the following:
     0 -> DIElement("a")
     1 -> DIArray("a")

Where DIArray children ArrayBuffer looks like:
     0 -> DIElement("b")
     1 -> DIElement("c")

Given the initial example if we backtrack(1):  // purge one item
     DIArray:
         0 -> DIElement("b")  // removed one item from DIArray

     HashMap DIComplex-root:
         "a" -> mutable.Seq(DIElement, DIArray)

     ArrayBuffer DIComplex-root:
         0 -> DIElement("a")
         1 -> DIArray("a")

Given the initial example if we backtrack(2): // purge two items
     DIArray:
         Empty

     HashMap DIComplex-root:
         "a" -> mutable.Seq(DIElement)  // Because DIArray had two items
                                        // removed and is now empty

     ArrayBuffer DIComplex-root:
         0 -> DIElement("a")

===

class DIComplex {
  def nSlots = erd.nChildSlots

  val children: mutable.ArrayBuffer[DINode](nSlots) // insertion order
  lazy val fastLookup: HashMap[String, mutable.Seq[DINode]]

  def addChild(node: DINode) {
    children += node
    fastLookup(node.name).append(node)
  }

  def getChild(name: String): immutable.Seq[DINode] {
    fastLookup.getOrElse(name, mutable.Seq.empty) // fastLookup(name)
  }

  def backtrack(int n) {
    for (i = 0; i < n; i++) {
      val endIdx = children.size -1
      val child = children.remove(endIdx) // children.pop()

      fastLookup.get(child.name) match {  // fastLookup(child.name).pop()
        case None => // do nothing
    case Some(nodesSeq) if nodesSeq.size == 0 => // Shouldn't happen?
        case Some(nodesSeq) => {
      val lastEntry = nodesSeq.last
          lastEntry match {
        case arr: DIArray => {
// If the array has >= 2 items, remove last one and update nodesSeq
              // with this array:
              //  val lastIdx = nodesSeq.length - 1
          //  arr.remove(arr.size - 1)
// fastLookup.update(child.name, nodesSeq.update(lastIdx, arr))
              //
              // If the array has 1 item, remove the array from nodesSeq
              //  fastLookup.update(child.name, nodesSeq.dropRight(1))
        }
            case _ => fastLookup.update(child.name, nodesSeq.dropRight(1))
          }

        }
      }
    }
  }

}

class DIArray {
private val initialSize = parent.tunable.initialElementOccurrencesHint.toInt // set to 10
    protected final val _contents = new ArrayBuffer[DIElement](initialSize)

    def append(ie: InfosetElement): Unit = {
        _contents.append(ie.asInstanceOf[DIElement])
        ie.setArray(this)
    }

    //
    // For backtracking purposes
    //
    def remove(n: Int): DIElement = {
        _contents.remove(n)
    }
}

===

To facillitate a breadth-first ordering scheme we shall have to construct a 'global' that lives on the Element Runtime Data (ERD). It shall be called childElementDeclPosition and pertains to the static order of the element children of a complex type element. When sorting DINodes, childElementDeclPosition can be obtained by way of reaching from the DINode to the coresponding ERD. Any array shall be a single entry in the sequence that needs to be sorted. All elements in this array will have the same childElementDeclPosition.

The difficulty in implementing this is that an Element's dsom.ComplexType can have groups within the groups before finding actual element children. Therefore in the Document Schema Object Model (DSOM) the model group object shall have a startingChildElementDeclPosition and an endingChildElementDeclPosition.

1. The startingChildElementPosition shall be passed to the constructor for the Term. 2. If the term is an element, the endingChildElementDeclPosition is equal to the startingChildElementDeclPosition. 3. If the term is a model group, then its first child term's startingChildElementDeclPosition is equal to its own startingChildElementDeclPosition, the 2nd to Nth child term's startingChildElementDeclPosition is the prior term's endingChildElementDeclPosition + 1. 4. The endingChildElementDeclPosition for a model-group shall be the endingChildElmentDeclPosition of its last child term.

Reply via email to