Andy Seaborne created JENA-2049:
-----------------------------------

             Summary: Binding Improvements
                 Key: JENA-2049
                 URL: https://issues.apache.org/jira/browse/JENA-2049
             Project: Apache Jena
          Issue Type: Improvement
          Components: ARQ
            Reporter: Andy Seaborne
            Assignee: Andy Seaborne
             Fix For: Jena 4.0.0


{{Binding}} is a map from variables to nodes used in query processing. It is
 central to execution of SPARQL.

h3. Context

One feature is that binding objects form a tree. To add variables to an 
existing  binding, whether a join or assignment (BIND), a new binding object is 
created with the original as parent. This avoid copy costs. The parent can be 
shared across solutions.

This design leads to large numbers of {{Binding}} objects, the majority of 
which are quite small (1 or 2 variables).

h3. Current Situation

For general building, the binding implementation is the general {{BindingMap}}. 
This is used a lot because the number of variables is not easy to determine at 
the time the {{Binding}} object is allocated.

{{Binding1}} is the special case of one var/node pair. This "one slot" binding 
requires the application to know there will be only one slot binding so it has 
limited use.

{{BindingMap}} is not immutable, although as a {{Binding}} the mutable
 operations need to be accessed with a cast.

{{BindingMap}} is implemented using a Map ({{HashMap}}) with a special case of a
 binding of one slot.

Maps are several java object including a array. It is quite large and Java 
objects require initialization. For streaming queries, the total footprint is 
quite low; the GC will reclaim space as results rows are finished with. For 
queries that accumulate results (e.g general sort), bindings are held for a 
period of time.

h3. Proposal

 * A builder pattern to accumulate "(var,node)" pairs during a constriction 
phase before a build step that can then choose the most appropriate immutable 
{{Binding}} implementation.
 This takes the work of selecting the right binding implementation off the
 rest of the code.
 * Expand the number fixed length bindings: Special cases for 0, 1, 2, 3, and 4 
pairs.
 * The builder can be reused (same thread) so the cost of object creation for 
the builder
 can be amortized.
 * Truly immutable bindings for long term stability.

h3. Compatibility

Retain, marked deprecated, the {{BindingFactory}} methods to create a 
{{BindingMap}} so existing code continues as before, with deprecation to 
indicate it would be a good idea to switch to the new way.

h3. Non-goal

This is not a speed up (nor a slow down).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to