[ 
https://issues.apache.org/jira/browse/JENA-44?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038699#comment-13038699
 ] 

Stephen Allen commented on JENA-44:
-----------------------------------

Some answers with respect to JENA-45:

1) In JENA-45, bnodes are encoded with NodeFmtLib.safeBNodeLabel(String) and 
decoded with a new method NodeFmtLib.decodeSafeBNodeLabel(String).  This 
outputs 'B' followed by the internal blank node label encoded in hexadecimal.  
This allows us to deserialize back to the proper Jena bnode.

2) I think we need to add a way make ThresholdPolicyCount configurable 
externally.  Alternatively, we can change this to actual memory usage by 
looking at the contents of each Binding object as it is added.

3) Ultimately having unified memory manager (perhaps with a preset limit for 
each query) would be ideal to manage operators that need it.  Looking at [1] 
(from the Hadoop project) gives an indication of how this might be approached.


[1] http://pig.apache.org/docs/r0.8.1/api/org/apache/pig/data/DataBag.html


> Support external sorting of bindings in ARQ
> -------------------------------------------
>
>                 Key: JENA-44
>                 URL: https://issues.apache.org/jira/browse/JENA-44
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Sam Tunnicliffe
>            Assignee: Paolo Castagna
>            Priority: Minor
>         Attachments: JENA-44-0.patch, JENA-44_ARQ_r8531.patch, 
> JENA-44_ARQ_r8724.patch
>
>
> In QueryIterSort, the sorting of the contents of an Iterator<Binding> is done 
> in memory, using Arrays.sort. This can be problematic where the set to be 
> sorted is large. A possible solution could be to use an external, disk-backed 
> algorithm. A hybrid approach may be better, whereby we attempt the in-memory 
> sort, but when the number of bindings encountered goes over a certain number, 
> resort to the disk-backed variant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to