[ 
https://issues.apache.org/jira/browse/LUCENE-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306666#comment-14306666
 ] 

David Smiley commented on LUCENE-5735:
--------------------------------------

The PrefixTreeFacetCounter utility is good; if it doesn't get committed to 5x 
as part of this issue first, it will for the heatmap one.

There's a bug in NumberRangePrefixTreeStrategy.calcFacets in which all cells 
above the parent are counted as topLeaves, when really that can only be done if 
the leaf cell _contains_ the facet range.  I have a fix in-progress in which I 
detect this and if the cell doesn't contain the facet range then I walk the 
sub-cells and increment the counters on the parent facet cells.  _There's a 
rare-ish bug I need to debug still._  But thus far there are a few changes 
pending in my local check-out:
* Make TreeCellIterator public (lucene.internal, still) and allow the 'cell' to 
be a cell other than the top world cell.  Probably add a reset() 
constructor-like method to re-use an instance.
* NRCell has an optimization when getting subCells that seems to work fine in 
the normal code-paths thus far but the updated faceting code in-progress has 
shown the optimization to be faulty, so I just removed it as I don't think it 
was worth trying to make it work.
* NRCell sometimes can't get subCells if it was initialized from a short length 
shape/bytes; it should instead always initialize it's array to maxLevels.  
Again; this apparently never happen in normal code paths but in some toy test 
code I triggered it.
* Refactor the two main date range tests to share a random calendar utility 
(RandomCalHelper).

> Faceting for DateRangePrefixTree
> --------------------------------
>
>                 Key: LUCENE-5735
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5735
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: 5.x
>
>         Attachments: LUCENE-5735.patch, LUCENE-5735.patch, 
> LUCENE-5735__PrefixTreeFacetCounter.patch
>
>
> The newly added DateRangePrefixTree (DRPT) encodes terms in a fashion 
> amenable to faceting by meaningful time buckets. The motivation for this 
> feature is to efficiently populate a calendar bar chart or 
> [heat-map|http://bl.ocks.org/mbostock/4063318]. It's not hard if you have 
> date instances like many do but it's challenging for date ranges.
> Internally this is going to iterate over the terms using seek/next with 
> TermsEnum as appropriate.  It should be quite efficient; it won't need any 
> special caches. I should be able to re-use SPT traversal code in 
> AbstractVisitingPrefixTreeFilter.  If this goes especially well; the 
> underlying implementation will be re-usable for geospatial heat-map faceting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to