Hello
Have you considered the range field
https://lucene.apache.org/core/9_1_0/core/org/apache/lucene/document/IntRange.html
?

On Mon, Jan 20, 2025 at 11:34 PM Cleber Muramoto <cleber.muram...@gmail.com>
wrote:

> Hello.
>
> My model has the following Root structure, which consists of N
> "TimeSpaceIntervals":
>
> {
>   id: <int>,
>   intervals: [
>   {
>     sector: <string>,
>      entry: <int>,
>      exit: <int>
>    }, ....
>   ]
> }
>
> (exit>=entry is guaranteed)
>
> Given a new Root record, I must check for intersections with the incoming
> data, which means, finding any document in the index having a sector whose
> time interval [entry, exit] overlaps with a corresponding sector of the
> incoming data.
>
> Currently, given a Root r, I am converting the records to documents as
> follows:
>
>     public Document doc(Root r) {
>         var doc = new Document();
>         doc.add(new IntPoint("id", r.id));
>
>         r.intervals.forEach(i -> {
>             doc.add(new IntPoint(i.sector + ".entry", i.entry));
>             doc.add(new IntPoint(i.sector + ".exit", i.exit));
>         });
>         return doc;
>     }
>
> And for a given Root n, the intersection query becomes:
>
>     Query intersection(Root n) {
>         var q = new BooleanQuery.Builder();
>         // exclude same id
>         q.add(new BooleanClause(IntPoint.newExactQuery("id", n.id),
> Occur.MUST_NOT));
>         // find overlapping sectors
>         n.intervals().forEach(i -> {
>             var sub = new BooleanQuery.Builder();
>             // other docs must start before this exits
>             sub.add(new BooleanClause(IntPoint.newRangeQuery(i.sector +
> ".entry", 0, i.exit), Occur.FILTER));
>             // other docs must end after this starts
>             sub.add(new BooleanClause(IntPoint.newRangeQuery(i.sector +
> ".exit", i.entry, Integer.MAX_VALUE), Occur.FILTER));
>
>             q.add(new BooleanClause(sub.build(), Occur.SHOULD));
>         });
>
>         return q.build();
>     }
>
> The problem with this approach is that the index will have as many fields
> as twice the cardinality of sectors. Currently, the number of distinct
> sectors small (< 500), so I think this strategy is OK, but I don't like the
> idea of having "dynamic fields".
>
> Given the intersection query requirement, is there a better way to model
> the index, aside from creating multiple documents per Root entry?
>
> Regards
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to