That makes sense to me too in the abstract. At Amazon we also have
interesting BDV fields we have to decode on the fly, so this looks
attractive for that reason (not just faceting).

I would say though that it would be easier to evaluate the fitness for
purpose (faceting) if we had some examples of BinaryDocValues used for
faceting (or otherwise being decoded on the fly) in the Lucene code
base -- do we have that?  I'd be concerned if we're not able to fully
test the new functionality to see what the impact of any changes might
be.

On Thu, Dec 5, 2024 at 6:45 AM Chris Hegarty
<christopher.hega...@elastic.co.invalid> wrote:
>
> Hi Ignacio,
>
> I completely agree with the idea of having a BytesRef-like thing that can be 
> off-heap. For a while now I’ve been thinking about how we could evolve 
> BytesRef so as to not expose its on-heap representation. Having a separate 
> primitive is probably a better way to go.
>
> -Chris.
>
> > On 5 Dec 2024, at 10:42, Ignacio Vera <iver...@gmail.com> wrote:
> >
> > Hello,
> >
> > I have been working with the idea of reading binary doc values
> > off-heap for a while. The idea behind it is that binary doc values are
> > often used for faceting where structure data is encoded at write time
> > and decoded at read time. It feels wasteful to have to read the data
> > on-heap before decoding it when we can read the data directly from the
> > off-heap buffer.
> >
> > The current proposal is to evolve the current API from an on-heap data
> > structure (BytesRef) to an off-heap data structure (currently named
> > RandomAccessInputRef). Because we are currently reading the data into
> > the buffer using a RandomAccessInput with an offset and a length, it
> > feels very natural to create an off-heap equivalent to BytesRef that
> > is backed by a RandomAccessInput.
> >
> > I am hoping to move this idea forward so I am asking for feedback as
> > this is a change on a public API so I would love to hear other
> > opinions.
> >
> > Thank you!
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to