The thing that worries me about JEP 370 is that if historical Java user migration patterns hold up, we will need to support Java 11 for a while (probably another 2–3 years), and we would therefore need to wait that long to use JEP 370. It seems like a long time and until then we would be stuck with a pretty inferior API.
I also would prefer not having to rewrite code a bunch of times, but that's why I suggested starting by using Memory for the VectorAggregator interface and stuff that interacts with it. There isn't that much code there yet (only a few aggregators implement VectorAggregator). So we will need to write most of it for the first time, and since it is fresh code, I think it'd be nice to use the best API currently available in Java 8 / 11. From what I see that is Memory. On Wed, Feb 5, 2020 at 9:21 AM Slim Bouguerra <bs...@apache.org> wrote: > Hi Gian, > Thanks for bringing this up. > IMO for the long run and looking at how much code will have to change, it > makes more sense to rely on JDK based API JEP 370 and have this work done > ONCE as oppose to multiple iteration. FYI i do not think it is far away, > seems like there is a good momentum around it. > This does not exclude or means we should not use Memory API for other stuff > like sketches et al, in fact i think even for project like Sketches it > makes more sense to move to newer API offered by the JDK rather that do it > your self. > > > On Tue, Feb 4, 2020 at 10:12 PM Gian Merlino <g...@apache.org> wrote: > > > Hey Druids, > > > > There has generally been a lot of talk about moving away from ByteBuffer > > and towards the DataSketches Memory package ( > > https://datasketches.apache.org/docs/Memory/MemoryPackage.html) or even > > using Unsafe directly. Much of that discussion happened on > > https://github.com/apache/druid/issues/3892. > > > > Recently a patch was merged that added datasketches-memory as a > dependency > > of druid-processing: https://github.com/apache/druid/pull/9308. The > reason > > was partially due to better performance and partially due to nicer API > > (both reasons mentioned in #3892 as well). > > > > JEP 370 is a potential long term solution but it seems a while away from > > being ready: https://openjdk.java.net/jeps/370 > > > > I wanted to bring the larger discussion back up and see what people think > > is a good path forward. > > > > My suggestion is that we migrate the VectorAggregator interface to use > > Memory, but keep BufferAggregator the way it is. That way, as we build > out > > support for vectorization (right now, only timeseries/groupby support it, > > and only a few aggregators, but we should be building this out) we'll be > > doing it with a nicer and potentially faster API. But we won't need to go > > back and redo a bunch of old code, since we'll keep the non-vectorized > code > > paths the way they are. (And hopefully, one day, delete them all > outright.) > > > > Gian > > >