Re: Writing a Druid extension

Charles Allen Wed, 02 Jan 2019 10:34:46 -0800

We have a functional gRPC extension for brokers internally. Let me see if I
can get approval for releasing it.

For the explicit answers:

1) Guava 16

Yep, druid is stuck on it due to hadoop.
https://github.com/apache/incubator-druid/pull/5413 is the only outstanding
issue I know of that would a very wide swath of guava implementations to be
used. Once a solution for the same thread executor service gets into place,
then you should be able to modify your local deployment to whatever guava
version fits with your indexing config.

2) Group By thread processing

You picked the hardest one here :) there is all kinds of multi-threaded fun
that can show up when dealing with group by queries. If you want a good
dive into this I suggest checking out
https://github.com/apache/incubator-druid/pull/6629 which will put you
straight into the weeds of it all.

3) Yielder / Sequence type safety

Yeah... I don't have any good info there other than "things aren't
currently broken". There are some really nasty and hacky type casts related
to by segment sequences if you start digging around the code.

4) Calcite Proto

This is a great question. I imagine getting a Calcite Proto SQL endpoint
setup in an extension wouldn't be too hard, but have not tried such a
thing. This one would probably be worth having its own discussion thread
(maybe an issue?) on how to handle.

You are on the right track!
Charles Allen

On Sat, Dec 29, 2018 at 11:59 PM Nikita Dolgov <java.saas.had...@gmail.com>
wrote:

> I was experimenting with a Druid extension prototype and encountered some
> difficulties. The experiment is to build something like
> https://github.com/apache/incubator-druid/issues/3891 with gRPC.
>
> (1) Guava version
>
> Druid relies on 16.0.1 which is a very old version (~4 years). My only
> guess is another transitive dependency (Hadoop?) requires it. The earliest
> version used by gRPC from three years ago was 19.0. So my first question is
> if there are any plans for upgrading Guava any time soon.
>
> (2) Druid thread model for query execution
>
> I played a little with calling
> org.apache.druid.server.QueryLifecycleFactory::runSimple under debugger.
> The stack trace was rather deep to reverse engineer easily so I'd like to
> ask directly instead. Would it be possible to briefly explain how many
> threads (and from which thread pool) it takes on a broker node to process,
> say, a GroupBy query.
>
> At the very least I'd like to know if calling
> QueryLifecycleFactory::runSimple on a thread from some "query processing
> pool" is better than doing it on the IO thread that received the query.
>
> (3) Yielder
>
> Is it safe to assume that QueryLifecycleFactory::runSimple always returns
> a Yielder<org.apache.druid.data.input.Row> ? QueryLifecycle omits generic
> types rather liberally when dealing with Sequence instances.
>
> (4) Calcite integration
>
> Presumably Avatica has an option of using protobuf encoding for the
> returned results. Is it true that Druid cannot use it?
> On a related note, any chance there was something written down about
> org.apache.druid.sql.calcite ?
>
> Thank you
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
>
>

Re: Writing a Druid extension

Reply via email to