Re: Druid + Presto?

2020-07-10 Thread Parth Brahmbhatt
ators, but also requires
>> a shuffle join, and then translate and execute it as an equivalent Presto
>> SQL query. The idea being you can express your query in either dialect and
>> get routed to the right place in the end.
>>
>> On Thu, Jul 9, 2020 at 4:36 PM Samarth Jain  wrote:
>>
>>> Gian,
>>>
>>> For the presto-sql version of Druid connector, for V1, we decided to
>>> pursue
>>> the JDBC route. You can follow along on the progress here -
>>> https://github.com/prestosql/presto/issues/1855
>>> My colleague, Parth (cc'ed as well) is working on implementing Druid
>>> aggregation push down including support for top-n style queries. Our
>>> immediate use cases, and what we think Druid
>>> generally is more suitable for, is for solving for aggregate group by
>>> style
>>> queries. Having a presto-druid connector also enables us to join data in
>>> Druid with the rest of our warehouse.
>>> In general though, for queries that don't do any aggregations i.e. which
>>> get translated to Druid SCAN queries, it makes sense to by-pass the Druid
>>> datanodes altogether and directly go
>>> to the deep storage. I think Druid provides enough metadata about the
>>> active segment files to be able to do that relatively easily.
>>>
>>> You bring up an interesting idea on the reverse connector. What do you
>>> think the value of such a connector will be? I am assuming Druid SQL for
>>> the most part is ANSI SQL.
>>>
>>> On Thu, Jul 9, 2020 at 12:56 PM Zhenxiao Luo 
>>> wrote:
>>>
>>> > Thank you, Mainak.
>>> >
>>> > Hi Gian,
>>> >
>>> > Glad to see you are interested in Presto Druid connector.
>>> >
>>> > My colleague, @Hao Luo  @Beinan Wang
>>> >  and
>>> > me, together, implemented the Presto Druid connector in PrestoDB:
>>> > https://prestodb.io/docs/current/connector/druid.html
>>> >
>>> > Our implementation includes:
>>> > 1. Presto could scan Druid segments to compute SQL results
>>> > 2. aggregation pushdown, where Presto leverages Druid fast aggregation
>>> > capabilities, and stream aggregated result from Druid
>>> > actually, we implemented 2 execution paths, users could use
>>> configurations
>>> > to control whether they'd like to scan segments or pushdown all
>>> sub-queries
>>> > to Druid
>>> >
>>> > We had run benchmarkings comparing Presto Druid connector with other
>>> SQL
>>> > engines. And are ready to run production workloads.
>>> >
>>> > Thanks,
>>> > Zhenxiao
>>> >
>>> > On Thu, Jul 9, 2020 at 12:40 PM Mainak Ghosh 
>>> wrote:
>>> >
>>> > > Hello Gian,
>>> > >
>>> > > We are currently testing the (other) Presto Druid connector at our
>>> end.
>>> > It
>>> > > has aggregation push down support. Adding Zhenxiao to this thread
>>> since
>>> > he
>>> > > is the primary developer of the connector. He can provide the kind of
>>> > > details you are looking for.
>>> > >
>>> > > Thanks,
>>> > > Mainak
>>> > >
>>> > > > On Jul 9, 2020, at 12:25 PM, Gian Merlino  wrote:
>>> > > >
>>> > > > By the way, I see that the other Presto has a Druid connector too:
>>> > > > https://prestodb.io/docs/current/connector/druid.html. From the
>>> docs
>>> > it
>>> > > > looks like it has different lineage and might even work
>>> differently.
>>> > > >
>>> > > > On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino 
>>> wrote:
>>> > > >
>>> > > >> I was thinking of exploring ideas like pushing down aggregations,
>>> > > enabling
>>> > > >> Presto to query directly from deep storage (in cases where there
>>> > aren't
>>> > > any
>>> > > >> interesting things to push down, this may be more efficient than
>>> > > querying
>>> > > >> Druid servers), enabling translation from Druid's SQL dialect to
>>> > > Presto's
>>> > > >> SQL dialect (a "reverse connector"), etc. Do you (or anyone else
>>> on
>>> > this
>>> > > >> list) have a

Re: Druid + Presto?

2020-07-10 Thread Mainak Ghosh
> has aggregation push down support. Adding Zhenxiao to this thread since
> > he
> > > is the primary developer of the connector. He can provide the kind of
> > > details you are looking for.
> > >
> > > Thanks,
> > > Mainak
> > >
> > > > On Jul 9, 2020, at 12:25 PM, Gian Merlino  > > > <mailto:g...@apache.org>> wrote:
> > > >
> > > > By the way, I see that the other Presto has a Druid connector too:
> > > > https://prestodb.io/docs/current/connector/druid.html 
> > > > <https://prestodb.io/docs/current/connector/druid.html>. From the docs
> > it
> > > > looks like it has different lineage and might even work differently.
> > > >
> > > > On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino  > > > <mailto:g...@apache.org>> wrote:
> > > >
> > > >> I was thinking of exploring ideas like pushing down aggregations,
> > > enabling
> > > >> Presto to query directly from deep storage (in cases where there
> > aren't
> > > any
> > > >> interesting things to push down, this may be more efficient than
> > > querying
> > > >> Druid servers), enabling translation from Druid's SQL dialect to
> > > Presto's
> > > >> SQL dialect (a "reverse connector"), etc. Do you (or anyone else on
> > this
> > > >> list) have any thoughts on any of those?
> > > >>
> > > >> I'm also curious what kinds of improvements you're planning to the
> > > >> connector you built.
> > > >>
> > > >> On Thu, Jul 9, 2020 at 10:18 AM Samarth Jain  > > >> <mailto:samarth.j...@gmail.com>>
> > > >> wrote:
> > > >>
> > > >>> Hi Gian,
> > > >>>
> > > >>> I contributed the jdbc based presto-druid connector in prestosql
> > which
> > > >>> went
> > > >>> out in release 337
> > > >>> https://prestosql.io/docs/current/release/release-337.html 
> > > >>> <https://prestosql.io/docs/current/release/release-337.html>. The v1
> > > >>> version
> > > >>> of the connector doesn’t support aggregate push down yet. It is being
> > > >>> actively worked on and we expect it to be improved over the next few
> > > >>> releases. We are currently evaluating using the presto-druid
> > connector
> > > in
> > > >>> our Tableau setup. It would be interesting to see what changes in
> > Druid
> > > >>> would be needed to support that integration.
> > > >>>
> > > >>> Thanks,
> > > >>> Samarth
> > > >>>
> > > >>> On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino  > > >>> <mailto:g...@apache.org>>
> > wrote:
> > > >>>
> > > >>>> Hey Druids,
> > > >>>>
> > > >>>> I was wondering, is anyone on this list using Druid + Presto
> > together?
> > > >>> If
> > > >>>> so, what does your architecture look like and which edition / flavor
> > > of
> > > >>>> Presto and Druid connector are you using? What's your experience
> > been
> > > >>> like?
> > > >>>> I'm asking since I'm starting to think about whether it makes sense
> > to
> > > >>> look
> > > >>>> at ways to improve the integration between the two projects.
> > > >>>>
> > > >>>> Gian
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >



Re: Druid + Presto?

2020-07-10 Thread Mainak Ghosh
+ Zhenxiao

> On Jul 9, 2020, at 11:03 PM, Gian Merlino  wrote:
> 
> Hey Zhenxiao, Hao, Beinan, Mainak,
> 
> Thanks for sharing information about your work.
> 
> You mention benchmarks — I'm curious, did you have a chance to benchmark each 
> execution path? How do they look?
> 
> When you were developing the connector, did you feel like any changes in 
> Druid would make it easier to integrate things between the two projects?
> 
> On Thu, Jul 9, 2020 at 12:56 PM Zhenxiao Luo  <mailto:z...@twitter.com.invalid>> wrote:
> Thank you, Mainak.
> 
> Hi Gian,
> 
> Glad to see you are interested in Presto Druid connector.
> 
> My colleague, @Hao Luo mailto:h...@twitter.com>> @Beinan 
> Wang
> mailto:bein...@twitter.com>> and
> me, together, implemented the Presto Druid connector in PrestoDB:
> https://prestodb.io/docs/current/connector/druid.html 
> <https://prestodb.io/docs/current/connector/druid.html>
> 
> Our implementation includes:
> 1. Presto could scan Druid segments to compute SQL results
> 2. aggregation pushdown, where Presto leverages Druid fast aggregation
> capabilities, and stream aggregated result from Druid
> actually, we implemented 2 execution paths, users could use configurations
> to control whether they'd like to scan segments or pushdown all sub-queries
> to Druid
> 
> We had run benchmarkings comparing Presto Druid connector with other SQL
> engines. And are ready to run production workloads.
> 
> Thanks,
> Zhenxiao
> 
> On Thu, Jul 9, 2020 at 12:40 PM Mainak Ghosh  <mailto:mgh...@twitter.com>> wrote:
> 
> > Hello Gian,
> >
> > We are currently testing the (other) Presto Druid connector at our end. It
> > has aggregation push down support. Adding Zhenxiao to this thread since he
> > is the primary developer of the connector. He can provide the kind of
> > details you are looking for.
> >
> > Thanks,
> > Mainak
> >
> > > On Jul 9, 2020, at 12:25 PM, Gian Merlino  > > <mailto:g...@apache.org>> wrote:
> > >
> > > By the way, I see that the other Presto has a Druid connector too:
> > > https://prestodb.io/docs/current/connector/druid.html 
> > > <https://prestodb.io/docs/current/connector/druid.html>. From the docs it
> > > looks like it has different lineage and might even work differently.
> > >
> > > On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino  > > <mailto:g...@apache.org>> wrote:
> > >
> > >> I was thinking of exploring ideas like pushing down aggregations,
> > enabling
> > >> Presto to query directly from deep storage (in cases where there aren't
> > any
> > >> interesting things to push down, this may be more efficient than
> > querying
> > >> Druid servers), enabling translation from Druid's SQL dialect to
> > Presto's
> > >> SQL dialect (a "reverse connector"), etc. Do you (or anyone else on this
> > >> list) have any thoughts on any of those?
> > >>
> > >> I'm also curious what kinds of improvements you're planning to the
> > >> connector you built.
> > >>
> > >> On Thu, Jul 9, 2020 at 10:18 AM Samarth Jain  > >> <mailto:samarth.j...@gmail.com>>
> > >> wrote:
> > >>
> > >>> Hi Gian,
> > >>>
> > >>> I contributed the jdbc based presto-druid connector in prestosql which
> > >>> went
> > >>> out in release 337
> > >>> https://prestosql.io/docs/current/release/release-337.html 
> > >>> <https://prestosql.io/docs/current/release/release-337.html>. The v1
> > >>> version
> > >>> of the connector doesn’t support aggregate push down yet. It is being
> > >>> actively worked on and we expect it to be improved over the next few
> > >>> releases. We are currently evaluating using the presto-druid connector
> > in
> > >>> our Tableau setup. It would be interesting to see what changes in Druid
> > >>> would be needed to support that integration.
> > >>>
> > >>> Thanks,
> > >>> Samarth
> > >>>
> > >>> On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino  > >>> <mailto:g...@apache.org>> wrote:
> > >>>
> > >>>> Hey Druids,
> > >>>>
> > >>>> I was wondering, is anyone on this list using Druid + Presto together?
> > >>> If
> > >>>> so, what does your architecture look like and which edition / flavor
> > of
> > >>>> Presto and Druid connector are you using? What's your experience been
> > >>> like?
> > >>>> I'm asking since I'm starting to think about whether it makes sense to
> > >>> look
> > >>>> at ways to improve the integration between the two projects.
> > >>>>
> > >>>> Gian
> > >>>>
> > >>>
> > >>
> >
> >



Re: Druid + Presto?

2020-07-10 Thread Gian Merlino
gt; details you are looking for.
>> > >
>> > > Thanks,
>> > > Mainak
>> > >
>> > > > On Jul 9, 2020, at 12:25 PM, Gian Merlino  wrote:
>> > > >
>> > > > By the way, I see that the other Presto has a Druid connector too:
>> > > > https://prestodb.io/docs/current/connector/druid.html. From the
>> docs
>> > it
>> > > > looks like it has different lineage and might even work differently.
>> > > >
>> > > > On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino 
>> wrote:
>> > > >
>> > > >> I was thinking of exploring ideas like pushing down aggregations,
>> > > enabling
>> > > >> Presto to query directly from deep storage (in cases where there
>> > aren't
>> > > any
>> > > >> interesting things to push down, this may be more efficient than
>> > > querying
>> > > >> Druid servers), enabling translation from Druid's SQL dialect to
>> > > Presto's
>> > > >> SQL dialect (a "reverse connector"), etc. Do you (or anyone else on
>> > this
>> > > >> list) have any thoughts on any of those?
>> > > >>
>> > > >> I'm also curious what kinds of improvements you're planning to the
>> > > >> connector you built.
>> > > >>
>> > > >> On Thu, Jul 9, 2020 at 10:18 AM Samarth Jain <
>> samarth.j...@gmail.com>
>> > > >> wrote:
>> > > >>
>> > > >>> Hi Gian,
>> > > >>>
>> > > >>> I contributed the jdbc based presto-druid connector in prestosql
>> > which
>> > > >>> went
>> > > >>> out in release 337
>> > > >>> https://prestosql.io/docs/current/release/release-337.html. The
>> v1
>> > > >>> version
>> > > >>> of the connector doesn’t support aggregate push down yet. It is
>> being
>> > > >>> actively worked on and we expect it to be improved over the next
>> few
>> > > >>> releases. We are currently evaluating using the presto-druid
>> > connector
>> > > in
>> > > >>> our Tableau setup. It would be interesting to see what changes in
>> > Druid
>> > > >>> would be needed to support that integration.
>> > > >>>
>> > > >>> Thanks,
>> > > >>> Samarth
>> > > >>>
>> > > >>> On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino 
>> > wrote:
>> > > >>>
>> > > >>>> Hey Druids,
>> > > >>>>
>> > > >>>> I was wondering, is anyone on this list using Druid + Presto
>> > together?
>> > > >>> If
>> > > >>>> so, what does your architecture look like and which edition /
>> flavor
>> > > of
>> > > >>>> Presto and Druid connector are you using? What's your experience
>> > been
>> > > >>> like?
>> > > >>>> I'm asking since I'm starting to think about whether it makes
>> sense
>> > to
>> > > >>> look
>> > > >>>> at ways to improve the integration between the two projects.
>> > > >>>>
>> > > >>>> Gian
>> > > >>>>
>> > > >>>
>> > > >>
>> > >
>> > >
>> >
>>
>


Re: Druid + Presto?

2020-07-10 Thread Gian Merlino
ervers), enabling translation from Druid's SQL dialect to
> > > Presto's
> > > >> SQL dialect (a "reverse connector"), etc. Do you (or anyone else on
> > this
> > > >> list) have any thoughts on any of those?
> > > >>
> > > >> I'm also curious what kinds of improvements you're planning to the
> > > >> connector you built.
> > > >>
> > > >> On Thu, Jul 9, 2020 at 10:18 AM Samarth Jain <
> samarth.j...@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> Hi Gian,
> > > >>>
> > > >>> I contributed the jdbc based presto-druid connector in prestosql
> > which
> > > >>> went
> > > >>> out in release 337
> > > >>> https://prestosql.io/docs/current/release/release-337.html. The v1
> > > >>> version
> > > >>> of the connector doesn’t support aggregate push down yet. It is
> being
> > > >>> actively worked on and we expect it to be improved over the next
> few
> > > >>> releases. We are currently evaluating using the presto-druid
> > connector
> > > in
> > > >>> our Tableau setup. It would be interesting to see what changes in
> > Druid
> > > >>> would be needed to support that integration.
> > > >>>
> > > >>> Thanks,
> > > >>> Samarth
> > > >>>
> > > >>> On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino 
> > wrote:
> > > >>>
> > > >>>> Hey Druids,
> > > >>>>
> > > >>>> I was wondering, is anyone on this list using Druid + Presto
> > together?
> > > >>> If
> > > >>>> so, what does your architecture look like and which edition /
> flavor
> > > of
> > > >>>> Presto and Druid connector are you using? What's your experience
> > been
> > > >>> like?
> > > >>>> I'm asking since I'm starting to think about whether it makes
> sense
> > to
> > > >>> look
> > > >>>> at ways to improve the integration between the two projects.
> > > >>>>
> > > >>>> Gian
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >
>


Re: Druid + Presto?

2020-07-10 Thread Gian Merlino
Hey Zhenxiao, Hao, Beinan, Mainak,

Thanks for sharing information about your work.

You mention benchmarks — I'm curious, did you have a chance to benchmark
each execution path? How do they look?

When you were developing the connector, did you feel like any changes in
Druid would make it easier to integrate things between the two projects?

On Thu, Jul 9, 2020 at 12:56 PM Zhenxiao Luo 
wrote:

> Thank you, Mainak.
>
> Hi Gian,
>
> Glad to see you are interested in Presto Druid connector.
>
> My colleague, @Hao Luo  @Beinan Wang
>  and
> me, together, implemented the Presto Druid connector in PrestoDB:
> https://prestodb.io/docs/current/connector/druid.html
>
> Our implementation includes:
> 1. Presto could scan Druid segments to compute SQL results
> 2. aggregation pushdown, where Presto leverages Druid fast aggregation
> capabilities, and stream aggregated result from Druid
> actually, we implemented 2 execution paths, users could use configurations
> to control whether they'd like to scan segments or pushdown all sub-queries
> to Druid
>
> We had run benchmarkings comparing Presto Druid connector with other SQL
> engines. And are ready to run production workloads.
>
> Thanks,
> Zhenxiao
>
> On Thu, Jul 9, 2020 at 12:40 PM Mainak Ghosh  wrote:
>
> > Hello Gian,
> >
> > We are currently testing the (other) Presto Druid connector at our end.
> It
> > has aggregation push down support. Adding Zhenxiao to this thread since
> he
> > is the primary developer of the connector. He can provide the kind of
> > details you are looking for.
> >
> > Thanks,
> > Mainak
> >
> > > On Jul 9, 2020, at 12:25 PM, Gian Merlino  wrote:
> > >
> > > By the way, I see that the other Presto has a Druid connector too:
> > > https://prestodb.io/docs/current/connector/druid.html. From the docs
> it
> > > looks like it has different lineage and might even work differently.
> > >
> > > On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino  wrote:
> > >
> > >> I was thinking of exploring ideas like pushing down aggregations,
> > enabling
> > >> Presto to query directly from deep storage (in cases where there
> aren't
> > any
> > >> interesting things to push down, this may be more efficient than
> > querying
> > >> Druid servers), enabling translation from Druid's SQL dialect to
> > Presto's
> > >> SQL dialect (a "reverse connector"), etc. Do you (or anyone else on
> this
> > >> list) have any thoughts on any of those?
> > >>
> > >> I'm also curious what kinds of improvements you're planning to the
> > >> connector you built.
> > >>
> > >> On Thu, Jul 9, 2020 at 10:18 AM Samarth Jain 
> > >> wrote:
> > >>
> > >>> Hi Gian,
> > >>>
> > >>> I contributed the jdbc based presto-druid connector in prestosql
> which
> > >>> went
> > >>> out in release 337
> > >>> https://prestosql.io/docs/current/release/release-337.html. The v1
> > >>> version
> > >>> of the connector doesn’t support aggregate push down yet. It is being
> > >>> actively worked on and we expect it to be improved over the next few
> > >>> releases. We are currently evaluating using the presto-druid
> connector
> > in
> > >>> our Tableau setup. It would be interesting to see what changes in
> Druid
> > >>> would be needed to support that integration.
> > >>>
> > >>> Thanks,
> > >>> Samarth
> > >>>
> > >>> On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino 
> wrote:
> > >>>
> > >>>> Hey Druids,
> > >>>>
> > >>>> I was wondering, is anyone on this list using Druid + Presto
> together?
> > >>> If
> > >>>> so, what does your architecture look like and which edition / flavor
> > of
> > >>>> Presto and Druid connector are you using? What's your experience
> been
> > >>> like?
> > >>>> I'm asking since I'm starting to think about whether it makes sense
> to
> > >>> look
> > >>>> at ways to improve the integration between the two projects.
> > >>>>
> > >>>> Gian
> > >>>>
> > >>>
> > >>
> >
> >
>


Re: Druid + Presto?

2020-07-09 Thread Samarth Jain
Gian,

For the presto-sql version of Druid connector, for V1, we decided to pursue
the JDBC route. You can follow along on the progress here -
https://github.com/prestosql/presto/issues/1855
My colleague, Parth (cc'ed as well) is working on implementing Druid
aggregation push down including support for top-n style queries. Our
immediate use cases, and what we think Druid
generally is more suitable for, is for solving for aggregate group by style
queries. Having a presto-druid connector also enables us to join data in
Druid with the rest of our warehouse.
In general though, for queries that don't do any aggregations i.e. which
get translated to Druid SCAN queries, it makes sense to by-pass the Druid
datanodes altogether and directly go
to the deep storage. I think Druid provides enough metadata about the
active segment files to be able to do that relatively easily.

You bring up an interesting idea on the reverse connector. What do you
think the value of such a connector will be? I am assuming Druid SQL for
the most part is ANSI SQL.

On Thu, Jul 9, 2020 at 12:56 PM Zhenxiao Luo 
wrote:

> Thank you, Mainak.
>
> Hi Gian,
>
> Glad to see you are interested in Presto Druid connector.
>
> My colleague, @Hao Luo  @Beinan Wang
>  and
> me, together, implemented the Presto Druid connector in PrestoDB:
> https://prestodb.io/docs/current/connector/druid.html
>
> Our implementation includes:
> 1. Presto could scan Druid segments to compute SQL results
> 2. aggregation pushdown, where Presto leverages Druid fast aggregation
> capabilities, and stream aggregated result from Druid
> actually, we implemented 2 execution paths, users could use configurations
> to control whether they'd like to scan segments or pushdown all sub-queries
> to Druid
>
> We had run benchmarkings comparing Presto Druid connector with other SQL
> engines. And are ready to run production workloads.
>
> Thanks,
> Zhenxiao
>
> On Thu, Jul 9, 2020 at 12:40 PM Mainak Ghosh  wrote:
>
> > Hello Gian,
> >
> > We are currently testing the (other) Presto Druid connector at our end.
> It
> > has aggregation push down support. Adding Zhenxiao to this thread since
> he
> > is the primary developer of the connector. He can provide the kind of
> > details you are looking for.
> >
> > Thanks,
> > Mainak
> >
> > > On Jul 9, 2020, at 12:25 PM, Gian Merlino  wrote:
> > >
> > > By the way, I see that the other Presto has a Druid connector too:
> > > https://prestodb.io/docs/current/connector/druid.html. From the docs
> it
> > > looks like it has different lineage and might even work differently.
> > >
> > > On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino  wrote:
> > >
> > >> I was thinking of exploring ideas like pushing down aggregations,
> > enabling
> > >> Presto to query directly from deep storage (in cases where there
> aren't
> > any
> > >> interesting things to push down, this may be more efficient than
> > querying
> > >> Druid servers), enabling translation from Druid's SQL dialect to
> > Presto's
> > >> SQL dialect (a "reverse connector"), etc. Do you (or anyone else on
> this
> > >> list) have any thoughts on any of those?
> > >>
> > >> I'm also curious what kinds of improvements you're planning to the
> > >> connector you built.
> > >>
> > >> On Thu, Jul 9, 2020 at 10:18 AM Samarth Jain 
> > >> wrote:
> > >>
> > >>> Hi Gian,
> > >>>
> > >>> I contributed the jdbc based presto-druid connector in prestosql
> which
> > >>> went
> > >>> out in release 337
> > >>> https://prestosql.io/docs/current/release/release-337.html. The v1
> > >>> version
> > >>> of the connector doesn’t support aggregate push down yet. It is being
> > >>> actively worked on and we expect it to be improved over the next few
> > >>> releases. We are currently evaluating using the presto-druid
> connector
> > in
> > >>> our Tableau setup. It would be interesting to see what changes in
> Druid
> > >>> would be needed to support that integration.
> > >>>
> > >>> Thanks,
> > >>> Samarth
> > >>>
> > >>> On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino 
> wrote:
> > >>>
> > >>>> Hey Druids,
> > >>>>
> > >>>> I was wondering, is anyone on this list using Druid + Presto
> together?
> > >>> If
> > >>>> so, what does your architecture look like and which edition / flavor
> > of
> > >>>> Presto and Druid connector are you using? What's your experience
> been
> > >>> like?
> > >>>> I'm asking since I'm starting to think about whether it makes sense
> to
> > >>> look
> > >>>> at ways to improve the integration between the two projects.
> > >>>>
> > >>>> Gian
> > >>>>
> > >>>
> > >>
> >
> >
>


Re: Druid + Presto?

2020-07-09 Thread Zhenxiao Luo
Thank you, Mainak.

Hi Gian,

Glad to see you are interested in Presto Druid connector.

My colleague, @Hao Luo  @Beinan Wang
 and
me, together, implemented the Presto Druid connector in PrestoDB:
https://prestodb.io/docs/current/connector/druid.html

Our implementation includes:
1. Presto could scan Druid segments to compute SQL results
2. aggregation pushdown, where Presto leverages Druid fast aggregation
capabilities, and stream aggregated result from Druid
actually, we implemented 2 execution paths, users could use configurations
to control whether they'd like to scan segments or pushdown all sub-queries
to Druid

We had run benchmarkings comparing Presto Druid connector with other SQL
engines. And are ready to run production workloads.

Thanks,
Zhenxiao

On Thu, Jul 9, 2020 at 12:40 PM Mainak Ghosh  wrote:

> Hello Gian,
>
> We are currently testing the (other) Presto Druid connector at our end. It
> has aggregation push down support. Adding Zhenxiao to this thread since he
> is the primary developer of the connector. He can provide the kind of
> details you are looking for.
>
> Thanks,
> Mainak
>
> > On Jul 9, 2020, at 12:25 PM, Gian Merlino  wrote:
> >
> > By the way, I see that the other Presto has a Druid connector too:
> > https://prestodb.io/docs/current/connector/druid.html. From the docs it
> > looks like it has different lineage and might even work differently.
> >
> > On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino  wrote:
> >
> >> I was thinking of exploring ideas like pushing down aggregations,
> enabling
> >> Presto to query directly from deep storage (in cases where there aren't
> any
> >> interesting things to push down, this may be more efficient than
> querying
> >> Druid servers), enabling translation from Druid's SQL dialect to
> Presto's
> >> SQL dialect (a "reverse connector"), etc. Do you (or anyone else on this
> >> list) have any thoughts on any of those?
> >>
> >> I'm also curious what kinds of improvements you're planning to the
> >> connector you built.
> >>
> >> On Thu, Jul 9, 2020 at 10:18 AM Samarth Jain 
> >> wrote:
> >>
> >>> Hi Gian,
> >>>
> >>> I contributed the jdbc based presto-druid connector in prestosql which
> >>> went
> >>> out in release 337
> >>> https://prestosql.io/docs/current/release/release-337.html. The v1
> >>> version
> >>> of the connector doesn’t support aggregate push down yet. It is being
> >>> actively worked on and we expect it to be improved over the next few
> >>> releases. We are currently evaluating using the presto-druid connector
> in
> >>> our Tableau setup. It would be interesting to see what changes in Druid
> >>> would be needed to support that integration.
> >>>
> >>> Thanks,
> >>> Samarth
> >>>
> >>> On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino  wrote:
> >>>
> >>>> Hey Druids,
> >>>>
> >>>> I was wondering, is anyone on this list using Druid + Presto together?
> >>> If
> >>>> so, what does your architecture look like and which edition / flavor
> of
> >>>> Presto and Druid connector are you using? What's your experience been
> >>> like?
> >>>> I'm asking since I'm starting to think about whether it makes sense to
> >>> look
> >>>> at ways to improve the integration between the two projects.
> >>>>
> >>>> Gian
> >>>>
> >>>
> >>
>
>


Re: Druid + Presto?

2020-07-09 Thread Mainak Ghosh
Hello Gian,

We are currently testing the (other) Presto Druid connector at our end. It has 
aggregation push down support. Adding Zhenxiao to this thread since he is the 
primary developer of the connector. He can provide the kind of details you are 
looking for.

Thanks,
Mainak 

> On Jul 9, 2020, at 12:25 PM, Gian Merlino  wrote:
> 
> By the way, I see that the other Presto has a Druid connector too:
> https://prestodb.io/docs/current/connector/druid.html. From the docs it
> looks like it has different lineage and might even work differently.
> 
> On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino  wrote:
> 
>> I was thinking of exploring ideas like pushing down aggregations, enabling
>> Presto to query directly from deep storage (in cases where there aren't any
>> interesting things to push down, this may be more efficient than querying
>> Druid servers), enabling translation from Druid's SQL dialect to Presto's
>> SQL dialect (a "reverse connector"), etc. Do you (or anyone else on this
>> list) have any thoughts on any of those?
>> 
>> I'm also curious what kinds of improvements you're planning to the
>> connector you built.
>> 
>> On Thu, Jul 9, 2020 at 10:18 AM Samarth Jain 
>> wrote:
>> 
>>> Hi Gian,
>>> 
>>> I contributed the jdbc based presto-druid connector in prestosql which
>>> went
>>> out in release 337
>>> https://prestosql.io/docs/current/release/release-337.html. The v1
>>> version
>>> of the connector doesn’t support aggregate push down yet. It is being
>>> actively worked on and we expect it to be improved over the next few
>>> releases. We are currently evaluating using the presto-druid connector in
>>> our Tableau setup. It would be interesting to see what changes in Druid
>>> would be needed to support that integration.
>>> 
>>> Thanks,
>>> Samarth
>>> 
>>> On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino  wrote:
>>> 
>>>> Hey Druids,
>>>> 
>>>> I was wondering, is anyone on this list using Druid + Presto together?
>>> If
>>>> so, what does your architecture look like and which edition / flavor of
>>>> Presto and Druid connector are you using? What's your experience been
>>> like?
>>>> I'm asking since I'm starting to think about whether it makes sense to
>>> look
>>>> at ways to improve the integration between the two projects.
>>>> 
>>>> Gian
>>>> 
>>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Druid + Presto?

2020-07-09 Thread Hao Luo
I wrote the Druid connector that reads directly from deep storage(the one
you found in prestodb). It understands the segment file format and read
only the data needed for the query. I think its possible to have
aggregation pushdown(and even more possibilities with other complex
operation pushdown), but more work needs to be done on the Presto side to
take advantage of the indexes in the segments.

On Thu, Jul 9, 2020 at 12:25 PM Gian Merlino  wrote:

> By the way, I see that the other Presto has a Druid connector too:
> https://prestodb.io/docs/current/connector/druid.html. From the docs it
> looks like it has different lineage and might even work differently.
>
> On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino  wrote:
>
> > I was thinking of exploring ideas like pushing down aggregations,
> enabling
> > Presto to query directly from deep storage (in cases where there aren't
> any
> > interesting things to push down, this may be more efficient than querying
> > Druid servers), enabling translation from Druid's SQL dialect to Presto's
> > SQL dialect (a "reverse connector"), etc. Do you (or anyone else on this
> > list) have any thoughts on any of those?
> >
> > I'm also curious what kinds of improvements you're planning to the
> > connector you built.
> >
> > On Thu, Jul 9, 2020 at 10:18 AM Samarth Jain 
> > wrote:
> >
> >> Hi Gian,
> >>
> >> I contributed the jdbc based presto-druid connector in prestosql which
> >> went
> >> out in release 337
> >> https://prestosql.io/docs/current/release/release-337.html. The v1
> >> version
> >> of the connector doesn’t support aggregate push down yet. It is being
> >> actively worked on and we expect it to be improved over the next few
> >> releases. We are currently evaluating using the presto-druid connector
> in
> >> our Tableau setup. It would be interesting to see what changes in Druid
> >> would be needed to support that integration.
> >>
> >> Thanks,
> >> Samarth
> >>
> >> On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino  wrote:
> >>
> >> > Hey Druids,
> >> >
> >> > I was wondering, is anyone on this list using Druid + Presto together?
> >> If
> >> > so, what does your architecture look like and which edition / flavor
> of
> >> > Presto and Druid connector are you using? What's your experience been
> >> like?
> >> > I'm asking since I'm starting to think about whether it makes sense to
> >> look
> >> > at ways to improve the integration between the two projects.
> >> >
> >> > Gian
> >> >
> >>
> >
>


Re: Druid + Presto?

2020-07-09 Thread Gian Merlino
By the way, I see that the other Presto has a Druid connector too:
https://prestodb.io/docs/current/connector/druid.html. From the docs it
looks like it has different lineage and might even work differently.

On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino  wrote:

> I was thinking of exploring ideas like pushing down aggregations, enabling
> Presto to query directly from deep storage (in cases where there aren't any
> interesting things to push down, this may be more efficient than querying
> Druid servers), enabling translation from Druid's SQL dialect to Presto's
> SQL dialect (a "reverse connector"), etc. Do you (or anyone else on this
> list) have any thoughts on any of those?
>
> I'm also curious what kinds of improvements you're planning to the
> connector you built.
>
> On Thu, Jul 9, 2020 at 10:18 AM Samarth Jain 
> wrote:
>
>> Hi Gian,
>>
>> I contributed the jdbc based presto-druid connector in prestosql which
>> went
>> out in release 337
>> https://prestosql.io/docs/current/release/release-337.html. The v1
>> version
>> of the connector doesn’t support aggregate push down yet. It is being
>> actively worked on and we expect it to be improved over the next few
>> releases. We are currently evaluating using the presto-druid connector in
>> our Tableau setup. It would be interesting to see what changes in Druid
>> would be needed to support that integration.
>>
>> Thanks,
>> Samarth
>>
>> On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino  wrote:
>>
>> > Hey Druids,
>> >
>> > I was wondering, is anyone on this list using Druid + Presto together?
>> If
>> > so, what does your architecture look like and which edition / flavor of
>> > Presto and Druid connector are you using? What's your experience been
>> like?
>> > I'm asking since I'm starting to think about whether it makes sense to
>> look
>> > at ways to improve the integration between the two projects.
>> >
>> > Gian
>> >
>>
>


Re: Druid + Presto?

2020-07-09 Thread Gian Merlino
I was thinking of exploring ideas like pushing down aggregations, enabling
Presto to query directly from deep storage (in cases where there aren't any
interesting things to push down, this may be more efficient than querying
Druid servers), enabling translation from Druid's SQL dialect to Presto's
SQL dialect (a "reverse connector"), etc. Do you (or anyone else on this
list) have any thoughts on any of those?

I'm also curious what kinds of improvements you're planning to the
connector you built.

On Thu, Jul 9, 2020 at 10:18 AM Samarth Jain  wrote:

> Hi Gian,
>
> I contributed the jdbc based presto-druid connector in prestosql which went
> out in release 337
> https://prestosql.io/docs/current/release/release-337.html. The v1 version
> of the connector doesn’t support aggregate push down yet. It is being
> actively worked on and we expect it to be improved over the next few
> releases. We are currently evaluating using the presto-druid connector in
> our Tableau setup. It would be interesting to see what changes in Druid
> would be needed to support that integration.
>
> Thanks,
> Samarth
>
> On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino  wrote:
>
> > Hey Druids,
> >
> > I was wondering, is anyone on this list using Druid + Presto together? If
> > so, what does your architecture look like and which edition / flavor of
> > Presto and Druid connector are you using? What's your experience been
> like?
> > I'm asking since I'm starting to think about whether it makes sense to
> look
> > at ways to improve the integration between the two projects.
> >
> > Gian
> >
>


Re: Druid + Presto?

2020-07-09 Thread Samarth Jain
Hi Gian,

I contributed the jdbc based presto-druid connector in prestosql which went
out in release 337
https://prestosql.io/docs/current/release/release-337.html. The v1 version
of the connector doesn’t support aggregate push down yet. It is being
actively worked on and we expect it to be improved over the next few
releases. We are currently evaluating using the presto-druid connector in
our Tableau setup. It would be interesting to see what changes in Druid
would be needed to support that integration.

Thanks,
Samarth

On Thu, Jul 9, 2020 at 10:07 AM Gian Merlino  wrote:

> Hey Druids,
>
> I was wondering, is anyone on this list using Druid + Presto together? If
> so, what does your architecture look like and which edition / flavor of
> Presto and Druid connector are you using? What's your experience been like?
> I'm asking since I'm starting to think about whether it makes sense to look
> at ways to improve the integration between the two projects.
>
> Gian
>


Druid + Presto?

2020-07-09 Thread Gian Merlino
Hey Druids,

I was wondering, is anyone on this list using Druid + Presto together? If
so, what does your architecture look like and which edition / flavor of
Presto and Druid connector are you using? What's your experience been like?
I'm asking since I'm starting to think about whether it makes sense to look
at ways to improve the integration between the two projects.

Gian