Re: new committer: Fokko Driesprong

2019-09-20 Thread Gian Merlino
Congrats Fokko!! On Tue, Sep 17, 2019 at 1:53 PM Jonathan Wei wrote: > The Project Management Committee (PMC) for Apache Druid > has invited Fokko Driesprong to become a committer and we are pleased > to announce that he has accepted. > > Being a committer enables easier contribution to the > pr

Re: Change to website's Powered By page

2019-09-20 Thread Gian Merlino
Hey Pierre, You submitted it to the right place. [3] is updated manually, and periodically, based on changes to [2]. I just merged your patch, so it will go out with the next website update. Thanks for your contribution! On Fri, Sep 20, 2019 at 12:28 AM Pierre Smits wrote: > Hi all, > > Recentl

Re: new committer: Furkan Kamaci

2019-09-20 Thread Gian Merlino
Congrats Furkan!! On Tue, Sep 17, 2019 at 1:53 PM Jonathan Wei wrote: > The Project Management Committee (PMC) for Apache Druid > has invited Furkan Kamaci to become a committer and we are pleased > to announce that he has accepted. > > Being a committer enables easier contribution to the > proj

Re: Looking out of the Druid's development bubble at modern Java testing practices

2019-10-08 Thread Gian Merlino
I like the "Complete Vertical Slide" recommendation. It goes against the wisdom of having focused unit tests, but I think in my experience, the tests that shake out the most bugs (and are most robust to refactoring) have been ones that wrap together a lot of layers. One thing I didn't see in the a

Re: A list of issues for new committers

2019-10-08 Thread Gian Merlino
That is definitely a good property for starter issues (not needing a production cluster to validate). On Fri, Oct 4, 2019 at 12:01 AM Roman Leventov wrote: > I don't use "Contributions Welcome" exclusively for starter issues. I do > _not_ put this label though on issues that are impossible or ha

Re: A list of issues for new committers

2019-10-08 Thread Gian Merlino
Vadim, the idea of removing the Difficulty labels and repurposing "Easy" for intro issues sounds good to me. It sounds like "Starter" as you envision it is a subset of "Contributions Welcome" as Roman envisions it. I wonder if there is some way we can align these better. It looks like _most_ of th

Re: Nulls vs Optional

2019-10-10 Thread Gian Merlino
For reference, a (brief) earlier conversation about this: https://github.com/apache/incubator-druid/issues/4275, which links to https://github.com/apache/incubator-druid/pull/4254#discussion_r116628607, which links to https://stackoverflow.com/questions/26327957/should-java-8-getters-return-optiona

Moving pydruid to Apache?

2019-10-10 Thread Gian Merlino
pydruid (https://github.com/druid-io/pydruid) is a project that is not part of Druid, but is related: 1) Its committers are all also Druid committers. 2) Its community is a subset of the Druid community (assuming pydruid users/devs are all also Druid users or devs, which seems likely). 3) It's hos

Re: A list of issues for new committers

2019-10-14 Thread Gian Merlino
19 at 5:46 PM Vadim Ogievetsky wrote: > Ok I will remove the Difficulty labels and repurpose "Easy" to "Starter" > unless anyone objects within 24h. > > On 2019/10/09 06:30:06, Gian Merlino wrote: > > That is definitely a good property for starter issues (

Re: Interested in Apache Druid Project

2019-10-14 Thread Gian Merlino
Hi Gautam, We have a list of suggested starter issues: https://github.com/apache/incubator-druid/issues?utf8=%E2%9C%93&q=is%3Aopen+is%3Aissue++label%3AStarter+ Additionally, contributions to clarify or correct documentation are always appreciated. On Mon, Oct 14, 2019 at 6:55 PM gautam gupta wr

Re: CDA - contributor agreement now that Druid entered the apache incubator

2019-10-15 Thread Gian Merlino
Hey Sascha, We are now operating under Apache policy, meaning we require CLAs for committers but not for every single contributor. I've written a patch to update our site, thanks for pointing it out: https://github.com/apache/incubator-druid-website-src/pull/65 On Tue, Oct 15, 2019 at 8:56 AM Sas

Custom Parser

2019-10-22 Thread Gian Merlino
Hey Tony, I accidentally rejected your message to the Druid dev list about writing a custom parser, by fat-fingering the item in the moderation queue. Sorry about that! You had asked about being pointed in a useful direction in terms of writing a custom parser for a proprietary data format. You m

Re: Custom Parser

2019-10-22 Thread Gian Merlino
estamp", > "format": "auto" > }, > "dimensionsSpec": { > "dimensions": [] > } > } > } > } > } > > > my module looks like: > > public class MyTypeEventDruidParserMo

Re: apply to join mail list

2019-10-25 Thread Gian Merlino
You can join by sending a message to dev-subscr...@druid.apache.org. On Fri, Oct 25, 2019 at 10:03 AM 张天生 wrote: > apply to join mail list >

Re: Release 0.16.0-incubating not running properly on OpenJDK 8 or 11 - segments become unavailable

2019-11-02 Thread Gian Merlino
Hi Tony, Druid doesn't fully support Java 11 today — we're working towards it but there are issues to work through related to 'unofficial' APIs we need for ByteBuffers and Cleaners, as well as issues with dependencies like DataSketches and Hadoop. We've updated our docs recently to reflect it: htt

Re: Failing CIs has been fixed

2019-11-06 Thread Gian Merlino
Thanks Jihoon! On Wed, Nov 6, 2019 at 11:35 AM Jihoon Son wrote: > Hi all, > > our CI, especially TeamCity and LGTM had been broken for a while because of > a missing library in the maven repository. If you see the following error > in the log, your PR is suffering the same problem. > > [2019-11

Re: Allow static imports in tests

2019-11-15 Thread Gian Merlino
I quite dislike EasyMock (I think it leads to brittle tests that over-couple test code with production code). But that comment aside, I think it is reasonable to use static imports for DSL-type stuff in tests, which it sounds like what you are suggesting. So that sounds good to me. I would still su

Re: Don't auto-close Bug-labelled PRs after a period of inactivity with stale bot.

2019-11-17 Thread Gian Merlino
Is that configuration possible? On Sat, Nov 16, 2019 at 1:00 AM Roman Leventov wrote: > I think it would be better to configure the stale-bot to leave a message in > a PR every 60 days, but not actually close PRs with Bug label. Feels a > little like sweeping problems under the carpet. >

LGTM issues

2019-11-18 Thread Gian Merlino
Hey Druids, The LGTM tool is broken right now and it is holding up PRs from being validated. Vadim and I looked into it a bit and it seems likely that something is wrong with an internal TLS proxy on the LGTM side (certain fetches to npm through that proxy don't work, even though they work outside

Re: LGTM issues

2019-11-18 Thread Gian Merlino
The patch in https://github.com/apache/incubator-druid/pull/8900 was effective. If you have an open PR, please merge master into your branch in order to fix LGTM. On Mon, Nov 18, 2019 at 2:30 PM Gian Merlino wrote: > Hey Druids, > > The LGTM tool is broken right now and it is holdi

Re: Dec 2019 podling report draft

2019-12-04 Thread Gian Merlino
I certainly think that would be appropriate. I would like to submit to the board the same resolution we previously had approved by both the dev list & the IPMC, with just the addition of some new initial PMC members (new committers, based on previous discussion that the initial PMC should include

Re: Podling Druid Report Reminder - December 2019

2019-12-04 Thread Gian Merlino
Hi Justin, I think it's fair to say that there aren't any unfinished issues, so it makes sense for the section to be blank. You might remember that a few months ago, Druid got close enough to graduating that the project had a resolution approved by the dev community and the IPMC, and had been sent

Re: Podling Druid Report Reminder - December 2019

2019-12-04 Thread Gian Merlino
That is fair, I suggest we change the first section (issues remaining) to this then: > The project is not aware of any issues blocking graduation. Druid previously shelved a resolution to graduate due to a potential brand issue, which the project has since been working on with VP Brand. It is now

[VOTE] Apache Druid graduation to top level project

2019-12-04 Thread Gian Merlino
* Fangjin Yang (f...@apache.org) * Fokko Driesprong (fo...@apache.org) * Furkan Kamaci (kam...@apache.org) * Gian Merlino (g...@apache.org) * Himanshu Gupta (himans...@apache.org) * Jihoon Son (jihoon...@apache.org) * Jonathan Wei (jon...@ap

Re: Podling Druid Report Reminder - December 2019

2019-12-04 Thread Gian Merlino
By the way, I've just started a graduation VOTE on the dev list. On Wed, Dec 4, 2019 at 12:12 PM Gian Merlino wrote: > That is fair, I suggest we change the first section (issues remaining) to > this then: > > > The project is not aware of any issues blocking graduation

Re: 0.17.0 branch?

2019-12-04 Thread Gian Merlino
This sounds great to me. Thanks Jon. On Wed, Dec 4, 2019 at 1:42 PM Jonathan Wei wrote: > Hi all, > > Since it's been ~3 months since our last major release (0.16.0 released on > Sep. 24, branch created Aug. 27), I propose creating the 0.17.0 branch from > master next Monday (Dec. 9). > > I can

Re: [VOTE] Apache Druid graduation to top level project

2019-12-07 Thread Gian Merlino
+1! > > > > > > > > > > > > > > > > > > Kind Regards, > > > > > > > > > Furkan KAMACI > > > > > > > > > > > > > > > > > > 4 Ara 2019 Çar, saat 23:58 tarihinde David Lim < > > > > > david...@apache.org> > > &g

[RESULT][VOTE] Apache Druid graduation to top level project

2019-12-07 Thread Gian Merlino
This vote passed with 8 PPMC +1s: David Lim Fangjin Yang Gian Merlino Jihoon Son Jonathan Wei Julian Hyde Kurt Young Slim Bouguerra Xavier Léauté In addition, the following nonbinding +1s were received: Benedict Jin Clint Wylie Fokko Driesprong Furkan Kamaci I will refer this to the Incubator

Re: Druid Scan with result streaming

2019-12-11 Thread Gian Merlino
Actually I'd suggest the opposite approach: use Scan if you want a stream. It can handle almost any number of results (you might want to set druid.broker.http.maxQueuedBytes to 1000 or so if you are doing large resultsets; this will likely be the default in the future). With Scan queries, Drui

Re: [RESULT][VOTE] Apache Druid graduation to top level project

2019-12-11 Thread Gian Merlino
By the way, a link to the vote thread: https://lists.apache.org/thread.html/2385d5545fa5bd13baf45bd7258fafa7694334d9d14267489d21d99b%40%3Cdev.druid.apache.org%3E On Sat, Dec 7, 2019 at 1:48 PM Gian Merlino wrote: > This vote passed with 8 PPMC +1s: > > David Lim > Fangjin Yang &g

Graduation 🎓

2019-12-20 Thread Gian Merlino
Hey Druids, It is official: Druid has graduated to a top level project! Now, we need to conduct various post-graduation tasks. The first is to raise an infra ticket to migrate the appropriate resources. I started it off here: https://issues.apache.org/jira/browse/INFRA-19609. Please, take a look

Re: Empty Data Source in Druid

2019-12-23 Thread Gian Merlino
In Druid it's not possible to have a datasource without any segments. But is possible, in theory, to have an empty datasource: you would need a single segment that has no rows (the important part is that the segment exists, not that it actually has rows in it). But there are two problems with this:

Re: Test naming in Druid

2019-12-23 Thread Gian Merlino
Suneet, Sometimes it's hard to understand how things would improve without an example. Could you point to a test file that you think would be improved by this change? Also, there are some test files that I would struggle to fit into this framework. It seems best suited to simple single-method unit

Druid Summit 2020: Call for speakers!

2019-12-24 Thread Gian Merlino
Hey Druids, I am excited to announce Druid Summit , an event being held in the San Francisco Bay Area next April 13–15, 2020. The entire Apache Druid community is welcome. It would be great to see a bunch of people from the community giving talks about Druid. The call fo

Re: Test naming in Druid

2019-12-30 Thread Gian Merlino
laining what the test > > does/expects is a good idea (that would be enough info in my view). > > > > Since I don't think the proposed format applies universally, I would > prefer > > starting it off as a suggestion/best practice instead of as a hard > > re

Re: Nulls vs Optional

2019-12-30 Thread Gian Merlino
much more pleasant to work > with > > than the JDK's Optional. > > > > On Thu, Oct 10, 2019 at 5:46 PM Gian Merlino wrote: > > > >> For reference, a (brief) earlier conversation about this: > >> https://github.com/apache/incubator-druid/issues/4275,

Re: Graduation 🎓

2019-12-30 Thread Gian Merlino
most of the incubation references (download > links and the ASF release process guide aren't updated): > https://github.com/apache/druid/pull/9108 > > On Sat, Dec 21, 2019 at 4:10 PM Vadim Ogievetsky > wrote: > > > Huzzah! > > > > Thank you for all the hard work Gian

New committer: Samarth Jain

2020-01-02 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Samarth Jain (@samarthjain on GitHub) to become a committer and we are pleased to announce that he has accepted. Samarth has contributed a variety of improvements to Druid over the past year and has also given back to the commu

Re: 8399 Migrating Guava to Caffeine

2020-01-05 Thread Gian Merlino
Hey JJ, I think your idea of adding a new option and deprecating "guava" is a good way forward. Gian On Fri, Dec 27, 2019 at 7:50 AM JJ Meyer wrote: > Hello all, > > I'm planning on contributing for the first time. I'm working on > https://github.com/apache/druid/issues/8399. No issues seem to

Re: 8399 Migrating Guava to Caffeine

2020-01-06 Thread Gian Merlino
e is no harm in this, Caffeine's concurrency is practically > "elastic" and doesn't demand concurrencyLevel. > > On Mon, 6 Jan 2020 at 01:13, Gian Merlino wrote: > > > Hey JJ, > > > > I think your idea of adding a new option and deprecating "gu

New committer: Alexander Saydakov

2020-01-07 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Alexander Saydakov (@AlexanderSaydakov on GitHub) to become a committer and we are pleased to announce that he has accepted. Alexander has contributed extensively to Druid's DataSketches extension, and is also a committer and PPMC member on the Apache DataSket

January 2020 Druid report

2020-01-09 Thread Gian Merlino
Hey Druids, Now that we're a top level project we're required to report periodically to the board. I just sent the following report (our first one). I'm posting it here in case anyone has any feedback, and so anyone interested can read it. Next time we'll post a draft here before submitting the re

Re: Druid Summit 2020: Call for speakers!

2020-01-14 Thread Gian Merlino
12 PM Gian Merlino wrote: > Hey Druids, > > I am excited to announce Druid Summit <https://druidsummit.org/>, an > event being held in the San Francisco Bay Area next April 13–15, 2020. The > entire Apache Druid community is welcome. > > It would be great to see a bunch of

Re: Publish staging docs for release earlier?

2020-01-15 Thread Gian Merlino
I love the idea. Even better if we can publish the docs from master somewhere (in addition to the current release branch). Both are useful to see. On Tue, Jan 14, 2020 at 7:14 PM Jonathan Wei wrote: > Hi all, > > We currently publish a staging version of the Druid docs for an upcoming > release

Re: Pull Requests need a review

2020-01-15 Thread Gian Merlino
Hey Serge, Thanks for the patches! I took a look at https://github.com/apache/druid/pull/8881 and posted a review. If anyone else could help review the other two, I'd be grateful. Gian On Mon, Jan 13, 2020 at 9:06 AM Serge Bespalov wrote: > Hello Druid developers. > I have following opened PR

New committer: Chi Cao Minh

2020-01-21 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Chi Cao Minh (@ccaominh on GitHub) to become a committer and we are pleased to announce that he has accepted. Chi has done work in a variety of areas, including adding range partitioning to native batch ingestion, quality-of-life work on CI and dependency gard

Re: [VOTE] Release Apache Druid 0.17.0 [RC1]

2020-01-26 Thread Gian Merlino
What version of javac are you using? I wonder if there's something going on there. On Sat, Jan 25, 2020 at 10:48 AM Hagen Rother wrote: > I fail to compile 0.17-rc1 on both linux and mac. I even removed ~/.m2 just > make sure it's not that. > > It fails on: > java.lang.NoClassDefFoundError: Coul

Re: mocking frameworks for tests in druid

2020-02-04 Thread Gian Merlino
I've never really liked EasyMock. I agree that its design tends to make test code too tightly coupled with the specific implementation being tested. > What would it take for us to try another framework like Mockito? IMO, for me all it'd take is a PR changing one of our EasyMock tests to a Mockito

ByteBuffer / Memory / Unsafe et al

2020-02-04 Thread Gian Merlino
Hey Druids, There has generally been a lot of talk about moving away from ByteBuffer and towards the DataSketches Memory package ( https://datasketches.apache.org/docs/Memory/MemoryPackage.html) or even using Unsafe directly. Much of that discussion happened on https://github.com/apache/druid/issu

Re: ByteBuffer / Memory / Unsafe et al

2020-02-05 Thread Gian Merlino
nd it. > This does not exclude or means we should not use Memory API for other stuff > like sketches et al, in fact i think even for project like Sketches it > makes more sense to move to newer API offered by the JDK rather that do it > your self. > > > On Tue, Feb 4, 2020 at 10:12 P

Re: ByteBuffer / Memory / Unsafe et al

2020-02-05 Thread Gian Merlino
n. > > On Wed, Feb 5, 2020 at 9:43 AM Gian Merlino wrote: > > > The thing that worries me about JEP 370 is that if historical Java user > > migration patterns hold up, we will need to support Java 11 for a while > > (probably another 2–3 years), and we would therefore nee

Re: ByteBuffer / Memory / Unsafe et al

2020-02-06 Thread Gian Merlino
later, once we drop support for Java pre-14. Separately, I think if we do build an abstraction layer here, we need to make sure the performance overhead is zero — it's important that the jvm be able to inline the underlying calls. > @Gian Merlino I think i am not 100% sure about the scope

Re: ByteBuffer / Memory / Unsafe et al

2020-02-06 Thread Gian Merlino
in those cases, relatively more time is spent doing aggregations). On Thu, Feb 6, 2020 at 11:32 AM Gian Merlino wrote: > We could make an interface that is a subset of Memory and use that. But I > think once the new JEP 370 stuff becomes mainstream it would be best to use > it directly,

Re: ByteBuffer / Memory / Unsafe et al

2020-02-06 Thread Gian Merlino
; If JEP370 does actually appear, and is workable, we will be strongly > motivated to replace our current use of Unsafe inside Memory with the newer > API, and all of that could be behind the current Memory API. > > > > > On 2020/02/06 20:33:18, Gian Merlino wrote: > &

Re: draft ASF Board Report Feb 2020

2020-02-14 Thread Gian Merlino
Thanks Clint! I could not have written it better myself. I just added this to the Board agenda for next week. On Fri, Feb 14, 2020 at 5:28 PM Clint Wylie wrote: > Hey all, > > I've put together our ASF board report for Feb 2020, and while I haven't > yet determined how to actually submit it, I

Re: draft ASF Board Report Feb 2020

2020-02-16 Thread Gian Merlino
Thanks for taking a look, Julian. I added this to the agenda via Whimsy Friday and fixed the spelling error. On Mon, Feb 17, 2020 at 11:04 AM Julian Hyde wrote: > Looks good. > > Maybe mention that we are working with Sally on a press release to > announce graduation? > > Spelling: New Dehli sho

Travis emails being send to dev list

2020-03-04 Thread Gian Merlino
Recently, Travis CI emails started being sent from bui...@travis-ci.org to dev@druid.apache.org. Did someone change something recently to make this happen? Also, do people enjoy that they show up here? I'm asking because currently they end up in a spam moderation queue and need to be manually appr

Re: Travis emails being send to dev list

2020-03-05 Thread Gian Merlino
ailure to success (but keep the > notification for success to failure). > > Thanks, > Chi > > > On Mar 4, 2020, at 11:47 AM, Gian Merlino wrote: > > > > Recently, Travis CI emails started being sent from bui...@travis-ci.org > to > > dev@druid.apache.o

Re: Draft April ASF Board Report

2020-04-08 Thread Gian Merlino
Looks good to me. Thank you for drafting the report this month. On Tue, Apr 7, 2020 at 6:05 PM Clint Wylie wrote: > Hey all, > > I put together a draft for the quarterly ASF board report due tomorrow, > sorry for the short notice. Let me know if I missed anything or should make > any changes. Th

Re: Draft April ASF Board Report

2020-04-08 Thread Gian Merlino
It does matter! But, we mentioned those in a previous report (our last one was just a month ago — so this one covers the last month). After this report they'll start being quarterly and covering 3 months. On Wed, Apr 8, 2020 at 1:18 PM itai yaffe wrote: > Hey, > Not sure it matters, but we actua

Re: Draft April ASF Board Report

2020-04-09 Thread Gian Merlino
ous monthly reports and wasn't > sure > > if it is necessary to go over it again. I think it is probably fine > without > > it since the information was included in previous reports? > > > > On Wed, Apr 8, 2020 at 1:44 PM Gian Merlino wrote: > > > > > It do

Re: Moving Average error

2020-04-29 Thread Gian Merlino
Hey Damiano, This is a contrib extension so you might get limited support for it here. That being said, at first, I suggest looking through the logs mentioned by the supervise errors (like /home/damiano/druid/0.17.1/var/sv/coordinator-overlord.log). IIRC you might need to disable the extension on

Re: Cross-platform discrepancies

2020-04-29 Thread Gian Merlino
t;> >> >>> >> This will still require some changes to our library to support memory >>> >> allocation like this, but it seems to be less challenging then the >>> current >>> >> Direct memory mode we have. >>> >> >>> >&g

Streaming updates and deletes

2020-04-30 Thread Gian Merlino
Hey Druids, Now that a join operator exists and is well on its way to being useful, I started thinking about some other pie in the sky ideas. In particular one that seems very useful is supporting updates and deletes. Of course, we support updates and deletes today, but only on a whole-time-chunk

Re: PRs awaiting review

2020-05-27 Thread Gian Merlino
Hey Samarth, It looks like the last PR has been merged already — great! I just wrote up a review for your first PR, about round robin data types. I haven't had a chance to check out the unknown-complex-types PR yet; apologies. I'm now subscribed to them all, though. On Fri, May 15, 2020 at 5:0

Re: Feature lifecycle for Druid features

2020-06-15 Thread Gian Merlino
IMO the alpha / beta / GA terminology makes sense, and makes things clearer to users, which is good. Some thoughts on the specifics of your proposal: - You're suggesting we commit to a specific number of releases that a GA feature will be forward / backward compatible for. IMO, our current commit

Re: Druid 0.19.0

2020-06-15 Thread Gian Merlino
I commented on https://github.com/apache/druid/issues/10011; it looks like a SQL planner problem to me. I also logged a review of https://github.com/apache/druid/pull/10027. I don't think either of these needs to be a release blocker, though. 10027 in particular I am sure has been around for a wh

New committer: Maytas Monsereenusorn

2020-07-07 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Maytas Monsereenusorn (@maytasm on github) to become a committer and we are pleased to announce that he has accepted. Maytas has contributed to various areas including automated testing improvements and bug fixes. He has also been

New committer: David Glasser

2020-07-07 Thread Gian Merlino
Hey Druids, The Druid PMC has invited David Glasser (@glasser on github) to become a committer and we are pleased to announce that he has accepted. David has contributed an improved, parallelized self-ingestion firehose as well as various other patches. He has also par

New committer: Lucas Capistrant

2020-07-07 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Lucas Capistrant (@capistrant on github) to become a committer and we are pleased to announce that he has accepted. Lucas has been active throughout the past year, contributing various enhancements and fixes. Congratulations Lu

New committer: Suneet Saldanha

2020-07-07 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Suneet Saldanha (@suneet-s on github) to become a committer and we are pleased to announce that he has accepted. Suneet has contributed to areas including the new join functionality, documentation, and general code quality. He has

New committer: Maggie Brewster

2020-07-07 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Maggie Brewster (@mcbrewster on github) to become a committer and we are pleased to announce that she has accepted. Maggie has made dozens of contributions to Druid, especially to the (relatively) new web console. Congratulatio

Druid + Presto?

2020-07-09 Thread Gian Merlino
Hey Druids, I was wondering, is anyone on this list using Druid + Presto together? If so, what does your architecture look like and which edition / flavor of Presto and Druid connector are you using? What's your experience been like? I'm asking since I'm starting to think about whether it makes se

Re: Druid + Presto?

2020-07-09 Thread Gian Merlino
ked on and we expect it to be improved over the next few > releases. We are currently evaluating using the presto-druid connector in > our Tableau setup. It would be interesting to see what changes in Druid > would be needed to support that integration. > > Thanks, > Samarth > > On T

Re: Druid + Presto?

2020-07-09 Thread Gian Merlino
By the way, I see that the other Presto has a Druid connector too: https://prestodb.io/docs/current/connector/druid.html. From the docs it looks like it has different lineage and might even work differently. On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino wrote: > I was thinking of exploring id

Re: Druid + Presto?

2020-07-09 Thread Gian Merlino
aggregation push down support. Adding Zhenxiao to this thread since > he > > is the primary developer of the connector. He can provide the kind of > > details you are looking for. > > > > Thanks, > > Mainak > > > > > On Jul 9, 2020, at 12:25 PM, Gian Mer

Re: Druid + Presto?

2020-07-09 Thread Gian Merlino
; > > > Thanks, > > Zhenxiao > > > > On Thu, Jul 9, 2020 at 12:40 PM Mainak Ghosh wrote: > > > > > Hello Gian, > > > > > > We are currently testing the (other) Presto Druid connector at our end. > > It > > > has aggregation p

Re: Any benchmarks for druid iingesting, querying (min, max, topn avg etc)

2020-07-09 Thread Gian Merlino
Hey Rajiv, I'm not aware of one for ingestion. For querying, two recent results using the Star Schema Benchmark are this paper comparing Druid, Hive, and Presto: https://www.researchgate.net/publication/333831332_Challenging_SQL-on-Hadoop_Performance_with_Apache_Druid, and this blog post comparing

Re: Druid + Presto?

2020-07-09 Thread Gian Merlino
n the last year or so — does it work the same way in each one? I'm wondering how much work can be shared between these different efforts and perhaps between these efforts and the Druid project itself. On Thu, Jul 9, 2020 at 11:24 PM Gian Merlino wrote: > Hey Samarth, > > Thank

Re: Druid not listed in Apache project list by category?

2020-07-31 Thread Gian Merlino
That's a good point. We must be missing some metadata. I'm not sure how this page works — does anyone else know? On Fri, Jul 31, 2020 at 11:49 AM Will Lauer wrote: > I was browsing the list of Apache projects today looking for something, and > while I was there, I noticed that Druid was missing.

Re: Study On Rejected Refactorings

2020-08-12 Thread Gian Merlino
Hey Jevgenija, I recently filled out the survey — hope the response is helpful! On Tue, Aug 11, 2020 at 1:05 PM Jevgenija Pantiuchina < jevgenija.pantiuch...@usi.ch> wrote: > Dear contributors, > > As part of a research team from Università della Svizzera italiana > (Switzerland) and University

Re: SQL Support for Tuple Sketches

2020-08-12 Thread Gian Merlino
Hey Mithal, I'm not aware of anyone currently working on it, so you certainly are welcome to! On Mon, Aug 10, 2020 at 11:56 AM Mithal Kothari wrote: > Hi Druid Dev team, > > I just wanted to follow up with you'll and find out if there is a > plan/possibility to introduce sql support for tuple s

Re: [CRON] Broken: apache/druid#28120 (master - c72f96a)

2020-08-19 Thread Gian Merlino
There's a lot of these with messages like: > [ERROR] Failed to execute goal org.owasp:dependency-check-maven:5.3.2:check (default-cli) on project druid: Fatal exception(s) analyzing Druid: One or more exceptions occurred during analysis: > [ERROR] Unable to connect to the dependency-check database

New committer: Atul Mohan

2020-09-02 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Atul Mohan (@a2l007 on github) to become a committer and we are pleased to announce that he has accepted. Atul has been actively working on various parts of Druid, including indexing from SQL sources and result-level caching. Congr

Re: Help in Configuring data retention

2020-09-21 Thread Gian Merlino
Hey Satish, Are you asking if Druid can write a log of load/drop rule changes to a Kafka topic? If so, no, it cannot. But it does write them to the metadata store, and perhaps you could use a tool to copy them from the metadata store into Kafka. On Mon, Sep 21, 2020 at 6:46 AM Satish Embadi < sa

Code reviews, UX, and tests

2020-10-15 Thread Gian Merlino
Hey Druids, I am writing to you all to ask for your help 🙂 In particular, your help in ensuring that potential code contributions are reviewed in a timely fashion. Right now we have 72 open PRs, which due to stalebot are mostly opened pretty recently. That's a lot of people that want to contribut

Re: [E] Re: Removing Druid support for JDK 8 and adding support for JDK 11

2020-11-13 Thread Gian Merlino
Seconding (thirding?) the idea that keeping JDK 8 for integration with Hadoop is important. Druid's Hadoop integration is built against Hadoop 2.x and that version only supports JDK 8: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions. We shouldn't drop JDK 8 support until we

Re: Non JSON-query API clients

2020-11-13 Thread Gian Merlino
I'm not aware of plans to build out official clients for those other APIs; when I've written python programs to integrate with them I've usually called them through http directly. I'm not familiar with OpenAPI, but looking at it briefly, it seems like an interesting concept and a potential way to

Re: Forbidding forced git push

2021-01-15 Thread Gian Merlino
Will this help for the (common) case where PR branches are in people's forks? On Fri, Jan 15, 2021 at 1:00 PM Jihoon Son wrote: > Hi all, > > The forced git push is usually used to make the commit history clean, which > I understand its importance. However, one of its downsides is, because it >

Re: Deprecate support for ZooKeeper 3.4.x

2021-01-19 Thread Gian Merlino
About time, I suppose. I replied to the issue on GitHub. I think the trickiest part is figuring out what migration will look like for users so we can write up some useful release notes. On Tue, Jan 19, 2021 at 5:43 PM Xavier Léauté wrote: > Hi everyone, I wrote up a short issue on deprecating su

Re: Spark-Druid Connectors

2021-02-23 Thread Gian Merlino
Hey Julian, Your pessimism in this matter is understandable but regrettable! It would be great to see this effort become part of mainline Druid. It is a more maintainable approach than a separate repo, because it gets rid of the risk of interface drift, and it makes sure that all the tests are ru

Re: L1 (caffeine) cache hits/misses metrics not emitted

2021-02-24 Thread Gian Merlino
Hey Vadim, According to https://druid.apache.org/docs/latest/operations/metrics.html#cache, today, the number of hits and misses for the hybrid cache are both emitted, but there isn't differentiation between L1 hits and L2 hits. Is that what you mean? If so, I think the main issue is there just i

Re: Spark-Druid Connectors

2021-03-02 Thread Gian Merlino
Thank you! On Thu, Feb 25, 2021 at 12:03 AM Julian Jaffe wrote: > Hey Gian, > > I’d be overjoyed to be proven wrong! For what it’s worth, my pessimism was > not driven by a lack of faith in the Druid community or the Druid > committers but by the fact that these connectors may be an awkward fit

Re: Contribute a new Community extensions : Launch Peon Pods Based on K8s

2021-03-02 Thread Gian Merlino
Hey Yue, Very interesting idea. I am not a kubernetes expert, but this seems like a neat concept. I guess the idea is only one MM would be needed? (Or maybe a handful, if one can't manage every pod?) If so, great. Hopefully someone that is more of a kubernetes expert will be able to chime in on th

Re: SpringBoot +MyBatis +Apache Druid

2021-03-10 Thread Gian Merlino
Hey Shamriya, It would help to know some more about what kind of integration you're trying to do, and which kind of driver class isn't being recognized. On Wed, Mar 10, 2021 at 11:36 AM nandalapadu shamriyashaik < nshamr...@gmail.com> wrote: > Hi, > > I am new to Druid and struggling to integrat

Re: Subject: [CVE-2021-26919] Authenticated users can execute arbitrary code from malicious MySQL database systems

2021-04-01 Thread Gian Merlino
I wanted to add a few more details about this advisory, in the hopes that it will be helpful to people that are upgrading. Here's a link to the relevant docs about the new properties: https://druid.apache.org/docs/latest/configuration/index.html#ingestion-security-configuration And the most secur

Re: Adding support to Kafka events keys

2021-04-21 Thread Gian Merlino
Hey Noam, I think this would certainly be useful, and thank you for your interest in contributing! I think the toughest part will be designing a good API (meaning: what would users specify in the kafka supervisor json spec in order to activate and configure this feature?). So a good way to procee

Re: Push-down of operations for SystemSchema tables

2021-05-19 Thread Gian Merlino
Hey Jason, It sounds like we have two different, but related goals: 1) Your goal is to improve the performance of system tables. 2) My goal with the branch Clint linked is to enable using Druid's native query engine for system tables, in order to achieve consistency in how SQL queries are execut

Re: Push-down of operations for SystemSchema tables

2021-05-19 Thread Gian Merlino
Hey Frank, These notes are really interesting. Thanks for writing them down. I agree that the three things you laid out are all important. With regard to SQL clauses from the web console, I did notice one recent change went in that changed the SQL clauses to only query sys.segments for columns th

Re: FlattenSpec for Nested Data With Unknown Array Length

2021-05-20 Thread Gian Merlino
Hey Evan, Druid's data model doesn't currently have a good way of storing arrays of objects like this. And you're right that even though joins exist, to get peak performance you want to avoid them at query time. In similar situations I have stored data models like this as 3 tables (entries, comme

  1   2   3   4   5   6   >