Re: [VOTE] Release Apache Druid 29.0.0 [RC1]

2024-02-16 Thread Gian Merlino
re people to change their arrayIngestMode. Gian On 2024/02/16 22:24:23 Gian Merlino wrote: > I just learned that arrayIngestMode is not actually new, just > https://github.com/apache/druid/pull/15588 is. However this will still make > it more likely that people accidentally break their tab

Re: [VOTE] Release Apache Druid 29.0.0 [RC1]

2024-02-16 Thread Gian Merlino
ait for 30, given the impact that can happen if people end up with mixed types without planning for it. On Fri, Feb 16, 2024 at 2:16 PM Gian Merlino wrote: > Thanks for managing this release! > > My vote is -0, let me explain why. I am concerned about usability issues > with the new a

Re: [VOTE] Release Apache Druid 29.0.0 [RC1]

2024-02-16 Thread Gian Merlino
Thanks for managing this release! My vote is -0, let me explain why. I am concerned about usability issues with the new arrayIngestMode feature. There are various issues when mixing MVD strings and string arrays in the same column: as soon as arrays show up in a column, various "classic

Re: on removing 'auto' strategy from native search query

2023-11-20 Thread Gian Merlino
We don't have usage data, but my sense is that the search query is not commonly used, and among people that use the search query, it's not common to rely on "druid.query.search.searchStrategy: auto". So I think it would be ok to remove the feature and have "auto" be an alias for "useIndexes",

Druid Summit 2023 — call for speakers!

2023-09-11 Thread Gian Merlino
Hey Druids, I am excited to write to you about this year's Druid Summit ( https://druidsummit.org/), an event being held virtually on December 5–6, 2023. The call for speakers is open here: https://docs.google.com/forms/d/e/1FAIpQLSfoBZNh_IpSCT59fsYdTSSK92hYa7Rxf_7Fu0yBRCbK8ZwJdg/viewform A

Re: CVEs in contrib extensions

2023-09-05 Thread Gian Merlino
I think it would be OK to have a policy that contrib extension dependencies are not proactively screened for CVEs. If we adopt such a policy, we do need to make it clear to people that they should do their own screening of any contrib extensions they use. However, we can't extend that policy to

Re: New Committer : Soumyava Das

2023-08-23 Thread Gian Merlino
Congratulations!! On Mon, Aug 21, 2023 at 9:13 AM Karan Kumar wrote: > Hello everyone, > > The Project Management Committee (PMC) for Apache Druid has invited > Soumyava to become a committer and we are pleased to announce that > Soumyava has accepted. > > Soumyava has been a consistent

Re: New Committer : Adarsh Sanjeev

2023-08-23 Thread Gian Merlino
Congratulations!! On Mon, Aug 21, 2023 at 8:14 AM Karan Kumar wrote: > Hello everyone, > > The Project Management Committee (PMC) for Apache Druid has invited > Adarsh to become a committer and we are pleased to announce that > Adarsh has accepted. > > Adarsh has been a consistent contributor

Re: [DISCUSS] Druid 28 dropping support for Hadoop 2

2023-07-19 Thread Gian Merlino
already, and the next release (28) is meant to not have it. Does anyone have some spare cycles to do (2)? Gian On 2023/06/28 06:42:08 Gian Merlino wrote: > I'd like to propose dropping support for Hadoop 2 in Druid 28. Not the very > next release (which I assume will be Druid 27) but the one

Re: About maintaining the Helm's Chart of Apache Druid

2023-07-17 Thread Gian Merlino
remove the > code. > > On Wed, Mar 1, 2023 at 7:14 AM Gian Merlino wrote: > > > Not as far as I _know_, I mean. > > > > On 2023/03/01 01:43:43 Gian Merlino wrote: > > > Not as far as I do. I think we're stuck since nobody has volunteered to > > do one of the two

Re: group-by v1

2023-07-17 Thread Gian Merlino
+1 to removing it. The only benefit I am aware of is the same one that you mentioned. But I don't think this needs to block removing the old v1 algo. On Wed, Jul 12, 2023 at 4:07 AM Clint Wylie wrote: > Is anyone opposed to removing group-by v1? I think it would allow us > to simplify quite a

Re: request to join dev group

2023-07-06 Thread Gian Merlino
Hi Tanya, Welcome! You can subscribe by sending an email to dev-subscr...@druid.apache.org. Gian On 2023/07/04 06:41:02 Tanya Mary wrote: > request to join dev group > - To unsubscribe, e-mail:

Re: [DISCUSS] Druid 28 dropping support for Hadoop 2

2023-06-29 Thread Gian Merlino
ran Kumar > wrote: > > > In favour of dropping hadoop 2 support . Another point is the lack of > > security and vulnerability fixes in hadoop2. > > > > > > > > On Wed, Jun 28, 2023 at 12:17 PM Clint Wylie wrote: > > > > > obvious

[DISCUSS] Druid 28 dropping support for Hadoop 2

2023-06-28 Thread Gian Merlino
I'd like to propose dropping support for Hadoop 2 in Druid 28. Not the very next release (which I assume will be Druid 27) but the one after that, likely late 2023 timeframe. In 2021, we had a discussion about moving away from Hadoop 2:

Re: Requirements for relaxing restrictions on github actions usage

2023-06-02 Thread Gian Merlino
+1, allowing CI to run without an explicit button push by committers will help encourage new contributors. The requirements seem OK. I looked through our repo and I don't see any external actions (they are all in "github" or "actions"). We do have ".github/workflows/labeler.yml" that fires on

Roadmap event: call for speakers

2023-05-30 Thread Gian Merlino
Hi Druids, We are looking to put on a virtual event called "Druid.NEXT" in June highlighting things that people in the community are working on. This is a call for speakers for that event! Date is TBD, but likely late June. The event will be on the shorter side, about meetup-length (an hour or

Re: Error message: "Error: Resource limit exceeded

2023-05-15 Thread Gian Merlino
Hi Alaka, There's a bit of text cut off in the error message. The full one is something like: "Time ordering is not supported for a Scan query with %,d segments per time chunk and a row limit of %,d. " + "Try reducing your query limit below maxRowsQueuedForOrdering

Re: Question regarding new development

2023-03-28 Thread Gian Merlino
Looks like the conversation is now in https://github.com/apache/druid/issues/13948. On Sat, Mar 18, 2023 at 8:00 AM Sergiu Ungureanu wrote: > Hi Team, > > Yesterday I raised a question in #dev channel in slack > > https://apachedruidworkspace.slack.com/archives/C030CMF6B70/p1679085073683509 > >

CI requiring approval for external contributors

2023-03-28 Thread Gian Merlino
Recently, ASF GitHub repos had their defaults for GitHub Actions changed to "always require approval for external contributors". In Slack, Karan pointed out that Airflow has recently submitted a ticket to have that changed back: https://issues.apache.org/jira/browse/INFRA-24200. IMO, we should do

Re: About maintaining the Helm's Chart of Apache Druid

2023-02-28 Thread Gian Merlino
Not as far as I do. I think we're stuck since nobody has volunteered to do one of the two necessary things: 1) shepherd this code the IP clearance process, or 2) analyze its provenance enough to determine that IP clearance isn't necessary. If anyone is willing to do one of the above it would be

Re: [Discuss] S3 buckets or IT tests

2023-02-22 Thread Gian Merlino
I think the ticket you're referring to is https://issues.apache.org/jira/browse/INFRA-23952. It would definitely be valuable to run S3 integration tests as part of the automated test suite in GitHub Actions. If Infra is willing to provide a bucket for this purpose then we would certainly be

Re: moving druid-core, extendedset, druid-hll into druid-processing

2023-02-06 Thread Gian Merlino
I support this. I don't feel like the separation between core and processing is buying us very much. On Mon, Jan 23, 2023 at 5:12 PM Clint Wylie wrote: > Hi all, > > I want to discuss moving druid-core, extendedset, and druid-hll into > druid-processing to simplify our code structure and

Re: [DISCUSS] Release 24.0.1

2022-10-18 Thread Gian Merlino
Thank you for volunteering! On Mon, Oct 17, 2022 at 7:00 AM Kashif Faraz wrote: > Hi Abhishek > > If you haven't started with the release process already, I would like to > volunteer to perform this release so that we can expedite it. > Please let me know if that works for you. > > Regards >

Druid Summit on the road

2022-09-06 Thread Gian Merlino
Hey Druids, I am excited to write to you about upcoming events in this year's edition of Druid Summit, which is being conducted as a series of more local in-person events. I hope it gives you a chance to meet people near you in the Druid community. Attendance is free of charge. I personally will

Re: Intermediate segment persistence

2022-09-06 Thread Gian Merlino
Hey Pramod, If it's a minor change I recommend raising a PR. Generally raising an issue first is a good idea for bigger changes, where it is helpful to have some discussion prior to the code showing up. But for smaller changes, we can go directly to the code. You can post the PR here too, or in

Re: [E] [DISCUSS] Hadoop 3, dropping support for Hadoop 2.x for 24.0

2022-08-08 Thread Gian Merlino
It's always good to deprecate things for some time prior to removing them, so we don't need to (nor should we) remove Hadoop 2 support right now. My vote is that in this upcoming release, we should deprecate it. The main problem in my eyes is the one Abhishek brought up: the dependency management

Re: Next Druid release version scheme

2022-07-06 Thread Gian Merlino
API changes, look no > further than Guava.) > > Julian > > > On Jul 6, 2022, at 1:53 AM, Gian Merlino wrote: > > My proposal for the next release is that we merely drop the leading "0." > and don't change anything else about our dev process. We'd start the next >

Re: Next Druid release version scheme

2022-07-06 Thread Gian Merlino
releases? > > Can I do a rolling upgrade of druid to the next version? > > > > The more things that are versioned the better, but (2) and (4) have been > > the things that have been most important to me in the past. > > > > Anyone in the community have any thou

Re: [DISCUSS] Removing code related to `FireHose`

2022-07-06 Thread Gian Merlino
I am in favor of immediately removing FiniteFirehoseFactory and marking EventReceiverFirehoseFactory deprecated. Then, later on we can remove InputRowParser and EventReceiverFirehoseFactory. On Fri, Jun 24, 2022 at 4:41 AM Abhishek Agarwal wrote: > I didn’t include them (RealtimeIndexTask and >

Re: Vulnerability Report [Misconfigured DMARC Record Flag]

2022-06-21 Thread Gian Merlino
Hey Zeus, You should have received a response to this report from the Apache Security Team (secur...@apache.org). In the future, please note that security reports should be sent to secur...@apache.org, not the dev list. On Tue, Jun 21, 2022 at 1:04 PM Cyber Zeus wrote: > Hi team > kindly

New PMC member: Abhishek Agarwal

2022-06-07 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Abhishek Agarwal (asf id abhishek, github id abhishekagarwal87) to become a PMC member, and we are pleased to announce that he has accepted. Abhishek has authored dozens of commits, participated in nearly 200 code reviews, and is release manager for the

Re: EJB interceptor binding API is not available

2022-06-04 Thread Gian Merlino
Hi Maithri, I haven't encountered something like this before so I'm not sure what's causing it. Is it reproducible? If you could provide some steps for someone else to see the same thing you're seeing — maybe it relies on a particular Java version, or particular Druid version, or something — then

Re: Next Druid release version scheme

2022-05-27 Thread Gian Merlino
Yeah, I'd say the next one after 24.0 would be 25.0. The idea is really just to remove the leading zero and thereby communicate the accurate state of the project: it has been stable and production-ready for a long time. Some people see the leading zero and interpret that as a sign of an immature

Re: [DISCUSS] Druid 0.23 release

2022-05-26 Thread Gian Merlino
I'm supportive of changing the versioning to something without the leading zero in the next release where this is practical. If it's the one after 0.23.0, then I would go with 24.0. IMO, going with 1.0 would send a message that this is the first mature release. But that isn't the case: we have

Re: Limitations of automated unused segment kill logic (Issue #10876 and PR #10877)

2022-05-05 Thread Gian Merlino
I just took a look, and it looks like a few other people did too. Sorry it took so long! I do think that "review for a review" is a good way to go, I think! Thanks for volunteering. On Mon, May 2, 2022 at 12:12 PM Lucas Capistrant wrote: > Hi all, > > I'm writing in regards to my enhancement

Re: [GitHub] [druid] cryptoe commented on a diff in pull request #12339: Make AWS WebIdentityToken actually working and usable from inside EKS.

2022-04-04 Thread Gian Merlino
I thought these emails were supposed to go to comm...@druid.apache.org? I do see a bunch on that list from today, so maybe this was a weird gitbox snafu. On Sun, Apr 3, 2022 at 10:53 PM GitBox wrote: > > cryptoe commented on code in PR #12339: > URL:

Re: 0.23

2022-03-24 Thread Gian Merlino
I agree it's a good time to do a release. Most of the release-manager steps involve having commit privileges, but nevertheless, you might find it interesting to read about the process: https://github.com/apache/druid/blob/master/distribution/asf-release-process-guide.md You've actually already

Multi-stage queries

2022-02-25 Thread Gian Merlino
Hey Druids, I recently posted a proposal on GitHub about adding multi-stage distributed queries to Druid: https://github.com/apache/druid/issues/12262 I think it'll be a powerful advancement in what Druid is capable of, and I'm interested in what people think. It's also going to be a lot of work

Re: Apache Druid Slack

2022-01-21 Thread Gian Merlino
It sounds like a good idea to me. It's not ideal that the current Slack workspace is hard for new people to join. On Thu, Jan 20, 2022 at 10:15 AM Vadim Ogievetsky wrote: > I think that the PMC should create a new Slack channel for Apache Druid and > shift the community towards using it away

Re: [E] [DISCUSS] Patch to fix new vulnerabilities in log4j

2021-12-20 Thread Gian Merlino
I think doing a 0.22.2 would be worth it for users' peace of mind, even if Druid isn't vulnerable by default. Just because people are on edge about log4j-related stuff right now. In case other people agree, I created an 0.22.2 branch just now. Is anyone able to release-manage this one? Btw, John

Re: Apache Druid security advisory: critical vulnerability CVE-2021-44228 in Apache Log4j

2021-12-13 Thread Gian Merlino
To clarify about the mitigations: the "-Dlog4j2.formatMsgNoLookups=true" mitigation that has been floating around the Internet is *not effective* for log4j 2.8.2, which was used by Druid 0.22.0 and other recent versions. If you are going to stay on an older version of Druid, do not use this

Re: Need Help Benchmarking Druid

2021-12-11 Thread Gian Merlino
Hey Abdel, Feel free to DM me on ASF Slack. The info to join is here: https://druid.apache.org/community/ On Fri, Dec 3, 2021 at 9:11 AM Abdelouahab Khelifati wrote: > Hello, > > I am Abdel, a researcher of Computer Science and I am working on a > benchmarking paper on time series database

Re: [RESULT][VOTE] Release Apache Druid 0.22.1 [RC2]

2021-12-11 Thread Gian Merlino
Thank you for running this release! On Sat, Dec 11, 2021 at 12:28 AM Jihoon Son wrote: > Thanks to everyone who participated in the vote! The vote has passed > with 3 binding +1s. > > Gian Merlino: +1 (binding) > Clint Wylie: +1 (binding) > Jonatha

Re: [VOTE] Release Apache Druid 0.22.1 [RC2]

2021-12-10 Thread Gian Merlino
+1 on releasing 0.22.1-rc2 I verified: - hashes / gpg - unit tests - compared the src and bin packages against 0.22.0 to make sure there were no unexpected changes - attempted to trigger the jndi lookup functionality; it triggered on 0.22.0 but not 0.22.1-rc2 - verified that task logs look

Re: Log4j vulnerability - hotfix?

2021-12-10 Thread Gian Merlino
lenging than for > projects on the slightly newer versions of log4j2, perhaps it would be > appropriate to put out one or two more patch releases, against 0.21 > and/or 0.20? I know our installation is still on 0.21, which is less > than 2 months old. > > On Fri, Dec 10, 2021 a

Re: [VOTE] Release Apache Druid 0.22.1 [RC1]

2021-12-10 Thread Gian Merlino
My vote is 0 on this release. I verified the usual things, and compared the src and bin packages against 0.22.0 to make sure there were no unexpected changes. That all looks OK to me. But there is an issue with weird errors at the end of logfiles for processes that exit normally. It's especially

Re: Log4j vulnerability - hotfix?

2021-12-10 Thread Gian Merlino
We're working on this right now and will be getting a vote / release for 0.22.1 out asap. Btw, the log4j announcement mentions a mitigation that does work for our current version (2.8.2). It's part (b) here, specifying "%m{nolookups}" in the PatternLayout configuration:

Re: Need help in understanding real-time ingestion task pause behavior during checkpointing

2021-12-02 Thread Gian Merlino
Harini, those are interesting findings. I'm not sure if the two pauses are necessary, but my thought is that it ideally shouldn't matter because the supervisor shouldn't be taking that long to handle its notices. A couple things come to mind about that: 1) Did you see what specifically the

Re: Push-down of operations for SystemSchema tables

2021-11-29 Thread Gian Merlino
on the same pathway with ordered scan > query, so I could rebase on top of that and break into a smaller set of > PRs, nonetheless the conceptual approach and direction is something that I > think will work. > > Thanks! > Jason > > > > > > > On Wed, May 19, 2021

Re: Druid-specific Calcite keywords

2021-11-05 Thread Gian Merlino
t; a location that expects an identifier (e.g. after FROM), BERNOULLI > will be converted into an identifier. Thus you can use BERNOULLI as a > table name. > > Julian > > On Thu, Nov 4, 2021 at 2:18 PM Gian Merlino wrote: > > > > Hey Druids, > > > > I'm looking into ho

Druid-specific Calcite keywords

2021-11-04 Thread Gian Merlino
Hey Druids, I'm looking into how to add keywords to Druid's SQL dialect, and I wanted to ask if anyone has enough familiarity with Calcite to point at some info about how to do that without needing to modify Calcite itself?

Druid Summit 2021

2021-09-28 Thread Gian Merlino
Hey Druids, I am excited to write to you about Druid Summit (https://druidsummit.org/), an event being held virtually on November 9–10, 2021. The entire Apache Druid community is welcome, and registration is free. It would also be great to see a bunch of people from the community giving talks

Re: [Proposal] - Kafka Input Format for headers, key and payload parsing

2021-09-21 Thread Gian Merlino
ill know > how to use this feature. And it'll help us better understand how it's > supposed to work. (Perhaps it could have answered the two questions above) > > >>> Absolutely agree with you, I will do that along with other review > comments from the code. > > Thanks aga

Re: [Proposal] - Kafka Input Format for headers, key and payload parsing

2021-09-16 Thread Gian Merlino
to get all your replies  On Tue, Sep 14, 2021 at 10:10 PM Gian Merlino wrote: > Hey Lokesh, > > The concept and API looks solid to me! Thank you for writing this up. I > agree with Ben's comment. This will be really useful functionality. > > I have a few questions about how it

Re: compression strategy concurrency

2021-09-14 Thread Gian Merlino
Hey Rahul, What kind of errors are you seeing? I ran the test a few times with a bumped up number of threads, and I did see a few problems but they were in the Closer. It looks like a single Closer is used for every thread, which is bad because Closers are not thread-safe (they are built around

Re: [Proposal] - Kafka Input Format for headers, key and payload parsing

2021-09-14 Thread Gian Merlino
Hey Lokesh, The concept and API looks solid to me! Thank you for writing this up. I agree with Ben's comment. This will be really useful functionality. I have a few questions about how it would work: 1) How is the timestamp exposed exactly? I see there is a recordTimestampLabelPrefix, but what

Re: Get Druid Service details in runtime (via extension)

2021-08-23 Thread Gian Merlino
ating org.apache.druid.discovery.DiscoveryDruidNode > for the 3rd parameter of > com.custom.MyEmitterModule.getEmitter(MyEmitterModule.java:39) > > According to the error, it looks like I cannot add DiscoveryDruidNode > because it does not have @Inject or a zero-argument constructor. But I'm > ab

Re: Get Druid Service details in runtime (via extension)

2021-08-22 Thread Gian Merlino
Does the "getNodeRole()" method on DiscoveryDruidNode do what you want? On Fri, Aug 20, 2021 at 3:07 PM Jeet Patel wrote: > Hi all, > > Is there a way to to know what druid services are running in a DruidNode > (Not > talking about the HTTP APIs)? > I went through druid-server module, class >

Re: Apache Druid Project Structure

2021-08-18 Thread Gian Merlino
rs who are looking to contribute to the > project and make them feel more confident knowing the project layout. > > Thank you, > Jeet > > On 2021/08/17 17:12:33, Gian Merlino wrote: > > Hey Jeet, > > > > I think it is a case of "it seemed like a good idea at

Re: Apache Druid Project Structure

2021-08-17 Thread Gian Merlino
Hey Jeet, I think it is a case of "it seemed like a good idea at the time". Some things about the current layout do work well: one is that there is actually a lot of common query engine code between anything that handles queries. That's historical, broker, peon, and indexer. That common query

Re: Question about merging groupby v2 spill files

2021-08-10 Thread Gian Merlino
Hey Will, The sorting that happens on the data servers is really useful, because it means the Broker can do its part of the query fully streaming instead of buffering things up. At one point we had a similar problem in ingestion (you could have a ton of spill files if you had a lot of sketches)

Re: Interested in contributing an article to your site

2021-07-30 Thread Gian Merlino
Hi Angela, There are a couple of places on the Druid website where we include content from the community. 1) If Sisu Data uses Druid internally, or produces Druid-based products, it would be appropriate to describe Sisu's usage of Druid on our Powered By page:

Re: ItemsSketch Aggregator in druid-datasketches extension

2021-07-23 Thread Gian Merlino
of using too much heap memory. The only advantage (2) has is that you don't need a Direct version of the ItemsSketch for it to work. On Fri, Jul 23, 2021 at 1:35 PM Gian Merlino wrote: > Hey Michael, > > Very cool! > > To answer your question: it is critical to have a BufferAggregator.

Re: ItemsSketch Aggregator in druid-datasketches extension

2021-07-23 Thread Gian Merlino
Hey Michael, Very cool! To answer your question: it is critical to have a BufferAggregator. Some context; there are 3 kinds of aggregators: - Aggregator: stores intermediate state on heap; is used during ingestion and by the non-vectorized timeseries query engine. Required, or else some queries

Re: druid can't parse string

2021-07-16 Thread Gian Merlino
Including the original poster in case they are not on the dev list themselves (hello!). On Fri, Jul 16, 2021 at 9:44 AM Gian Merlino wrote: > Druid stores strings as UTF-8 and from a storage and query basis, it > should work fine with any language. The > "wikiticker-2015-09-12-s

Re: druid can't parse string

2021-07-16 Thread Gian Merlino
Druid stores strings as UTF-8 and from a storage and query basis, it should work fine with any language. The "wikiticker-2015-09-12-sampled.json.gz" dataset used for the tutorial has strings in a variety of languages (check the "page" field):

Re: A question about a potential bug in Druid Joins

2021-06-24 Thread Gian Merlino
_id = DIM.api_client_id > > So the “api_client_id” field is `long` type in both > “inline_data” and “inline_dimension_api_clients_1” datasources. However, > when doing a join, the makeLongProcessor method will be called, and > throw an “UnsupportedOperationException" because

Re: Enabling dependabot in our github repository

2021-06-08 Thread Gian Merlino
Here's a running list of PRs opened by the dependabot: https://github.com/apache/druid/pulls?q=is%3Apr+author%3Aapp%2Fdependabot On Mon, Jun 7, 2021 at 12:22 PM Gian Merlino wrote: > There's been some extra discussion this PR: > https://github.com/apache/druid/pull/11079 > >

Re: Enabling dependabot in our github repository

2021-06-07 Thread Gian Merlino
There's been some extra discussion this PR: https://github.com/apache/druid/pull/11079 I just +1'ed it, but I wanted to come back here to say that IMO, we should avoid getting in the habit of blindly applying these updates without testing. There's been lots of situations in the past where a

Re: FlattenSpec for Nested Data With Unknown Array Length

2021-05-20 Thread Gian Merlino
Hey Evan, Druid's data model doesn't currently have a good way of storing arrays of objects like this. And you're right that even though joins exist, to get peak performance you want to avoid them at query time. In similar situations I have stored data models like this as 3 tables (entries,

Re: Push-down of operations for SystemSchema tables

2021-05-19 Thread Gian Merlino
Hey Frank, These notes are really interesting. Thanks for writing them down. I agree that the three things you laid out are all important. With regard to SQL clauses from the web console, I did notice one recent change went in that changed the SQL clauses to only query sys.segments for columns

Re: Push-down of operations for SystemSchema tables

2021-05-19 Thread Gian Merlino
Hey Jason, It sounds like we have two different, but related goals: 1) Your goal is to improve the performance of system tables. 2) My goal with the branch Clint linked is to enable using Druid's native query engine for system tables, in order to achieve consistency in how SQL queries are

Re: Adding support to Kafka events keys

2021-04-21 Thread Gian Merlino
Hey Noam, I think this would certainly be useful, and thank you for your interest in contributing! I think the toughest part will be designing a good API (meaning: what would users specify in the kafka supervisor json spec in order to activate and configure this feature?). So a good way to

Re: Subject: [CVE-2021-26919] Authenticated users can execute arbitrary code from malicious MySQL database systems

2021-04-01 Thread Gian Merlino
I wanted to add a few more details about this advisory, in the hopes that it will be helpful to people that are upgrading. Here's a link to the relevant docs about the new properties: https://druid.apache.org/docs/latest/configuration/index.html#ingestion-security-configuration And the most

Re: SpringBoot +MyBatis +Apache Druid

2021-03-10 Thread Gian Merlino
Hey Shamriya, It would help to know some more about what kind of integration you're trying to do, and which kind of driver class isn't being recognized. On Wed, Mar 10, 2021 at 11:36 AM nandalapadu shamriyashaik < nshamr...@gmail.com> wrote: > Hi, > > I am new to Druid and struggling to

Re: Contribute a new Community extensions : Launch Peon Pods Based on K8s

2021-03-02 Thread Gian Merlino
Hey Yue, Very interesting idea. I am not a kubernetes expert, but this seems like a neat concept. I guess the idea is only one MM would be needed? (Or maybe a handful, if one can't manage every pod?) If so, great. Hopefully someone that is more of a kubernetes expert will be able to chime in on

Re: Spark-Druid Connectors

2021-03-02 Thread Gian Merlino
Thank you! On Thu, Feb 25, 2021 at 12:03 AM Julian Jaffe wrote: > Hey Gian, > > I’d be overjoyed to be proven wrong! For what it’s worth, my pessimism was > not driven by a lack of faith in the Druid community or the Druid > committers but by the fact that these connectors may be an awkward fit

Re: L1 (caffeine) cache hits/misses metrics not emitted

2021-02-24 Thread Gian Merlino
Hey Vadim, According to https://druid.apache.org/docs/latest/operations/metrics.html#cache, today, the number of hits and misses for the hybrid cache are both emitted, but there isn't differentiation between L1 hits and L2 hits. Is that what you mean? If so, I think the main issue is there just

Re: Spark-Druid Connectors

2021-02-23 Thread Gian Merlino
Hey Julian, Your pessimism in this matter is understandable but regrettable! It would be great to see this effort become part of mainline Druid. It is a more maintainable approach than a separate repo, because it gets rid of the risk of interface drift, and it makes sure that all the tests are

Re: Deprecate support for ZooKeeper 3.4.x

2021-01-19 Thread Gian Merlino
About time, I suppose. I replied to the issue on GitHub. I think the trickiest part is figuring out what migration will look like for users so we can write up some useful release notes. On Tue, Jan 19, 2021 at 5:43 PM Xavier Léauté wrote: > Hi everyone, I wrote up a short issue on deprecating

Re: Forbidding forced git push

2021-01-15 Thread Gian Merlino
Will this help for the (common) case where PR branches are in people's forks? On Fri, Jan 15, 2021 at 1:00 PM Jihoon Son wrote: > Hi all, > > The forced git push is usually used to make the commit history clean, which > I understand its importance. However, one of its downsides is, because it >

Re: Non JSON-query API clients

2020-11-13 Thread Gian Merlino
I'm not aware of plans to build out official clients for those other APIs; when I've written python programs to integrate with them I've usually called them through http directly. I'm not familiar with OpenAPI, but looking at it briefly, it seems like an interesting concept and a potential way to

Re: [E] Re: Removing Druid support for JDK 8 and adding support for JDK 11

2020-11-13 Thread Gian Merlino
Seconding (thirding?) the idea that keeping JDK 8 for integration with Hadoop is important. Druid's Hadoop integration is built against Hadoop 2.x and that version only supports JDK 8: https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions. We shouldn't drop JDK 8 support until we

Code reviews, UX, and tests

2020-10-15 Thread Gian Merlino
Hey Druids, I am writing to you all to ask for your help  In particular, your help in ensuring that potential code contributions are reviewed in a timely fashion. Right now we have 72 open PRs, which due to stalebot are mostly opened pretty recently. That's a lot of people that want to

Re: Help in Configuring data retention

2020-09-21 Thread Gian Merlino
Hey Satish, Are you asking if Druid can write a log of load/drop rule changes to a Kafka topic? If so, no, it cannot. But it does write them to the metadata store, and perhaps you could use a tool to copy them from the metadata store into Kafka. On Mon, Sep 21, 2020 at 6:46 AM Satish Embadi <

New committer: Atul Mohan

2020-09-02 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Atul Mohan (@a2l007 on github) to become a committer and we are pleased to announce that he has accepted. Atul has been actively working on various parts of Druid, including indexing from SQL sources and result-level caching.

Re: [CRON] Broken: apache/druid#28120 (master - c72f96a)

2020-08-19 Thread Gian Merlino
There's a lot of these with messages like: > [ERROR] Failed to execute goal org.owasp:dependency-check-maven:5.3.2:check (default-cli) on project druid: Fatal exception(s) analyzing Druid: One or more exceptions occurred during analysis: > [ERROR] Unable to connect to the dependency-check

Re: SQL Support for Tuple Sketches

2020-08-12 Thread Gian Merlino
Hey Mithal, I'm not aware of anyone currently working on it, so you certainly are welcome to! On Mon, Aug 10, 2020 at 11:56 AM Mithal Kothari wrote: > Hi Druid Dev team, > > I just wanted to follow up with you'll and find out if there is a > plan/possibility to introduce sql support for tuple

Re: Study On Rejected Refactorings

2020-08-12 Thread Gian Merlino
Hey Jevgenija, I recently filled out the survey — hope the response is helpful! On Tue, Aug 11, 2020 at 1:05 PM Jevgenija Pantiuchina < jevgenija.pantiuch...@usi.ch> wrote: > Dear contributors, > > As part of a research team from Università della Svizzera italiana > (Switzerland) and University

Re: Druid not listed in Apache project list by category?

2020-07-31 Thread Gian Merlino
That's a good point. We must be missing some metadata. I'm not sure how this page works — does anyone else know? On Fri, Jul 31, 2020 at 11:49 AM Will Lauer wrote: > I was browsing the list of Apache projects today looking for something, and > while I was there, I noticed that Druid was

Re: Druid + Presto?

2020-07-10 Thread Gian Merlino
the last year or so — does it work the same way in each one? I'm wondering how much work can be shared between these different efforts and perhaps between these efforts and the Druid project itself. On Thu, Jul 9, 2020 at 11:24 PM Gian Merlino wrote: > Hey Samarth, > > Thanks fo

Re: Any benchmarks for druid iingesting, querying (min, max, topn avg etc)

2020-07-10 Thread Gian Merlino
Hey Rajiv, I'm not aware of one for ingestion. For querying, two recent results using the Star Schema Benchmark are this paper comparing Druid, Hive, and Presto: https://www.researchgate.net/publication/333831332_Challenging_SQL-on-Hadoop_Performance_with_Apache_Druid, and this blog post

Re: Druid + Presto?

2020-07-10 Thread Gian Merlino
iao > > > > On Thu, Jul 9, 2020 at 12:40 PM Mainak Ghosh wrote: > > > > > Hello Gian, > > > > > > We are currently testing the (other) Presto Druid connector at our end. > > It > > > has aggregation push down support. Adding Zhenxiao to thi

Re: Druid + Presto?

2020-07-10 Thread Gian Merlino
n push down support. Adding Zhenxiao to this thread since > he > > is the primary developer of the connector. He can provide the kind of > > details you are looking for. > > > > Thanks, > > Mainak > > > > > On Jul 9, 2020, at 12:25 PM, Gian Merlino wrot

Re: Druid + Presto?

2020-07-09 Thread Gian Merlino
By the way, I see that the other Presto has a Druid connector too: https://prestodb.io/docs/current/connector/druid.html. From the docs it looks like it has different lineage and might even work differently. On Thu, Jul 9, 2020 at 12:22 PM Gian Merlino wrote: > I was thinking of exploring id

Re: Druid + Presto?

2020-07-09 Thread Gian Merlino
o be improved over the next few > releases. We are currently evaluating using the presto-druid connector in > our Tableau setup. It would be interesting to see what changes in Druid > would be needed to support that integration. > > Thanks, > Samarth > > On Thu, Jul 9, 2020 at 10

Druid + Presto?

2020-07-09 Thread Gian Merlino
Hey Druids, I was wondering, is anyone on this list using Druid + Presto together? If so, what does your architecture look like and which edition / flavor of Presto and Druid connector are you using? What's your experience been like? I'm asking since I'm starting to think about whether it makes

New committer: Maggie Brewster

2020-07-07 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Maggie Brewster (@mcbrewster on github) to become a committer and we are pleased to announce that she has accepted. Maggie has made dozens of contributions to Druid, especially to the (relatively) new web console.

New committer: Suneet Saldanha

2020-07-07 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Suneet Saldanha (@suneet-s on github) to become a committer and we are pleased to announce that he has accepted. Suneet has contributed to areas including the new join functionality, documentation, and general code quality. He

New committer: Lucas Capistrant

2020-07-07 Thread Gian Merlino
Hey Druids, The Druid PMC has invited Lucas Capistrant (@capistrant on github) to become a committer and we are pleased to announce that he has accepted. Lucas has been active throughout the past year, contributing various enhancements and fixes. Congratulations

  1   2   3   4   5   >