[GitHub] clintropolis commented on a change in pull request #5957: Various changes

2018-07-19 Thread GitBox
clintropolis commented on a change in pull request #5957: Various changes URL: https://github.com/apache/incubator-druid/pull/5957#discussion_r203922348 ## File path: processing/src/main/java/io/druid/segment/ColumnSelector.java ## @@ -31,5 +31,5 @@ List

Re: Build failure on 0.13.SNAPSHOT

2018-07-19 Thread Dongjin Lee
Hi Jihoon, I ran `mvn clean package` following development/build . Dongjin On Fri, Jul 20, 2018 at 12:30 AM Jihoon Son wrote: > Hi Dongjin, > > what maven command did you run? > > Jihoon > > On Wed, Jul

[GitHub] jihoonson commented on issue #5471: Implement force push down for nested group by query

2018-07-19 Thread GitBox
jihoonson commented on issue #5471: Implement force push down for nested group by query URL: https://github.com/apache/incubator-druid/pull/5471#issuecomment-406445172 Hi @samarthjain, sorry for the delay. I'm looking at this PR again, and the size of this PR looks huge. Would you check

[GitHub] jon-wei closed pull request #5998: Add support to filter on datasource for active tasks

2018-07-19 Thread GitBox
jon-wei closed pull request #5998: Add support to filter on datasource for active tasks URL: https://github.com/apache/incubator-druid/pull/5998 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As

Re: Question on GroupBy query results merging process

2018-07-19 Thread Jisoo Kim
Hi Jihoon, Thanks for the reply. So what I ended up doing for merging a list of serialized result Sequences (which is a byte array) was: 1) Create a stream of out of the list 2) For each serialized sequence in a list, create a query runner that deserializes the byte array and returns a Sequence

Re: Question on GroupBy query results merging process

2018-07-19 Thread Jihoon Son
Hi Jisoo, sorry, the previous email was sent by accident. The initial version of groupBy v2 wasn't capable of combining intermediates in parallel. Some of our customers met the similar issue to yours, and so I was working on improving groupBy v2 performance for a while. Parallel combining on

Re: Question on GroupBy query results merging process

2018-07-19 Thread Jihoon Son
Hi Jisoo, the initial version of groupBy v2 On Thu, Jul 19, 2018 at 2:42 PM Jisoo Kim wrote: > Hi all, > > I am currently working on a project that uses Druid's QueryRunner and other > druid-processing classes. It uses Druid's own classes to calculate query > results. I have been testing large

Question on GroupBy query results merging process

2018-07-19 Thread Jisoo Kim
Hi all, I am currently working on a project that uses Druid's QueryRunner and other druid-processing classes. It uses Druid's own classes to calculate query results. I have been testing large GroupBy queries (using v2), and it seems like parallel combining threads for GroupBy queries are only

Re: Issue with segments not loading/taking a long time

2018-07-19 Thread Jihoon Son
Hi Samarth, IIRC, nothing has been changed around loading local segments when historicals start up. The above log looks that you have 4816 segments to load. How long does it take to load all of them? How slow is it compared to before? Jihoon On Thu, Jul 19, 2018 at 1:37 PM Samarth Jain wrote:

[GitHub] jihoonson edited a comment on issue #4434: [Proposal] Automatic background segment compaction

2018-07-19 Thread GitBox
jihoonson edited a comment on issue #4434: [Proposal] Automatic background segment compaction URL: https://github.com/apache/incubator-druid/issues/4434#issuecomment-311573364 I have to admit that this proposal is quite complicated and tricky. I'll raise another proposal for this problem

Re: Issue with segments not loading/taking a long time

2018-07-19 Thread Samarth Jain
Thanks for the reply, Clint. It does look related. We also noticed that historicals are taking a long time to download the segments after a restart. At least in 0.10.1, restart of a historical wouldn't be a big deal as the segments it is responsible for serving were still available on the local

Re: Issue with segments not loading/taking a long time

2018-07-19 Thread Clint Wylie
You might be running into something related to these issues https://github.com/apache/incubator-druid/issues/5531 and https://github.com/apache/incubator-druid/issues/5882, the former of which should be fixed in 0.12.2. The effects of these issues can be at least partially mitigated by setting and

Re: Issue with segments not loading/taking a long time

2018-07-19 Thread Samarth Jain
Hi Jihoon, I have a 6 node historical test cluster. 3 nodes are at ~80% and the other two at ~60 and ~50% disk utilization. The interesting thing is that the 6th node ended up getting into zk timeout (because of large GC pause) and is no longer part of the cluster (which is a separate issue I am

Druid 0.12.2-rc1 vote

2018-07-19 Thread Jihoon Son
Hi all, we have no open issues and PRs for 0.12.2 ( https://github.com/apache/incubator-druid/milestone/27). The 0.12.2 branch is already available and all PRs for 0.12.2 have merged into that branch. Let's vote on releasing RC1. Here is my +1. This is a non-ASF release. Best, Jihoon

Re: Druid 0.12.2-rc1 vote

2018-07-19 Thread Jihoon Son
Hi guys, I think we're ready for releasing 0.12.2. I'm closing this vote and creating a new one. Best, Jihoon On Wed, Jul 11, 2018 at 1:43 PM Gian Merlino wrote: > Well, it's never good if a WTH?! message actually gets logged. They are > usually meant to be things that should "never" happen.

Issue with segments not loading/taking a long time

2018-07-19 Thread Samarth Jain
I am working on upgrading our internal cluster to 0.12.1 release and seeing that a few data sources fail to load. Looking at coordinator logs, I am seeing messages like this for the datasource: @40005b50dbc637061cec 2018-07-19T18:43:08,923 INFO [Coordinator-Exec--0]

[GitHub] gianm commented on issue #3236: gitter community channel?

2018-07-19 Thread GitBox
gianm commented on issue #3236: gitter community channel? URL: https://github.com/apache/incubator-druid/issues/3236#issuecomment-406387424 We do already have an IRC channel, and there are some nice web based IRC clients. IMO if you are interested in a chat channel being a more active

Re: synchronization question about datasketches aggregator

2018-07-19 Thread Gian Merlino
Hi Will, Check out also this thread for related discussion: https://lists.apache.org/thread.html/9899aa790a7eb561ab66f47b35c8f66ffe695432719251351339521a@%3Cdev.druid.apache.org%3E On Thu, Jul 19, 2018 at 11:21 AM Will Lauer wrote: > A colleague recently pointed out to me that all the sketch

[GitHub] gianm commented on issue #3956: Thread safe reads for aggregators in IncrementalIndex

2018-07-19 Thread GitBox
gianm commented on issue #3956: Thread safe reads for aggregators in IncrementalIndex URL: https://github.com/apache/incubator-druid/pull/3956#issuecomment-406384627 I was just looking at this issue again after the conversations on the mailing list about sketch synchronization:

[GitHub] gianm closed pull request #6022: Log the full stack trace when an HTTP request fails

2018-07-19 Thread GitBox
gianm closed pull request #6022: Log the full stack trace when an HTTP request fails URL: https://github.com/apache/incubator-druid/pull/6022 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As

Re: synchronization question about datasketches aggregator

2018-07-19 Thread Roman Leventov
There is a race between aggregators and ingestion updates. Actually, many aggregators are vulnerable now. See this issue: https://github.com/apache/incubator-druid/pull/3956 and a conversation starting from this message: https://github.com/apache/incubator-druid/pull/5148#discussion_r170906998.

[GitHub] jihoonson commented on a change in pull request #5492: Native parallel batch indexing without shuffle

2018-07-19 Thread GitBox
jihoonson commented on a change in pull request #5492: Native parallel batch indexing without shuffle URL: https://github.com/apache/incubator-druid/pull/5492#discussion_r203823447 ## File path: indexing-service/src/main/java/io/druid/indexing/common/task/ParallelIndexSubTask.java

[GitHub] jihoonson commented on a change in pull request #5492: Native parallel batch indexing without shuffle

2018-07-19 Thread GitBox
jihoonson commented on a change in pull request #5492: Native parallel batch indexing without shuffle URL: https://github.com/apache/incubator-druid/pull/5492#discussion_r203215000 ## File path:

[GitHub] jihoonson commented on a change in pull request #5492: Native parallel batch indexing without shuffle

2018-07-19 Thread GitBox
jihoonson commented on a change in pull request #5492: Native parallel batch indexing without shuffle URL: https://github.com/apache/incubator-druid/pull/5492#discussion_r203822871 ## File path:

Re: list polluted by gitbox messages

2018-07-19 Thread Gian Merlino
We're working with infra to redirect the notifications: https://issues.apache.org/jira/browse/INFRA-16674 In the meantime, I have been using these filters to keep myself sane: https://gist.github.com/gianm/0eb410915c02e3844e11235172894c62 (it's a gist because the filters are partially based on

list polluted by gitbox messages

2018-07-19 Thread Prashant Deva
seems like every bit of activity on gitbox is being posted to the dev mailing list. its impossible to see any real messages since all i see are gitbox mails. Prashant

[GitHub] drcrallen opened a new issue #6024: Missing exception handling as part of `io.druid.java.util.http.client.netty.HttpClientPipelineFactory`

2018-07-19 Thread GitBox
drcrallen opened a new issue #6024: Missing exception handling as part of `io.druid.java.util.http.client.netty.HttpClientPipelineFactory` URL: https://github.com/apache/incubator-druid/issues/6024 The `io.druid.java.util.http.client.netty.HttpClientPipelineFactory` class constructs the

[GitHub] drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges

2018-07-19 Thread GitBox
drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges URL: https://github.com/apache/incubator-druid/pull/5913#discussion_r203792145 ## File path:

[GitHub] drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges

2018-07-19 Thread GitBox
drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges URL: https://github.com/apache/incubator-druid/pull/5913#discussion_r203791629 ## File path:

[GitHub] drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges

2018-07-19 Thread GitBox
drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges URL: https://github.com/apache/incubator-druid/pull/5913#discussion_r203791629 ## File path:

[GitHub] drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges

2018-07-19 Thread GitBox
drcrallen commented on a change in pull request #5913: Move Caching Cluster Client to java streams and allow parallel intermediate merges URL: https://github.com/apache/incubator-druid/pull/5913#discussion_r203791186 ## File path:

[GitHub] gianm commented on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed

2018-07-19 Thread GitBox
gianm commented on issue #6014: Optionally refuse to consume new data until the prior chunk is being consumed URL: https://github.com/apache/incubator-druid/pull/6014#issuecomment-406326473 @drcrallen, a question: what happens when one query has a huge set of data to pull in, but the

[GitHub] gianm commented on issue #4949: Add limit to query result buffering queue

2018-07-19 Thread GitBox
gianm commented on issue #4949: Add limit to query result buffering queue URL: https://github.com/apache/incubator-druid/pull/4949#issuecomment-406325484 It looks like #6014 is attempting to solve the same problem. This is an

Re: Build failure on 0.13.SNAPSHOT

2018-07-19 Thread Jihoon Son
Hi Dongjin, what maven command did you run? Jihoon On Wed, Jul 18, 2018 at 10:38 PM Dongjin Lee wrote: > Hello. I am trying to build druid, but it fails. My environment is like the > following: > > - CPU: Intel(R) Core(TM) i7-7560U CPU @ 2.40GHz > - RAM: 7704 MB > - OS: ubuntu 18.04 > - JDK:

[GitHub] asdf2014 commented on issue #5980: Various changes about a few coding specifications

2018-07-19 Thread GitBox
asdf2014 commented on issue #5980: Various changes about a few coding specifications URL: https://github.com/apache/incubator-druid/pull/5980#issuecomment-406300864 @leventov It seems that both travis and teamcity have succeeded. Any other good suggestions?

Re: Subscription Request

2018-07-19 Thread Gian Merlino
Hi Dongjin, To subscribe, just send a mail to dev-subscr...@druid.apache.org. On Wed, Jul 18, 2018 at 9:55 PM Dongjin Lee wrote: > -- > *Dongjin Lee* > > *A hitchhiker in the mathematical world.* > > *github: github.com/dongjinleekr >

[GitHub] sascha-coenen opened a new issue #6023: Full ISO 8601 compatibility in query intervals

2018-07-19 Thread GitBox
sascha-coenen opened a new issue #6023: Full ISO 8601 compatibility in query intervals URL: https://github.com/apache/incubator-druid/issues/6023 The current Druid documentation states in several places that queries have a mandatory "intervals" or "interval" attribute which can contain

[GitHub] nicolasblaye commented on issue #6018: RegisteredLookup java api

2018-07-19 Thread GitBox
nicolasblaye commented on issue #6018: RegisteredLookup java api URL: https://github.com/apache/incubator-druid/issues/6018#issuecomment-406204255 By the way, I did a quick fix for our use case, but it's far from something clean. I created a `RegisteredExtractionFn` (instead of a

[GitHub] nicolasblaye commented on issue #6018: RegisteredLookup java api

2018-07-19 Thread GitBox
nicolasblaye commented on issue #6018: RegisteredLookup java api URL: https://github.com/apache/incubator-druid/issues/6018#issuecomment-406201106 Hi drcrallen, Thank you for your quick response. Indeed, this class is in the druid server jar, but I need it in the druid-api jar (or