[GitHub] [incubator-druid] JackyYangPassion edited a comment on issue #7484: my realtime task cant handoff the segments
JackyYangPassion edited a comment on issue #7484: my realtime task cant handoff the segments URL: https://github.com/apache/incubator-druid/issues/7484#issuecomment-490748567 check is there a task have many segment to ingest into druid? a month ago: i have the same problem;the reason is a datasource every day ingest a year data; other reason many be a historycal node dead also cause the same hand off long time; if you want to know why: you can see coordinator code: CoordinatorHistoricalManagerRunnable in class DruidCoordinator This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] JackyYangPassion edited a comment on issue #7484: my realtime task cant handoff the segments
JackyYangPassion edited a comment on issue #7484: my realtime task cant handoff the segments URL: https://github.com/apache/incubator-druid/issues/7484#issuecomment-490748567 check is there a task have many segment to ingest into druid? a month ago: i have the same problem;the reason is a datasource every day ingest a year data; other reason many be a historycal node dead also cause the same hand off long time; if you want to know why: you can see coordinator code: CoordinatorHistoricalManagerRunnable in class DruidCoordinator.java This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] JackyYangPassion commented on issue #7484: my realtime task cant handoff the segments
JackyYangPassion commented on issue #7484: my realtime task cant handoff the segments URL: https://github.com/apache/incubator-druid/issues/7484#issuecomment-490748567 check is there a task have many segment to ingest into druid? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[incubator-druid-website] branch 0.14.1-downloads created (now b84129d)
This is an automated email from the ASF dual-hosted git repository. cwylie pushed a change to branch 0.14.1-downloads in repository https://gitbox.apache.org/repos/asf/incubator-druid-website.git. at b84129d add download links for 0.14.1 This branch includes the following new commits: new b84129d add download links for 0.14.1 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[incubator-druid-website] 01/01: add download links for 0.14.1
This is an automated email from the ASF dual-hosted git repository. cwylie pushed a commit to branch 0.14.1-downloads in repository https://gitbox.apache.org/repos/asf/incubator-druid-website.git commit b84129d818d4a4588297beb5517886705edb1bcf Author: Clint Wylie AuthorDate: Wed May 8 21:06:55 2019 -0700 add download links for 0.14.1 --- downloads.html | 21 + 1 file changed, 21 insertions(+) diff --git a/downloads.html b/downloads.html index 912162a..6b6f1ff 100644 --- a/downloads.html +++ b/downloads.html @@ -44,6 +44,27 @@ + + 0.14.1-incubating + 08 May 2019 + +https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz; onclick="trackDownload('click', 'https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz');">source + +(https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz.sha512; onclick="trackDownload('click', 'https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz.sha512');">sha512 + +https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz.asc; onclick="trackDownload('click', 'https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz.asc');">pgp) + + +https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz; onclick="trackDownload('click', 'https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz');">binary + +(https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz.sha512; onclick="trackDownload('click', 'https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz.sha512');">sha512 + +https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz.asc; onclick="trackDownload('click', 'https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz.asc');">pgp) + + +https://github.com/apache/incubator-druid/releases/tag/druid-0.14.1-incubating;>Release notes + + 0.14.0-incubating 09 Apr 2019 - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid-website] clintropolis opened a new pull request #5: add download links for 0.14.1
clintropolis opened a new pull request #5: add download links for 0.14.1 URL: https://github.com/apache/incubator-druid-website/pull/5 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] iamxiaojuan commented on issue #7620: sql injection violation
iamxiaojuan commented on issue #7620: sql injection violation URL: https://github.com/apache/incubator-druid/issues/7620#issuecomment-490735437 > Druid SQL does not support inserts (or variables). It is a read only query interface. You ingest data in other ways: http://druid.io/docs/latest/ingestion/index.html thank you very much! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] clintropolis commented on issue #7619: fix issue #7607
clintropolis commented on issue #7619: fix issue #7607 URL: https://github.com/apache/incubator-druid/pull/7619#issuecomment-490734781 Hmm, the failure is perhaps no longer related to missing artifacts: ``` [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BannedDependencies failed with message: Found Banned Dependency: com.google.code.findbugs:annotations:jar:3.0.0 Use 'mvn dependency:tree' to locate the source of the banned dependencies. ``` at https://travis-ci.org/apache/incubator-druid/jobs/530049926#L3162 Doing a quick search, this library is licensed as `lgpl` which is perhaps the reason for this issue? Is this a new dependency between 0.13.1 and 0.13.3? I wonder if it can be safely excluded from the datasketches extension pom? ``` com.google.code.findbugs annotations ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] vogievetsky commented on issue #7620: sql injection violation
vogievetsky commented on issue #7620: sql injection violation URL: https://github.com/apache/incubator-druid/issues/7620#issuecomment-490734489 Druid SQL does not support inserts (or variables). It is a read only query interface. You ingest data in other ways: http://druid.io/docs/latest/ingestion/index.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] iamxiaojuan commented on issue #7620: sql injection violation
iamxiaojuan commented on issue #7620: sql injection violation URL: https://github.com/apache/incubator-druid/issues/7620#issuecomment-490734241 > I am pretty sure nothing about that SQL statement is supported What do you think the reason is for that? and,thank you for your reply. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] vogievetsky commented on issue #7620: sql injection violation
vogievetsky commented on issue #7620: sql injection violation URL: https://github.com/apache/incubator-druid/issues/7620#issuecomment-490733753 I am pretty sure nothing about that error is supported This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] vogievetsky edited a comment on issue #7620: sql injection violation
vogievetsky edited a comment on issue #7620: sql injection violation URL: https://github.com/apache/incubator-druid/issues/7620#issuecomment-490733753 I am pretty sure nothing about that SQL statement is supported This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] clintropolis commented on a change in pull request #7606: Set direct memory if unable to detect JVM config
clintropolis commented on a change in pull request #7606: Set direct memory if unable to detect JVM config URL: https://github.com/apache/incubator-druid/pull/7606#discussion_r282326369 ## File path: processing/src/main/java/org/apache/druid/query/DruidProcessingConfig.java ## @@ -52,7 +52,22 @@ public int intermediateComputeSizeBytes() return computedBufferSizeBytes.get(); } -long directSizeBytes = JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes(); +long directSizeBytes; +try { + directSizeBytes = JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes(); Review comment: I haven't been following the jdk9+ compat PRs too closely so apologies if this has been discussed elsewhere, but if we aren't afraid of getting a bit dirty there appear to be at least 2 ways we could source this information which I tested up to jdk11 (didn't have 12 handy). [This stuff is still in the jdk, at least as of 12, but it's now in `jdk.internal.misc.VM`](https://github.com/AdoptOpenJDK/openjdk-jdk12u/blob/master/src/java.base/share/classes/jdk/internal/misc/VM.java#L130). I imagine the java people would tell us that neither of the ways I could get this information are legit to use.. but they probably would've said that about using `sun.misc.VM` in the first place. The first and less intrusive is to reflect it out of `java.nio.Bits`, which stores the value it gets from `jdk.internal.misc.VM` in a static field which we could grab like this: ``` Class bitsClass = Class.forName("java.nio.Bits"); Field maxMemField = bitsClass.getDeclaredField("MAX_MEMORY"); maxMemField.setAccessible(true); long maxMem = (Long) maxMemField.get(null); ``` However, this will complain loudly on stderr with something like: ``` WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by me.clintropolis.sandbox.mem.Main (file:/Users/clint/workspace/clintropolis/sandbox/target/classes/) to field java.nio.Bits.MAX_MEMORY WARNING: Please consider reporting this to the maintainers of me.clintropolis.sandbox.mem.Main WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release ``` The 2nd way involves directly using `jdk.internal.misc.VM` like we were doing for `sun.misc.VM` something like: ``` Class vmClass = Class.forName("jdk.internal.misc.VM"); Object maxDirectMemoryObj = vmClass.getMethod("maxDirectMemory").invoke(null); if (maxDirectMemoryObj == null || !(maxDirectMemoryObj instanceof Number)) { throw new UOE("Cannot determine maxDirectMemory from [%s]", maxDirectMemoryObj); } else { return ((Number) maxDirectMemoryObj).longValue(); } ``` and adding: ``` --add-exports java.base/jdk.internal.misc=ALL-UNNAMED ``` to the `java` command-line when running (or stuff in jvm.config?). If this is not added, it will explode violently with: ``` Exception in thread "main" java.lang.IllegalAccessException: class me.clintropolis.sandbox.mem.Main cannot access class jdk.internal.misc.VM (in module java.base) because module java.base does not export jdk.internal.misc to unnamed module @2f7c7260 at java.base/jdk.internal.reflect.Reflection.newIllegalAccessException(Reflection.java:361) at java.base/java.lang.reflect.AccessibleObject.checkAccess(AccessibleObject.java:591) at java.base/java.lang.reflect.Method.invoke(Method.java:558) at me.clintropolis.sandbox.mem.Main.main(Main.java:17) ``` Other approaches I've seen: [Netty has some crazy bit about trying to pull it from the java command-line options](https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/internal/PlatformDependent.java#L1035) to parse if the user has set `-XX:MaxDirectMemorySize` and try to use that if it can't get the information from `sun.misc.VM`, but i'm unsure if that is reasonable to do here. Personally, I don't really understand why the java developers don't think it necessary to provide a friendly way to expose this information, but it is what it is. I don't know if we want to pursue either of these approaches, just wanted to bring it up for discussion. All this said, it looks to me like [the jdk itself defaults max directMemory to `Runtime.getRuntime().maxMemory()`](https://github.com/AdoptOpenJDK/openjdk-jdk12u/blob/master/src/java.base/share/classes/jdk/internal/misc/VM.java#L208) if `-XX:MaxDirectMemorySize` is not set, so it doesn't seem so unreasonable to size off of that... Regardless, thanks for working on this stuff @xvrl! This is an automated message from the Apache Git Service. To respond to the message,
[GitHub] [incubator-druid] clintropolis commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
clintropolis commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490721420 >This is a regression in Theta sketch code. So I would think you don't want to approve the 0.14.1 release candidate as it is now. We will fix the sketches-core shortly. 0.14.1 is too far gone, the artifacts are already propagated to maven and the apache mirrors, so I'm going to go ahead and do the release anyway. I've modified the release notes to warn about upgrading if relying on theta sketches. This issue does seem severe enough to go ahead and do a 0.14.2 since we can probably drive that through a lot quicker than we can wrap up and validate 0.15.0, so I will create an rc and start a vote as soon as possible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] xvrl commented on a change in pull request #7606: Set direct memory if unable to detect JVM config
xvrl commented on a change in pull request #7606: Set direct memory if unable to detect JVM config URL: https://github.com/apache/incubator-druid/pull/7606#discussion_r282309638 ## File path: processing/src/main/java/org/apache/druid/query/DruidProcessingConfig.java ## @@ -52,7 +52,22 @@ public int intermediateComputeSizeBytes() return computedBufferSizeBytes.get(); } -long directSizeBytes = JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes(); +long directSizeBytes; +try { + directSizeBytes = JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes(); Review comment: yes, the call will always fail in Java 9 and above. DEFAULT_PROCESSING_BUFFER_SIZE_BYTES is -1, so this logic should only apply if they haven't configured anything and we need to auto-size the buffer. This only changes the auto-sizing logic to assume the maximum available direct memory is at least 25% of heap size, which should be relatively safe, given that JDKs now default to max direct memory = max heap size. The only reason I picked a fraction of it is to provide a better out of the box experience for someone kicking the tires, and avoid having their memory usage be twice as much as the heap size they configured, but I picked this number fairly arbitrarily, so I'd be happy to revisit if we think a different default is better. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jihoonson commented on issue #7619: fix issue #7607
jihoonson commented on issue #7619: fix issue #7607 URL: https://github.com/apache/incubator-druid/pull/7619#issuecomment-49077 Ok. Will restart CI after a few mins. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jihoonson commented on issue #7619: fix issue #7607
jihoonson commented on issue #7619: fix issue #7607 URL: https://github.com/apache/incubator-druid/pull/7619#issuecomment-490698350 Should we add a unit test? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] AlexanderSaydakov commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
AlexanderSaydakov commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490697153 https://github.com/apache/incubator-druid/pull/7619 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] AlexanderSaydakov opened a new pull request #7619: fix issue #7607
AlexanderSaydakov opened a new pull request #7619: fix issue #7607 URL: https://github.com/apache/incubator-druid/pull/7619 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490693034 @pzhdfy @gianm DataSketches sketches-core 0.13.3 is now released to Maven Central with the fix. Thank you @pzhdfy and @gianm for your help in finding this!! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490693034 @pzhdfy @gianm DataSketches sketches-core 0.13.3 is now release to Maven Central with the fix. Thank you @pzhdfy and @gianm for you help in finding this!! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490693034 @pzhdfy @gianm DataSketches sketches-core 0.13.3 is now released to Maven Central with the fix. Thank you @pzhdfy and @gianm for you help in finding this!! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators
jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282295708 ## File path: extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation.tdigestsketch; + +import com.google.common.base.Preconditions; +import com.tdunning.math.stats.MergingDigest; +import it.unimi.dsi.fastutil.ints.Int2ObjectMap; +import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap; +import org.apache.druid.java.util.common.IAE; +import org.apache.druid.query.aggregation.BufferAggregator; +import org.apache.druid.segment.ColumnValueSelector; + +import javax.annotation.Nonnull; +import javax.annotation.concurrent.GuardedBy; +import java.nio.ByteBuffer; +import java.util.IdentityHashMap; +import java.util.Map; + +/** + * Aggregator that builds t-digest backed sketches using numeric values read from {@link ByteBuffer} + */ +public class TDigestBuildSketchBufferAggregator implements BufferAggregator +{ + + @Nonnull + private final ColumnValueSelector selector; + @Nonnull + private final int compression; + + @GuardedBy("this") + private Map> sketches = new IdentityHashMap<>(); + + public TDigestBuildSketchBufferAggregator( + final ColumnValueSelector valueSelector, + final Integer compression + ) + { +Preconditions.checkNotNull(valueSelector); +this.selector = valueSelector; +if (compression != null) { + this.compression = compression; +} else { + this.compression = TDigestBuildSketchAggregator.DEFAULT_COMPRESSION; +} + } + + @Override + public synchronized void init(ByteBuffer buffer, int position) Review comment: If a query is issued while a stream ingestion task is running, then the query would be routed to that task. This is when concurrent reads and writes can happen. Since only `OnHeapIncrementalIndex` is used at ingestion time which uses `Aggregator`, we need to consider if there's any concurrency issue between `get()` and `aggregate()`. Check out these comments: https://github.com/apache/incubator-druid/pull/5002#issuecomment-341179982, https://github.com/apache/incubator-druid/pull/5148#discussion_r170906998 I'm not sure why `HistogramAggregator` is not synchronized even though it looks to have to. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators
jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282295729 ## File path: extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation.tdigestsketch; + +import com.google.common.base.Preconditions; +import com.tdunning.math.stats.MergingDigest; +import it.unimi.dsi.fastutil.ints.Int2ObjectMap; +import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap; +import org.apache.druid.java.util.common.IAE; +import org.apache.druid.query.aggregation.BufferAggregator; +import org.apache.druid.segment.ColumnValueSelector; + +import javax.annotation.Nonnull; +import javax.annotation.Nullable; +import javax.annotation.concurrent.GuardedBy; +import java.nio.ByteBuffer; +import java.util.IdentityHashMap; +import java.util.Map; + +/** + * Aggregator that builds t-digest backed sketches using numeric values read from {@link ByteBuffer} + */ +public class TDigestBuildSketchBufferAggregator implements BufferAggregator +{ + + @Nonnull + private final ColumnValueSelector selector; + private final int compression; + + @GuardedBy("this") + private final Map> sketches = new IdentityHashMap<>(); + + public TDigestBuildSketchBufferAggregator( + final ColumnValueSelector valueSelector, + @Nullable final Integer compression + ) + { +Preconditions.checkNotNull(valueSelector); +this.selector = valueSelector; +if (compression != null) { + this.compression = compression; +} else { + this.compression = TDigestBuildSketchAggregatorFactory.DEFAULT_COMPRESSION; +} + } + + @Override + public synchronized void init(ByteBuffer buffer, int position) + { +MergingDigest emptyDigest = new MergingDigest(compression); +putSketch(buffer, position, emptyDigest); + } + + @Override + public synchronized void aggregate(ByteBuffer buffer, int position) + { +MergingDigest sketch = sketches.get(buffer).get(position); +Object x = selector.getObject(); +if (x instanceof Number) { + sketch.add(((Number) x).doubleValue()); +} else { + throw new IAE("Unexpected value of type " + x.getClass().getName() + " encountered"); +} + } + + @Override + public synchronized Object get(final ByteBuffer buffer, final int position) + { +return sketches.get(buffer).get(position); Review comment: Thank you for calling out this! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation
surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282204314 ## File path: core/src/main/java/org/apache/druid/timeline/DataSegmentWithOvershadowedStatus.java ## @@ -25,16 +25,16 @@ /** * DataSegment object plus the overshadowed status for the segment. An immutable object. * - * SegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} + * DataSegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} * of the DataSegment object. */ -public class SegmentWithOvershadowedStatus implements Comparable +public class DataSegmentWithOvershadowedStatus implements Comparable Review comment: hmm, since it's a wrapper on DataSegment, that's why I felt this might make more sense, i'd be okay reverting it back to `SegmentWithOvershadowedStatus` if there are plans to rename `DataSegment` , although that seems unclear at this point. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation
surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282280885 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -744,6 +757,32 @@ public DataSegment map(int index, ResultSet r, StatementContext ctx) throws SQLE // Replace "dataSources" atomically. dataSources = newDataSources; +overshadowedSegments = ImmutableSet.copyOf(determineOvershadowedSegments(segments)); + } + + /** + * This method builds a timeline from given segments and finds the overshadowed segments + * + * @return set of overshadowed segments + */ + private Set determineOvershadowedSegments(Iterable segments) Review comment: changed to list This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation
surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282291730 ## File path: server/src/main/java/org/apache/druid/server/coordinator/helper/DruidCoordinatorRuleRunner.java ## @@ -84,8 +85,10 @@ public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params) // find available segments which are not overshadowed by other segments in DB // only those would need to be loaded/dropped // anything overshadowed by served segments is dropped automatically by DruidCoordinatorCleanupOvershadowed -final Set overshadowed = ImmutableDruidDataSource -.determineOvershadowedSegments(params.getAvailableSegments()); +// If metadata store hasn't been polled yet, use empty overshadowed list +final Collection overshadowed = Optional + .ofNullable(coordinator.getMetadataSegmentManager().findOvershadowedSegments()) Review comment: If it's acceptable to get updated overshadowedSegments in the next run of `DruidCoordinatorHelper#run`, then I think it might be ok. If not, then may be we should compute the overshadowed list here itself, like before, until the mutability of `DataSegment` is settled. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation
surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282248638 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -744,6 +757,32 @@ public DataSegment map(int index, ResultSet r, StatementContext ctx) throws SQLE // Replace "dataSources" atomically. dataSources = newDataSources; +overshadowedSegments = ImmutableSet.copyOf(determineOvershadowedSegments(segments)); Review comment: Why would the overshadowed segments be invalid, do you think it can happen when some dataSources are enabled or disabled outside `doPoll`? Also, if they do become invalid, would a comment be enough, or something should be done in code to prevent invalid overshadowed segments ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jon-wei commented on a change in pull request #7331: TDigest backed sketch aggregators
jon-wei commented on a change in pull request #7331: TDigest backed sketch aggregators URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282288766 ## File path: extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation.tdigestsketch; + +import com.google.common.base.Preconditions; +import com.tdunning.math.stats.MergingDigest; +import it.unimi.dsi.fastutil.ints.Int2ObjectMap; +import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap; +import org.apache.druid.java.util.common.IAE; +import org.apache.druid.query.aggregation.BufferAggregator; +import org.apache.druid.segment.ColumnValueSelector; + +import javax.annotation.Nonnull; +import javax.annotation.Nullable; +import javax.annotation.concurrent.GuardedBy; +import java.nio.ByteBuffer; +import java.util.IdentityHashMap; +import java.util.Map; + +/** + * Aggregator that builds t-digest backed sketches using numeric values read from {@link ByteBuffer} + */ +public class TDigestBuildSketchBufferAggregator implements BufferAggregator +{ + + @Nonnull + private final ColumnValueSelector selector; + private final int compression; + + @GuardedBy("this") + private final Map> sketches = new IdentityHashMap<>(); + + public TDigestBuildSketchBufferAggregator( + final ColumnValueSelector valueSelector, + @Nullable final Integer compression + ) + { +Preconditions.checkNotNull(valueSelector); +this.selector = valueSelector; +if (compression != null) { + this.compression = compression; +} else { + this.compression = TDigestBuildSketchAggregatorFactory.DEFAULT_COMPRESSION; +} + } + + @Override + public synchronized void init(ByteBuffer buffer, int position) + { +MergingDigest emptyDigest = new MergingDigest(compression); +putSketch(buffer, position, emptyDigest); + } + + @Override + public synchronized void aggregate(ByteBuffer buffer, int position) + { +MergingDigest sketch = sketches.get(buffer).get(position); +Object x = selector.getObject(); +if (x instanceof Number) { + sketch.add(((Number) x).doubleValue()); +} else { + throw new IAE("Unexpected value of type " + x.getClass().getName() + " encountered"); +} + } + + @Override + public synchronized Object get(final ByteBuffer buffer, final int position) + { +return sketches.get(buffer).get(position); Review comment: The get() on the buffer aggregator needs to return a snapshot copy of the sketch to avoid use-after-free issues (see recently updated javadocs on get() and https://github.com/apache/incubator-druid/pull/7464) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] samarthjain commented on a change in pull request #7331: TDigest backed sketch aggregators
samarthjain commented on a change in pull request #7331: TDigest backed sketch aggregators URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282275813 ## File path: extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation.tdigestsketch; + +import com.google.common.base.Preconditions; +import com.tdunning.math.stats.MergingDigest; +import it.unimi.dsi.fastutil.ints.Int2ObjectMap; +import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap; +import org.apache.druid.java.util.common.IAE; +import org.apache.druid.query.aggregation.BufferAggregator; +import org.apache.druid.segment.ColumnValueSelector; + +import javax.annotation.Nonnull; +import javax.annotation.concurrent.GuardedBy; +import java.nio.ByteBuffer; +import java.util.IdentityHashMap; +import java.util.Map; + +/** + * Aggregator that builds t-digest backed sketches using numeric values read from {@link ByteBuffer} + */ +public class TDigestBuildSketchBufferAggregator implements BufferAggregator +{ + + @Nonnull + private final ColumnValueSelector selector; + @Nonnull + private final int compression; + + @GuardedBy("this") + private Map> sketches = new IdentityHashMap<>(); + + public TDigestBuildSketchBufferAggregator( + final ColumnValueSelector valueSelector, + final Integer compression + ) + { +Preconditions.checkNotNull(valueSelector); +this.selector = valueSelector; +if (compression != null) { + this.compression = compression; +} else { + this.compression = TDigestBuildSketchAggregator.DEFAULT_COMPRESSION; +} + } + + @Override + public synchronized void init(ByteBuffer buffer, int position) Review comment: For clarity, when building an incremental index, are aggregators invoked? And is that BufferedAggregator or Aggregator. From your comments it sounds like we needn't worry about thread safety for BufferedAggregators but what about Aggregators? Looking at HistogramAggregator or HistogramBufferAggregator, I don't see any kind of synchronization. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators
jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282269647 ## File path: extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation.tdigestsketch; + +import com.google.common.base.Preconditions; +import com.tdunning.math.stats.MergingDigest; +import it.unimi.dsi.fastutil.ints.Int2ObjectMap; +import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap; +import org.apache.druid.java.util.common.IAE; +import org.apache.druid.query.aggregation.BufferAggregator; +import org.apache.druid.segment.ColumnValueSelector; + +import javax.annotation.Nonnull; +import javax.annotation.concurrent.GuardedBy; +import java.nio.ByteBuffer; +import java.util.IdentityHashMap; +import java.util.Map; + +/** + * Aggregator that builds t-digest backed sketches using numeric values read from {@link ByteBuffer} + */ +public class TDigestBuildSketchBufferAggregator implements BufferAggregator +{ + + @Nonnull + private final ColumnValueSelector selector; + @Nonnull + private final int compression; + + @GuardedBy("this") + private Map> sketches = new IdentityHashMap<>(); + + public TDigestBuildSketchBufferAggregator( + final ColumnValueSelector valueSelector, + final Integer compression + ) + { +Preconditions.checkNotNull(valueSelector); +this.selector = valueSelector; +if (compression != null) { + this.compression = compression; +} else { + this.compression = TDigestBuildSketchAggregator.DEFAULT_COMPRESSION; +} + } + + @Override + public synchronized void init(ByteBuffer buffer, int position) Review comment: Yeah, it's lame that the doc is missing about what should be synchronized. I think DataSketches implementations are wrong. It doesn't have to be synchronized because concurrent reads and writes can happen only in incremental index. You would see other BufferAggregator implementations of druid-core or druid-extensions-core don't do it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators
jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282269622 ## File path: extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchAggregatorFactory.java ## @@ -0,0 +1,269 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation.tdigestsketch; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.tdunning.math.stats.MergingDigest; +import com.tdunning.math.stats.TDigest; +import org.apache.druid.query.aggregation.Aggregator; +import org.apache.druid.query.aggregation.AggregatorFactory; +import org.apache.druid.query.aggregation.AggregatorFactoryNotMergeableException; +import org.apache.druid.query.aggregation.AggregatorUtil; +import org.apache.druid.query.aggregation.BufferAggregator; +import org.apache.druid.query.cache.CacheKeyBuilder; +import org.apache.druid.segment.ColumnSelectorFactory; +import org.apache.druid.segment.ColumnValueSelector; +import org.apache.druid.segment.column.ColumnCapabilities; +import org.apache.druid.segment.column.ValueType; + +import javax.annotation.Nonnull; +import javax.annotation.Nullable; +import java.util.Collections; +import java.util.Comparator; +import java.util.List; +import java.util.Objects; + +/** + * Aggregation operations over the tdigest-based quantile sketch + * available on https://github.com/tdunning/t-digest;>github and described + * in the paper + * https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf;> + * Computing extremely accurate quantiles using t-digests. + * + * + * At the time of writing this implementation, there are two flavors of {@link TDigest} + * available - {@link MergingDigest} and {@link com.tdunning.math.stats.AVLTreeDigest}. + * This implementation uses {@link MergingDigest} since it is more suited for the cases + * when we have to merge intermediate aggregations which Druid needs to do as + * part of query processing. + */ +public class TDigestBuildSketchAggregatorFactory extends AggregatorFactory +{ + + // Default compression + public static final int DEFAULT_COMRESSION = 100; + + @Nonnull + private final String name; + @Nonnull + private final String fieldName; + @Nonnull + final Integer compression; + @Nonnull + private final byte cacheTypeId; + + public static final String TYPE_NAME = "buildTDigestSketch"; + + @JsonCreator + public TDigestBuildSketchAggregatorFactory( + @JsonProperty("name") final String name, + @JsonProperty("fieldName") final String fieldName, + @Nullable @JsonProperty("compression") final Integer compression + ) + { +this(name, fieldName, compression, AggregatorUtil.TDIGEST_BUILD_SKETCH_CACHE_TYPE_ID); + } + + TDigestBuildSketchAggregatorFactory( + final String name, + final String fieldName, + @Nullable final Integer compression, + final byte cacheTypeId + ) + { +Objects.requireNonNull(name, "Must have a valid, non-null aggregator name"); +this.name = name; +Objects.requireNonNull(fieldName, "Parameter fieldName must be specified"); +this.fieldName = fieldName; +this.compression = compression == null ? DEFAULT_COMRESSION : compression; +this.cacheTypeId = cacheTypeId; + } + + + @Override + public byte[] getCacheKey() + { +return new CacheKeyBuilder( +cacheTypeId +).appendString(fieldName).appendInt(compression).build(); + } + + + @Override + public Aggregator factorize(ColumnSelectorFactory metricFactory) + { +ColumnCapabilities cap = metricFactory.getColumnCapabilities(fieldName); +if (cap == null || ValueType.isNumeric(cap.getType())) { + final ColumnValueSelector selector = metricFactory.makeColumnValueSelector(fieldName); + return new TDigestBuildSketchAggregator(selector, compression); +} else { + final ColumnValueSelector selector = metricFactory.makeColumnValueSelector(fieldName); + return new TDigestMergeSketchAggregator(selector, compression); +} + } + + @Override
[GitHub] [incubator-druid] nosahama commented on issue #2523: Support multiple lookups within one namespace
nosahama commented on issue #2523: Support multiple lookups within one namespace URL: https://github.com/apache/incubator-druid/issues/2523#issuecomment-490667638 > As requested I am sharing our use case. We're using a TSV in S3 for a namespace lookup (at least to start with, we will probably switch over to a JDBC source eventually). We have a single key column, which always corresponds to the same actual dimension in Druid. We have a dozen lookup columns (could grow by a handful, but I'd think no more than 20). And we're starting pretty small now with only about 100K rows, but expect that could grow to several million rows before too long. > > We don't need this updated really frequently. Actually we're still working out our ETLs and so forth to deal with revisions and additions to the lookup data. But I wouldn't expect us to have updates more frequently than hourly, and probably more like daily. > > As far as pain points with this arrangement - there is sure plenty of boilerplate in the config. I have an array of a dozen entries in `druid.query.extraction.namespace.lookups` that are identical in all fields except for `namespace` and `valueColumn`. A bit clunky but not so much that I'd complain about it really - I did write a couple of simple scripts that generate the stuff to be placed in config. > > I'm more concerned about the overhead when we do update the lookup source. Druid will have to load and parse this (potentially sized) 20 x 3M TSV once per lookup. I haven't done any benchmarking but I have noticed that it can take on the order of 15 seconds to completely load our current 12 x 100K case. Even if it takes a few minutes that is not a gamebreaker (assuming it does not interfere with query performance or produce inconsistent results while in progress). But it certainly seems like it could be a lot more efficient to load and parse the file once instead of 12 or 20 times. > > Overall, the configuration and use feels a bit clunky, I think because from the user point of view, we have just one "lookup namespace" - there is a single source, and a single key column. It would feel more natural to define the data source level properties (uri, format, columns) and key column once, along with a list of allowed targetColumns, then use it in dimension specs and filters by referencing just the one single namespace plus a targetColumn. It might start to look like an ingestion spec at that point, with dataSchema- and ioConfig-like sections. > > But honestly I don't know how much of a priority I'd want it to be. Associating a single namespace with a single key column and multiple value columns might well be overfitting to our specific case, and it 's certainly quite usable as it stands. > > (One side note, the ability to include columns in the CSV which are not key or value columns is useful for assembling the data manually - we can include "friendly name" sort of columns that are helpful to people who are filling in or auditing the actual lookup data.) Hi there, please i am trying to configure Druid to load a lookup file from s3, how do i do this? Do i use the `file:/` syntax or there is another syntax for loading lookups from s3? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jon-wei commented on a change in pull request #7614: Fix exception when using complex aggs with result level caching
jon-wei commented on a change in pull request #7614: Fix exception when using complex aggs with result level caching URL: https://github.com/apache/incubator-druid/pull/7614#discussion_r282239041 ## File path: processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java ## @@ -566,12 +566,24 @@ public Row apply(Object input) DimensionHandlerUtils.convertObjectToType(results.next(), dimensionSpec.getOutputType()) ); } - Iterator aggsIter = aggs.iterator(); + +// When using the result level cache, the agg values seen here are Review comment: Extracted to a helper method in `CacheStrategy` (`CacheUtil` is in `server` which is not a dependency of `processing`) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] himanshug merged pull request #7183: add postgresql meta db table schema configuration property (#7137)
himanshug merged pull request #7183: add postgresql meta db table schema configuration property (#7137) URL: https://github.com/apache/incubator-druid/pull/7183 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[incubator-druid] branch master updated: add postgresql meta db table schema configuration property (#7137) (#7183)
This is an automated email from the ASF dual-hosted git repository. himanshug pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-druid.git The following commit(s) were added to refs/heads/master by this push: new 0ef435a add postgresql meta db table schema configuration property (#7137) (#7183) 0ef435a is described below commit 0ef435a16c511181bb97f61b230eeadf50d63535 Author: Jinseon Lee AuthorDate: Thu May 9 04:56:30 2019 +0900 add postgresql meta db table schema configuration property (#7137) (#7183) * add postgresql meta db table schema configuration property (#7137) If the postgresql db schema changes, you must set the configuration values. You do not need to set it if there is no change from the default schema 'public'. druid.metadata.postgres.dbTableSchema=public * create postgresql metadb table schema configuration property (#7137) If the postgresql db schema changes, you must set the configuration values. You do not need to set it if there is no change from the default schema 'public'. druid.metadata.postgres.dbTableSchema=public check PostgreSQLTablesConfig.java * modify postgresql readme file. - metadb table schema (#7137) If the postgresql db schema changes, you must set the configuration values. You do not need to set it if there is no change from the default schema 'public'. druid.metadata.postgres.dbTableSchema=public check PostgreSQLTablesConfig.java --- .../development/extensions-core/postgresql.md | 2 ++ .../storage/postgresql/PostgreSQLConnector.java| 11 -- .../PostgreSQLMetadataStorageModule.java | 1 + .../storage/postgresql/PostgreSQLTablesConfig.java | 41 ++ .../postgresql/PostgreSQLConnectorTest.java| 3 +- 5 files changed, 54 insertions(+), 4 deletions(-) diff --git a/docs/content/development/extensions-core/postgresql.md b/docs/content/development/extensions-core/postgresql.md index 07a2a78..26f77fc 100644 --- a/docs/content/development/extensions-core/postgresql.md +++ b/docs/content/development/extensions-core/postgresql.md @@ -83,3 +83,5 @@ In most cases, the configuration options map directly to the [postgres jdbc conn | `druid.metadata.postgres.ssl.sslRootCert` | The full path to the root certificate. | none | no | | `druid.metadata.postgres.ssl.sslHostNameVerifier` | The classname of the hostname verifier. | none | no | | `druid.metadata.postgres.ssl.sslPasswordCallback` | The classname of the SSL password provider. | none | no | +| `druid.metadata.postgres.dbTableSchema` | druid meta table schema | `public` | no | + diff --git a/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLConnector.java b/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLConnector.java index e234a15..a474a0b 100644 --- a/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLConnector.java +++ b/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLConnector.java @@ -48,11 +48,14 @@ public class PostgreSQLConnector extends SQLMetadataConnector private volatile Boolean canUpsert; + private final String dbTableSchema; + @Inject public PostgreSQLConnector( Supplier config, Supplier dbTables, - PostgreSQLConnectorConfig connectorConfig + PostgreSQLConnectorConfig connectorConfig, + PostgreSQLTablesConfig tablesConfig ) { super(config, dbTables); @@ -104,7 +107,8 @@ public class PostgreSQLConnector extends SQLMetadataConnector } this.dbi = new DBI(datasource); - +this.dbTableSchema = tablesConfig.getDbTableSchema(); + log.info("Configured PostgreSQL as metadata storage"); } @@ -146,8 +150,9 @@ public class PostgreSQLConnector extends SQLMetadataConnector public boolean tableExists(final Handle handle, final String tableName) { return !handle.createQuery( -"SELECT tablename FROM pg_catalog.pg_tables WHERE schemaname = 'public' AND tablename ILIKE :tableName" +"SELECT tablename FROM pg_catalog.pg_tables WHERE schemaname = :dbTableSchema AND tablename ILIKE :tableName" ) + .bind("dbTableSchema", dbTableSchema) .bind("tableName", tableName) .map(StringMapper.FIRST) .list() diff --git a/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLMetadataStorageModule.java b/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLMetadataStorageModule.java index f10de65..9506edd 100644 ---
[GitHub] [incubator-druid] himanshug commented on issue #7618: Virtual column updates for exploiting base column internal structure
himanshug commented on issue #7618: Virtual column updates for exploiting base column internal structure URL: https://github.com/apache/incubator-druid/pull/7618#issuecomment-490617207 looks like there is test failure, will update. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] himanshug commented on a change in pull request #7606: Set direct memory if unable to detect JVM config
himanshug commented on a change in pull request #7606: Set direct memory if unable to detect JVM config URL: https://github.com/apache/incubator-druid/pull/7606#discussion_r282209983 ## File path: server/src/main/java/org/apache/druid/guice/DruidProcessingModule.java ## @@ -157,9 +157,16 @@ private void verifyDirectMemory(DruidProcessingConfig config) } } catch (UnsupportedOperationException e) { + log.debug("Checking for direct memory size is not support on this platform: %s", e); log.info( - "Could not verify that you have enough direct memory, so I hope you do! Error message was: %s", - e.getMessage() + "Unable to determine max direct memory size. If -XX:MaxDirectMemorySize is set, make sure " Review comment: if above change is made then this should be something like... "Unable to determine max direct memory size. If druid.processing.buffer.sizeBytes is explicitly set then -XX:MaxDirectMemorySize to at least 'druid.processing.buffer.sizeBytes * (druid.processing.numMergeBuffers[%,d] + druid.processing.numThreads[%,d] + 1)' or else to at least 25% of maximum heap size." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] himanshug commented on a change in pull request #7606: Set direct memory if unable to detect JVM config
himanshug commented on a change in pull request #7606: Set direct memory if unable to detect JVM config URL: https://github.com/apache/incubator-druid/pull/7606#discussion_r282205588 ## File path: processing/src/main/java/org/apache/druid/query/DruidProcessingConfig.java ## @@ -52,7 +52,22 @@ public int intermediateComputeSizeBytes() return computedBufferSizeBytes.get(); } -long directSizeBytes = JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes(); +long directSizeBytes; +try { + directSizeBytes = JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes(); Review comment: so this always fails for jdk 9 onwards ? if yes, that means if user wanted buffers of size DEFAULT_PROCESSING_BUFFER_SIZE_BYTES , they would configure that number but instead get buffers of size 25%-of-max-heap/totalNumBuffers ? in that case we need to change line 49 to have a different way of detecting whether user explicitly provided sizeBytes config . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490609093 Thank you!! We have been able to reproduce the problem. Now I can dig in to see what went wrong. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] himanshug opened a new pull request #7618: Virtual column updates for exploiting base column internal structure
himanshug opened a new pull request #7618: Virtual column updates for exploiting base column internal structure URL: https://github.com/apache/incubator-druid/pull/7618 Fixes #7574 This patch adds more methods to VirtualColumn interface to exploit base column's internal structure and few updates to interface users to use those methods. Unit tests are introduced to ensure expected use of VirtualColumn interface in Druid code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7611: Add plain text README.txt, use relative link from README.md to build.md
jihoonson commented on a change in pull request #7611: Add plain text README.txt, use relative link from README.md to build.md URL: https://github.com/apache/incubator-druid/pull/7611#discussion_r282174091 ## File path: distribution/src/assembly/source-assembly.xml ## @@ -47,6 +47,7 @@ .gitignore .dockerignore .travis.yml +README.md Review comment: Should we exclude this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7611: Add plain text README.txt, use relative link from README.md to build.md
jihoonson commented on a change in pull request #7611: Add plain text README.txt, use relative link from README.md to build.md URL: https://github.com/apache/incubator-druid/pull/7611#discussion_r282172376 ## File path: README ## @@ -0,0 +1,89 @@ +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + + +Apache Druid (incubating) is a high performance analytics data store for event-driven data. More information about Druid +can be found on http://www.druid.io. + +The Druid community is in the process of migrating to Apache by way of the Apache Incubator. Eventually, as we proceed +along this path, our site will move from http://druid.io/ to https://druid.apache.org/. + + +Documentation +- Review comment: nit: would it look esthetically better if the bar has the same length with the title? Same question for other sections. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490573859 @leerho, please let me know if the following is helpful, or if I could do anything else to help. What the Druid query is doing is something like this: 1. Iterating over all rows in a Druid segment, and building up a theta sketch object. This object looks fine. 2. Taking that object and merging it into the 'merge buffer', which starts off initialized to an empty sketch. This is where it goes off the rails. I scattered a bunch of sketch toStrings around the code and found that in step (2) they look like this: **The object built up from the segment scan,** ``` ### HeapCompactOrderedSketch SUMMARY: Estimate: 100086.81001356241 Upper Bound, 95% conf : 101530.78009013624 Lower Bound, 95% conf : 98663.31883662633 Theta (double) : 0.16369789383615946 Theta (long): 1509846576500454824 Theta (long) hex: 14f40d7639a635a8 EstMode?: true Empty? : false Array Size Entries : 16384 Retained Entries: 16384 Seed Hash : 93cc | 37836 ### END SKETCH SUMMARY ``` **The initial state of the sketch in the merge buffer (should be empty),** ``` ### HeapCompactOrderedSketch SUMMARY: Estimate: 0.0 Upper Bound, 95% conf : 0.0 Lower Bound, 95% conf : 0.0 Theta (double) : 1.0 Theta (long): 9223372036854775807 Theta (long) hex: 7fff EstMode?: false Empty? : true Array Size Entries : 0 Retained Entries: 0 Seed Hash : 93cc | 37836 ### END SKETCH SUMMARY ``` **The final state of the sketch in the merge buffer (should match the original sketch from the segment scan),** ``` ### HeapCompactOrderedSketch SUMMARY: Estimate: 16384.0 Upper Bound, 95% conf : 16384.0 Lower Bound, 95% conf : 16384.0 Theta (double) : 1.0 Theta (long): 9223372036854775807 Theta (long) hex: 7fff EstMode?: false Empty? : false Array Size Entries : 16384 Retained Entries: 16384 Seed Hash : 93cc | 37836 ### END SKETCH SUMMARY ``` It's changed a bit, but doesn't match up. The code that printed this was the `aggregate` method in SketchBufferAggregator, which looks like this after the debugging code I added: ```java @Override public void aggregate(ByteBuffer buf, int position) { Object update = selector.getObject(); if (update == null) { return; } Union union = getOrCreateUnion(buf, position); final String initialUnionResult = update instanceof SketchHolder ? union.getResult().toString() : null; SketchAggregator.updateUnion(union, update); if (update instanceof SketchHolder) { log.info( "Aggregate called with buffer[%s], position[%s], update = %s, union starts as = %s, union ends as = %s", System.identityHashCode(buf), position, ((SketchHolder) update).getSketch(), initialUnionResult, union.getResult() ); } } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490570278 Very puzzling. Se need to simplify the problem environment to where I can reproduce the problem outside Druid. I suspect that somehow theta is being reset to 1.0, which would cause this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leventov commented on a change in pull request #7595: Optimize overshadowed segments computation
leventov commented on a change in pull request #7595: Optimize overshadowed segments computation URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282157494 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -744,6 +757,32 @@ public DataSegment map(int index, ResultSet r, StatementContext ctx) throws SQLE // Replace "dataSources" atomically. dataSources = newDataSources; +overshadowedSegments = ImmutableSet.copyOf(determineOvershadowedSegments(segments)); Review comment: There are some changes in this class to `dataSources` apart from in `doPoll()` that may make computed overshadowed segments invalid. (Even if they don't, there should be comments about that.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leventov commented on a change in pull request #7595: Optimize overshadowed segments computation
leventov commented on a change in pull request #7595: Optimize overshadowed segments computation URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282159218 ## File path: server/src/main/java/org/apache/druid/server/coordinator/helper/DruidCoordinatorRuleRunner.java ## @@ -84,8 +85,10 @@ public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params) // find available segments which are not overshadowed by other segments in DB // only those would need to be loaded/dropped // anything overshadowed by served segments is dropped automatically by DruidCoordinatorCleanupOvershadowed -final Set overshadowed = ImmutableDruidDataSource -.determineOvershadowedSegments(params.getAvailableSegments()); +// If metadata store hasn't been polled yet, use empty overshadowed list +final Collection overshadowed = Optional + .ofNullable(coordinator.getMetadataSegmentManager().findOvershadowedSegments()) Review comment: There are some concurrency issues: you may observe `overshadowedSegments` from a previous run. I think it would be easier to add `isOvershadowed` field to `DataSegment` directly in this PR because that's the plan anyway and that would allow avoiding concurrency issues. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leventov commented on a change in pull request #7595: Optimize overshadowed segments computation
leventov commented on a change in pull request #7595: Optimize overshadowed segments computation URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282156967 ## File path: core/src/main/java/org/apache/druid/timeline/DataSegmentWithOvershadowedStatus.java ## @@ -25,16 +25,16 @@ /** * DataSegment object plus the overshadowed status for the segment. An immutable object. * - * SegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} + * DataSegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} * of the DataSegment object. */ -public class SegmentWithOvershadowedStatus implements Comparable +public class DataSegmentWithOvershadowedStatus implements Comparable Review comment: I don't think this is a good rename. I think ["DataSegment" is an unfortunate name](https://github.com/apache/incubator-druid/issues/7396), so why propagating it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leventov commented on a change in pull request #7595: Optimize overshadowed segments computation
leventov commented on a change in pull request #7595: Optimize overshadowed segments computation URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282158038 ## File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java ## @@ -744,6 +757,32 @@ public DataSegment map(int index, ResultSet r, StatementContext ctx) throws SQLE // Replace "dataSources" atomically. dataSources = newDataSources; +overshadowedSegments = ImmutableSet.copyOf(determineOvershadowedSegments(segments)); + } + + /** + * This method builds a timeline from given segments and finds the overshadowed segments + * + * @return set of overshadowed segments + */ + private Set determineOvershadowedSegments(Iterable segments) Review comment: Doesn't need to be set, you copy into another Set anyway This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leventov commented on a change in pull request #7595: Optimize overshadowed segments computation
leventov commented on a change in pull request #7595: Optimize overshadowed segments computation URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282159420 ## File path: server/src/main/java/org/apache/druid/metadata/MetadataSegmentManager.java ## @@ -98,6 +98,15 @@ Collection getAllDataSourceNames(); + /** + * Returns a collection of overshadowed segments + * + * Will return null if we do not have a valid snapshot of segments yet (perhaps the underlying metadata store has + * not yet been polled.) + */ + @Nullable + Collection findOvershadowedSegments(); Review comment: A method which is a simple getter shouldn't be called `find...`, implying high cost. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] drcrallen commented on issue #6948: Add guava compatability up to 27.0.1
drcrallen commented on issue #6948: Add guava compatability up to 27.0.1 URL: https://github.com/apache/incubator-druid/pull/6948#issuecomment-490566931 go away stale bot! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leventov commented on issue #6358: Interning in SQLMetadataSegmentManager may obliterate new segment metadata
leventov commented on issue #6358: Interning in SQLMetadataSegmentManager may obliterate new segment metadata URL: https://github.com/apache/incubator-druid/issues/6358#issuecomment-490553573 @surekhasaharan I meant to add it as an error. So every time a developer wants to have `Set` in their code, they must put `@SuppressWarnings` (with justification) to pass CI. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490552474 Thank you, @pzhdfy, for the detailed instructions on how to reproduce this problem. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490552264 I was able to reproduce this as well. Downgrading to sketches-core-0.13.0 fixed the problem. I also noticed that adding a limit to the groupBy fixed it as well. I'm not sure why - it does change the code paths, however. In Druid SQL, this query exhibits the issue: ```sql SELECT 'beep', APPROX_COUNT_DISTINCT_DS_THETA("user_id") FROM test_theta GROUP BY 1 ``` And this one doesn't: ```sql SELECT 'beep', APPROX_COUNT_DISTINCT_DS_THETA("user_id") FROM test_theta GROUP BY 1 LIMIT 1 ``` (The 'beep' is to force the SQL planner to use a groupBy rather than timeseries query type.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on a change in pull request #7614: Fix exception when using complex aggs with result level caching
gianm commented on a change in pull request #7614: Fix exception when using complex aggs with result level caching URL: https://github.com/apache/incubator-druid/pull/7614#discussion_r282136139 ## File path: processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java ## @@ -566,12 +566,24 @@ public Row apply(Object input) DimensionHandlerUtils.convertObjectToType(results.next(), dimensionSpec.getOutputType()) ); } - Iterator aggsIter = aggs.iterator(); + +// When using the result level cache, the agg values seen here are Review comment: There's a long comment and identical block in each toolchest -- perhaps extract to a helper method in `CacheUtil`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490541495 @pzhdfy 1) Please try sketches release 0.13.0. This will narrow down the possible changes that might be causing this. 2) At the point where the sketch is reporting the bad result print out the sketch summary using toString(). This might provide some clues. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490541495 @pzhdfy 1) Please try sketches release 0.13.0. This will narrow down the possible changes that might be causing this. 2) At the points where the sketch is reporting the good result and the bad result print out the sketch summary using toString(). This might provide some clues. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384 URL: https://github.com/apache/incubator-druid/issues/7607#issuecomment-490541004 In groupBy vs topN, as far as aggregators are concerned, one major difference is that groupBy uses `relocate` and topN does not. However, since you have just one `date_id`, I don't think it's likely that `relocate` would be called. So that's probably not it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on issue #7617: Segments/Tables getting dropped unwantedly
gianm commented on issue #7617: Segments/Tables getting dropped unwantedly URL: https://github.com/apache/incubator-druid/issues/7617#issuecomment-490530879 @Shailesh-Pandey do you see a full stack trace somewhere, perhaps in the broker logs? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] Shailesh-Pandey opened a new issue #7617: Segments/Tables getting dropped unwantedly
Shailesh-Pandey opened a new issue #7617: Segments/Tables getting dropped unwantedly URL: https://github.com/apache/incubator-druid/issues/7617 druid version:0.13.0 < dsql> select * from tablename; java.lang.RuntimeException: Error while applying rule DruidTableScanRule, args [rel#16916567:LogicalTableScan.NONE.[](table=[druid, tablename])] > On the druid overlord console the dropped segments are not visible but a "\d" on dsql shows the table names. The data ingestion on one of the dropped tables was almost secondly(lot of data) while on another one was just hourly(rougly). All indexes are realtime. Any idea on what could be the issue or what should be initial approach towards troubleshooting. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] doctording opened a new issue #7616: jvm fatal error of index_realtime task on middleManager
doctording opened a new issue #7616: jvm fatal error of index_realtime task on middleManager URL: https://github.com/apache/incubator-druid/issues/7616 Druid 0.12.3 on task log file(after smoosh file created, before copy to deep storage) ```java io.druid.java.util.common.io.smoosh.FileSmoosher - Created smoosh file ... 0.smoosh] of size [2049254] bytes. # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x, pid=23022, tid=0x7f56ece41700 # # JRE version: Java(TM) SE Runtime Environment (8.0_171-b11) (build 1.8.0_171-b11) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode linux-amd64 compressed oops) # Problematic frame: # C 0x # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /usr/local/imply-2.7.8/hs_err_pid23022.log [thread 140011328812800 also had an error] [thread 140011325654784 also had an error] [thread 140011323549440 also had an error][thread 140011317233408 also had an error] [thread 140011324602112 also had an error] [thread 140011326707456 also had an error] [thread 140011321444096 also had an error] [thread 140011319338752 also had an error] # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] doctording removed a comment on issue #7032: KIS task crashes with JVM segfault
doctording removed a comment on issue #7032: KIS task crashes with JVM segfault URL: https://github.com/apache/incubator-druid/issues/7032#issuecomment-490459144 +1 druid 0.12.3 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] doctording commented on issue #7032: KIS task crashes with JVM segfault
doctording commented on issue #7032: KIS task crashes with JVM segfault URL: https://github.com/apache/incubator-druid/issues/7032#issuecomment-490459144 +1 druid 0.12.3 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] leventov opened a new issue #7615: Add endpoint to kill all segments belonging to a data source
leventov opened a new issue #7615: Add endpoint to kill all segments belonging to a data source URL: https://github.com/apache/incubator-druid/issues/7615 https://github.com/apache/incubator-druid/blob/9b197b436b4489195ad589b44978d0a9e08d2c3f/web-console/src/views/datasource-view.tsx#L306 It's better to add an endpoint to kill all data properly rather than using `1000/3000` workaround. FYI @vogievetsky @surekhasaharan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] clintropolis closed pull request #7612: add git porcelain check to travis
clintropolis closed pull request #7612: add git porcelain check to travis URL: https://github.com/apache/incubator-druid/pull/7612 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] samarthjain commented on a change in pull request #7331: TDigest backed sketch aggregators
samarthjain commented on a change in pull request #7331: TDigest backed sketch aggregators URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r281958380 ## File path: extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation.tdigestsketch; + +import com.google.common.base.Preconditions; +import com.tdunning.math.stats.MergingDigest; +import it.unimi.dsi.fastutil.ints.Int2ObjectMap; +import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap; +import org.apache.druid.java.util.common.IAE; +import org.apache.druid.query.aggregation.BufferAggregator; +import org.apache.druid.segment.ColumnValueSelector; + +import javax.annotation.Nonnull; +import javax.annotation.concurrent.GuardedBy; +import java.nio.ByteBuffer; +import java.util.IdentityHashMap; +import java.util.Map; + +/** + * Aggregator that builds t-digest backed sketches using numeric values read from {@link ByteBuffer} + */ +public class TDigestBuildSketchBufferAggregator implements BufferAggregator +{ + + @Nonnull + private final ColumnValueSelector selector; + @Nonnull + private final int compression; + + @GuardedBy("this") + private Map> sketches = new IdentityHashMap<>(); + + public TDigestBuildSketchBufferAggregator( + final ColumnValueSelector valueSelector, + final Integer compression + ) + { +Preconditions.checkNotNull(valueSelector); +this.selector = valueSelector; +if (compression != null) { + this.compression = compression; +} else { + this.compression = TDigestBuildSketchAggregator.DEFAULT_COMPRESSION; +} + } + + @Override + public synchronized void init(ByteBuffer buffer, int position) Review comment: @jihoonson - unfortunately the documentation on the base classes/interfaces doesn't clearly mention which methods could be called in a multi-threaded fashion. So I ended up following what the DataSketches implementation does. For ex - https://github.com/apache/incubator-druid/blob/master/extensions-core/datasketches/src/main/java/org/apache/druid/query/aggregation/datasketches/quantiles/DoublesSketchBuildBufferAggregator.java#L54 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] samarthjain commented on a change in pull request #7331: TDigest backed sketch aggregators
samarthjain commented on a change in pull request #7331: TDigest backed sketch aggregators URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r281958692 ## File path: extensions-contrib/tdigestsketch/src/test/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestSketchAggregatorTest.java ## @@ -0,0 +1,284 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation.tdigestsketch; + +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.druid.data.input.Row; +import org.apache.druid.jackson.DefaultObjectMapper; +import org.apache.druid.java.util.common.granularity.Granularities; +import org.apache.druid.java.util.common.guava.Sequence; +import org.apache.druid.query.aggregation.AggregationTestHelper; +import org.apache.druid.query.aggregation.AggregatorFactory; +import org.apache.druid.query.groupby.GroupByQueryConfig; +import org.apache.druid.query.groupby.GroupByQueryRunnerTest; +import org.junit.Assert; +import org.junit.Rule; +import org.junit.Test; +import org.junit.rules.TemporaryFolder; +import org.junit.runner.RunWith; +import org.junit.runners.Parameterized; + +import java.io.File; +import java.util.ArrayList; +import java.util.Collection; +import java.util.List; + +@RunWith(Parameterized.class) +public class TDigestSketchAggregatorTest +{ + + private final AggregationTestHelper helper; + private final AggregationTestHelper timeSeriesHelper; Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big
mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big URL: https://github.com/apache/incubator-druid/issues/7597#issuecomment-490081669 I have increased the `druid.indexer.runner.maxZnodeBytes` value to `2.5MB` and everything worked fine until the Supervisor reached an interval with many segments (the [created JSON](https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/overlord/RemoteTaskRunner.java#L863) has ~ `1.5MB`). Starting from this point, the `Waiting Tasks - Tasks waiting on locks` section in Overlord was filled with dozens of MV tasks (for the same data source: `index_materialized_view_test_2019-05-07...`). The error: ``` ERROR [LeaderSelector[/druid/druid-prod/overlord/_OVERLORD]] org.apache.druid.curator.discovery.CuratorDruidLeaderSelector - listener becomeLeader() failed. Unable to become leader: {class=org.apache.druid.curator.discovery.CuratorDruidLeaderSelector, exceptionType=class org.apache.druid.java.util.common.ISE, exceptionMessage=Could not reacquire lock on interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] version[2019-05-07T11:24:03.543Z] for task: index_materialized_view_test_2019-05-07T11:16:05.026Z} org.apache.druid.java.util.common.ISE: Could not reacquire lock on interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] version[2019-05-07T11:24:03.543Z] for task: index_materialized_view_test_2019-05-07T11:16:05.026Z at org.apache.druid.indexing.overlord.TaskLockbox.syncFromStorage(TaskLockbox.java:171) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:109) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.curator.discovery.CuratorDruidLeaderSelector$1.isLeader(CuratorDruidLeaderSelector.java:98) [druid-server-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:665) [curator-recipes-4.0.0.jar:4.0.0] at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:661) [curator-recipes-4.0.0.jar:4.0.0] at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-4.0.0.jar:4.0.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201] ``` Because of this error, the Overlord has stopped to respond. Also, all those waiting tasks have the same payload. If I don't suspend the Supervisor, the waiting tasks are keep growing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big
mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big URL: https://github.com/apache/incubator-druid/issues/7597#issuecomment-490081669 I have increased the `druid.indexer.runner.maxZnodeBytes` value to `2.5MB` and everything worked fine until the Supervisor reached an interval with many segments (the [created JSON](https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/overlord/RemoteTaskRunner.java#L863) has ~ `1.5MB`). Starting from this point, the `Waiting Tasks - Tasks waiting on locks` section in Overlord was filled with dozens of MV tasks (for the same data source: `index_materialized_view_test_2019-05-07...`). The error: ``` ERROR [LeaderSelector[/druid/druid-prod/overlord/_OVERLORD]] org.apache.druid.curator.discovery.CuratorDruidLeaderSelector - listener becomeLeader() failed. Unable to become leader: {class=org.apache.druid.curator.discovery.CuratorDruidLeaderSelector, exceptionType=class org.apache.druid.java.util.common.ISE, exceptionMessage=Could not reacquire lock on interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] version[2019-05-07T11:24:03.543Z] for task: index_materialized_view_test_2019-05-07T11:16:05.026Z} org.apache.druid.java.util.common.ISE: Could not reacquire lock on interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] version[2019-05-07T11:24:03.543Z] for task: index_materialized_view_test_2019-05-07T11:16:05.026Z at org.apache.druid.indexing.overlord.TaskLockbox.syncFromStorage(TaskLockbox.java:171) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:109) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.curator.discovery.CuratorDruidLeaderSelector$1.isLeader(CuratorDruidLeaderSelector.java:98) [druid-server-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:665) [curator-recipes-4.0.0.jar:4.0.0] at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:661) [curator-recipes-4.0.0.jar:4.0.0] at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-4.0.0.jar:4.0.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201] ``` Because of this error, the Overlord has stopped to respond. Also, all those waiting tasks have the same payload. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org