[GitHub] [incubator-druid] JackyYangPassion edited a comment on issue #7484: my realtime task cant handoff the segments

2019-05-08 Thread GitBox
JackyYangPassion edited a comment on issue #7484: my realtime task cant handoff 
the segments
URL: 
https://github.com/apache/incubator-druid/issues/7484#issuecomment-490748567
 
 
   check is there  a task have many segment to ingest into druid?
   a month ago:
   i have the same problem;the reason is a datasource every day ingest a year 
data;
   other reason many be a historycal node dead also cause the same hand off 
long time;
   
   if you want to know why:
   you can see coordinator code:
   CoordinatorHistoricalManagerRunnable in class DruidCoordinator


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] JackyYangPassion edited a comment on issue #7484: my realtime task cant handoff the segments

2019-05-08 Thread GitBox
JackyYangPassion edited a comment on issue #7484: my realtime task cant handoff 
the segments
URL: 
https://github.com/apache/incubator-druid/issues/7484#issuecomment-490748567
 
 
   check is there  a task have many segment to ingest into druid?
   a month ago:
   i have the same problem;the reason is a datasource every day ingest a year 
data;
   other reason many be a historycal node dead also cause the same hand off 
long time;
   
   if you want to know why:
   you can see coordinator code:
   CoordinatorHistoricalManagerRunnable in class DruidCoordinator.java


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] JackyYangPassion commented on issue #7484: my realtime task cant handoff the segments

2019-05-08 Thread GitBox
JackyYangPassion commented on issue #7484: my realtime task cant handoff the 
segments
URL: 
https://github.com/apache/incubator-druid/issues/7484#issuecomment-490748567
 
 
   check is there  a task have many segment to ingest into druid?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[incubator-druid-website] branch 0.14.1-downloads created (now b84129d)

2019-05-08 Thread cwylie
This is an automated email from the ASF dual-hosted git repository.

cwylie pushed a change to branch 0.14.1-downloads
in repository https://gitbox.apache.org/repos/asf/incubator-druid-website.git.


  at b84129d  add download links for 0.14.1

This branch includes the following new commits:

 new b84129d  add download links for 0.14.1

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[incubator-druid-website] 01/01: add download links for 0.14.1

2019-05-08 Thread cwylie
This is an automated email from the ASF dual-hosted git repository.

cwylie pushed a commit to branch 0.14.1-downloads
in repository https://gitbox.apache.org/repos/asf/incubator-druid-website.git

commit b84129d818d4a4588297beb5517886705edb1bcf
Author: Clint Wylie 
AuthorDate: Wed May 8 21:06:55 2019 -0700

add download links for 0.14.1
---
 downloads.html | 21 +
 1 file changed, 21 insertions(+)

diff --git a/downloads.html b/downloads.html
index 912162a..6b6f1ff 100644
--- a/downloads.html
+++ b/downloads.html
@@ -44,6 +44,27 @@
 
   
   
+  
+  0.14.1-incubating
+  08 May 2019
+  
+https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz;
 onclick="trackDownload('click', 
'https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz');">source
+
+(https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz.sha512;
 onclick="trackDownload('click', 
'https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz.sha512');">sha512
+
+https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz.asc;
 onclick="trackDownload('click', 
'https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-src.tar.gz.asc');">pgp)
+  
+  
+https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz;
 onclick="trackDownload('click', 
'https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz');">binary
+
+(https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz.sha512;
 onclick="trackDownload('click', 
'https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz.sha512');">sha512
+
+https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz.asc;
 onclick="trackDownload('click', 
'https://www.apache.org/dist/incubator/druid/0.14.1-incubating/apache-druid-0.14.1-incubating-bin.tar.gz.asc');">pgp)
+  
+  
+https://github.com/apache/incubator-druid/releases/tag/druid-0.14.1-incubating;>Release
 notes
+  
+   
 
   0.14.0-incubating
   09 Apr 2019


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid-website] clintropolis opened a new pull request #5: add download links for 0.14.1

2019-05-08 Thread GitBox
clintropolis opened a new pull request #5: add download links for 0.14.1
URL: https://github.com/apache/incubator-druid-website/pull/5
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] iamxiaojuan commented on issue #7620: sql injection violation

2019-05-08 Thread GitBox
iamxiaojuan commented on issue #7620: sql injection violation
URL: 
https://github.com/apache/incubator-druid/issues/7620#issuecomment-490735437
 
 
   > Druid SQL does not support inserts (or variables). It is a read only query 
interface. You ingest data in other ways: 
http://druid.io/docs/latest/ingestion/index.html
   
   thank you very much!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] clintropolis commented on issue #7619: fix issue #7607

2019-05-08 Thread GitBox
clintropolis commented on issue #7619: fix issue #7607
URL: https://github.com/apache/incubator-druid/pull/7619#issuecomment-490734781
 
 
   Hmm, the failure is perhaps no longer related to missing artifacts:
   ```
   [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BannedDependencies 
failed with message:
   Found Banned Dependency: com.google.code.findbugs:annotations:jar:3.0.0
   Use 'mvn dependency:tree' to locate the source of the banned dependencies.
   ```
   at https://travis-ci.org/apache/incubator-druid/jobs/530049926#L3162
   
   Doing a quick search, this library is licensed as `lgpl` which is perhaps 
the reason for this issue? Is this a new dependency between 0.13.1 and 0.13.3? 
I wonder if it can be safely excluded from the datasketches extension pom?
   ```
  
   
   com.google.code.findbugs
   annotations
   
   
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] vogievetsky commented on issue #7620: sql injection violation

2019-05-08 Thread GitBox
vogievetsky commented on issue #7620: sql injection violation
URL: 
https://github.com/apache/incubator-druid/issues/7620#issuecomment-490734489
 
 
   Druid SQL does not support inserts (or variables). It is a read only query 
interface. You ingest data in other ways: 
http://druid.io/docs/latest/ingestion/index.html


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] iamxiaojuan commented on issue #7620: sql injection violation

2019-05-08 Thread GitBox
iamxiaojuan commented on issue #7620: sql injection violation
URL: 
https://github.com/apache/incubator-druid/issues/7620#issuecomment-490734241
 
 
   > I am pretty sure nothing about that SQL statement is supported
   
   What do you think the reason is for that?
   and,thank you for your reply.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] vogievetsky commented on issue #7620: sql injection violation

2019-05-08 Thread GitBox
vogievetsky commented on issue #7620: sql injection violation
URL: 
https://github.com/apache/incubator-druid/issues/7620#issuecomment-490733753
 
 
   I am pretty sure nothing about that error is supported


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] vogievetsky edited a comment on issue #7620: sql injection violation

2019-05-08 Thread GitBox
vogievetsky edited a comment on issue #7620: sql injection violation
URL: 
https://github.com/apache/incubator-druid/issues/7620#issuecomment-490733753
 
 
   I am pretty sure nothing about that SQL statement is supported


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] clintropolis commented on a change in pull request #7606: Set direct memory if unable to detect JVM config

2019-05-08 Thread GitBox
clintropolis commented on a change in pull request #7606: Set direct memory if 
unable to detect JVM config
URL: https://github.com/apache/incubator-druid/pull/7606#discussion_r282326369
 
 

 ##
 File path: 
processing/src/main/java/org/apache/druid/query/DruidProcessingConfig.java
 ##
 @@ -52,7 +52,22 @@ public int intermediateComputeSizeBytes()
   return computedBufferSizeBytes.get();
 }
 
-long directSizeBytes = 
JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes();
+long directSizeBytes;
+try {
+  directSizeBytes = JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes();
 
 Review comment:
   I haven't been following the jdk9+ compat PRs too closely so apologies if 
this has been discussed elsewhere, but if we aren't afraid of getting a bit 
dirty there appear to be at least 2 ways we could source this information which 
I tested up to jdk11 (didn't have 12 handy).
   
   [This stuff is still in the jdk, at least as of 12, but it's now in 
`jdk.internal.misc.VM`](https://github.com/AdoptOpenJDK/openjdk-jdk12u/blob/master/src/java.base/share/classes/jdk/internal/misc/VM.java#L130).
 I imagine the java people would tell us that neither of the ways I could get 
this information are legit to use.. but they probably would've said that about 
using `sun.misc.VM` in the first place. 
   
   The first and less intrusive is to reflect it out of `java.nio.Bits`, which 
stores the value it gets from `jdk.internal.misc.VM` in a static field which we 
could grab like this:
   
   ```
   Class bitsClass = Class.forName("java.nio.Bits");
   Field maxMemField = bitsClass.getDeclaredField("MAX_MEMORY");
   maxMemField.setAccessible(true);
   long maxMem = (Long) maxMemField.get(null);
   ```
   
   However, this will complain loudly on stderr with something like:
   ```
   WARNING: An illegal reflective access operation has occurred
   WARNING: Illegal reflective access by me.clintropolis.sandbox.mem.Main 
(file:/Users/clint/workspace/clintropolis/sandbox/target/classes/) to field 
java.nio.Bits.MAX_MEMORY
   WARNING: Please consider reporting this to the maintainers of 
me.clintropolis.sandbox.mem.Main
   WARNING: Use --illegal-access=warn to enable warnings of further illegal 
reflective access operations
   WARNING: All illegal access operations will be denied in a future release
   ```
   
   The 2nd way involves directly using `jdk.internal.misc.VM` like we were 
doing for `sun.misc.VM` something like:
   ```
   Class vmClass = Class.forName("jdk.internal.misc.VM");
   Object maxDirectMemoryObj = 
vmClass.getMethod("maxDirectMemory").invoke(null);
   
   if (maxDirectMemoryObj == null || !(maxDirectMemoryObj instanceof 
Number)) {
 throw new UOE("Cannot determine maxDirectMemory from [%s]", 
maxDirectMemoryObj);
   } else {
 return ((Number) maxDirectMemoryObj).longValue();
   }
   ```
   and adding:
   ```
   --add-exports java.base/jdk.internal.misc=ALL-UNNAMED
   ```
   to the `java` command-line when running (or stuff in jvm.config?). If this 
is not added, it will explode violently with:
   ```
   Exception in thread "main" java.lang.IllegalAccessException: class 
me.clintropolis.sandbox.mem.Main cannot access class jdk.internal.misc.VM (in 
module java.base) because module java.base does not export jdk.internal.misc to 
unnamed module @2f7c7260
at 
java.base/jdk.internal.reflect.Reflection.newIllegalAccessException(Reflection.java:361)
at 
java.base/java.lang.reflect.AccessibleObject.checkAccess(AccessibleObject.java:591)
at java.base/java.lang.reflect.Method.invoke(Method.java:558)
at me.clintropolis.sandbox.mem.Main.main(Main.java:17)
   ```
   
   Other approaches I've seen: [Netty has some crazy bit about trying to pull 
it from the java command-line 
options](https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/internal/PlatformDependent.java#L1035)
 to parse if the user has set `-XX:MaxDirectMemorySize` and try to use that if 
it can't get the information from `sun.misc.VM`, but i'm unsure if that is 
reasonable to do here. 
   
   Personally, I don't really understand why the java developers don't think it 
necessary to provide a friendly way to expose this information, but it is what 
it is. 
   
   I don't know if we want to pursue either of these approaches, just wanted to 
bring it up for discussion. All this said, it looks to me like [the jdk itself 
defaults max directMemory to 
`Runtime.getRuntime().maxMemory()`](https://github.com/AdoptOpenJDK/openjdk-jdk12u/blob/master/src/java.base/share/classes/jdk/internal/misc/VM.java#L208)
 if `-XX:MaxDirectMemorySize` is not set, so it doesn't seem so unreasonable to 
size off of that... 
   
   Regardless, thanks for working on this stuff @xvrl!
   
   


This is an automated message from the Apache Git Service.
To respond to the message, 

[GitHub] [incubator-druid] clintropolis commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
clintropolis commented on issue #7607: thetaSketch(with sketches-core-0.13.1) 
in groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490721420
 
 
   >This is a regression in Theta sketch code. So I would think you don't want 
to approve the 0.14.1 release candidate as it is now. We will fix the 
sketches-core shortly.
   
   0.14.1 is too far gone, the artifacts are already propagated to maven and 
the apache mirrors, so I'm going to go ahead and do the release anyway. I've 
modified the release notes to warn about upgrading if relying on theta sketches.
   
   This issue does seem severe enough to go ahead and do a 0.14.2 since we can 
probably drive that through a lot quicker than we can wrap up and validate 
0.15.0, so I will create an rc and start a vote as soon as possible.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] xvrl commented on a change in pull request #7606: Set direct memory if unable to detect JVM config

2019-05-08 Thread GitBox
xvrl commented on a change in pull request #7606: Set direct memory if unable 
to detect JVM config
URL: https://github.com/apache/incubator-druid/pull/7606#discussion_r282309638
 
 

 ##
 File path: 
processing/src/main/java/org/apache/druid/query/DruidProcessingConfig.java
 ##
 @@ -52,7 +52,22 @@ public int intermediateComputeSizeBytes()
   return computedBufferSizeBytes.get();
 }
 
-long directSizeBytes = 
JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes();
+long directSizeBytes;
+try {
+  directSizeBytes = JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes();
 
 Review comment:
   yes, the call will always fail in Java 9 and above.
   
   DEFAULT_PROCESSING_BUFFER_SIZE_BYTES is -1, so this logic should only apply 
if they haven't configured anything and we need to auto-size the buffer.
   
   This only changes the auto-sizing logic to assume the maximum available 
direct memory is at least 25% of heap size, which should be relatively safe, 
given that JDKs now default to max direct memory = max heap size.
   
   The only reason I picked a fraction of it is to provide a better out of the 
box experience for someone kicking the tires, and avoid having their memory 
usage be twice as much as the heap size they configured, but I picked this 
number fairly arbitrarily, so I'd be happy to revisit if we think a different 
default is better.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jihoonson commented on issue #7619: fix issue #7607

2019-05-08 Thread GitBox
jihoonson commented on issue #7619: fix issue #7607
URL: https://github.com/apache/incubator-druid/pull/7619#issuecomment-49077
 
 
   Ok. Will restart CI after a few mins.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jihoonson commented on issue #7619: fix issue #7607

2019-05-08 Thread GitBox
jihoonson commented on issue #7619: fix issue #7607
URL: https://github.com/apache/incubator-druid/pull/7619#issuecomment-490698350
 
 
   Should we add a unit test?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] AlexanderSaydakov commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
AlexanderSaydakov commented on issue #7607: thetaSketch(with 
sketches-core-0.13.1) in groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490697153
 
 
   https://github.com/apache/incubator-druid/pull/7619
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] AlexanderSaydakov opened a new pull request #7619: fix issue #7607

2019-05-08 Thread GitBox
AlexanderSaydakov opened a new pull request #7619: fix issue #7607
URL: https://github.com/apache/incubator-druid/pull/7619
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) 
in groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490693034
 
 
   @pzhdfy @gianm 
   DataSketches  sketches-core 0.13.3 is now released to Maven Central with the 
fix.
   Thank you @pzhdfy and @gianm  for your help in finding this!!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in 
groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490693034
 
 
   @pzhdfy @gianm 
   DataSketches  sketches-core 0.13.3 is now release to Maven Central with the 
fix.
   Thank you @pzhdfy and @gianm  for you help in finding this!!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) 
in groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490693034
 
 
   @pzhdfy @gianm 
   DataSketches  sketches-core 0.13.3 is now released to Maven Central with the 
fix.
   Thank you @pzhdfy and @gianm  for you help in finding this!!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators

2019-05-08 Thread GitBox
jihoonson commented on a change in pull request #7331: TDigest backed sketch 
aggregators
URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282295708
 
 

 ##
 File path: 
extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java
 ##
 @@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.tdigestsketch;
+
+import com.google.common.base.Preconditions;
+import com.tdunning.math.stats.MergingDigest;
+import it.unimi.dsi.fastutil.ints.Int2ObjectMap;
+import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap;
+import org.apache.druid.java.util.common.IAE;
+import org.apache.druid.query.aggregation.BufferAggregator;
+import org.apache.druid.segment.ColumnValueSelector;
+
+import javax.annotation.Nonnull;
+import javax.annotation.concurrent.GuardedBy;
+import java.nio.ByteBuffer;
+import java.util.IdentityHashMap;
+import java.util.Map;
+
+/**
+ * Aggregator that builds t-digest backed sketches using numeric values read 
from {@link ByteBuffer}
+ */
+public class TDigestBuildSketchBufferAggregator implements BufferAggregator
+{
+
+  @Nonnull
+  private final ColumnValueSelector selector;
+  @Nonnull
+  private final int compression;
+
+  @GuardedBy("this")
+  private Map> sketches = new 
IdentityHashMap<>();
+
+  public TDigestBuildSketchBufferAggregator(
+  final ColumnValueSelector valueSelector,
+  final Integer compression
+  )
+  {
+Preconditions.checkNotNull(valueSelector);
+this.selector = valueSelector;
+if (compression != null) {
+  this.compression = compression;
+} else {
+  this.compression = TDigestBuildSketchAggregator.DEFAULT_COMPRESSION;
+}
+  }
+
+  @Override
+  public synchronized void init(ByteBuffer buffer, int position)
 
 Review comment:
   If a query is issued while a stream ingestion task is running, then the 
query would be routed to that task. This is when concurrent reads and writes 
can happen. Since only `OnHeapIncrementalIndex` is used at ingestion time which 
uses `Aggregator`, we need to consider if there's any concurrency issue between 
`get()` and `aggregate()`. Check out these comments: 
https://github.com/apache/incubator-druid/pull/5002#issuecomment-341179982, 
https://github.com/apache/incubator-druid/pull/5148#discussion_r170906998
   
   I'm not sure why `HistogramAggregator` is not synchronized even though it 
looks to have to. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators

2019-05-08 Thread GitBox
jihoonson commented on a change in pull request #7331: TDigest backed sketch 
aggregators
URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282295729
 
 

 ##
 File path: 
extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java
 ##
 @@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.tdigestsketch;
+
+import com.google.common.base.Preconditions;
+import com.tdunning.math.stats.MergingDigest;
+import it.unimi.dsi.fastutil.ints.Int2ObjectMap;
+import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap;
+import org.apache.druid.java.util.common.IAE;
+import org.apache.druid.query.aggregation.BufferAggregator;
+import org.apache.druid.segment.ColumnValueSelector;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+import javax.annotation.concurrent.GuardedBy;
+import java.nio.ByteBuffer;
+import java.util.IdentityHashMap;
+import java.util.Map;
+
+/**
+ * Aggregator that builds t-digest backed sketches using numeric values read 
from {@link ByteBuffer}
+ */
+public class TDigestBuildSketchBufferAggregator implements BufferAggregator
+{
+
+  @Nonnull
+  private final ColumnValueSelector selector;
+  private final int compression;
+
+  @GuardedBy("this")
+  private final Map> sketches = new 
IdentityHashMap<>();
+
+  public TDigestBuildSketchBufferAggregator(
+  final ColumnValueSelector valueSelector,
+  @Nullable final Integer compression
+  )
+  {
+Preconditions.checkNotNull(valueSelector);
+this.selector = valueSelector;
+if (compression != null) {
+  this.compression = compression;
+} else {
+  this.compression = 
TDigestBuildSketchAggregatorFactory.DEFAULT_COMPRESSION;
+}
+  }
+
+  @Override
+  public synchronized void init(ByteBuffer buffer, int position)
+  {
+MergingDigest emptyDigest = new MergingDigest(compression);
+putSketch(buffer, position, emptyDigest);
+  }
+
+  @Override
+  public synchronized void aggregate(ByteBuffer buffer, int position)
+  {
+MergingDigest sketch = sketches.get(buffer).get(position);
+Object x = selector.getObject();
+if (x instanceof Number) {
+  sketch.add(((Number) x).doubleValue());
+} else {
+  throw new IAE("Unexpected value of type " + x.getClass().getName() + " 
encountered");
+}
+  }
+
+  @Override
+  public synchronized Object get(final ByteBuffer buffer, final int position)
+  {
+return sketches.get(buffer).get(position);
 
 Review comment:
   Thank you for calling out this!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation

2019-05-08 Thread GitBox
surekhasaharan commented on a change in pull request #7595: Optimize 
overshadowed segments computation
URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282204314
 
 

 ##
 File path: 
core/src/main/java/org/apache/druid/timeline/DataSegmentWithOvershadowedStatus.java
 ##
 @@ -25,16 +25,16 @@
 /**
  * DataSegment object plus the overshadowed status for the segment. An 
immutable object.
  *
- * SegmentWithOvershadowedStatus's {@link #compareTo} method considers only 
the {@link SegmentId}
+ * DataSegmentWithOvershadowedStatus's {@link #compareTo} method considers 
only the {@link SegmentId}
  * of the DataSegment object.
  */
-public class SegmentWithOvershadowedStatus implements 
Comparable
+public class DataSegmentWithOvershadowedStatus implements 
Comparable
 
 Review comment:
   hmm, since it's a wrapper on DataSegment, that's why I felt this might make 
more sense, i'd be okay reverting it back to `SegmentWithOvershadowedStatus` if 
there are plans to rename `DataSegment` , although that seems unclear at this 
point. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation

2019-05-08 Thread GitBox
surekhasaharan commented on a change in pull request #7595: Optimize 
overshadowed segments computation
URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282280885
 
 

 ##
 File path: 
server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java
 ##
 @@ -744,6 +757,32 @@ public DataSegment map(int index, ResultSet r, 
StatementContext ctx) throws SQLE
 
 // Replace "dataSources" atomically.
 dataSources = newDataSources;
+overshadowedSegments = 
ImmutableSet.copyOf(determineOvershadowedSegments(segments));
+  }
+
+  /**
+   * This method builds a timeline from given segments and finds the 
overshadowed segments
+   *
+   * @return set of overshadowed segments
+   */
+  private Set determineOvershadowedSegments(Iterable 
segments)
 
 Review comment:
   changed to list


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation

2019-05-08 Thread GitBox
surekhasaharan commented on a change in pull request #7595: Optimize 
overshadowed segments computation
URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282291730
 
 

 ##
 File path: 
server/src/main/java/org/apache/druid/server/coordinator/helper/DruidCoordinatorRuleRunner.java
 ##
 @@ -84,8 +85,10 @@ public DruidCoordinatorRuntimeParams 
run(DruidCoordinatorRuntimeParams params)
 // find available segments which are not overshadowed by other segments in 
DB
 // only those would need to be loaded/dropped
 // anything overshadowed by served segments is dropped automatically by 
DruidCoordinatorCleanupOvershadowed
-final Set overshadowed = ImmutableDruidDataSource
-.determineOvershadowedSegments(params.getAvailableSegments());
+// If metadata store hasn't been polled yet, use empty overshadowed list
+final Collection overshadowed = Optional
+
.ofNullable(coordinator.getMetadataSegmentManager().findOvershadowedSegments())
 
 Review comment:
   If it's acceptable to get updated overshadowedSegments in the next run of 
`DruidCoordinatorHelper#run`, then I think it might be ok. If not, then may be 
we should compute the overshadowed list here itself, like before, until the 
mutability of `DataSegment` is settled.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7595: Optimize overshadowed segments computation

2019-05-08 Thread GitBox
surekhasaharan commented on a change in pull request #7595: Optimize 
overshadowed segments computation
URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282248638
 
 

 ##
 File path: 
server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java
 ##
 @@ -744,6 +757,32 @@ public DataSegment map(int index, ResultSet r, 
StatementContext ctx) throws SQLE
 
 // Replace "dataSources" atomically.
 dataSources = newDataSources;
+overshadowedSegments = 
ImmutableSet.copyOf(determineOvershadowedSegments(segments));
 
 Review comment:
   Why would the overshadowed segments be invalid, do you think it can happen 
when some dataSources  are enabled or disabled outside `doPoll`?  Also, if they 
do become invalid, would a comment be enough, or something should be done in 
code to prevent invalid overshadowed segments ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jon-wei commented on a change in pull request #7331: TDigest backed sketch aggregators

2019-05-08 Thread GitBox
jon-wei commented on a change in pull request #7331: TDigest backed sketch 
aggregators
URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282288766
 
 

 ##
 File path: 
extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java
 ##
 @@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.tdigestsketch;
+
+import com.google.common.base.Preconditions;
+import com.tdunning.math.stats.MergingDigest;
+import it.unimi.dsi.fastutil.ints.Int2ObjectMap;
+import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap;
+import org.apache.druid.java.util.common.IAE;
+import org.apache.druid.query.aggregation.BufferAggregator;
+import org.apache.druid.segment.ColumnValueSelector;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+import javax.annotation.concurrent.GuardedBy;
+import java.nio.ByteBuffer;
+import java.util.IdentityHashMap;
+import java.util.Map;
+
+/**
+ * Aggregator that builds t-digest backed sketches using numeric values read 
from {@link ByteBuffer}
+ */
+public class TDigestBuildSketchBufferAggregator implements BufferAggregator
+{
+
+  @Nonnull
+  private final ColumnValueSelector selector;
+  private final int compression;
+
+  @GuardedBy("this")
+  private final Map> sketches = new 
IdentityHashMap<>();
+
+  public TDigestBuildSketchBufferAggregator(
+  final ColumnValueSelector valueSelector,
+  @Nullable final Integer compression
+  )
+  {
+Preconditions.checkNotNull(valueSelector);
+this.selector = valueSelector;
+if (compression != null) {
+  this.compression = compression;
+} else {
+  this.compression = 
TDigestBuildSketchAggregatorFactory.DEFAULT_COMPRESSION;
+}
+  }
+
+  @Override
+  public synchronized void init(ByteBuffer buffer, int position)
+  {
+MergingDigest emptyDigest = new MergingDigest(compression);
+putSketch(buffer, position, emptyDigest);
+  }
+
+  @Override
+  public synchronized void aggregate(ByteBuffer buffer, int position)
+  {
+MergingDigest sketch = sketches.get(buffer).get(position);
+Object x = selector.getObject();
+if (x instanceof Number) {
+  sketch.add(((Number) x).doubleValue());
+} else {
+  throw new IAE("Unexpected value of type " + x.getClass().getName() + " 
encountered");
+}
+  }
+
+  @Override
+  public synchronized Object get(final ByteBuffer buffer, final int position)
+  {
+return sketches.get(buffer).get(position);
 
 Review comment:
   The get() on the buffer aggregator needs to return a snapshot copy of the 
sketch to avoid use-after-free issues (see recently updated javadocs on get() 
and https://github.com/apache/incubator-druid/pull/7464)
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] samarthjain commented on a change in pull request #7331: TDigest backed sketch aggregators

2019-05-08 Thread GitBox
samarthjain commented on a change in pull request #7331: TDigest backed sketch 
aggregators
URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282275813
 
 

 ##
 File path: 
extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java
 ##
 @@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.tdigestsketch;
+
+import com.google.common.base.Preconditions;
+import com.tdunning.math.stats.MergingDigest;
+import it.unimi.dsi.fastutil.ints.Int2ObjectMap;
+import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap;
+import org.apache.druid.java.util.common.IAE;
+import org.apache.druid.query.aggregation.BufferAggregator;
+import org.apache.druid.segment.ColumnValueSelector;
+
+import javax.annotation.Nonnull;
+import javax.annotation.concurrent.GuardedBy;
+import java.nio.ByteBuffer;
+import java.util.IdentityHashMap;
+import java.util.Map;
+
+/**
+ * Aggregator that builds t-digest backed sketches using numeric values read 
from {@link ByteBuffer}
+ */
+public class TDigestBuildSketchBufferAggregator implements BufferAggregator
+{
+
+  @Nonnull
+  private final ColumnValueSelector selector;
+  @Nonnull
+  private final int compression;
+
+  @GuardedBy("this")
+  private Map> sketches = new 
IdentityHashMap<>();
+
+  public TDigestBuildSketchBufferAggregator(
+  final ColumnValueSelector valueSelector,
+  final Integer compression
+  )
+  {
+Preconditions.checkNotNull(valueSelector);
+this.selector = valueSelector;
+if (compression != null) {
+  this.compression = compression;
+} else {
+  this.compression = TDigestBuildSketchAggregator.DEFAULT_COMPRESSION;
+}
+  }
+
+  @Override
+  public synchronized void init(ByteBuffer buffer, int position)
 
 Review comment:
   For clarity, when building an incremental index, are aggregators invoked? 
And is that BufferedAggregator or Aggregator. From your comments it sounds like 
we needn't worry about thread safety for BufferedAggregators but what about 
Aggregators? Looking at HistogramAggregator or HistogramBufferAggregator, I 
don't see any kind of synchronization.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators

2019-05-08 Thread GitBox
jihoonson commented on a change in pull request #7331: TDigest backed sketch 
aggregators
URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282269647
 
 

 ##
 File path: 
extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java
 ##
 @@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.tdigestsketch;
+
+import com.google.common.base.Preconditions;
+import com.tdunning.math.stats.MergingDigest;
+import it.unimi.dsi.fastutil.ints.Int2ObjectMap;
+import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap;
+import org.apache.druid.java.util.common.IAE;
+import org.apache.druid.query.aggregation.BufferAggregator;
+import org.apache.druid.segment.ColumnValueSelector;
+
+import javax.annotation.Nonnull;
+import javax.annotation.concurrent.GuardedBy;
+import java.nio.ByteBuffer;
+import java.util.IdentityHashMap;
+import java.util.Map;
+
+/**
+ * Aggregator that builds t-digest backed sketches using numeric values read 
from {@link ByteBuffer}
+ */
+public class TDigestBuildSketchBufferAggregator implements BufferAggregator
+{
+
+  @Nonnull
+  private final ColumnValueSelector selector;
+  @Nonnull
+  private final int compression;
+
+  @GuardedBy("this")
+  private Map> sketches = new 
IdentityHashMap<>();
+
+  public TDigestBuildSketchBufferAggregator(
+  final ColumnValueSelector valueSelector,
+  final Integer compression
+  )
+  {
+Preconditions.checkNotNull(valueSelector);
+this.selector = valueSelector;
+if (compression != null) {
+  this.compression = compression;
+} else {
+  this.compression = TDigestBuildSketchAggregator.DEFAULT_COMPRESSION;
+}
+  }
+
+  @Override
+  public synchronized void init(ByteBuffer buffer, int position)
 
 Review comment:
   Yeah, it's lame that the doc is missing about what should be synchronized. I 
think DataSketches implementations are wrong. It doesn't have to be 
synchronized because concurrent reads and writes can happen only in incremental 
index. You would see other BufferAggregator implementations of druid-core or 
druid-extensions-core don't do it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7331: TDigest backed sketch aggregators

2019-05-08 Thread GitBox
jihoonson commented on a change in pull request #7331: TDigest backed sketch 
aggregators
URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r282269622
 
 

 ##
 File path: 
extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchAggregatorFactory.java
 ##
 @@ -0,0 +1,269 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.tdigestsketch;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.tdunning.math.stats.MergingDigest;
+import com.tdunning.math.stats.TDigest;
+import org.apache.druid.query.aggregation.Aggregator;
+import org.apache.druid.query.aggregation.AggregatorFactory;
+import 
org.apache.druid.query.aggregation.AggregatorFactoryNotMergeableException;
+import org.apache.druid.query.aggregation.AggregatorUtil;
+import org.apache.druid.query.aggregation.BufferAggregator;
+import org.apache.druid.query.cache.CacheKeyBuilder;
+import org.apache.druid.segment.ColumnSelectorFactory;
+import org.apache.druid.segment.ColumnValueSelector;
+import org.apache.druid.segment.column.ColumnCapabilities;
+import org.apache.druid.segment.column.ValueType;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.List;
+import java.util.Objects;
+
+/**
+ * Aggregation operations over the tdigest-based quantile sketch
+ * available on https://github.com/tdunning/t-digest;>github and 
described
+ * in the paper
+ * https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf;>
+ * Computing extremely accurate quantiles using t-digests.
+ * 
+ * 
+ * At the time of writing this implementation, there are two flavors of {@link 
TDigest}
+ * available - {@link MergingDigest} and {@link 
com.tdunning.math.stats.AVLTreeDigest}.
+ * This implementation uses {@link MergingDigest} since it is more suited for 
the cases
+ * when we have to merge intermediate aggregations which Druid needs to do as
+ * part of query processing.
+ */
+public class TDigestBuildSketchAggregatorFactory extends AggregatorFactory
+{
+
+  // Default compression
+  public static final int DEFAULT_COMRESSION = 100;
+
+  @Nonnull
+  private final String name;
+  @Nonnull
+  private final String fieldName;
+  @Nonnull
+  final Integer compression;
+  @Nonnull
+  private final byte cacheTypeId;
+
+  public static final String TYPE_NAME = "buildTDigestSketch";
+
+  @JsonCreator
+  public TDigestBuildSketchAggregatorFactory(
+  @JsonProperty("name") final String name,
+  @JsonProperty("fieldName") final String fieldName,
+  @Nullable @JsonProperty("compression") final Integer compression
+  )
+  {
+this(name, fieldName, compression, 
AggregatorUtil.TDIGEST_BUILD_SKETCH_CACHE_TYPE_ID);
+  }
+
+  TDigestBuildSketchAggregatorFactory(
+  final String name,
+  final String fieldName,
+  @Nullable final Integer compression,
+  final byte cacheTypeId
+  )
+  {
+Objects.requireNonNull(name, "Must have a valid, non-null aggregator 
name");
+this.name = name;
+Objects.requireNonNull(fieldName, "Parameter fieldName must be specified");
+this.fieldName = fieldName;
+this.compression = compression == null ? DEFAULT_COMRESSION : compression;
+this.cacheTypeId = cacheTypeId;
+  }
+
+
+  @Override
+  public byte[] getCacheKey()
+  {
+return new CacheKeyBuilder(
+cacheTypeId
+).appendString(fieldName).appendInt(compression).build();
+  }
+
+
+  @Override
+  public Aggregator factorize(ColumnSelectorFactory metricFactory)
+  {
+ColumnCapabilities cap = metricFactory.getColumnCapabilities(fieldName);
+if (cap == null || ValueType.isNumeric(cap.getType())) {
+  final ColumnValueSelector selector = 
metricFactory.makeColumnValueSelector(fieldName);
+  return new TDigestBuildSketchAggregator(selector, compression);
+} else {
+  final ColumnValueSelector selector = 
metricFactory.makeColumnValueSelector(fieldName);
+  return new TDigestMergeSketchAggregator(selector, compression);
+}
+  }
+
+  @Override

[GitHub] [incubator-druid] nosahama commented on issue #2523: Support multiple lookups within one namespace

2019-05-08 Thread GitBox
nosahama commented on issue #2523: Support multiple lookups within one namespace
URL: 
https://github.com/apache/incubator-druid/issues/2523#issuecomment-490667638
 
 
   > As requested I am sharing our use case. We're using a TSV in S3 for a 
namespace lookup (at least to start with, we will probably switch over to a 
JDBC source eventually). We have a single key column, which always corresponds 
to the same actual dimension in Druid. We have a dozen lookup columns (could 
grow by a handful, but I'd think no more than 20). And we're starting pretty 
small now with only about 100K rows, but expect that could grow to several 
million rows before too long.
   > 
   > We don't need this updated really frequently. Actually we're still working 
out our ETLs and so forth to deal with revisions and additions to the lookup 
data. But I wouldn't expect us to have updates more frequently than hourly, and 
probably more like daily.
   > 
   > As far as pain points with this arrangement - there is sure plenty of 
boilerplate in the config. I have an array of a dozen entries in 
`druid.query.extraction.namespace.lookups` that are identical in all fields 
except for `namespace` and `valueColumn`. A bit clunky but not so much that I'd 
complain about it really - I did write a couple of simple scripts that generate 
the stuff to be placed in config.
   > 
   > I'm more concerned about the overhead when we do update the lookup source. 
Druid will have to load and parse this (potentially sized) 20 x 3M TSV once per 
lookup. I haven't done any benchmarking but I have noticed that it can take on 
the order of 15 seconds to completely load our current 12 x 100K case. Even if 
it takes a few minutes that is not a gamebreaker (assuming it does not 
interfere with query performance or produce inconsistent results while in 
progress). But it certainly seems like it could be a lot more efficient to load 
and parse the file once instead of 12 or 20 times.
   > 
   > Overall, the configuration and use feels a bit clunky, I think because 
from the user point of view, we have just one "lookup namespace" - there is a 
single source, and a single key column. It would feel more natural to define 
the data source level properties (uri, format, columns) and key column once, 
along with a list of allowed targetColumns, then use it in dimension specs and 
filters by referencing just the one single namespace plus a targetColumn. It 
might start to look like an ingestion spec at that point, with dataSchema- and 
ioConfig-like sections.
   > 
   > But honestly I don't know how much of a priority I'd want it to be. 
Associating a single namespace with a single key column and multiple value 
columns might well be overfitting to our specific case, and it 's certainly 
quite usable as it stands.
   > 
   > (One side note, the ability to include columns in the CSV which are not 
key or value columns is useful for assembling the data manually - we can 
include "friendly name" sort of columns that are helpful to people who are 
filling in or auditing the actual lookup data.)
   
   Hi there, please i am trying to configure Druid to load a lookup file from 
s3, how do i do this? Do i use the `file:/` syntax or there is another syntax 
for loading lookups from s3? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jon-wei commented on a change in pull request #7614: Fix exception when using complex aggs with result level caching

2019-05-08 Thread GitBox
jon-wei commented on a change in pull request #7614: Fix exception when using 
complex aggs with result level caching
URL: https://github.com/apache/incubator-druid/pull/7614#discussion_r282239041
 
 

 ##
 File path: 
processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java
 ##
 @@ -566,12 +566,24 @@ public Row apply(Object input)
   DimensionHandlerUtils.convertObjectToType(results.next(), 
dimensionSpec.getOutputType())
   );
 }
-
 Iterator aggsIter = aggs.iterator();
+
+// When using the result level cache, the agg values seen here are
 
 Review comment:
   Extracted to a helper method in `CacheStrategy` (`CacheUtil` is in `server` 
which is not a dependency of `processing`)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] himanshug merged pull request #7183: add postgresql meta db table schema configuration property (#7137)

2019-05-08 Thread GitBox
himanshug merged pull request #7183: add postgresql meta db table schema 
configuration property (#7137)
URL: https://github.com/apache/incubator-druid/pull/7183
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[incubator-druid] branch master updated: add postgresql meta db table schema configuration property (#7137) (#7183)

2019-05-08 Thread himanshug
This is an automated email from the ASF dual-hosted git repository.

himanshug pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-druid.git


The following commit(s) were added to refs/heads/master by this push:
 new 0ef435a  add postgresql meta db table schema configuration property 
(#7137) (#7183)
0ef435a is described below

commit 0ef435a16c511181bb97f61b230eeadf50d63535
Author: Jinseon Lee 
AuthorDate: Thu May 9 04:56:30 2019 +0900

add postgresql meta db table schema configuration property (#7137) (#7183)

* add postgresql meta db table schema configuration property (#7137)

If the postgresql db schema changes, you must set the configuration
values.
You do not need to set it if there is no change from the default schema
'public'.
druid.metadata.postgres.dbTableSchema=public

* create postgresql metadb table schema configuration property (#7137)
If the postgresql db schema changes, you must set the configuration
values.
You do not need to set it if there is no change from the default schema
'public'.
druid.metadata.postgres.dbTableSchema=public
check PostgreSQLTablesConfig.java

* modify postgresql readme file. - metadb table schema (#7137)
If the postgresql db schema changes, you must set the configuration
values.
You do not need to set it if there is no change from the default schema
'public'.
druid.metadata.postgres.dbTableSchema=public
check PostgreSQLTablesConfig.java
---
 .../development/extensions-core/postgresql.md  |  2 ++
 .../storage/postgresql/PostgreSQLConnector.java| 11 --
 .../PostgreSQLMetadataStorageModule.java   |  1 +
 .../storage/postgresql/PostgreSQLTablesConfig.java | 41 ++
 .../postgresql/PostgreSQLConnectorTest.java|  3 +-
 5 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/docs/content/development/extensions-core/postgresql.md 
b/docs/content/development/extensions-core/postgresql.md
index 07a2a78..26f77fc 100644
--- a/docs/content/development/extensions-core/postgresql.md
+++ b/docs/content/development/extensions-core/postgresql.md
@@ -83,3 +83,5 @@ In most cases, the configuration options map directly to the 
[postgres jdbc conn
 | `druid.metadata.postgres.ssl.sslRootCert` | The full path to the root 
certificate. | none | no |
 | `druid.metadata.postgres.ssl.sslHostNameVerifier` | The classname of the 
hostname verifier. | none | no |
 | `druid.metadata.postgres.ssl.sslPasswordCallback` | The classname of the SSL 
password provider. | none | no |
+| `druid.metadata.postgres.dbTableSchema` | druid meta table schema | `public` 
| no |
+
diff --git 
a/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLConnector.java
 
b/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLConnector.java
index e234a15..a474a0b 100644
--- 
a/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLConnector.java
+++ 
b/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLConnector.java
@@ -48,11 +48,14 @@ public class PostgreSQLConnector extends 
SQLMetadataConnector
 
   private volatile Boolean canUpsert;
 
+  private final String dbTableSchema;
+  
   @Inject
   public PostgreSQLConnector(
   Supplier config,
   Supplier dbTables,
-  PostgreSQLConnectorConfig connectorConfig
+  PostgreSQLConnectorConfig connectorConfig,
+  PostgreSQLTablesConfig tablesConfig
   )
   {
 super(config, dbTables);
@@ -104,7 +107,8 @@ public class PostgreSQLConnector extends 
SQLMetadataConnector
 }
 
 this.dbi = new DBI(datasource);
-
+this.dbTableSchema = tablesConfig.getDbTableSchema();
+
 log.info("Configured PostgreSQL as metadata storage");
   }
 
@@ -146,8 +150,9 @@ public class PostgreSQLConnector extends 
SQLMetadataConnector
   public boolean tableExists(final Handle handle, final String tableName)
   {
 return !handle.createQuery(
-"SELECT tablename FROM pg_catalog.pg_tables WHERE schemaname = 
'public' AND tablename ILIKE :tableName"
+"SELECT tablename FROM pg_catalog.pg_tables WHERE schemaname = 
:dbTableSchema AND tablename ILIKE :tableName"
 )
+  .bind("dbTableSchema", dbTableSchema)
   .bind("tableName", tableName)
   .map(StringMapper.FIRST)
   .list()
diff --git 
a/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLMetadataStorageModule.java
 
b/extensions-core/postgresql-metadata-storage/src/main/java/org/apache/druid/metadata/storage/postgresql/PostgreSQLMetadataStorageModule.java
index f10de65..9506edd 100644
--- 

[GitHub] [incubator-druid] himanshug commented on issue #7618: Virtual column updates for exploiting base column internal structure

2019-05-08 Thread GitBox
himanshug commented on issue #7618: Virtual column updates for exploiting base 
column internal structure
URL: https://github.com/apache/incubator-druid/pull/7618#issuecomment-490617207
 
 
   looks like there is test failure, will update.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] himanshug commented on a change in pull request #7606: Set direct memory if unable to detect JVM config

2019-05-08 Thread GitBox
himanshug commented on a change in pull request #7606: Set direct memory if 
unable to detect JVM config
URL: https://github.com/apache/incubator-druid/pull/7606#discussion_r282209983
 
 

 ##
 File path: 
server/src/main/java/org/apache/druid/guice/DruidProcessingModule.java
 ##
 @@ -157,9 +157,16 @@ private void verifyDirectMemory(DruidProcessingConfig 
config)
   }
 }
 catch (UnsupportedOperationException e) {
+  log.debug("Checking for direct memory size is not support on this 
platform: %s", e);
   log.info(
-  "Could not verify that you have enough direct memory, so I hope you 
do! Error message was: %s",
-  e.getMessage()
+  "Unable to determine max direct memory size. If 
-XX:MaxDirectMemorySize is set, make sure "
 
 Review comment:
   if above change is made then this should be something like...
   
   "Unable to determine max direct memory size. If 
druid.processing.buffer.sizeBytes is explicitly set then 
-XX:MaxDirectMemorySize to at least 'druid.processing.buffer.sizeBytes * 
(druid.processing.numMergeBuffers[%,d] + druid.processing.numThreads[%,d] + 1)' 
or else to at least 25% of maximum heap size."


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] himanshug commented on a change in pull request #7606: Set direct memory if unable to detect JVM config

2019-05-08 Thread GitBox
himanshug commented on a change in pull request #7606: Set direct memory if 
unable to detect JVM config
URL: https://github.com/apache/incubator-druid/pull/7606#discussion_r282205588
 
 

 ##
 File path: 
processing/src/main/java/org/apache/druid/query/DruidProcessingConfig.java
 ##
 @@ -52,7 +52,22 @@ public int intermediateComputeSizeBytes()
   return computedBufferSizeBytes.get();
 }
 
-long directSizeBytes = 
JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes();
+long directSizeBytes;
+try {
+  directSizeBytes = JvmUtils.getRuntimeInfo().getDirectMemorySizeBytes();
 
 Review comment:
   so this always fails for jdk 9 onwards ?
   if yes, that means if user wanted buffers of size 
DEFAULT_PROCESSING_BUFFER_SIZE_BYTES , they would configure that number but 
instead get buffers of size 25%-of-max-heap/totalNumBuffers ?
   in that case we need to change line 49 to have a different way of detecting 
whether user explicitly provided sizeBytes config .


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in 
groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490609093
 
 
   Thank you!!  We have been able to reproduce the problem.  Now I can dig in 
to see what went wrong.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] himanshug opened a new pull request #7618: Virtual column updates for exploiting base column internal structure

2019-05-08 Thread GitBox
himanshug opened a new pull request #7618: Virtual column updates for 
exploiting base column internal structure
URL: https://github.com/apache/incubator-druid/pull/7618
 
 
   Fixes #7574 
   
   This patch adds more methods to VirtualColumn interface to exploit base 
column's internal structure and few updates to interface users to use those 
methods.
   Unit tests are introduced to ensure expected use of VirtualColumn interface 
in Druid code.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7611: Add plain text README.txt, use relative link from README.md to build.md

2019-05-08 Thread GitBox
jihoonson commented on a change in pull request #7611: Add plain text 
README.txt, use relative link from README.md to build.md
URL: https://github.com/apache/incubator-druid/pull/7611#discussion_r282174091
 
 

 ##
 File path: distribution/src/assembly/source-assembly.xml
 ##
 @@ -47,6 +47,7 @@
 .gitignore
 .dockerignore
 .travis.yml
+README.md
 
 Review comment:
   Should we exclude this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] jihoonson commented on a change in pull request #7611: Add plain text README.txt, use relative link from README.md to build.md

2019-05-08 Thread GitBox
jihoonson commented on a change in pull request #7611: Add plain text 
README.txt, use relative link from README.md to build.md
URL: https://github.com/apache/incubator-druid/pull/7611#discussion_r282172376
 
 

 ##
 File path: README
 ##
 @@ -0,0 +1,89 @@
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+
+Apache Druid (incubating) is a high performance analytics data store for 
event-driven data. More information about Druid
+can be found on http://www.druid.io.
+
+The Druid community is in the process of migrating to Apache by way of the 
Apache Incubator. Eventually, as we proceed
+along this path, our site will move from http://druid.io/ to 
https://druid.apache.org/.
+
+
+Documentation
+-
 
 Review comment:
   nit: would it look esthetically better if the bar has the same length with 
the title? Same question for other sections.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in 
groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490573859
 
 
   @leerho, please let me know if the following is helpful, or if I could do 
anything else to help.
   
   What the Druid query is doing is something like this:
   
   1. Iterating over all rows in a Druid segment, and building up a theta 
sketch object. This object looks fine.
   2. Taking that object and merging it into the 'merge buffer', which starts 
off initialized to an empty sketch. This is where it goes off the rails.
   
   I scattered a bunch of sketch toStrings around the code and found that in 
step (2) they look like this:
   
   **The object built up from the segment scan,**
   
   ```
   ### HeapCompactOrderedSketch SUMMARY:
  Estimate: 100086.81001356241
  Upper Bound, 95% conf   : 101530.78009013624
  Lower Bound, 95% conf   : 98663.31883662633
  Theta (double)  : 0.16369789383615946
  Theta (long): 1509846576500454824
  Theta (long) hex: 14f40d7639a635a8
  EstMode?: true
  Empty?  : false
  Array Size Entries  : 16384
  Retained Entries: 16384
  Seed Hash   : 93cc | 37836
   ### END SKETCH SUMMARY
   ```
   
   **The initial state of the sketch in the merge buffer (should be empty),**
   
   ```
   ### HeapCompactOrderedSketch SUMMARY:
  Estimate: 0.0
  Upper Bound, 95% conf   : 0.0
  Lower Bound, 95% conf   : 0.0
  Theta (double)  : 1.0
  Theta (long): 9223372036854775807
  Theta (long) hex: 7fff
  EstMode?: false
  Empty?  : true
  Array Size Entries  : 0
  Retained Entries: 0
  Seed Hash   : 93cc | 37836
   ### END SKETCH SUMMARY
   ```
   
   **The final state of the sketch in the merge buffer (should match the 
original sketch from the segment scan),**
   
   ```
   ### HeapCompactOrderedSketch SUMMARY:
  Estimate: 16384.0
  Upper Bound, 95% conf   : 16384.0
  Lower Bound, 95% conf   : 16384.0
  Theta (double)  : 1.0
  Theta (long): 9223372036854775807
  Theta (long) hex: 7fff
  EstMode?: false
  Empty?  : false
  Array Size Entries  : 16384
  Retained Entries: 16384
  Seed Hash   : 93cc | 37836
   ### END SKETCH SUMMARY
   ```
   
   It's changed a bit, but doesn't match up.
   
   The code that printed this was the `aggregate` method in 
SketchBufferAggregator, which looks like this after the debugging code I added:
   
   ```java
 @Override
 public void aggregate(ByteBuffer buf, int position)
 {
   Object update = selector.getObject();
   if (update == null) {
 return;
   }
   
   Union union = getOrCreateUnion(buf, position);
   final String initialUnionResult = update instanceof SketchHolder ? 
union.getResult().toString() : null;
   
   SketchAggregator.updateUnion(union, update);
   
   if (update instanceof SketchHolder) {
 log.info(
 "Aggregate called with buffer[%s], position[%s], update = %s, 
union starts as = %s, union ends as = %s",
 System.identityHashCode(buf),
 position,
 ((SketchHolder) update).getSketch(),
 initialUnionResult,
 union.getResult()
 );
   }
 }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in 
groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490570278
 
 
   Very puzzling.  Se need to simplify the problem environment to where I can 
reproduce the problem outside Druid.  I suspect that somehow theta is being 
reset to 1.0, which would cause this.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leventov commented on a change in pull request #7595: Optimize overshadowed segments computation

2019-05-08 Thread GitBox
leventov commented on a change in pull request #7595: Optimize overshadowed 
segments computation
URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282157494
 
 

 ##
 File path: 
server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java
 ##
 @@ -744,6 +757,32 @@ public DataSegment map(int index, ResultSet r, 
StatementContext ctx) throws SQLE
 
 // Replace "dataSources" atomically.
 dataSources = newDataSources;
+overshadowedSegments = 
ImmutableSet.copyOf(determineOvershadowedSegments(segments));
 
 Review comment:
   There are some changes in this class to `dataSources` apart from in 
`doPoll()` that may make computed overshadowed segments invalid. (Even if they 
don't, there should be comments about that.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leventov commented on a change in pull request #7595: Optimize overshadowed segments computation

2019-05-08 Thread GitBox
leventov commented on a change in pull request #7595: Optimize overshadowed 
segments computation
URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282159218
 
 

 ##
 File path: 
server/src/main/java/org/apache/druid/server/coordinator/helper/DruidCoordinatorRuleRunner.java
 ##
 @@ -84,8 +85,10 @@ public DruidCoordinatorRuntimeParams 
run(DruidCoordinatorRuntimeParams params)
 // find available segments which are not overshadowed by other segments in 
DB
 // only those would need to be loaded/dropped
 // anything overshadowed by served segments is dropped automatically by 
DruidCoordinatorCleanupOvershadowed
-final Set overshadowed = ImmutableDruidDataSource
-.determineOvershadowedSegments(params.getAvailableSegments());
+// If metadata store hasn't been polled yet, use empty overshadowed list
+final Collection overshadowed = Optional
+
.ofNullable(coordinator.getMetadataSegmentManager().findOvershadowedSegments())
 
 Review comment:
   There are some concurrency issues: you may observe `overshadowedSegments` 
from a previous run. I think it would be easier to add `isOvershadowed` field 
to `DataSegment` directly in this PR because that's the plan anyway and that 
would allow avoiding concurrency issues.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leventov commented on a change in pull request #7595: Optimize overshadowed segments computation

2019-05-08 Thread GitBox
leventov commented on a change in pull request #7595: Optimize overshadowed 
segments computation
URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282156967
 
 

 ##
 File path: 
core/src/main/java/org/apache/druid/timeline/DataSegmentWithOvershadowedStatus.java
 ##
 @@ -25,16 +25,16 @@
 /**
  * DataSegment object plus the overshadowed status for the segment. An 
immutable object.
  *
- * SegmentWithOvershadowedStatus's {@link #compareTo} method considers only 
the {@link SegmentId}
+ * DataSegmentWithOvershadowedStatus's {@link #compareTo} method considers 
only the {@link SegmentId}
  * of the DataSegment object.
  */
-public class SegmentWithOvershadowedStatus implements 
Comparable
+public class DataSegmentWithOvershadowedStatus implements 
Comparable
 
 Review comment:
   I don't think this is a good rename. I think ["DataSegment" is an 
unfortunate name](https://github.com/apache/incubator-druid/issues/7396), so 
why propagating it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leventov commented on a change in pull request #7595: Optimize overshadowed segments computation

2019-05-08 Thread GitBox
leventov commented on a change in pull request #7595: Optimize overshadowed 
segments computation
URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282158038
 
 

 ##
 File path: 
server/src/main/java/org/apache/druid/metadata/SQLMetadataSegmentManager.java
 ##
 @@ -744,6 +757,32 @@ public DataSegment map(int index, ResultSet r, 
StatementContext ctx) throws SQLE
 
 // Replace "dataSources" atomically.
 dataSources = newDataSources;
+overshadowedSegments = 
ImmutableSet.copyOf(determineOvershadowedSegments(segments));
+  }
+
+  /**
+   * This method builds a timeline from given segments and finds the 
overshadowed segments
+   *
+   * @return set of overshadowed segments
+   */
+  private Set determineOvershadowedSegments(Iterable 
segments)
 
 Review comment:
   Doesn't need to be set, you copy into another Set anyway


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leventov commented on a change in pull request #7595: Optimize overshadowed segments computation

2019-05-08 Thread GitBox
leventov commented on a change in pull request #7595: Optimize overshadowed 
segments computation
URL: https://github.com/apache/incubator-druid/pull/7595#discussion_r282159420
 
 

 ##
 File path: 
server/src/main/java/org/apache/druid/metadata/MetadataSegmentManager.java
 ##
 @@ -98,6 +98,15 @@
 
   Collection getAllDataSourceNames();
 
+  /**
+   * Returns a collection of overshadowed segments
+   *
+   * Will return null if we do not have a valid snapshot of segments yet 
(perhaps the underlying metadata store has
+   * not yet been polled.)
+   */
+  @Nullable
+  Collection findOvershadowedSegments();
 
 Review comment:
   A method which is a simple getter shouldn't be called `find...`, implying 
high cost.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] drcrallen commented on issue #6948: Add guava compatability up to 27.0.1

2019-05-08 Thread GitBox
drcrallen commented on issue #6948: Add guava compatability up to 27.0.1
URL: https://github.com/apache/incubator-druid/pull/6948#issuecomment-490566931
 
 
   go away stale bot!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leventov commented on issue #6358: Interning in SQLMetadataSegmentManager may obliterate new segment metadata

2019-05-08 Thread GitBox
leventov commented on issue #6358: Interning in SQLMetadataSegmentManager may 
obliterate new segment metadata
URL: 
https://github.com/apache/incubator-druid/issues/6358#issuecomment-490553573
 
 
   @surekhasaharan I meant to add it as an error. So every time a developer 
wants to have `Set` in their code, they must put 
`@SuppressWarnings` (with justification) to pass CI.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in 
groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490552474
 
 
   Thank you, @pzhdfy, for the detailed instructions on how to reproduce this 
problem.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in 
groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490552264
 
 
   I was able to reproduce this as well. Downgrading to sketches-core-0.13.0 
fixed the problem. I also noticed that adding a limit to the groupBy fixed it 
as well. I'm not sure why - it does change the code paths, however. In Druid 
SQL, this query exhibits the issue:
   
   ```sql
   SELECT 'beep', APPROX_COUNT_DISTINCT_DS_THETA("user_id") FROM test_theta 
GROUP BY 1
   ```
   
   And this one doesn't:
   
   ```sql
   SELECT 'beep', APPROX_COUNT_DISTINCT_DS_THETA("user_id") FROM test_theta 
GROUP BY 1 LIMIT 1
   ```
   
   (The 'beep' is to force the SQL planner to use a groupBy rather than 
timeseries query type.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] gianm commented on a change in pull request #7614: Fix exception when using complex aggs with result level caching

2019-05-08 Thread GitBox
gianm commented on a change in pull request #7614: Fix exception when using 
complex aggs with result level caching
URL: https://github.com/apache/incubator-druid/pull/7614#discussion_r282136139
 
 

 ##
 File path: 
processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java
 ##
 @@ -566,12 +566,24 @@ public Row apply(Object input)
   DimensionHandlerUtils.convertObjectToType(results.next(), 
dimensionSpec.getOutputType())
   );
 }
-
 Iterator aggsIter = aggs.iterator();
+
+// When using the result level cache, the agg values seen here are
 
 Review comment:
   There's a long comment and identical block in each toolchest -- perhaps 
extract to a helper method in `CacheUtil`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
leerho commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in 
groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490541495
 
 
   @pzhdfy 
   1) Please try sketches release 0.13.0.  This will narrow down the possible 
changes that might be causing this.
   2) At the point where the sketch is reporting the bad result print out the 
sketch summary using  toString().  This might provide some clues.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
leerho edited a comment on issue #7607: thetaSketch(with sketches-core-0.13.1) 
in groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490541495
 
 
   @pzhdfy 
   1) Please try sketches release 0.13.0.  This will narrow down the possible 
changes that might be causing this.
   2) At the points where the sketch is reporting the good result and the bad 
result print out the sketch summary using  toString().  This might provide some 
clues.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384

2019-05-08 Thread GitBox
gianm commented on issue #7607: thetaSketch(with sketches-core-0.13.1) in 
groupBy always return value no more than 16384
URL: 
https://github.com/apache/incubator-druid/issues/7607#issuecomment-490541004
 
 
   In groupBy vs topN, as far as aggregators are concerned, one major 
difference is that groupBy uses `relocate` and topN does not. However, since 
you have just one `date_id`, I don't think it's likely that `relocate` would be 
called. So that's probably not it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] gianm commented on issue #7617: Segments/Tables getting dropped unwantedly

2019-05-08 Thread GitBox
gianm commented on issue #7617: Segments/Tables getting dropped unwantedly
URL: 
https://github.com/apache/incubator-druid/issues/7617#issuecomment-490530879
 
 
   @Shailesh-Pandey do you see a full stack trace somewhere, perhaps in the 
broker logs?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] Shailesh-Pandey opened a new issue #7617: Segments/Tables getting dropped unwantedly

2019-05-08 Thread GitBox
Shailesh-Pandey opened a new issue #7617: Segments/Tables getting dropped 
unwantedly
URL: https://github.com/apache/incubator-druid/issues/7617
 
 
   druid version:0.13.0
   <
   dsql> select * from tablename;
   
   java.lang.RuntimeException: Error while applying rule DruidTableScanRule, 
args [rel#16916567:LogicalTableScan.NONE.[](table=[druid, tablename])]
   >
   
   On the druid overlord console the dropped segments are not visible but a 
"\d" on dsql shows the table names.
   
   The data ingestion on one of the dropped tables was almost secondly(lot of 
data) while on another one was just hourly(rougly). All indexes are realtime.
   Any idea on what could be the issue or what should be initial approach 
towards troubleshooting.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] doctording opened a new issue #7616: jvm fatal error of index_realtime task on middleManager

2019-05-08 Thread GitBox
doctording opened a new issue #7616:  jvm fatal error of index_realtime task on 
middleManager
URL: https://github.com/apache/incubator-druid/issues/7616
 
 
   Druid 0.12.3
   
   on task log file(after smoosh file created, before copy to deep storage)
   
   ```java
   io.druid.java.util.common.io.smoosh.FileSmoosher - Created smoosh file ... 
0.smoosh] of size [2049254] bytes.
   #
   # A fatal error has been detected by the Java Runtime Environment:
   #
   #  SIGSEGV (0xb) at pc=0x, pid=23022, tid=0x7f56ece41700
   #
   # JRE version: Java(TM) SE Runtime Environment (8.0_171-b11) (build 
1.8.0_171-b11)
   # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode 
linux-amd64 compressed oops)
   # Problematic frame:
   # C  0x
   #
   # Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
   #
   # An error report file with more information is saved as:
   # /usr/local/imply-2.7.8/hs_err_pid23022.log
   [thread 140011328812800 also had an error]
   [thread 140011325654784 also had an error]
   [thread 140011323549440 also had an error][thread 140011317233408 also had 
an error]
   
   [thread 140011324602112 also had an error]
   [thread 140011326707456 also had an error]
   [thread 140011321444096 also had an error]
   [thread 140011319338752 also had an error]
   #
   # If you would like to submit a bug report, please visit:
   #   http://bugreport.java.com/bugreport/crash.jsp
   #
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] doctording removed a comment on issue #7032: KIS task crashes with JVM segfault

2019-05-08 Thread GitBox
doctording removed a comment on issue #7032: KIS task crashes with JVM segfault
URL: 
https://github.com/apache/incubator-druid/issues/7032#issuecomment-490459144
 
 
   +1 druid 0.12.3


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] doctording commented on issue #7032: KIS task crashes with JVM segfault

2019-05-08 Thread GitBox
doctording commented on issue #7032: KIS task crashes with JVM segfault
URL: 
https://github.com/apache/incubator-druid/issues/7032#issuecomment-490459144
 
 
   +1 druid 0.12.3


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] leventov opened a new issue #7615: Add endpoint to kill all segments belonging to a data source

2019-05-08 Thread GitBox
leventov opened a new issue #7615: Add endpoint to kill all segments belonging 
to a data source
URL: https://github.com/apache/incubator-druid/issues/7615
 
 
   
https://github.com/apache/incubator-druid/blob/9b197b436b4489195ad589b44978d0a9e08d2c3f/web-console/src/views/datasource-view.tsx#L306
   
   It's better to add an endpoint to kill all data properly rather than using 
`1000/3000` workaround.
   
   FYI @vogievetsky @surekhasaharan  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] clintropolis closed pull request #7612: add git porcelain check to travis

2019-05-08 Thread GitBox
clintropolis closed pull request #7612: add git porcelain check to travis
URL: https://github.com/apache/incubator-druid/pull/7612
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] samarthjain commented on a change in pull request #7331: TDigest backed sketch aggregators

2019-05-08 Thread GitBox
samarthjain commented on a change in pull request #7331: TDigest backed sketch 
aggregators
URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r281958380
 
 

 ##
 File path: 
extensions-contrib/tdigestsketch/src/main/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestBuildSketchBufferAggregator.java
 ##
 @@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.tdigestsketch;
+
+import com.google.common.base.Preconditions;
+import com.tdunning.math.stats.MergingDigest;
+import it.unimi.dsi.fastutil.ints.Int2ObjectMap;
+import it.unimi.dsi.fastutil.ints.Int2ObjectOpenHashMap;
+import org.apache.druid.java.util.common.IAE;
+import org.apache.druid.query.aggregation.BufferAggregator;
+import org.apache.druid.segment.ColumnValueSelector;
+
+import javax.annotation.Nonnull;
+import javax.annotation.concurrent.GuardedBy;
+import java.nio.ByteBuffer;
+import java.util.IdentityHashMap;
+import java.util.Map;
+
+/**
+ * Aggregator that builds t-digest backed sketches using numeric values read 
from {@link ByteBuffer}
+ */
+public class TDigestBuildSketchBufferAggregator implements BufferAggregator
+{
+
+  @Nonnull
+  private final ColumnValueSelector selector;
+  @Nonnull
+  private final int compression;
+
+  @GuardedBy("this")
+  private Map> sketches = new 
IdentityHashMap<>();
+
+  public TDigestBuildSketchBufferAggregator(
+  final ColumnValueSelector valueSelector,
+  final Integer compression
+  )
+  {
+Preconditions.checkNotNull(valueSelector);
+this.selector = valueSelector;
+if (compression != null) {
+  this.compression = compression;
+} else {
+  this.compression = TDigestBuildSketchAggregator.DEFAULT_COMPRESSION;
+}
+  }
+
+  @Override
+  public synchronized void init(ByteBuffer buffer, int position)
 
 Review comment:
   @jihoonson - unfortunately the documentation on the base classes/interfaces 
doesn't clearly mention which methods could be called in a multi-threaded 
fashion. So I ended up following what the DataSketches implementation does. 
   For ex - 
   
https://github.com/apache/incubator-druid/blob/master/extensions-core/datasketches/src/main/java/org/apache/druid/query/aggregation/datasketches/quantiles/DoublesSketchBuildBufferAggregator.java#L54
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] samarthjain commented on a change in pull request #7331: TDigest backed sketch aggregators

2019-05-08 Thread GitBox
samarthjain commented on a change in pull request #7331: TDigest backed sketch 
aggregators
URL: https://github.com/apache/incubator-druid/pull/7331#discussion_r281958692
 
 

 ##
 File path: 
extensions-contrib/tdigestsketch/src/test/java/org/apache/druid/query/aggregation/tdigestsketch/TDigestSketchAggregatorTest.java
 ##
 @@ -0,0 +1,284 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.tdigestsketch;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import org.apache.druid.data.input.Row;
+import org.apache.druid.jackson.DefaultObjectMapper;
+import org.apache.druid.java.util.common.granularity.Granularities;
+import org.apache.druid.java.util.common.guava.Sequence;
+import org.apache.druid.query.aggregation.AggregationTestHelper;
+import org.apache.druid.query.aggregation.AggregatorFactory;
+import org.apache.druid.query.groupby.GroupByQueryConfig;
+import org.apache.druid.query.groupby.GroupByQueryRunnerTest;
+import org.junit.Assert;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import java.io.File;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.List;
+
+@RunWith(Parameterized.class)
+public class TDigestSketchAggregatorTest
+{
+
+  private final AggregationTestHelper helper;
+  private final AggregationTestHelper timeSeriesHelper;
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big

2019-05-08 Thread GitBox
mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The 
generated specification is too big
URL: 
https://github.com/apache/incubator-druid/issues/7597#issuecomment-490081669
 
 
   I have increased the `druid.indexer.runner.maxZnodeBytes` value to `2.5MB` 
and everything worked fine until the Supervisor reached an interval with many 
segments (the [created 
JSON](https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/overlord/RemoteTaskRunner.java#L863)
 has ~ `1.5MB`). Starting from this point,  the `Waiting Tasks - Tasks waiting 
on locks` section in Overlord was filled with dozens of MV tasks (for the same 
data source: `index_materialized_view_test_2019-05-07...`). 
   
   The error:
   
   ```
   ERROR [LeaderSelector[/druid/druid-prod/overlord/_OVERLORD]] 
org.apache.druid.curator.discovery.CuratorDruidLeaderSelector - listener 
becomeLeader() failed. Unable to become leader: 
{class=org.apache.druid.curator.discovery.CuratorDruidLeaderSelector, 
exceptionType=class org.apache.druid.java.util.common.ISE, 
exceptionMessage=Could not reacquire lock on 
interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] 
version[2019-05-07T11:24:03.543Z] for task: 
index_materialized_view_test_2019-05-07T11:16:05.026Z}
   org.apache.druid.java.util.common.ISE: Could not reacquire lock on 
interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] 
version[2019-05-07T11:24:03.543Z] for task: 
index_materialized_view_test_2019-05-07T11:16:05.026Z
   at 
org.apache.druid.indexing.overlord.TaskLockbox.syncFromStorage(TaskLockbox.java:171)
 ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   at 
org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:109)
 ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   at 
org.apache.druid.curator.discovery.CuratorDruidLeaderSelector$1.isLeader(CuratorDruidLeaderSelector.java:98)
 [druid-server-0.13.0-incubating.jar:0.13.0-incubating]
   at 
org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:665)
 [curator-recipes-4.0.0.jar:4.0.0]
   at 
org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:661)
 [curator-recipes-4.0.0.jar:4.0.0]
   at 
org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93)
 [curator-framework-4.0.0.jar:4.0.0]
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_201]
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_201]
   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
   ```
   
   Because of this error, the Overlord has stopped to respond.
   
   Also, all those waiting tasks have the same payload. If I don't suspend the 
Supervisor, the waiting tasks are keep growing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [incubator-druid] mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The generated specification is too big

2019-05-08 Thread GitBox
mihai-cazacu-adswizz edited a comment on issue #7597: [materialized view] The 
generated specification is too big
URL: 
https://github.com/apache/incubator-druid/issues/7597#issuecomment-490081669
 
 
   I have increased the `druid.indexer.runner.maxZnodeBytes` value to `2.5MB` 
and everything worked fine until the Supervisor reached an interval with many 
segments (the [created 
JSON](https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/overlord/RemoteTaskRunner.java#L863)
 has ~ `1.5MB`). Starting from this point,  the `Waiting Tasks - Tasks waiting 
on locks` section in Overlord was filled with dozens of MV tasks (for the same 
data source: `index_materialized_view_test_2019-05-07...`). 
   
   The error:
   
   ```
   ERROR [LeaderSelector[/druid/druid-prod/overlord/_OVERLORD]] 
org.apache.druid.curator.discovery.CuratorDruidLeaderSelector - listener 
becomeLeader() failed. Unable to become leader: 
{class=org.apache.druid.curator.discovery.CuratorDruidLeaderSelector, 
exceptionType=class org.apache.druid.java.util.common.ISE, 
exceptionMessage=Could not reacquire lock on 
interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] 
version[2019-05-07T11:24:03.543Z] for task: 
index_materialized_view_test_2019-05-07T11:16:05.026Z}
   org.apache.druid.java.util.common.ISE: Could not reacquire lock on 
interval[2019-02-21T00:00:00.000Z/2019-02-22T00:00:00.000Z] 
version[2019-05-07T11:24:03.543Z] for task: 
index_materialized_view_test_2019-05-07T11:16:05.026Z
   at 
org.apache.druid.indexing.overlord.TaskLockbox.syncFromStorage(TaskLockbox.java:171)
 ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   at 
org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:109)
 ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   at 
org.apache.druid.curator.discovery.CuratorDruidLeaderSelector$1.isLeader(CuratorDruidLeaderSelector.java:98)
 [druid-server-0.13.0-incubating.jar:0.13.0-incubating]
   at 
org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:665)
 [curator-recipes-4.0.0.jar:4.0.0]
   at 
org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:661)
 [curator-recipes-4.0.0.jar:4.0.0]
   at 
org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93)
 [curator-framework-4.0.0.jar:4.0.0]
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_201]
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_201]
   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
   ```
   
   Because of this error, the Overlord has stopped to respond.
   
   Also, all those waiting tasks have the same payload.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org