[GitHub] drill pull request #1090: DRILL-6080: Sort incorrectly limits batch size to ...

2018-01-12 Thread paul-rogers
GitHub user paul-rogers opened a pull request:

https://github.com/apache/drill/pull/1090

DRILL-6080: Sort incorrectly limits batch size to 65535 records

Sort incorrectly limits batch size to 65535 records rather than 65536.

This PR also includes a few code cleanup items.

Also fixes DRILL-6086: Overflow in offset vector in row set writer

@vrozov, please review.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/paul-rogers/drill DRILL-6080

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1090.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1090


commit c1d3402a619f3355e47e845aae245fd0f96e2189
Author: Paul Rogers 
Date:   2018-01-11T00:04:53Z

DRILL-6080: Sort incorrectly limits batch size to 65535 records

Sort incorrectly limits batch size to 65535 records rather than 65536.

This PR also includes a few code cleanup items.

Fix for overflow in offset vector in row set writer




---


[GitHub] drill pull request #1089: DRILL-6078: support timestamp type to be pushed in...

2018-01-12 Thread chunhui-shi
GitHub user chunhui-shi opened a pull request:

https://github.com/apache/drill/pull/1089

DRILL-6078: support timestamp type to be pushed into MapRDB



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chunhui-shi/drill work3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1089.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1089


commit 1c2a7c94fcdb56e7e430879b26f1b3e5a5144d11
Author: chunhui-shi 
Date:   2018-01-12T23:14:44Z

DRILL-6078: support timestamp type to be pushed into MapRDB




---


[GitHub] drill pull request #1086: DRILL-6076: Reduce the default memory from a total...

2018-01-12 Thread kkhatua
Github user kkhatua closed the pull request at:

https://github.com/apache/drill/pull/1086


---


[GitHub] drill issue #1086: DRILL-6076: Reduce the default memory from a total of 13G...

2018-01-12 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1086
  
Ok. Looks like 
[DRILL-5926](https://issues.apache.org/jira/browse/DRILL-5926) (committed into 
Apache master) explicitly has a need for bumping up the memory to 4GB for heap 
alone. I guess we'll need to figure out something else to manage this.


---


[GitHub] drill issue #1084: DRILL-5868: Support SQL syntax highlighting of queries

2018-01-12 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1084
  
@arina-ielchiieva updated the changes based on your review. Please have a 
look.


---


[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...

2018-01-12 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1084#discussion_r161350730
  
--- Diff: 
exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/snippets/sql.js
 ---
@@ -0,0 +1,46 @@
+/**
+ * Drill SQL Syntax Snippets
+ */
+
+ace.define("ace/snippets/sql",["require","exports","module"], 
function(require, exports, module) {
+"use strict";
+
+exports.snippetText = "snippet info\n\
+   select * from INFORMATION_SCHEMA.${1:};\n\
+snippet sysmem\n\
+   select * from sys.mem;\n\
+snippet sysopt\n\
+   select * from sys.opt;\n\
+snippet sysbit\n\
+   select * from sys.bit;\n\
+snippet sysconn\n\
+   select * from sys.conn;\n\
+snippet sysprof\n\
+   select * from sys.prof;\n\
+snippet cview\n\
+   create view ${1:[workspace]}.${2:} ( ${3:} )  as 
\n\
+   ${4:};\n\
+snippet ctas\n\
+   create table ${1:} ( ${2:} )  as \n\
+   ${3:};\n\
+snippet ctemp\n\
+   create temporary table ${1:} ( ${2:} )  as \n\
--- End diff --

Done.


---


[jira] [Created] (DRILL-6086) TestSortImpl.testLargeBatch unit test fails randomly

2018-01-12 Thread Vlad Rozov (JIRA)
Vlad Rozov created DRILL-6086:
-

 Summary: TestSortImpl.testLargeBatch unit test fails randomly
 Key: DRILL-6086
 URL: https://issues.apache.org/jira/browse/DRILL-6086
 Project: Apache Drill
  Issue Type: Bug
Reporter: Vlad Rozov


{noformat}
Failed tests:
  TestSortImpl.testLargeBatch:513->runJumboBatchTest:486->runLargeSortTest:455 
Value of 1:0 expected:<0> but was:<1>
{noformat}

The test fails due to memory corruption caused by a write out of the direct 
buffer allocated space. With bounds check enabled, the test fails reliably with 
{noformat}
/Library/Java/JavaVirtualMachines/jdk1.8.0_144.jdk/Contents/Home/bin/java 
-agentlib:jdwp=transport=dt_socket,address=127.0.0.1:57731,suspend=y,server=n 
-Dvisualvm.id=131461133353377 -Ddrill.exec.rpc.user.timeout=0 
-Ddrill.exec.rpc.bit.timeout=0 -Dlog.path=${DRILL_LOG_DIR}/drill.log 
-Dlog.query.path=${DRILL_LOG_DIR}/query.log 
-Djava.io.tmpdir=/Users/vrozov/Projects/Apache/drill/exec/java-exec/target 
-Xms512m -Xmx4096m -Ddrill.exec.http.enabled=false 
-Ddrill.exec.sys.store.provider.local.write=false 
-Dorg.apache.drill.exec.server.Drillbit.system_options=org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on
 -Ddrill.test.query.printing.silent=true 
-Ddrill.catastrophic_to_standard_out=true -XX:MaxPermSize=512M 
-XX:MaxDirectMemorySize=3072M -Djava.net.preferIPv4Stack=true 
-Djava.awt.headless=true -XX:+CMSClassUnloadingEnabled -ea 
-Didea.test.cyclic.buffer.size=1048576 
-javaagent:/Users/vrozov/Library/Caches/IntelliJIdea2017.3/captureAgent/debugger-agent.jar=/private/var/folders/52/11m3mlk902g_wwp856y3sdvcgp/T/capture.props
 -Dfile.encoding=UTF-8 -classpath "/Applications/IntelliJ 
IDEA.app/Contents/lib/idea_rt.jar:/Applications/IntelliJ 
IDEA.app/Contents/plugins/junit/lib/junit-rt.jar:/Applications/IntelliJ 

Re: Odd Error on Login with 1.13-SNAPSHOT

2018-01-12 Thread Vitalii Diravka
Hi John,

Probably this Jira can help you [1].
You should verify your drill-override.conf file. Do you want to use any
security mechanism?

[1] https://issues.apache.org/jira/browse/DRILL-5425

Kind regards
Vitalii

On Thu, Jan 11, 2018 at 6:52 PM, John Omernik  wrote:

> I am probably missing something minor here, but I am working with Ted
> Dunning on some PCAP plugin stuff, so I built his 1.13 SNAPSHOT, and when I
> try to login I see
>
> {
>   "errorMessage" : "No configuration setting found for key
> 'drill.exec.http.auth'"
> }
>
>
>
> I am guessing that something was added that I need to fill out in my
> config? Is there a JIRA or something that can guide me on this?
>
>
> Thanks
>


[jira] [Created] (DRILL-6085) Travis build sometimes fails becomes vm suddenly exits.

2018-01-12 Thread Timothy Farkas (JIRA)
Timothy Farkas created DRILL-6085:
-

 Summary: Travis build sometimes fails becomes vm suddenly exits.
 Key: DRILL-6085
 URL: https://issues.apache.org/jira/browse/DRILL-6085
 Project: Apache Drill
  Issue Type: Bug
Reporter: Timothy Farkas
Assignee: Timothy Farkas


Travis fails with the following error.

{code}
Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.17:test 
(default-test) on project drill-java-exec: Execution 
default-test of goal org.apache.maven.plugins:maven-surefire-plugin:2.17:test 
failed: The forked VM terminated without properly saying goodbye. VM crash or 
System.exit called?
{code}

I believe this error is caused by the vm running out of memory. I will try 
bumping up the memory slightly and doing several runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill issue #1086: DRILL-6076: Reduce the default memory from a total of 13G...

2018-01-12 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1086
  
I think folks running unit tests can continue with the existing limits of 
4GB heap+4GB Direct. The idea is to get Drill up and running for minimal use 
cases, and you are expected to bump up memory to higher limits if queries get 
OOM errors. Unit tests, if carrying any tests that require much more memory, 
would qualify under this.
I've launched a 
[fork](https://github.com/kkhatua/drill/commits/LowerMemUnitTest) of this 
branch with reduced memory settings for the unit tests as well and letting 
TravisCI test it out:
https://travis-ci.org/kkhatua/drill/builds/328250155
If it passes, we can either reduce those settings in the POM files as well, 
or leave it intact to ensure developers doing unit tests are able to run the 
tests in a reasonable amount of time.



---


[jira] [Created] (DRILL-6084) Expose list of Drill functions for consumption by JavaScript libraries

2018-01-12 Thread Kunal Khatua (JIRA)
Kunal Khatua created DRILL-6084:
---

 Summary: Expose list of Drill functions for consumption by 
JavaScript libraries
 Key: DRILL-6084
 URL: https://issues.apache.org/jira/browse/DRILL-6084
 Project: Apache Drill
  Issue Type: Improvement
  Components: Web Server
Reporter: Kunal Khatua
Priority: Minor
 Fix For: Future


DRILL-5868 introduces SQL highlighter and the auto-complete feature in the 
web-UI's query editor.
For the backend Javascript to highlight the keywords correctly as SQL reserved 
words or functions, it needs a list of all the Drill functions to be already 
defined in a source JavaScript ( {{../mode-sql.js}} ) .
While this works well for standard Drill functions, it means that any new 
function added to Drill needs to be explicitly hard-coded into the file, rather 
than be programatically discovered and loaded for the library to highlight.

It would help if this can be done dynamically without contributors having to 
explicitly alter the script file to add new function names.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill issue #1086: DRILL-6076: Reduce the default memory from a total of 13G...

2018-01-12 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/1086
  
The settings for running the unit tests are (from the pom file :
  -Xms512m -Xmx4096m 
  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M
  -XX:+CMSClassUnloadingEnabled -ea

At the very least, the default settings for Drill and the unit tests should 
match.


---


[jira] [Resolved] (DRILL-6047) Update doc to include instructions for libpam4j

2018-01-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens resolved DRILL-6047.
---
Resolution: Fixed

> Update doc to include instructions for libpam4j
> ---
>
> Key: DRILL-6047
> URL: https://issues.apache.org/jira/browse/DRILL-6047
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 1.12.0
>Reporter: Bridget Bevens
>Assignee: Bridget Bevens
>Priority: Minor
> Fix For: 1.12.0
>
>
> Update Apache Drill docs to include JPAM and libpam4j PAM authenticator 
> instructions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...

2018-01-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1023


---


[GitHub] drill pull request #1049: DRILL-5971: Fix INT64, INT32 logical types in comp...

2018-01-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1049


---


[GitHub] drill pull request #1070: DRILL-6004: Direct buffer bounds checking should b...

2018-01-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1070


---


[GitHub] drill pull request #1067: DRILL-3958: Return a valid error message when stor...

2018-01-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1067


---


[GitHub] drill pull request #1041: DRILL-5961: For long running queries (> 10 min) Dr...

2018-01-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1041


---


[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...

2018-01-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1075


---


[GitHub] drill pull request #1026: DRILL-5919: Add non-numeric support for JSON proce...

2018-01-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1026


---


[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...

2018-01-12 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1084#discussion_r161288148
  
--- Diff: 
exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/snippets/sql.js
 ---
@@ -0,0 +1,46 @@
+/**
+ * Drill SQL Syntax Snippets
+ */
+
+ace.define("ace/snippets/sql",["require","exports","module"], 
function(require, exports, module) {
+"use strict";
+
+exports.snippetText = "snippet info\n\
+   select * from INFORMATION_SCHEMA.${1:};\n\
+snippet sysmem\n\
+   select * from sys.mem;\n\
+snippet sysopt\n\
+   select * from sys.opt;\n\
+snippet sysbit\n\
+   select * from sys.bit;\n\
+snippet sysconn\n\
+   select * from sys.conn;\n\
+snippet sysprof\n\
+   select * from sys.prof;\n\
+snippet cview\n\
--- End diff --

Good catch. 
I was generating the code for the snippets script, so I missed these (they 
went into a recent commit).  I'll fix these. 


---


[GitHub] drill issue #1060: DRILL-5846: Improve parquet performance for Flat Data Typ...

2018-01-12 Thread sachouche
Github user sachouche commented on the issue:

https://github.com/apache/drill/pull/1060
  
@paul-rogers with regard to the design aspects that you brought up:
** Corrections about the proposed Design **
Your analysis somehow assumes the Vector is the one driving the loading 
operation. This is not the case, a) it is the reader which invokes the bulk 
save API b) the Vector save method runs a loop requesting the rest of the data. 
The reader can any time stop the loading phase; currently the reader uses the 
batch size to make this decision. The plan is to enhance this logic by having a 
feedback from the Vector save method about memory usage (e.g., save current row 
set with the condition the memory usage doesn't get X bytes). I believe I have 
raised this design decision early on and the agreement was to implement the 
memory batch restrictions in a next Drill release.

** Agile Development **
Through the years I came to appreciate the value of agile development. One 
needs to prioritize design activities; it is completely valid to start with a 
design (which is not guaranteed to be perfect) and then refine it overtime as 
long as the underlying implementation doesn't spill to other modules 
(encapsulation). This has the merit of showcasing incremental improvements and 
allowing the developer to get new insight as they have a better understanding.
   


---


[GitHub] drill pull request #1085: DRILL-6049: Misc. hygiene and code cleanup changes

2018-01-12 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1085#discussion_r161208584
  
--- Diff: common/src/main/java/org/apache/drill/common/types/Types.java ---
@@ -728,4 +733,49 @@ public static boolean isLaterType(MajorType type) {
 return type.getMinorType() == MinorType.LATE;
   }
 
+  public static boolean isEquivalent(MajorType type1, MajorType type2) {
+
+// Requires full type equality, including fields such as precision and 
scale.
+// But, unset fields are equivalent to 0. Can't use the 
protobuf-provided
+// isEquals() which treats set and unset fields as different.
+
+if (type1.getMinorType() != type2.getMinorType() ||
+type1.getMode() != type2.getMode() ||
+type1.getScale() != type2.getScale() ||
+type1.getPrecision() != type2.getPrecision()) {
+  return false;
+}
+
+// Subtypes are only for unions and are seldom used.
+
+if (type1.getMinorType() != MinorType.UNION) {
+  return true;
+}
+
+List subtypes1 = type1.getSubTypeList();
+List subtypes2 = type2.getSubTypeList();
+if (subtypes1 == subtypes2) { // Only occurs if both are null
+  return true;
+}
+if (subtypes1 == null || subtypes2 == null) {
+  return false;
+}
+if (subtypes1.size() != subtypes2.size()) {
+  return false;
+}
+
+// Now it gets slow because subtype lists are not ordered.
+
+List copy1 = new ArrayList<>();
+List copy2 = new ArrayList<>();
+copy1.addAll(subtypes1);
+copy2.addAll(subtypes2);
+Collections.sort(copy1);
+Collections.sort(copy2);
+return copy1.equals(copy2);
+  }
+
+  public static String typeKey(MinorType type) {
--- End diff --

Why we need this method? It looks like it is never used? If it's intended 
for future use please add java doc then.


---


[GitHub] drill pull request #1085: DRILL-6049: Misc. hygiene and code cleanup changes

2018-01-12 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1085#discussion_r161215163
  
--- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/ExecTest.java 
---
@@ -38,20 +38,26 @@
 import org.apache.drill.exec.server.options.SystemOptionManager;
 import 
org.apache.drill.exec.store.sys.store.provider.LocalPersistentStoreProvider;
 import org.apache.drill.exec.util.GuavaPatcher;
+import org.apache.drill.test.BaseDirTestWatcher;
--- End diff --

Starting from this class, github does not highlight Java code. I remember 
@vdiravka had similar problem, it turned out that he copied code from text 
editor with some hidden symbols. Please check and fix.


---


[GitHub] drill pull request #1085: DRILL-6049: Misc. hygiene and code cleanup changes

2018-01-12 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1085#discussion_r161208998
  
--- Diff: common/src/test/java/org/apache/drill/test/DrillTest.java ---
@@ -54,8 +56,7 @@
   static MemWatcher memWatcher;
   static String className;
 
-  @Rule public final TestRule TIMEOUT = TestTools.getTimeoutRule(10);
-
+  @Rule public final TestRule TIMEOUT = new 
DisableOnDebug(TestTools.getTimeoutRule(100_000));
--- End diff --

Great enhancement!


---


[GitHub] drill pull request #1085: DRILL-6049: Misc. hygiene and code cleanup changes

2018-01-12 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1085#discussion_r161211784
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java ---
@@ -278,22 +273,62 @@ public void addDoubleMetrics(OperatorProfile.Builder 
builder) {
 }
   }
 
-  @Override
+  /**
+   * Set a stat to the specified long value. Creates the stat
+   * if the stat does not yet exist.
+   *
+   * @param metric the metric to update
+   * @param value the value to set
+   */
+
   public void addLongStat(MetricDef metric, long value){
 longMetrics.putOrAdd(metric.metricId(), value, value);
   }
 
-  @Override
+  @VisibleForTesting
+  public long getLongStat(MetricDef metric) {
+return longMetrics.get(metric.metricId());
--- End diff --

If metric is absent, will it return 0 or fail?


---


[GitHub] drill pull request #1085: DRILL-6049: Misc. hygiene and code cleanup changes

2018-01-12 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1085#discussion_r161207650
  
--- Diff: 
common/src/main/java/org/apache/drill/common/exceptions/UserException.java ---
@@ -83,23 +83,17 @@ public static Builder memoryError() {
* The cause message will be used unless {@link 
Builder#message(String, Object...)} is called.
* If the wrapped exception is, or wraps, a user exception it will be 
returned by {@link Builder#build(Logger)}
* instead of creating a new exception. Any added context will be added 
to the user exception as well.
-   * 
-   * This exception, previously deprecated, has been repurposed to 
indicate unspecified
-   * errors. In particular, the case in which a lower level bit of code 
throws an
-   * exception other than UserException. The catching code then only knows 
"something went
-   * wrong", but not enough information to categorize the error.
-   * 
-   * System errors also indicate illegal internal states, missing 
functionality, and other
-   * code-related errors -- all of which "should never occur."
*
* @see 
org.apache.drill.exec.proto.UserBitShared.DrillPBError.ErrorType#SYSTEM
*
* @param cause exception we want the user exception to wrap. If cause 
is, or wrap, a user exception it will be
*  returned by the builder instead of creating a new user 
exception
* @return user exception builder
*
+   * @deprecated This method should never need to be used explicitly, 
unless you are passing the exception to the
+   * Rpc layer or UserResultListener.submitFailed()
*/
-
+  @Deprecated
--- End diff --

Deprecated means that we might remove it in future. But since you mention 
that it is still ok to use in certain cases, I am not sure that such annotation 
is appropriate.


---


[jira] [Resolved] (DRILL-5690) RepeatedDecimal18Vector does not pass scale, precision to data vector

2018-01-12 Thread Volodymyr Vysotskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi resolved DRILL-5690.

Resolution: Fixed

Fixed in 
https://github.com/apache/drill/commit/40de8ca4f47533fa6593d1266403868ae1a2119f

> RepeatedDecimal18Vector does not pass scale, precision to data vector
> -
>
> Key: DRILL-5690
> URL: https://issues.apache.org/jira/browse/DRILL-5690
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.13.0
>
>
> Decimal types require not just the type (Decimal9, Decimal18, etc.) but also 
> a precision and scale. The triple of (minor type, precision, scale) appears 
> in the {{MaterializedField}} for the nullable or required vectors.
> A repeated vector has three parts: the {{RepeatedDecimal18Vector}} which is 
> composed of a {{UInt4Vector}} offset vector and a {{Decimal18Vector}} that 
> holds values.
> When {{RepeatedDecimal18Vector}} creates the {{Decimal18Vector}} to hold the 
> values, it clones the {{MaterializedField}}. But, it *does not* clone the 
> scale and precision, resulting in the loss of critical information.
> {code}
>   public RepeatedDecimal18Vector(MaterializedField field, BufferAllocator 
> allocator) {
> super(field, allocator);
> 
> addOrGetVector(VectorDescriptor.create(Types.required(field.getType().getMinorType(;
>   }
> {code}
> This is normally not a problem because most code access the data via the 
> repeated vector. But, for code that needs to work with the values, the 
> results are wrong given that the types are wrong. (Values stored with one 
> scale, 123.45, (scale 2) will be retrieved with 0 scale (123, say).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-12 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r161205848
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/TestJoinNullable.java ---
@@ -568,6 +570,22 @@ public void nullMixedComparatorEqualJoinHelper(final 
String query) throws Except
 .go();
   }
 
+  /** InnerJoin with empty dir table on nullable cols, MergeJoin */
+  // TODO: the same tests should be added for HashJoin operator, DRILL-6070
+  @Test
--- End diff --

Please add test for merge join with ignore annotation. Also please check 
that NLJ also works fine. If not, please fix.


---


[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

2018-01-12 Thread vvysotskyi
Github user vvysotskyi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1066#discussion_r161185413
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/AggPruleBase.java
 ---
@@ -82,4 +83,19 @@ protected boolean create2PhasePlan(RelOptRuleCall call, 
DrillAggregateRel aggreg
 }
 return true;
   }
+
+  /**
+   * Returns group-by keys with the remapped arguments for specified 
aggregate.
+   *
+   * @param groupSet ImmutableBitSet of aggregate rel node, whose group-by 
keys should be remapped.
+   * @return {@link ImmutableBitSet} instance with remapped keys.
+   */
+  public static ImmutableBitSet remapGroupSet(ImmutableBitSet groupSet) {
--- End diff --

After the changes in CALCITE-1930 in the class 
`AggregateExpandDistinctAggregatesRule`, this rule started applying more 
correctly, since in the older version there were checks like this:
```
aggCall.getAggregation() instanceof SqlCountAggFunction
```
but they were replaced by checks like this:
```
final SqlKind aggCallKind = aggCall.getAggregation().getKind();
switch (aggCallKind)
```
So for the cases when instead of Calcite s `SqlCountAggFunction` were used 
`DrillCalciteSqlAggFunctionWrapper`s this rule changed its behaviour and 
instead of returning joins of distinct and non-distinct aggregate rel nodes, it 
returns distinct aggregate which has an input with non-distinct aggregates 
grouped by column, which is distinct in outer aggregate.

These Drill rules were able to work correctly only for the cases when 
aggregate rel node does not contain ` aggCalls` and contains only the single 
field in `rowType` (in this case `groupSet` is always `{0}`, and it still 
correct for outer aggregates which are created in `*AggPrule`s)

With the new version of `AggregateExpandDistinctAggregatesRule` these Drill 
rules received aggregate rel nodes with the non-empty lists of ` aggCalls`, 
therefore its `rowType` contains more than one field. But before my changes, 
the same `groupSet` was passed to the constructor of outer aggregate and row 
type of aggregate differed from the row type of its input. So it was incorrect 
`groupSet`. 
Aggregate rel nodes always specify the group by columns in the first 
positions of the list, so correct `groupSet` for outer aggregate should be 
`ImmutableBitSet` with the same size as the `groupSet` of nested aggregate, but 
the ordinals of columns should start from 0. 

As for the point of iterating through the group set, 
`ImmutableBitSet.size()`, `ImmutableBitSet.length()` and 
`ImmutableBitSet.cardinality()` does not return desired "size" of the 
`groupSet`. `AggregateExpandDistinctAggregatesRule` also contains similar code 
which iterating through the `groupSet` for similar purposes.


---