[jira] [Resolved] (DRILL-7303) Filter record batch does not handle zero-length batches

2019-11-29 Thread Paul Rogers (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-7303.

Resolution: Duplicate

> Filter record batch does not handle zero-length batches
> ---
>
> Key: DRILL-7303
> URL: https://issues.apache.org/jira/browse/DRILL-7303
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> Testing of the row-set-based JSON reader revealed a limitation of the Filter 
> record batch: if an incoming batch has zero records, the length of the 
> associated SV2 is left at -1. In particular:
> {code:java}
> public class SelectionVector2 implements AutoCloseable {
>   // Indicates actual number of rows in the RecordBatch
>   // container which owns this SV2 instance
>   private int batchActualRecordCount = -1;
> {code}
> Then:
> {code:java}
> public abstract class FilterTemplate2 implements Filterer {
>   @Override
>   public void filterBatch(int recordCount) throws SchemaChangeException{
> if (recordCount == 0) {
>   outgoingSelectionVector.setRecordCount(0);
>   return;
> }
> {code}
> Notice there is no call to set the actual record count. The solution is to 
> insert one line of code:
> {code:java}
> if (recordCount == 0) {
>   outgoingSelectionVector.setRecordCount(0);
>   outgoingSelectionVector.setBatchActualRecordCount(0); // <-- Add this
>   return;
> }
> {code}
> Without this, the query fails with an error due to an invalid index of -1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (DRILL-7311) Partial fixes for empty batch bugs

2019-11-29 Thread Paul Rogers (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-7311.

Resolution: Duplicate

> Partial fixes for empty batch bugs
> --
>
> Key: DRILL-7311
> URL: https://issues.apache.org/jira/browse/DRILL-7311
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.18.0
>
>
> DRILL-7305 explains that multiple operators have serious bugs when presented 
> with empty batches. DRILL-7306 explains that the EVF (AKA "new scan 
> framework") was originally coded to emit an empty "fast schema" batch, but 
> that the feature was disabled because of the many empty-batch operator 
> failures.
> This ticket covers a set of partial fixes for empty-batch issues. This is the 
> result of work done to get the converted JSON reader to work with a "fast 
> schema." The JSON work, in the end, revealed that Drill has too many bugs to 
> enable fast schema, and so the DRILL-7306 was implemented instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (DRILL-7305) Multiple operators do not handle empty batches

2019-11-29 Thread Paul Rogers (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-7305.

Resolution: Duplicate

> Multiple operators do not handle empty batches
> --
>
> Key: DRILL-7305
> URL: https://issues.apache.org/jira/browse/DRILL-7305
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Priority: Major
>
> While testing the new "EVF" framework, it was found that multiple operators 
> incorrectly handle empty batches. The EVF framework is set up to return a 
> "fast schema" empty batch with only schema as its first batch. It turns out 
> that many operators fail with problems such as:
> * Failure to set the value counts in the output container
> * Fail to initialize the offset vector position 0 to 0 for variable-width or 
> repeated vectors
> And so on.
> Partial fixes are in the JSON reader PR.
> For now, the easiest work-around is to disable the "fast schema" path in the 
> EVF: DRILL-7306.
> To discover the remaining issues, enable the 
> {{ScanOrchestratorBuilder.enableSchemaBatch}} option and run unit tests. You 
> can use the {{VectorChecker}} and {{VectorAccessorUtilities.verify()}} 
> methods to check state. Insert a call to {{verify()}} in each "next" method: 
> verify the incoming and outgoing batches. The checker only verifies a few 
> vector types; but these are enough to show many problems.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7324) Many vector-validity errors from unit tests

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985238#comment-16985238
 ] 

ASF GitHub Bot commented on DRILL-7324:
---

paul-rogers commented on pull request #1912: DRILL-7324: Final set of "batch 
count" fixes
URL: https://github.com/apache/drill/pull/1912
 
 
   Final set of fixes for batch count/record count issues. Enables
   vector checking for all operators (except statistics, which do not
   produce valid Drill batches.)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Many vector-validity errors from unit tests
> ---
>
> Key: DRILL-7324
> URL: https://issues.apache.org/jira/browse/DRILL-7324
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> Drill's value vectors contain many counts that must be maintained in sync. 
> Drill provides a utility, {{BatchValidator}} to check (a subset of) these 
> values for consistency.
> The {{IteratorValidatorBatchIterator}} class is used in tests to validate the 
> state of each operator (AKA "record batch") as Drill runs the Volcano 
> iterator. This class can also validate vectors by setting the 
> {{VALIDATE_VECTORS}} constant to `true`.
> This was done, then unit tests were run. Many tests failed. Examples:
> {noformat}
> [INFO] Running org.apache.drill.TestUnionDistinct
> 18:44:26.742 [22d42585-74c2-d418-6f59-9b1870d04770:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> key - NullableBitVector: Row count = 0, but value count = 2
> 18:44:26.745 [22d42585-74c2-d418-6f59-9b1870d04770:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> key - NullableBitVector: Row count = 0, but value count = 2
> [INFO] Running org.apache.drill.TestUnionDistinct
> 8:44:48.302 [22d4256e-c90b-847c-5104-02d6cdf5223e:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> key - NullableBitVector: Row count = 0, but value count = 2
> 18:44:48.703 [22d4256e-ccf3-2af6-f56a-140e9c3e55bb:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> n_nationkey - IntVector: Row count = 2, but value count = 25
> n_regionkey - IntVector: Row count = 2, but value count = 25
> 18:44:48.731 [22d4256e-ccf3-2af6-f56a-140e9c3e55bb:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> n_nationkey - IntVector: Row count = 4, but value count = 25
> n_regionkey - IntVector: Row count = 4, but value count = 25
> 18:44:49.039 [22d4256f-6b39-d2ab-d145-4f2b0db315a3:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> n_nationkey - IntVector: Row count = 2, but value count = 25
> 18:44:49.363 [22d4256e-3d91-850f-9ab4-5939219ac0d0:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> c_custkey - IntVector: Row count = 4, but value count = 1500
> 18:44:49.597 [22d4256d-c113-ae5c-6f31-4dd1ec091365:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> n_nationkey - IntVector: Row count = 5, but value count = 25
> n_regionkey - IntVector: Row count = 5, but value count = 25
> 18:44:49.610 [22d4256d-c113-ae5c-6f31-4dd1ec091365:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> r_regionkey - IntVector: Row count = 1, but value count = 5
> 18:44:53.029 [22d4256a-8b70-5f3b-f79b-806e194c5ed2:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> n_nationkey - IntVector: Row count = 0, but value count = 25
> n_name - VarCharVector: Row count = 0, but value count = 25
> n_regionkey - IntVector: Row count = 0, but value count = 25
> 18:44:53.033 [22d4256a-8b70-5f3b-f79b-806e194c5ed2:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> n_regionkey - IntVector: Row count = 5, but value count = 25
> 18:44:53.331 [22d4256a-526c-7815-c216-8e45752a4a6c:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> n_nationkey - IntVector: Row count = 5, but value count = 25
> n_name - 

[jira] [Assigned] (DRILL-7324) Many vector-validity errors from unit tests

2019-11-29 Thread Paul Rogers (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers reassigned DRILL-7324:
--

Assignee: Paul Rogers

> Many vector-validity errors from unit tests
> ---
>
> Key: DRILL-7324
> URL: https://issues.apache.org/jira/browse/DRILL-7324
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> Drill's value vectors contain many counts that must be maintained in sync. 
> Drill provides a utility, {{BatchValidator}} to check (a subset of) these 
> values for consistency.
> The {{IteratorValidatorBatchIterator}} class is used in tests to validate the 
> state of each operator (AKA "record batch") as Drill runs the Volcano 
> iterator. This class can also validate vectors by setting the 
> {{VALIDATE_VECTORS}} constant to `true`.
> This was done, then unit tests were run. Many tests failed. Examples:
> {noformat}
> [INFO] Running org.apache.drill.TestUnionDistinct
> 18:44:26.742 [22d42585-74c2-d418-6f59-9b1870d04770:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> key - NullableBitVector: Row count = 0, but value count = 2
> 18:44:26.745 [22d42585-74c2-d418-6f59-9b1870d04770:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> key - NullableBitVector: Row count = 0, but value count = 2
> [INFO] Running org.apache.drill.TestUnionDistinct
> 8:44:48.302 [22d4256e-c90b-847c-5104-02d6cdf5223e:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> key - NullableBitVector: Row count = 0, but value count = 2
> 18:44:48.703 [22d4256e-ccf3-2af6-f56a-140e9c3e55bb:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> n_nationkey - IntVector: Row count = 2, but value count = 25
> n_regionkey - IntVector: Row count = 2, but value count = 25
> 18:44:48.731 [22d4256e-ccf3-2af6-f56a-140e9c3e55bb:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> n_nationkey - IntVector: Row count = 4, but value count = 25
> n_regionkey - IntVector: Row count = 4, but value count = 25
> 18:44:49.039 [22d4256f-6b39-d2ab-d145-4f2b0db315a3:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> n_nationkey - IntVector: Row count = 2, but value count = 25
> 18:44:49.363 [22d4256e-3d91-850f-9ab4-5939219ac0d0:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> c_custkey - IntVector: Row count = 4, but value count = 1500
> 18:44:49.597 [22d4256d-c113-ae5c-6f31-4dd1ec091365:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> n_nationkey - IntVector: Row count = 5, but value count = 25
> n_regionkey - IntVector: Row count = 5, but value count = 25
> 18:44:49.610 [22d4256d-c113-ae5c-6f31-4dd1ec091365:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> FilterRecordBatch
> r_regionkey - IntVector: Row count = 1, but value count = 5
> 18:44:53.029 [22d4256a-8b70-5f3b-f79b-806e194c5ed2:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> n_nationkey - IntVector: Row count = 0, but value count = 25
> n_name - VarCharVector: Row count = 0, but value count = 25
> n_regionkey - IntVector: Row count = 0, but value count = 25
> 18:44:53.033 [22d4256a-8b70-5f3b-f79b-806e194c5ed2:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> n_regionkey - IntVector: Row count = 5, but value count = 25
> 18:44:53.331 [22d4256a-526c-7815-c216-8e45752a4a6c:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> n_nationkey - IntVector: Row count = 5, but value count = 25
> n_name - VarCharVector: Row count = 5, but value count = 25
> n_regionkey - IntVector: Row count = 5, but value count = 25
> 18:44:53.337 [22d4256a-526c-7815-c216-8e45752a4a6c:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> n_regionkey - IntVector: Row count = 0, but value count = 25
> 18:44:53.646 [22d42569-c293-ced0-c3d0-e9153cc4a70a:frag:0:0] ERROR 
> o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors from 
> LimitRecordBatch
> key - NullableBitVector: Row count = 0, but value count = 2
> Running org.apache.drill.TestTpchSingleMode
> 18:45:01.299 

[jira] [Commented] (DRILL-7359) Add support for DICT type in RowSet Framework

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985210#comment-16985210
 ] 

ASF GitHub Bot commented on DRILL-7359:
---

paul-rogers commented on pull request #1870: DRILL-7359: Add support for DICT 
type in RowSet Framework
URL: https://github.com/apache/drill/pull/1870#discussion_r352252495
 
 

 ##
 File path: 
exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/reader/UnionReaderImpl.java
 ##
 @@ -268,4 +274,51 @@ public String getAsString() {
 }
 return requireReader(type).getAsString();
   }
+
+  private UnionReaderImpl getNullReader() {
+AbstractObjectReader[] nullVariants = new 
AbstractObjectReader[variants.length];
+for (int i = 0; i < variants.length; i++) {
+  nullVariants[i] = variants[i].createNullReader();
+}
+return new NullUnionReader(schema(), unionAccessor, nullVariants);
+  }
+
+  private static class NullUnionReader extends UnionReaderImpl {
+
+private NullUnionReader(ColumnMetadata metadata, VectorAccessor va, 
AbstractObjectReader[] variants) {
 
 Review comment:
   This approach is a bit better than the big null reader we had before. I 
wonder, however, why we need such a reader? Null (AKA "dummy") readers exist, 
but generally for specialized cases (such as a List that has no type.) I 
wonder, is this for the case where the value of a key/value pair is null?
   
   If so, then two thoughts. First, the right way to handle that is to:
   
   * Use the `isNull()` method on the `ColumnReader` class:
   
   ```
   if (myDictReader.value().isNull()) {
 // Handle null value
   } else {
 int x = myDictReader.value().scalar().getInt();
   }
   ```
   
   The idea is that null values are null: they have no structure. The approach 
here seems to say that null values have structure (such as a null union), so we 
can ask what types the null has and so on. But, this is, perhaps, too much of a 
fantasy.
   
   The other suggestion would be to suggest that values cannot be null, 
especially if they are of structured types.
   
   Also, note that, for unions, we already have multiple levels of nulls: The 
union itself can be null. When used in a list, the list entry can be null. The 
union can have nullable types, so that the union might have an int, but the int 
is null. This is already a mess, we probably don't want to make it even more 
complex.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support for DICT type in RowSet Framework
> -
>
> Key: DRILL-7359
> URL: https://issues.apache.org/jira/browse/DRILL-7359
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.17.0
>
>
> Add support for new DICT data type (see DRILL-7096) in RowSet Framework



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985207#comment-16985207
 ] 

ASF GitHub Bot commented on DRILL-7393:
---

paul-rogers commented on pull request #1910: DRILL-7393: Revisit Drill tests to 
ensure that patching is executed b…
URL: https://github.com/apache/drill/pull/1910#discussion_r352251662
 
 

 ##
 File path: common/src/test/java/org/apache/drill/test/BaseTest.java
 ##
 @@ -15,24 +15,25 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.drill.exec.expr;
+package org.apache.drill.test;
 
-import org.apache.drill.BaseTestQuery;
-import org.apache.drill.common.util.TestTools;
-import org.junit.Test;
+import org.apache.drill.common.util.GuavaPatcher;
+import org.apache.drill.common.util.ProtobufPatcher;
 
-public class TestPrune extends BaseTestQuery {
+public class BaseTest {
 
 Review comment:
   The term `BaseTest` is too generic: soon this will contain all manner of 
common code. Better solutions are:
   
   * MapRTestBase and HBaseTestBase to patch the code as needed for just those 
clients.
   * Patchers as a service added to tests where needed, such as is done with 
the "dirTestWatcher"
   * The patch-on-load approach suggested below.
   
   The approach here is really far too much of a hack to maintain.
   
   Also, this patching means that code running in tests operates in a different 
environment than the same code running in production. That will become the 
cause of very mysterious bugs.
   
   Can we rethink our approach?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: ready-to-commit
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails during 
> unit testing if there are any problems.
> After all tests are revised, we can remove {{metastore-test}} execution from 
> main.xml in {{maven-surefire-plugin}} which was added to ensure that all 
> Metastore tests run in a separate JVM where patching is done in first place 
> since Iceberg Metastore heavily depends on patched Guava Preconditions class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985205#comment-16985205
 ] 

ASF GitHub Bot commented on DRILL-7393:
---

paul-rogers commented on pull request #1910: DRILL-7393: Revisit Drill tests to 
ensure that patching is executed b…
URL: https://github.com/apache/drill/pull/1910#discussion_r352250931
 
 

 ##
 File path: common/src/main/java/org/apache/drill/common/util/GuavaPatcher.java
 ##
 @@ -24,154 +24,157 @@
 import java.util.stream.Collectors;
 import java.util.stream.IntStream;
 
-import javassist.CannotCompileException;
 import javassist.ClassPool;
 import javassist.CtClass;
 import javassist.CtConstructor;
 import javassist.CtMethod;
 import javassist.CtNewMethod;
-import javassist.NotFoundException;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 public class GuavaPatcher {
   private static final Logger logger = 
LoggerFactory.getLogger(GuavaPatcher.class);
 
-  private static boolean patched;
+  private static boolean patchingAttempted;
 
   public static synchronized void patch() {
-if (!patched) {
-  try {
-patchStopwatch();
-patchCloseables();
-patchPreconditions();
-patched = true;
-  } catch (Throwable e) {
-logger.warn("Unable to patch Guava classes.", e);
-  }
+if (!patchingAttempted) {
+  patchingAttempted = true;
 
 Review comment:
   This version no longer has a try/catch block. Is the idea that, if patching 
fails, the test should fail? If this particular test fails, we've marked 
patching as attempted and we won't try again. This means that future tests that 
needed the patching will mysteriously fail.
   
   Should we, instead, simply do a `System.exit(1)` if patching fails to shut 
down the tests? Or, print a warning to stderr that subsequent tests may fail?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: ready-to-commit
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails during 
> unit testing if there are any problems.
> After all tests are revised, we can remove {{metastore-test}} execution from 
> main.xml in {{maven-surefire-plugin}} which was added to ensure that all 
> Metastore tests run in a separate JVM where patching is done in first place 
> since Iceberg Metastore heavily depends on patched Guava Preconditions class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985206#comment-16985206
 ] 

ASF GitHub Bot commented on DRILL-7393:
---

paul-rogers commented on pull request #1910: DRILL-7393: Revisit Drill tests to 
ensure that patching is executed b…
URL: https://github.com/apache/drill/pull/1910#discussion_r352251049
 
 

 ##
 File path: common/src/main/java/org/apache/drill/common/util/GuavaPatcher.java
 ##
 @@ -24,154 +24,157 @@
 import java.util.stream.Collectors;
 import java.util.stream.IntStream;
 
-import javassist.CannotCompileException;
 import javassist.ClassPool;
 import javassist.CtClass;
 import javassist.CtConstructor;
 import javassist.CtMethod;
 import javassist.CtNewMethod;
-import javassist.NotFoundException;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 public class GuavaPatcher {
   private static final Logger logger = 
LoggerFactory.getLogger(GuavaPatcher.class);
 
-  private static boolean patched;
+  private static boolean patchingAttempted;
 
   public static synchronized void patch() {
-if (!patched) {
-  try {
-patchStopwatch();
-patchCloseables();
-patchPreconditions();
-patched = true;
-  } catch (Throwable e) {
-logger.warn("Unable to patch Guava classes.", e);
-  }
+if (!patchingAttempted) {
+  patchingAttempted = true;
+  patchStopwatch();
+  patchCloseables();
+  patchPreconditions();
 }
   }
 
   /**
* Makes Guava stopwatch look like the old version for compatibility with 
hbase-server (for test purposes).
*/
-  private static void patchStopwatch() throws Exception {
-
-ClassPool cp = ClassPool.getDefault();
-CtClass cc = cp.get("com.google.common.base.Stopwatch");
-
-// Expose the constructor for Stopwatch for old libraries who use the 
pattern new Stopwatch().start().
-for (CtConstructor c : cc.getConstructors()) {
-  if (!Modifier.isStatic(c.getModifiers())) {
-c.setModifiers(Modifier.PUBLIC);
+  private static void patchStopwatch() {
+try {
+  ClassPool cp = ClassPool.getDefault();
+  CtClass cc = cp.get("com.google.common.base.Stopwatch");
+
+  // Expose the constructor for Stopwatch for old libraries who use the 
pattern new Stopwatch().start().
+  for (CtConstructor c : cc.getConstructors()) {
+if (!Modifier.isStatic(c.getModifiers())) {
+  c.setModifiers(Modifier.PUBLIC);
+}
   }
-}
 
-// Add back the Stopwatch.elapsedMillis() method for old consumers.
-CtMethod newMethod = CtNewMethod.make(
-"public long elapsedMillis() { return 
elapsed(java.util.concurrent.TimeUnit.MILLISECONDS); }", cc);
-cc.addMethod(newMethod);
+  // Add back the Stopwatch.elapsedMillis() method for old consumers.
+  CtMethod newMethod = CtNewMethod.make(
+  "public long elapsedMillis() { return 
elapsed(java.util.concurrent.TimeUnit.MILLISECONDS); }", cc);
+  cc.addMethod(newMethod);
 
-// Load the modified class instead of the original.
-cc.toClass();
+  // Load the modified class instead of the original.
+  cc.toClass();
 
-logger.info("Google's Stopwatch patched for old HBase Guava version.");
+  logger.info("Google's Stopwatch patched for old HBase Guava version.");
+} catch (Exception e) {
+  logger.warn("Unable to patch Guava classes.", e);
 
 Review comment:
   We are warning on errors, but then proceeding. The test that caused this 
patch will still run, but presumably will fail. Is this the desired behaviour?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: ready-to-commit
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> 

[jira] [Commented] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985204#comment-16985204
 ] 

ASF GitHub Bot commented on DRILL-7393:
---

paul-rogers commented on pull request #1910: DRILL-7393: Revisit Drill tests to 
ensure that patching is executed b…
URL: https://github.com/apache/drill/pull/1910#discussion_r352250720
 
 

 ##
 File path: common/src/main/java/org/apache/drill/common/util/GuavaPatcher.java
 ##
 @@ -24,154 +24,157 @@
 import java.util.stream.Collectors;
 import java.util.stream.IntStream;
 
-import javassist.CannotCompileException;
 import javassist.ClassPool;
 import javassist.CtClass;
 import javassist.CtConstructor;
 import javassist.CtMethod;
 import javassist.CtNewMethod;
-import javassist.NotFoundException;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 public class GuavaPatcher {
   private static final Logger logger = 
LoggerFactory.getLogger(GuavaPatcher.class);
 
-  private static boolean patched;
+  private static boolean patchingAttempted;
 
 Review comment:
   Maven runs tests concurrently in the same JVM. Does this flag need 
synchronization?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: ready-to-commit
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails during 
> unit testing if there are any problems.
> After all tests are revised, we can remove {{metastore-test}} execution from 
> main.xml in {{maven-surefire-plugin}} which was added to ensure that all 
> Metastore tests run in a separate JVM where patching is done in first place 
> since Iceberg Metastore heavily depends on patched Guava Preconditions class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985208#comment-16985208
 ] 

ASF GitHub Bot commented on DRILL-7393:
---

paul-rogers commented on pull request #1910: DRILL-7393: Revisit Drill tests to 
ensure that patching is executed b…
URL: https://github.com/apache/drill/pull/1910#discussion_r352251533
 
 

 ##
 File path: common/src/test/java/org/apache/drill/test/BaseTest.java
 ##
 @@ -15,24 +15,25 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.drill.exec.expr;
+package org.apache.drill.test;
 
-import org.apache.drill.BaseTestQuery;
-import org.apache.drill.common.util.TestTools;
-import org.junit.Test;
+import org.apache.drill.common.util.GuavaPatcher;
+import org.apache.drill.common.util.ProtobufPatcher;
 
-public class TestPrune extends BaseTestQuery {
+public class BaseTest {
 
-  String MULTILEVEL = TestTools.getWorkingPath() + 
"/../java-exec/src/test/resources/multilevel";
-
-  @Test
-  public void pruneCompound() throws Exception {
-test(String.format("select * from dfs.`%s/csv` where x is null and dir1 in 
('Q1', 'Q2')", MULTILEVEL));
-  }
-
-  @Test
-  public void pruneSimple() throws Exception {
-test(String.format("select * from dfs.`%s/csv` where dir1 in ('Q1', 
'Q2')", MULTILEVEL));
+  static {
+/*
+ * HBase and MapR-DB clients use older version of protobuf,
+ * and override some methods that became final in recent versions.
+ * This code removes these final modifiers.
 
 Review comment:
   Shouldn't these clients be addressed directly? We're doing patching for 
tests, but what about production code? Possible solutions:
   
   * Load these clients into their own class loader so their dependencies do 
not conflict.
   * Shade them.
   * Upgrade to newer versions that use more recent Guava libraries.
   * Do the patching within Drill: perhaps have a plugin-specific startup event 
that can trigger patching where needed.
   * Lobby the creators of the libraries to fix the problems.
   
   My concern is that changing all our tests to work around bugs in external 
libraries is an unstable solution: we'll find ourselves doing more of this each 
time some new external conflict arises.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: ready-to-commit
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails during 
> unit testing if there are any problems.
> After all tests are revised, we can remove {{metastore-test}} execution from 
> main.xml in {{maven-surefire-plugin}} which was added to ensure that all 
> Metastore tests run in a separate JVM where patching is done in first place 
> since Iceberg Metastore heavily depends on patched Guava Preconditions class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985203#comment-16985203
 ] 

ASF GitHub Bot commented on DRILL-7393:
---

paul-rogers commented on issue #1910: DRILL-7393: Revisit Drill tests to ensure 
that patching is executed b…
URL: https://github.com/apache/drill/pull/1910#issuecomment-559887767
 
 
   The meta-question is why we still need this after all these years? Shouldn't 
we try to normalize our usage of Guava? Also, we moved Drill's use into a 
shaded jar so that, presumably, it won't conflict with other usages. So, do we 
still need this patching hack?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: ready-to-commit
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails during 
> unit testing if there are any problems.
> After all tests are revised, we can remove {{metastore-test}} execution from 
> main.xml in {{maven-surefire-plugin}} which was added to ensure that all 
> Metastore tests run in a separate JVM where patching is done in first place 
> since Iceberg Metastore heavily depends on patched Guava Preconditions class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread Vova Vysotskyi (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vova Vysotskyi updated DRILL-7393:
--
Labels: ready-to-commit  (was: )

> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: ready-to-commit
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails during 
> unit testing if there are any problems.
> After all tests are revised, we can remove {{metastore-test}} execution from 
> main.xml in {{maven-surefire-plugin}} which was added to ensure that all 
> Metastore tests run in a separate JVM where patching is done in first place 
> since Iceberg Metastore heavily depends on patched Guava Preconditions class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985129#comment-16985129
 ] 

ASF GitHub Bot commented on DRILL-7393:
---

agozhiy commented on issue #1910: DRILL-7393: Revisit Drill tests to ensure 
that patching is executed b…
URL: https://github.com/apache/drill/pull/1910#issuecomment-559850966
 
 
   @vvysotskyi, thank you for review, I made the changes you requested.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails during 
> unit testing if there are any problems.
> After all tests are revised, we can remove {{metastore-test}} execution from 
> main.xml in {{maven-surefire-plugin}} which was added to ensure that all 
> Metastore tests run in a separate JVM where patching is done in first place 
> since Iceberg Metastore heavily depends on patched Guava Preconditions class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-6963) create/aggregate/work with array

2019-11-29 Thread benj (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985067#comment-16985067
 ] 

benj commented on DRILL-6963:
-

For the second point (arry_agg), in attempt of an eventual official function, 
here is a simple implementation that can do that (without possibility to 
_DISTINCT_ or _ORDER BY_) 
{code:java}
package org.apache.drill.contrib.function;

import io.netty.buffer.DrillBuf;
import org.apache.drill.exec.expr.DrillAggFunc;
import org.apache.drill.exec.expr.annotations.FunctionTemplate;
import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
import org.apache.drill.exec.expr.annotations.Output;
import org.apache.drill.exec.expr.annotations.Param;
import org.apache.drill.exec.expr.annotations.Workspace;
import org.apache.drill.exec.expr.holders.*;

import javax.inject.Inject;

// If dataset is too large, need : ALTER SESSION SET `planner.enable_hashagg` = 
false
public class ArrayAgg {

// STRING NULLABLE //   
@FunctionTemplate(
name = "array_agg",
scope = FunctionScope.POINT_AGGREGATE,
nulls = NullHandling.INTERNAL)
public static class NullableVarChar_ArrayAgg implements DrillAggFunc {
  @Param NullableVarCharHolder input;
  @Workspace ObjectHolder agg;
  @Output org.apache.drill.exec.vector.complex.writer.BaseWriter.ComplexWriter 
out;
  @Inject DrillBuf buffer;

  @Override public void setup() { 
agg = new ObjectHolder();
  }

  @Override public void reset() {
agg = new ObjectHolder();
  }

  @Override public void add() {
org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter 
listWriter;
if (agg.obj == null) {
  agg.obj = out.rootAsList();
}

if ( input.isSet == 0 )
return;

org.apache.drill.exec.expr.holders.VarCharHolder rowHolder = new 
org.apache.drill.exec.expr.holders.VarCharHolder();
byte[] inputBytes = 
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(
 input ).getBytes( com.google.common.base.Charsets.UTF_8 );
buffer.reallocIfNeeded(inputBytes.length); 
buffer.setBytes(0, inputBytes);
rowHolder.start = 0; 
rowHolder.end = inputBytes.length; 
rowHolder.buffer = buffer;

listWriter = 
(org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter) agg.obj;
listWriter.varChar().write( rowHolder );  
  }

  @Override public void output() {
  ((org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter) 
agg.obj).endList();
  }
}

// INTEGER NULLABLE //
@FunctionTemplate(
 name = "array_agg",
 scope = FunctionScope.POINT_AGGREGATE,
 nulls = NullHandling.INTERNAL)
public static class NullableInt_ArrayAgg implements DrillAggFunc {
@Param NullableIntHolder input;
@Workspace ObjectHolder agg;
@Output 
org.apache.drill.exec.vector.complex.writer.BaseWriter.ComplexWriter out;
@Inject DrillBuf buffer;

@Override public void setup() {
  
 agg = new ObjectHolder();
}

@Override public void reset() {
 agg = new ObjectHolder();
}

@Override public void add() {
 org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter 
listWriter;
 if (agg.obj == null) {
   agg.obj = out.rootAsList();
 }
 
if ( input.isSet == 0 )
return;
 
 listWriter = 
(org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter) agg.obj;
 listWriter.integer().writeInt( input.value );
}

@Override public void output() {
  
((org.apache.drill.exec.vector.complex.writer.BaseWriter.ListWriter) 
agg.obj).endList();
}
}

// ...
}
{code}
 

> create/aggregate/work with array
> 
>
> Key: DRILL-6963
> URL: https://issues.apache.org/jira/browse/DRILL-6963
> Project: Apache Drill
>  Issue Type: Wish
>  Components: Functions - Drill
>Reporter: benj
>Priority: Major
>
> * Add the possibility to build array (like : SELECT array[a1,a2,a3...]) - 
> ideally work with all types
>  * Add a default array_agg (like : SELECT col1, array_agg(col2), 
> array_agg(DISTINCT col2) FROM ... GROUP BY col1) ;  - ideally work with all 
> types
>  * Add function/facilities/operator to work with array



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-5844) Incorrect values of TABLE_TYPE returned from method DatabaseMetaData.getTables of JDBC API

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985065#comment-16985065
 ] 

ASF GitHub Bot commented on DRILL-5844:
---

arina-ielchiieva commented on issue #1904: DRILL-5844: Incorrect values of 
TABLE_TYPE returned from method DatabaseMetaData.getTables of JDBC API
URL: https://github.com/apache/drill/pull/1904#issuecomment-559819512
 
 
   @arjuntheprogrammer did you have a chance to fix test failures?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Incorrect values of TABLE_TYPE returned from method 
> DatabaseMetaData.getTables of JDBC API
> --
>
> Key: DRILL-5844
> URL: https://issues.apache.org/jira/browse/DRILL-5844
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Metadata
>Affects Versions: 1.10.0
>Reporter: second88
>Assignee: Arjun Gupta
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> As far as I can see, the values of TABLE_TYPE returned from method 
> DatabaseMetaData.getTables of JDBC API of a Drill Connection include:
> TABLE
> VIEW
> SYSTEM_TABLE
> According to [JDBC 
> API|http://docs.oracle.com/javase/7/docs/api/java/sql/DatabaseMetaData.html#getTables(java.lang.String,%20java.lang.String,%20java.lang.String,%20java.lang.String\[\])],
>  the typical types are "TABLE", "VIEW", "SYSTEM TABLE", "GLOBAL TEMPORARY", 
> "LOCAL TEMPORARY", "ALIAS", "SYNONYM".
> Therefore "SYSTEM_TABLE" should be replaced by "SYSTEM TABLE".
> Besides, I wonder if this bug is related to another bug 
> [DRILL-5843|https://issues.apache.org/jira/browse/DRILL-5843] reported by me.
> It should be noted that the values of TABLE_TYPE returned from methods 
> DatabaseMetaData.getTables and DatabaseMetaData.getTableTypes should be 
> one-to-one matched with but may not be the same as those in 
> INFORMATION_SCHEMA.TABLES.TABLE_TYPE, for instance, "TABLE" VS "BASE TABLE".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7233) Format Plugin for HDF5

2019-11-29 Thread Arina Ielchiieva (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7233:

Fix Version/s: (was: 1.17.0)

> Format Plugin for HDF5
> --
>
> Key: DRILL-7233
> URL: https://issues.apache.org/jira/browse/DRILL-7233
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.18.0
>
>
> h2. Drill HDF5 Format Plugin
> h2. 
> Per wikipedia, Hierarchical Data Format (HDF) is a set of file formats 
> designed to store and organize large amounts of data. Originally developed at 
> the National Center for Supercomputing Applications, it is supported by The 
> HDF Group, a non-profit corporation whose mission is to ensure continued 
> development of HDF5 technologies and the continued accessibility of data 
> stored in HDF.
> This plugin enables Apache Drill to query HDF5 files.
> h3. Configuration
> There are three configuration variables in this plugin:
> type: This should be set to hdf5.
> extensions: This is a list of the file extensions used to identify HDF5 
> files. Typically HDF5 uses .h5 or .hdf5 as file extensions. This defaults to 
> .h5.
> defaultPath:
> h3. Example Configuration
> h3. 
> For most uses, the configuration below will suffice to enable Drill to query 
> HDF5 files.
> {{"hdf5": {
>   "type": "hdf5",
>   "extensions": [
> "h5"
>   ],
>   "defaultPath": null
> }}}
> h3. Usage
> Since HDF5 can be viewed as a file system within a file, a single file can 
> contain many datasets. For instance, if you have a simple HDF5 file, a star 
> query will produce the following result:
> {{apache drill> select * from dfs.test.`dset.h5`;
> +---+---+---+--+
> | path  | data_type | file_name | int_data
>  |
> +---+---+---+--+
> | /dset | DATASET   | dset.h5   | 
> [[1,2,3,4,5,6],[7,8,9,10,11,12],[13,14,15,16,17,18],[19,20,21,22,23,24]] |
> +---+---+---+--+}}
> The actual data in this file is mapped to a column called int_data. In order 
> to effectively access the data, you should use Drill's FLATTEN() function on 
> the int_data column, which produces the following result.
> {{apache drill> select flatten(int_data) as int_data from dfs.test.`dset.h5`;
> +-+
> |  int_data   |
> +-+
> | [1,2,3,4,5,6]   |
> | [7,8,9,10,11,12]|
> | [13,14,15,16,17,18] |
> | [19,20,21,22,23,24] |
> +-+}}
> Once you have the data in this form, you can access it similarly to how you 
> might access nested data in JSON or other files.
> {{apache drill> SELECT int_data[0] as col_0,
> . .semicolon> int_data[1] as col_1,
> . .semicolon> int_data[2] as col_2
> . .semicolon> FROM ( SELECT flatten(int_data) AS int_data
> . . . . . .)> FROM dfs.test.`dset.h5`
> . . . . . .)> );
> +---+---+---+
> | col_0 | col_1 | col_2 |
> +---+---+---+
> | 1 | 2 | 3 |
> | 7 | 8 | 9 |
> | 13| 14| 15|
> | 19| 20| 21|
> +---+---+---+}}
> Alternatively, a better way to query the actual data in an HDF5 file is to 
> use the defaultPath field in your query. If the defaultPath field is defined 
> in the query, or via the plugin configuration, Drill will only return the 
> data, rather than the file metadata.
> ** Note: Once you have determined which data set you are querying, it is 
> advisable to use this method to query HDF5 data. **
> You can set the defaultPath variable in either the plugin configuration, or 
> at query time using the table() function as shown in the example below:
> {{SELECT * 
> FROM table(dfs.test.`dset.h5` (type => 'hdf5', defaultPath => '/dset'))}}
> This query will return the result below:
> {{apache drill> SELECT * FROM table(dfs.test.`dset.h5` (type => 'hdf5', 
> defaultPath => '/dset'));
> +---+---+---+---+---+---+
> | int_col_0 | int_col_1 | int_col_2 | int_col_3 | int_col_4 | int_col_5 |
> +---+---+---+---+---+---+
> | 1 | 2 | 3 | 4 | 5 | 6 |
> | 7 | 8 | 9 | 10| 11| 12|
> | 13| 14| 15| 16| 17| 18|
> | 19| 20| 21| 22| 23| 24|
> 

[jira] [Updated] (DRILL-7091) Query with EXISTS and correlated subquery fails with NPE in HashJoinMemoryCalculatorImpl$BuildSidePartitioningImpl

2019-11-29 Thread Arina Ielchiieva (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7091:

Fix Version/s: (was: 1.17.0)
   1.18.0

> Query with EXISTS and correlated subquery fails with NPE in 
> HashJoinMemoryCalculatorImpl$BuildSidePartitioningImpl
> --
>
> Key: DRILL-7091
> URL: https://issues.apache.org/jira/browse/DRILL-7091
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Vova Vysotskyi
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.18.0
>
>
> Steps to reproduce:
> 1. Create view:
> {code:sql}
> create view dfs.tmp.nation_view as select * from cp.`tpch/nation.parquet`;
> {code}
> Run the following query:
> {code:sql}
> SELECT n_nationkey, n_name
> FROM  dfs.tmp.nation_view a
> WHERE EXISTS (SELECT 1
> FROM cp.`tpch/region.parquet` b
> WHERE b.r_regionkey =  a.n_regionkey)
> {code}
> This query fails with NPE:
> {noformat}
> [Error Id: 9a592635-f792-4403-965c-bd2eece7e8fc on cv1:31010]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:364)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:219)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:330)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_161]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_161]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.drill.exec.physical.impl.join.HashJoinMemoryCalculatorImpl$BuildSidePartitioningImpl.initialize(HashJoinMemoryCalculatorImpl.java:267)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase(HashJoinBatch.java:959)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:525)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:141)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.test.generated.HashAggregatorGen2.doWork(HashAggTemplate.java:642)
>  ~[na:na]
>   at 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext(HashAggBatch.java:295)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
>  

[jira] [Comment Edited] (DRILL-4547) Javadoc fails with Java8

2019-11-29 Thread Arina Ielchiieva (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973744#comment-16973744
 ] 

Arina Ielchiieva edited comment on DRILL-4547 at 11/29/19 3:08 PM:
---

[~volodymyr] please check if anything can be done before 1.17 release.


was (Author: arina):
[~volodymyr] please schema if anything can be done before 1.17 release.

> Javadoc fails with Java8
> 
>
> Key: DRILL-4547
> URL: https://issues.apache.org/jira/browse/DRILL-4547
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build  Test
>Affects Versions: 1.6.0
>Reporter: Laurent Goujon
>Assignee: Vova Vysotskyi
>Priority: Major
>
> Javadoc cannot be generated when using Java8 (likely because the parser is 
> now more strict).
> Here's an example of issues when trying to generate javadocs in module 
> {{drill-fmpp-maven-plugin}}
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-javadoc-plugin:2.9.1:jar (attach-javadocs) on 
> project drill-fmpp-maven-plugin: MavenReportException: Error while creating 
> archive:
> [ERROR] Exit code: 1 - 
> /Users/laurent/devel/drill/tools/fmpp/src/main/java/org/apache/drill/fmpp/mojo/FMPPMojo.java:44:
>  error: unknown tag: goal
> [ERROR] * @goal generate
> [ERROR] ^
> [ERROR] 
> /Users/laurent/devel/drill/tools/fmpp/src/main/java/org/apache/drill/fmpp/mojo/FMPPMojo.java:45:
>  error: unknown tag: phase
> [ERROR] * @phase generate-sources
> [ERROR] ^
> [ERROR] 
> /Users/laurent/devel/drill/tools/fmpp/target/generated-sources/plugin/org/apache/drill/fmpp/mojo/HelpMojo.java:25:
>  error: unknown tag: goal
> [ERROR] * @goal help
> [ERROR] ^
> [ERROR] 
> /Users/laurent/devel/drill/tools/fmpp/target/generated-sources/plugin/org/apache/drill/fmpp/mojo/HelpMojo.java:26:
>  error: unknown tag: requiresProject
> [ERROR] * @requiresProject false
> [ERROR] ^
> [ERROR] 
> /Users/laurent/devel/drill/tools/fmpp/target/generated-sources/plugin/org/apache/drill/fmpp/mojo/HelpMojo.java:27:
>  error: unknown tag: threadSafe
> [ERROR] * @threadSafe
> [ERROR] ^
> [ERROR] 
> [ERROR] Command line was: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_72.jdk/Contents/Home/bin/javadoc 
> @options @packages
> [ERROR] 
> [ERROR] Refer to the generated Javadoc files in 
> '/Users/laurent/devel/drill/tools/fmpp/target/apidocs' dir.
> [ERROR] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> [ERROR] 
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :drill-fmpp-maven-plugin
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7208) Drill commit is not shown if build Drill from the 1.16.0-rc1 release sources

2019-11-29 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7208:

Labels: ready-to-commit  (was: )

> Drill commit is not shown if build Drill from the 1.16.0-rc1 release sources
> 
>
> Key: DRILL-7208
> URL: https://issues.apache.org/jira/browse/DRILL-7208
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Vova Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> *Steps:*
>  # Download the rc1 sources tarball:
>  
> [apache-drill-1.16.0-src.tar.gz|http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/apache-drill-1.16.0-src.tar.gz]
>  # Unpack
>  # Build:
> {noformat}
> mvn clean install -DskipTests
> {noformat}
>  # Start Drill in embedded mode:
> {noformat}
> Linux:
> distribution/target/apache-drill-1.16.0/apache-drill-1.16.0/bin/drill-embedded
> Windows:
> distribution\target\apache-drill-1.16.0\apache-drill-1.16.0\bin\sqlline.bat 
> -u "jdbc:drill:zk=local"
> {noformat}
>  # Run the query:
> {code:sql}
> select * from sys.version;
> {code}
> *Expected result:*
>  Drill version, commit_id, commit_message, commit_time, build_email, 
> build_time should be correctly displayed.
> *Actual result:*
> {noformat}
> apache drill> select * from sys.version;
> +-+---++-+-++
> | version | commit_id | commit_message | commit_time | build_email | 
> build_time |
> +-+---++-+-++
> | 1.16.0  | Unknown   || | Unknown |  
>   |
> +-+---++-+-++
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984966#comment-16984966
 ] 

ASF GitHub Bot commented on DRILL-7393:
---

vvysotskyi commented on pull request #1910: DRILL-7393: Revisit Drill tests to 
ensure that patching is executed b…
URL: https://github.com/apache/drill/pull/1910#discussion_r352104234
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/BaseTestInheritance.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.test.BaseTest;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.reflections.Reflections;
+import org.reflections.scanners.SubTypesScanner;
+
+import java.lang.reflect.Method;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import static org.junit.Assert.assertTrue;
+
+public class BaseTestInheritance extends BaseTest {
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void verifyInheritance() {
+// Get all BaseTest inheritors
+Reflections reflections = new Reflections("org.apache.drill", new 
SubTypesScanner(false));
+Set baseTestInheritors = 
reflections.getSubTypesOf(BaseTest.class).stream()
+.map(Class::getName)
+.collect(Collectors.toSet());
+// Get all tests
+Set testClasses = reflections.getSubTypesOf(Object.class).stream()
+.filter(c -> !c.isInterface())
+.filter(c -> c.getSimpleName().toLowerCase().contains("test"))
+.filter(c -> {
+  for (Method m : c.getDeclaredMethods()) {
 
 Review comment:
   Please use find any here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails during 
> unit testing if there are any problems.
> After all tests are revised, we can remove {{metastore-test}} execution from 
> main.xml in {{maven-surefire-plugin}} which was added to ensure that all 
> Metastore tests run in a separate JVM where patching is done in first place 
> since Iceberg Metastore heavily depends on patched Guava Preconditions class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984967#comment-16984967
 ] 

ASF GitHub Bot commented on DRILL-7393:
---

vvysotskyi commented on pull request #1910: DRILL-7393: Revisit Drill tests to 
ensure that patching is executed b…
URL: https://github.com/apache/drill/pull/1910#discussion_r352104680
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/BaseTestInheritance.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.test.BaseTest;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.reflections.Reflections;
+import org.reflections.scanners.SubTypesScanner;
+
+import java.lang.reflect.Method;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import static org.junit.Assert.assertTrue;
+
+public class BaseTestInheritance extends BaseTest {
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void verifyInheritance() {
+// Get all BaseTest inheritors
+Reflections reflections = new Reflections("org.apache.drill", new 
SubTypesScanner(false));
+Set baseTestInheritors = 
reflections.getSubTypesOf(BaseTest.class).stream()
+.map(Class::getName)
+.collect(Collectors.toSet());
+// Get all tests
+Set testClasses = reflections.getSubTypesOf(Object.class).stream()
+.filter(c -> !c.isInterface())
+.filter(c -> c.getSimpleName().toLowerCase().contains("test"))
+.filter(c -> {
+  for (Method m : c.getDeclaredMethods()) {
+if (m.getAnnotation(Test.class) != null) {
+  return true;
+}
+  }
+  return false;
+})
+.map(Class::getName)
+.collect(Collectors.toSet());
+
+testClasses.removeAll(baseTestInheritors);
 
 Review comment:
   We may avoid this by adding one more filter into the stream above.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails during 
> unit testing if there are any problems.
> After all tests are revised, we can remove {{metastore-test}} execution from 
> main.xml in {{maven-surefire-plugin}} which was added to ensure that all 
> Metastore tests run in a separate 

[jira] [Commented] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984965#comment-16984965
 ] 

ASF GitHub Bot commented on DRILL-7393:
---

vvysotskyi commented on pull request #1910: DRILL-7393: Revisit Drill tests to 
ensure that patching is executed b…
URL: https://github.com/apache/drill/pull/1910#discussion_r352110212
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/BaseTestInheritance.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.test.BaseTest;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.reflections.Reflections;
+import org.reflections.scanners.SubTypesScanner;
+
+import java.lang.reflect.Method;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import static org.junit.Assert.assertTrue;
+
+public class BaseTestInheritance extends BaseTest {
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void verifyInheritance() {
+// Get all BaseTest inheritors
+Reflections reflections = new Reflections("org.apache.drill", new 
SubTypesScanner(false));
+Set baseTestInheritors = 
reflections.getSubTypesOf(BaseTest.class).stream()
+.map(Class::getName)
+.collect(Collectors.toSet());
+// Get all tests
+Set testClasses = reflections.getSubTypesOf(Object.class).stream()
+.filter(c -> !c.isInterface())
+.filter(c -> c.getSimpleName().toLowerCase().contains("test"))
+.filter(c -> {
+  for (Method m : c.getDeclaredMethods()) {
+if (m.getAnnotation(Test.class) != null) {
+  return true;
+}
+  }
+  return false;
+})
+.map(Class::getName)
+.collect(Collectors.toSet());
+
+testClasses.removeAll(baseTestInheritors);
+String illegitimateTests = String.join("\n", testClasses);
+assertTrue(String.format("The following test classes are not inherited 
from BaseTest:\n%s", illegitimateTests),
 
 Review comment:
   We may use assert equals and specify empty collection, so it wouldn't be 
required to append the list into the error string.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails 

[jira] [Commented] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984964#comment-16984964
 ] 

ASF GitHub Bot commented on DRILL-7393:
---

vvysotskyi commented on pull request #1910: DRILL-7393: Revisit Drill tests to 
ensure that patching is executed b…
URL: https://github.com/apache/drill/pull/1910#discussion_r352094027
 
 

 ##
 File path: 
common/src/test/java/org/apache/drill/common/util/function/TestCheckedFunction.java
 ##
 @@ -17,14 +17,15 @@
  */
 package org.apache.drill.common.util.function;
 
+import org.apache.drill.test.DrillTest;
 import org.junit.Rule;
 import org.junit.Test;
 import org.junit.rules.ExpectedException;
 
 import java.util.HashMap;
 import java.util.Map;
 
-public class TestCheckedFunction {
+public class TestCheckedFunction extends DrillTest {
 
 Review comment:
   Why not `BaseTest`?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails during 
> unit testing if there are any problems.
> After all tests are revised, we can remove {{metastore-test}} execution from 
> main.xml in {{maven-surefire-plugin}} which was added to ensure that all 
> Metastore tests run in a separate JVM where patching is done in first place 
> since Iceberg Metastore heavily depends on patched Guava Preconditions class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7208) Drill commit is not shown if build Drill from the 1.16.0-rc1 release sources

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984953#comment-16984953
 ] 

ASF GitHub Bot commented on DRILL-7208:
---

vvysotskyi commented on issue #1911: DRILL-7208: Reuse root git.properties file
URL: https://github.com/apache/drill/pull/1911#issuecomment-559772432
 
 
   @agozhiy, good question! Plugins declared for the same phase, will be 
executed in the order of their appearance in the pom file, but currently, it 
may cause problems since `git-commit-id-plugin` is applied after 
`maven-resources-plugin`, so outdated file may be placed into the 
target/classes folder. Thanks for pointing that, I have corrected plugins order.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill commit is not shown if build Drill from the 1.16.0-rc1 release sources
> 
>
> Key: DRILL-7208
> URL: https://issues.apache.org/jira/browse/DRILL-7208
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.17.0
>
>
> *Steps:*
>  # Download the rc1 sources tarball:
>  
> [apache-drill-1.16.0-src.tar.gz|http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/apache-drill-1.16.0-src.tar.gz]
>  # Unpack
>  # Build:
> {noformat}
> mvn clean install -DskipTests
> {noformat}
>  # Start Drill in embedded mode:
> {noformat}
> Linux:
> distribution/target/apache-drill-1.16.0/apache-drill-1.16.0/bin/drill-embedded
> Windows:
> distribution\target\apache-drill-1.16.0\apache-drill-1.16.0\bin\sqlline.bat 
> -u "jdbc:drill:zk=local"
> {noformat}
>  # Run the query:
> {code:sql}
> select * from sys.version;
> {code}
> *Expected result:*
>  Drill version, commit_id, commit_message, commit_time, build_email, 
> build_time should be correctly displayed.
> *Actual result:*
> {noformat}
> apache drill> select * from sys.version;
> +-+---++-+-++
> | version | commit_id | commit_message | commit_time | build_email | 
> build_time |
> +-+---++-+-++
> | 1.16.0  | Unknown   || | Unknown |  
>   |
> +-+---++-+-++
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7208) Drill commit is not shown if build Drill from the 1.16.0-rc1 release sources

2019-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984940#comment-16984940
 ] 

ASF GitHub Bot commented on DRILL-7208:
---

agozhiy commented on issue #1911: DRILL-7208: Reuse root git.properties file
URL: https://github.com/apache/drill/pull/1911#issuecomment-559769539
 
 
   @vvysotskyi, Could there be a race condition possible between 
maven-resources-plugin and git-commit-id-plugin?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill commit is not shown if build Drill from the 1.16.0-rc1 release sources
> 
>
> Key: DRILL-7208
> URL: https://issues.apache.org/jira/browse/DRILL-7208
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.17.0
>
>
> *Steps:*
>  # Download the rc1 sources tarball:
>  
> [apache-drill-1.16.0-src.tar.gz|http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/apache-drill-1.16.0-src.tar.gz]
>  # Unpack
>  # Build:
> {noformat}
> mvn clean install -DskipTests
> {noformat}
>  # Start Drill in embedded mode:
> {noformat}
> Linux:
> distribution/target/apache-drill-1.16.0/apache-drill-1.16.0/bin/drill-embedded
> Windows:
> distribution\target\apache-drill-1.16.0\apache-drill-1.16.0\bin\sqlline.bat 
> -u "jdbc:drill:zk=local"
> {noformat}
>  # Run the query:
> {code:sql}
> select * from sys.version;
> {code}
> *Expected result:*
>  Drill version, commit_id, commit_message, commit_time, build_email, 
> build_time should be correctly displayed.
> *Actual result:*
> {noformat}
> apache drill> select * from sys.version;
> +-+---++-+-++
> | version | commit_id | commit_message | commit_time | build_email | 
> build_time |
> +-+---++-+-++
> | 1.16.0  | Unknown   || | Unknown |  
>   |
> +-+---++-+-++
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)