[jira] [Resolved] (DRILL-3958) Improve error message when JDBC driver not found

2017-12-19 Thread Kunal Khatua (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua resolved DRILL-3958.
-
Resolution: Done
  Reviewer: Aman Sinha

> Improve error message when JDBC driver not found
> 
>
> Key: DRILL-3958
> URL: https://issues.apache.org/jira/browse/DRILL-3958
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP
>Affects Versions: 1.2.0
>Reporter: Uwe Geercken
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.13.0
>
>
> When setting up a storage definition for JDBC in the Drill web UI, the 
> appropriate driver has to be available in the 3rdparty folder before defining 
> the storage, otherwise an error is displayed.
> The error message refers to a JSON mapping error which is completely 
> inappropriate in this case, because the error is the missing JDBC driver in 
> the 3rdparty folder and not the JSON mapping.
> I request to change the error message to something appropriate that the 
> class/driver referred to could not be found (like for example: 
> com.mysql.jdbc.Driver)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (DRILL-3958) Improve error message when JDBC driver not found

2017-12-19 Thread Kunal Khatua (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua resolved DRILL-3958.
-
Resolution: Done

> Improve error message when JDBC driver not found
> 
>
> Key: DRILL-3958
> URL: https://issues.apache.org/jira/browse/DRILL-3958
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP
>Affects Versions: 1.2.0
>Reporter: Uwe Geercken
>Priority: Critical
>
> When setting up a storage definition for JDBC in the Drill web UI, the 
> appropriate driver has to be available in the 3rdparty folder before defining 
> the storage, otherwise an error is displayed.
> The error message refers to a JSON mapping error which is completely 
> inappropriate in this case, because the error is the missing JDBC driver in 
> the 3rdparty folder and not the JSON mapping.
> I request to change the error message to something appropriate that the 
> class/driver referred to could not be found (like for example: 
> com.mysql.jdbc.Driver)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-6048) ListVector is incomplete and broken, RepeatedListVector works

2017-12-19 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-6048:
--

 Summary: ListVector is incomplete and broken, RepeatedListVector 
works
 Key: DRILL-6048
 URL: https://issues.apache.org/jira/browse/DRILL-6048
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers


Drill provides two kinds of "list vectors": {{ListVector}} and 
{{RepeatedListVector}}. I attempted to use the {{ListVector}} to implement 
lists in JSON. While some parts work, others are broken and JIRA tickets were 
filed.

Once things worked well enough to run a query, it turned out that the Project 
operator failed. Digging into the cause, it appears that the {{ListVector}} is 
incomplete and not used. Its implementation of {{makeTransferPair()}} was 
clearly never tested. A list has contents, but when this method attempts to 
create the contents of the target vector, it fails to create the list contents.

Elsewhere, we saw that the constructor did correctly create the vector, and 
that the {{promoteToUnion()}} had holes. The sheer number of bugs leads to the 
conclusion that this class is not, in fact, used or usable.

Looking more carefully at the JSON and older writer code, it appears that the 
ListVector was *not* used for JSON, and that JSON has the limitations of a 
repeated vector (it cannot support lists with null elements.)

This implies that the JSON reader itself is broken as it does not support fully 
JSON semantics because it does not use the {{ListVector}} that was intended for 
this purpose.

So, the conclusion is that JSON uses:

* Repeated vectors for single-dimensional arrays (without null support)
* {{RepeatedListVector}} for two-dimensional arrays

This triggers the question: what do we do for three-dimensional arrays?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1057: DRILL-5993 VectorContainer Append and Generic Copi...

2017-12-19 Thread ilooner
Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r157894540
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/Copier.java
 ---
@@ -19,13 +19,15 @@
 
 import org.apache.drill.exec.compile.TemplateClassDefinition;
 import org.apache.drill.exec.exception.SchemaChangeException;
-import org.apache.drill.exec.ops.FragmentContext;
 import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.VectorContainer;
 
 public interface Copier {
-  public static TemplateClassDefinition TEMPLATE_DEFINITION2 = new 
TemplateClassDefinition(Copier.class, CopierTemplate2.class);
-  public static TemplateClassDefinition TEMPLATE_DEFINITION4 = new 
TemplateClassDefinition(Copier.class, CopierTemplate4.class);
+  TemplateClassDefinition TEMPLATE_DEFINITION2 = new 
TemplateClassDefinition(Copier.class, CopierTemplate2.class);
+  TemplateClassDefinition TEMPLATE_DEFINITION4 = new 
TemplateClassDefinition(Copier.class, CopierTemplate4.class);
--- End diff --

I will create a separate PR to do that since that change is unrelated to 
this PR


---


[GitHub] drill pull request #1057: DRILL-5993 VectorContainer Append and Generic Copi...

2017-12-19 Thread ilooner
Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r157894423
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/CopierTemplate2.java
 ---
@@ -53,17 +51,32 @@ public int copyRecords(int index, int recordCount) 
throws SchemaChangeException
   }
 }
 
-int outgoingPosition = 0;
+return insertRecords(0, index, recordCount);
+  }
+
+  @Override
+  public int appendRecord(int index) throws SchemaChangeException {
+return appendRecords(index, 1);
+  }
+
+  @Override
+  public int appendRecords(int index, int recordCount) throws 
SchemaChangeException {
+return insertRecords(outgoing.getRecordCount(), index, recordCount);
+  }
+
+  private int insertRecords(int outgoingPosition, int index, int 
recordCount) throws SchemaChangeException {
+final int endIndex = index + recordCount;
 
-for(int svIndex = index; svIndex < index + recordCount; svIndex++, 
outgoingPosition++){
+for(int svIndex = index; svIndex < endIndex; svIndex++, 
outgoingPosition++){
   doEval(sv2.getIndex(svIndex), outgoingPosition);
 }
+
+outgoing.setRecordCount(outgoingPosition);
 return outgoingPosition;
   }
 
-  public abstract void doSetup(@Named("context") FragmentContext context,
-   @Named("incoming") RecordBatch incoming,
-   @Named("outgoing") RecordBatch outgoing)
+  public abstract void doSetup(@Named("incoming") RecordBatch incoming,
+   @Named("outgoing") VectorContainer outgoing)
--- End diff --

The copiers are only used in the SVRemover and TopN operator. I have 
replaced the code generated copiers in both now to use the GenericCopiers.


---


[jira] [Created] (DRILL-6047) Update doc to include instructions for libpam4j

2017-12-19 Thread Bridget Bevens (JIRA)
Bridget Bevens created DRILL-6047:
-

 Summary: Update doc to include instructions for libpam4j
 Key: DRILL-6047
 URL: https://issues.apache.org/jira/browse/DRILL-6047
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.12.0
Reporter: Bridget Bevens
Assignee: Bridget Bevens
Priority: Minor
 Fix For: 1.12.0


Update Apache Drill docs to include JPAM and libpam4j PAM authenticator 
instructions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-6046) Define semantics of vector metadata

2017-12-19 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-6046:
--

 Summary: Define semantics of vector metadata
 Key: DRILL-6046
 URL: https://issues.apache.org/jira/browse/DRILL-6046
 Project: Apache Drill
  Issue Type: Improvement
Affects Versions: 1.10.0
Reporter: Paul Rogers
Priority: Minor


Vectors provide metadata in the form of the {{MaterializedField}}. This class 
has evolved in an ad-hoc fashion over time, resulting in inconsistent behavior 
across vectors. The inconsistent behavior causes bugs and slow development 
because each vector follows different rules. Consistent behavior would, by 
contrast, lead to faster development and fewer bugs by reducing the number of 
variations that code must handle.

Issues include:

* Map vectors, but not lists, can create contents given a list of children in 
the {{MaterializedField}} passed to the constructor.
* {{MaterializedField}} appears to want to be immutable, but it does allow 
changing of children. Unions also want to change the list of subtypes, but that 
is in the immutable {{MajorType}}, causing unions to rebuild and replace its 
{{MaterializedField}} on addition of a new type. By contrast, maps do not 
replace the field, they just add children.
* Container vectors (maps, unions, lists) hold references to child 
{{MaterializedFields}}. But, because unions replace their fields, parents 
become out of sync since they point to the old, version before the update, 
causing inconsistent metadata, so that code cannot trust the metadata.
* Lists and maps, but not unions, list their children in the field.
* Nullable types, but not repeated types, include internal vectors in their 
list of children. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-6045) Doc new parameter

2017-12-19 Thread Bridget Bevens (JIRA)
Bridget Bevens created DRILL-6045:
-

 Summary: Doc new parameter
 Key: DRILL-6045
 URL: https://issues.apache.org/jira/browse/DRILL-6045
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.12.0
Reporter: Bridget Bevens
Assignee: Bridget Bevens


Document the new parameter listed in DRILL-5815



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill issue #1075: DRILL-6030: Managed sort should minimize number of batche...

2017-12-19 Thread vrozov
Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1075
  
The scenario when all batches can be merged in memory is covered by 'if 
(canUseMemoryMerge())` check in `SortImpl.java:399`. The affected code path 
applies only to cases where merge between spilled and in-memory batches is 
necessary. Note that this is a short term fix to improve managed sort 
performance, in a long run, it is necessary to have an ability to merge all 
batches in memory (using SV4) without spilling and be able to merge it with the 
spilled data.


---


[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...

2017-12-19 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1075#discussion_r157885846
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java
 ---
@@ -84,7 +85,7 @@ public SortConfig(DrillConfig config) {
 if (limit > 0) {
   mergeLimit = Math.max(limit, MIN_MERGE_LIMIT);
 } else {
-  mergeLimit = Integer.MAX_VALUE;
+  mergeLimit = DEFAULT_MERGE_LIMIT;
--- End diff --

IMO, it is better to change the default to avoid upgrade problems. In an 
upgrade scenario,  users may simply overwrite `drill-override.conf` from their 
prior installations and forget to set the merge limit. Is there a reason not to 
change the default merge limit?


---


[GitHub] drill pull request #1057: DRILL-5993 VectorContainer Append and Generic Copi...

2017-12-19 Thread ilooner
Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r157853697
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/CopierTemplate4.java
 ---
@@ -54,17 +52,33 @@ public int copyRecords(int index, int recordCount) 
throws SchemaChangeException
   }
 }
 
-int outgoingPosition = 0;
-for(int svIndex = index; svIndex < index + recordCount; svIndex++, 
outgoingPosition++){
+return insertRecords(0, index, recordCount);
+  }
+
+  @Override
+  public int appendRecord(int index) throws SchemaChangeException {
+return appendRecords(index, 1);
+  }
--- End diff --

Updated the code and made an implementation of appendRecord which doesn't 
use a for loop


---


[GitHub] drill issue #1073: DRILL-5967: Fixed memory leak in OrderedPartitionSender

2017-12-19 Thread ilooner
Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1073
  
Removed unnecessary creation of list in OrderedPartitionSenderCreator as 
discussed. @paul-rogers  please take a look.


---


[jira] [Created] (DRILL-6044) Shutdown button does not work from WebUI

2017-12-19 Thread Krystal (JIRA)
Krystal created DRILL-6044:
--

 Summary: Shutdown button does not work from WebUI
 Key: DRILL-6044
 URL: https://issues.apache.org/jira/browse/DRILL-6044
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - HTTP
Affects Versions: 1.13.0
Reporter: Krystal


git.commit.id.abbrev=eb0c403

Nothing happens when click on the SHUTDOWN button from the WebUI.  The 
browser's debugger showed that the request failed due to access control checks 
(see attached screen shot).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill issue #1075: DRILL-6030: Managed sort should minimize number of batche...

2017-12-19 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1075
  
One additional thought. This bug was found when sorting 18 GB of data in 8 
GB of memory. That is, a case in which the sort must spill.

What happens in the case in which the 18 GB of data is sorted in, say, 20 
GB of memory (an in-memory sort)? We don't want the merge limit to force a 
spill in this case; kind of defeats the purpose of an in-memory sort.

So:

1. Does the limit affect in memory sort? If so, we need to revise the 
solution.
2. Does the in-memory sort suffer from a similar performance issue? If so, 
we need to revise the in memory sort.

One possible solution is to:

1. Defer sorting of individual batches until necessary.
2. Sort batches just before spilling.
3. If all batches fit in memory, do a single, combined sort (using an SV4).


---


[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...

2017-12-19 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1075#discussion_r157830252
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java
 ---
@@ -84,7 +85,7 @@ public SortConfig(DrillConfig config) {
 if (limit > 0) {
   mergeLimit = Math.max(limit, MIN_MERGE_LIMIT);
 } else {
-  mergeLimit = Integer.MAX_VALUE;
+  mergeLimit = DEFAULT_MERGE_LIMIT;
--- End diff --

The merge limit is already a config option. (I'd forgotten about that.) The 
comment on the config option says "Limit on the number of spilled batches that 
can be merged in a single pass." So, let's just set that default (in 
`drill-override-conf`) to your new value of 128 and leave the code here 
unchanged.


---


[GitHub] drill issue #1073: DRILL-5967: Fixed memory leak in OrderedPartitionSender

2017-12-19 Thread vrozov
Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1073
  
LGTM


---


[GitHub] drill issue #1068: DRILL-6020 Fix NullPointerException when querying JSON un...

2017-12-19 Thread vvysotskyi
Github user vvysotskyi commented on the issue:

https://github.com/apache/drill/pull/1068
  
@MitchelLabonte, thanks for the pull request, +1


---


[GitHub] drill issue #1068: DRILL-6020 Fix NullPointerException when querying JSON un...

2017-12-19 Thread MitchelLabonte
Github user MitchelLabonte commented on the issue:

https://github.com/apache/drill/pull/1068
  
@vvysotskyi No problem, yes all tests pass now. 


---


[GitHub] drill issue #1068: DRILL-6020 Fix NullPointerException when querying JSON un...

2017-12-19 Thread vvysotskyi
Github user vvysotskyi commented on the issue:

https://github.com/apache/drill/pull/1068
  
Sorry, I read your removed comment and made a suggestion that this test 
fails. 
Did you run all unit tests to see that this change does not break anything?


---


[GitHub] drill issue #1068: DRILL-6020 Fix NullPointerException when querying JSON un...

2017-12-19 Thread MitchelLabonte
Github user MitchelLabonte commented on the issue:

https://github.com/apache/drill/pull/1068
  
@vvysotskyi
This is happening because the type is cached as a json object from the 
previous row. The fix is similar to the getFieldIdIfMatches() method so it 
looks like this is intended behaviour.
As you can see in the unit test, the results are what is expected after the 
fix. I am not sure what else could be done. 


---


[GitHub] drill issue #1068: DRILL-6020 Fix NullPointerException when querying JSON un...

2017-12-19 Thread vvysotskyi
Github user vvysotskyi commented on the issue:

https://github.com/apache/drill/pull/1068
  
@MitchelLabonte, I think this NPE is just a consequence of the bug that 
should be fixed. Please investigate why Drill is trying to use child 
`PathSegment` when a value has VarChar type.


---


[jira] [Created] (DRILL-6043) Nullable vector, but not List vector, adds its internal vectors to child list

2017-12-19 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-6043:
--

 Summary: Nullable vector, but not List vector, adds its internal 
vectors to child list
 Key: DRILL-6043
 URL: https://issues.apache.org/jira/browse/DRILL-6043
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers
Priority: Minor


Each Drill vector has associated metadata in the form of a {{MaterializeField}} 
instance. The {{MaterializeField}} contains a list of children. For a Map 
vector, the list of children lists the vectors that make up the map.

Nullable vectors use the list of children to identify the hidden vectors that 
make up the nullable vectors: {{$bits$}} and {{$values$}}.

However, repeated vectors (including lists) also have hidden internal vectors: 
offsets and values. However, the metadata for repeated types and lists do not 
include these in the vector metadata.

We should decide if we need metadata for the implied internal vectors. (Having 
it does cause problems since a newly-created schema for a nullable vector is 
not equal to the actual schema created by the vector itself.)

If we don't need the internal vector metadata, remove it from the nullable 
vectors.

But, if we do need it, add it to the repeated vectors and to lists.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)