[jira] [Resolved] (DRILL-3958) Improve error message when JDBC driver not found
[ https://issues.apache.org/jira/browse/DRILL-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua resolved DRILL-3958. - Resolution: Done Reviewer: Aman Sinha > Improve error message when JDBC driver not found > > > Key: DRILL-3958 > URL: https://issues.apache.org/jira/browse/DRILL-3958 > Project: Apache Drill > Issue Type: Improvement > Components: Client - HTTP >Affects Versions: 1.2.0 >Reporter: Uwe Geercken >Assignee: Aman Sinha >Priority: Critical > Fix For: 1.13.0 > > > When setting up a storage definition for JDBC in the Drill web UI, the > appropriate driver has to be available in the 3rdparty folder before defining > the storage, otherwise an error is displayed. > The error message refers to a JSON mapping error which is completely > inappropriate in this case, because the error is the missing JDBC driver in > the 3rdparty folder and not the JSON mapping. > I request to change the error message to something appropriate that the > class/driver referred to could not be found (like for example: > com.mysql.jdbc.Driver) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-3958) Improve error message when JDBC driver not found
[ https://issues.apache.org/jira/browse/DRILL-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua resolved DRILL-3958. - Resolution: Done > Improve error message when JDBC driver not found > > > Key: DRILL-3958 > URL: https://issues.apache.org/jira/browse/DRILL-3958 > Project: Apache Drill > Issue Type: Improvement > Components: Client - HTTP >Affects Versions: 1.2.0 >Reporter: Uwe Geercken >Priority: Critical > > When setting up a storage definition for JDBC in the Drill web UI, the > appropriate driver has to be available in the 3rdparty folder before defining > the storage, otherwise an error is displayed. > The error message refers to a JSON mapping error which is completely > inappropriate in this case, because the error is the missing JDBC driver in > the 3rdparty folder and not the JSON mapping. > I request to change the error message to something appropriate that the > class/driver referred to could not be found (like for example: > com.mysql.jdbc.Driver) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-6048) ListVector is incomplete and broken, RepeatedListVector works
Paul Rogers created DRILL-6048: -- Summary: ListVector is incomplete and broken, RepeatedListVector works Key: DRILL-6048 URL: https://issues.apache.org/jira/browse/DRILL-6048 Project: Apache Drill Issue Type: Bug Affects Versions: 1.10.0 Reporter: Paul Rogers Drill provides two kinds of "list vectors": {{ListVector}} and {{RepeatedListVector}}. I attempted to use the {{ListVector}} to implement lists in JSON. While some parts work, others are broken and JIRA tickets were filed. Once things worked well enough to run a query, it turned out that the Project operator failed. Digging into the cause, it appears that the {{ListVector}} is incomplete and not used. Its implementation of {{makeTransferPair()}} was clearly never tested. A list has contents, but when this method attempts to create the contents of the target vector, it fails to create the list contents. Elsewhere, we saw that the constructor did correctly create the vector, and that the {{promoteToUnion()}} had holes. The sheer number of bugs leads to the conclusion that this class is not, in fact, used or usable. Looking more carefully at the JSON and older writer code, it appears that the ListVector was *not* used for JSON, and that JSON has the limitations of a repeated vector (it cannot support lists with null elements.) This implies that the JSON reader itself is broken as it does not support fully JSON semantics because it does not use the {{ListVector}} that was intended for this purpose. So, the conclusion is that JSON uses: * Repeated vectors for single-dimensional arrays (without null support) * {{RepeatedListVector}} for two-dimensional arrays This triggers the question: what do we do for three-dimensional arrays? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #1057: DRILL-5993 VectorContainer Append and Generic Copi...
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r157894540 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/Copier.java --- @@ -19,13 +19,15 @@ import org.apache.drill.exec.compile.TemplateClassDefinition; import org.apache.drill.exec.exception.SchemaChangeException; -import org.apache.drill.exec.ops.FragmentContext; import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.VectorContainer; public interface Copier { - public static TemplateClassDefinition TEMPLATE_DEFINITION2 = new TemplateClassDefinition(Copier.class, CopierTemplate2.class); - public static TemplateClassDefinition TEMPLATE_DEFINITION4 = new TemplateClassDefinition(Copier.class, CopierTemplate4.class); + TemplateClassDefinition TEMPLATE_DEFINITION2 = new TemplateClassDefinition(Copier.class, CopierTemplate2.class); + TemplateClassDefinition TEMPLATE_DEFINITION4 = new TemplateClassDefinition(Copier.class, CopierTemplate4.class); --- End diff -- I will create a separate PR to do that since that change is unrelated to this PR ---
[GitHub] drill pull request #1057: DRILL-5993 VectorContainer Append and Generic Copi...
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r157894423 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/CopierTemplate2.java --- @@ -53,17 +51,32 @@ public int copyRecords(int index, int recordCount) throws SchemaChangeException } } -int outgoingPosition = 0; +return insertRecords(0, index, recordCount); + } + + @Override + public int appendRecord(int index) throws SchemaChangeException { +return appendRecords(index, 1); + } + + @Override + public int appendRecords(int index, int recordCount) throws SchemaChangeException { +return insertRecords(outgoing.getRecordCount(), index, recordCount); + } + + private int insertRecords(int outgoingPosition, int index, int recordCount) throws SchemaChangeException { +final int endIndex = index + recordCount; -for(int svIndex = index; svIndex < index + recordCount; svIndex++, outgoingPosition++){ +for(int svIndex = index; svIndex < endIndex; svIndex++, outgoingPosition++){ doEval(sv2.getIndex(svIndex), outgoingPosition); } + +outgoing.setRecordCount(outgoingPosition); return outgoingPosition; } - public abstract void doSetup(@Named("context") FragmentContext context, - @Named("incoming") RecordBatch incoming, - @Named("outgoing") RecordBatch outgoing) + public abstract void doSetup(@Named("incoming") RecordBatch incoming, + @Named("outgoing") VectorContainer outgoing) --- End diff -- The copiers are only used in the SVRemover and TopN operator. I have replaced the code generated copiers in both now to use the GenericCopiers. ---
[jira] [Created] (DRILL-6047) Update doc to include instructions for libpam4j
Bridget Bevens created DRILL-6047: - Summary: Update doc to include instructions for libpam4j Key: DRILL-6047 URL: https://issues.apache.org/jira/browse/DRILL-6047 Project: Apache Drill Issue Type: Sub-task Components: Documentation Affects Versions: 1.12.0 Reporter: Bridget Bevens Assignee: Bridget Bevens Priority: Minor Fix For: 1.12.0 Update Apache Drill docs to include JPAM and libpam4j PAM authenticator instructions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-6046) Define semantics of vector metadata
Paul Rogers created DRILL-6046: -- Summary: Define semantics of vector metadata Key: DRILL-6046 URL: https://issues.apache.org/jira/browse/DRILL-6046 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.10.0 Reporter: Paul Rogers Priority: Minor Vectors provide metadata in the form of the {{MaterializedField}}. This class has evolved in an ad-hoc fashion over time, resulting in inconsistent behavior across vectors. The inconsistent behavior causes bugs and slow development because each vector follows different rules. Consistent behavior would, by contrast, lead to faster development and fewer bugs by reducing the number of variations that code must handle. Issues include: * Map vectors, but not lists, can create contents given a list of children in the {{MaterializedField}} passed to the constructor. * {{MaterializedField}} appears to want to be immutable, but it does allow changing of children. Unions also want to change the list of subtypes, but that is in the immutable {{MajorType}}, causing unions to rebuild and replace its {{MaterializedField}} on addition of a new type. By contrast, maps do not replace the field, they just add children. * Container vectors (maps, unions, lists) hold references to child {{MaterializedFields}}. But, because unions replace their fields, parents become out of sync since they point to the old, version before the update, causing inconsistent metadata, so that code cannot trust the metadata. * Lists and maps, but not unions, list their children in the field. * Nullable types, but not repeated types, include internal vectors in their list of children. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-6045) Doc new parameter
Bridget Bevens created DRILL-6045: - Summary: Doc new parameter Key: DRILL-6045 URL: https://issues.apache.org/jira/browse/DRILL-6045 Project: Apache Drill Issue Type: Sub-task Components: Documentation Affects Versions: 1.12.0 Reporter: Bridget Bevens Assignee: Bridget Bevens Document the new parameter listed in DRILL-5815 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill issue #1075: DRILL-6030: Managed sort should minimize number of batche...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1075 The scenario when all batches can be merged in memory is covered by 'if (canUseMemoryMerge())` check in `SortImpl.java:399`. The affected code path applies only to cases where merge between spilled and in-memory batches is necessary. Note that this is a short term fix to improve managed sort performance, in a long run, it is necessary to have an ability to merge all batches in memory (using SV4) without spilling and be able to merge it with the spilled data. ---
[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1075#discussion_r157885846 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java --- @@ -84,7 +85,7 @@ public SortConfig(DrillConfig config) { if (limit > 0) { mergeLimit = Math.max(limit, MIN_MERGE_LIMIT); } else { - mergeLimit = Integer.MAX_VALUE; + mergeLimit = DEFAULT_MERGE_LIMIT; --- End diff -- IMO, it is better to change the default to avoid upgrade problems. In an upgrade scenario, users may simply overwrite `drill-override.conf` from their prior installations and forget to set the merge limit. Is there a reason not to change the default merge limit? ---
[GitHub] drill pull request #1057: DRILL-5993 VectorContainer Append and Generic Copi...
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r157853697 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/CopierTemplate4.java --- @@ -54,17 +52,33 @@ public int copyRecords(int index, int recordCount) throws SchemaChangeException } } -int outgoingPosition = 0; -for(int svIndex = index; svIndex < index + recordCount; svIndex++, outgoingPosition++){ +return insertRecords(0, index, recordCount); + } + + @Override + public int appendRecord(int index) throws SchemaChangeException { +return appendRecords(index, 1); + } --- End diff -- Updated the code and made an implementation of appendRecord which doesn't use a for loop ---
[GitHub] drill issue #1073: DRILL-5967: Fixed memory leak in OrderedPartitionSender
Github user ilooner commented on the issue: https://github.com/apache/drill/pull/1073 Removed unnecessary creation of list in OrderedPartitionSenderCreator as discussed. @paul-rogers please take a look. ---
[jira] [Created] (DRILL-6044) Shutdown button does not work from WebUI
Krystal created DRILL-6044: -- Summary: Shutdown button does not work from WebUI Key: DRILL-6044 URL: https://issues.apache.org/jira/browse/DRILL-6044 Project: Apache Drill Issue Type: Bug Components: Client - HTTP Affects Versions: 1.13.0 Reporter: Krystal git.commit.id.abbrev=eb0c403 Nothing happens when click on the SHUTDOWN button from the WebUI. The browser's debugger showed that the request failed due to access control checks (see attached screen shot). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill issue #1075: DRILL-6030: Managed sort should minimize number of batche...
Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/1075 One additional thought. This bug was found when sorting 18 GB of data in 8 GB of memory. That is, a case in which the sort must spill. What happens in the case in which the 18 GB of data is sorted in, say, 20 GB of memory (an in-memory sort)? We don't want the merge limit to force a spill in this case; kind of defeats the purpose of an in-memory sort. So: 1. Does the limit affect in memory sort? If so, we need to revise the solution. 2. Does the in-memory sort suffer from a similar performance issue? If so, we need to revise the in memory sort. One possible solution is to: 1. Defer sorting of individual batches until necessary. 2. Sort batches just before spilling. 3. If all batches fit in memory, do a single, combined sort (using an SV4). ---
[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1075#discussion_r157830252 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java --- @@ -84,7 +85,7 @@ public SortConfig(DrillConfig config) { if (limit > 0) { mergeLimit = Math.max(limit, MIN_MERGE_LIMIT); } else { - mergeLimit = Integer.MAX_VALUE; + mergeLimit = DEFAULT_MERGE_LIMIT; --- End diff -- The merge limit is already a config option. (I'd forgotten about that.) The comment on the config option says "Limit on the number of spilled batches that can be merged in a single pass." So, let's just set that default (in `drill-override-conf`) to your new value of 128 and leave the code here unchanged. ---
[GitHub] drill issue #1073: DRILL-5967: Fixed memory leak in OrderedPartitionSender
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1073 LGTM ---
[GitHub] drill issue #1068: DRILL-6020 Fix NullPointerException when querying JSON un...
Github user vvysotskyi commented on the issue: https://github.com/apache/drill/pull/1068 @MitchelLabonte, thanks for the pull request, +1 ---
[GitHub] drill issue #1068: DRILL-6020 Fix NullPointerException when querying JSON un...
Github user MitchelLabonte commented on the issue: https://github.com/apache/drill/pull/1068 @vvysotskyi No problem, yes all tests pass now. ---
[GitHub] drill issue #1068: DRILL-6020 Fix NullPointerException when querying JSON un...
Github user vvysotskyi commented on the issue: https://github.com/apache/drill/pull/1068 Sorry, I read your removed comment and made a suggestion that this test fails. Did you run all unit tests to see that this change does not break anything? ---
[GitHub] drill issue #1068: DRILL-6020 Fix NullPointerException when querying JSON un...
Github user MitchelLabonte commented on the issue: https://github.com/apache/drill/pull/1068 @vvysotskyi This is happening because the type is cached as a json object from the previous row. The fix is similar to the getFieldIdIfMatches() method so it looks like this is intended behaviour. As you can see in the unit test, the results are what is expected after the fix. I am not sure what else could be done. ---
[GitHub] drill issue #1068: DRILL-6020 Fix NullPointerException when querying JSON un...
Github user vvysotskyi commented on the issue: https://github.com/apache/drill/pull/1068 @MitchelLabonte, I think this NPE is just a consequence of the bug that should be fixed. Please investigate why Drill is trying to use child `PathSegment` when a value has VarChar type. ---
[jira] [Created] (DRILL-6043) Nullable vector, but not List vector, adds its internal vectors to child list
Paul Rogers created DRILL-6043: -- Summary: Nullable vector, but not List vector, adds its internal vectors to child list Key: DRILL-6043 URL: https://issues.apache.org/jira/browse/DRILL-6043 Project: Apache Drill Issue Type: Bug Affects Versions: 1.10.0 Reporter: Paul Rogers Priority: Minor Each Drill vector has associated metadata in the form of a {{MaterializeField}} instance. The {{MaterializeField}} contains a list of children. For a Map vector, the list of children lists the vectors that make up the map. Nullable vectors use the list of children to identify the hidden vectors that make up the nullable vectors: {{$bits$}} and {{$values$}}. However, repeated vectors (including lists) also have hidden internal vectors: offsets and values. However, the metadata for repeated types and lists do not include these in the vector metadata. We should decide if we need metadata for the implied internal vectors. (Having it does cause problems since a newly-created schema for a nullable vector is not equal to the actual schema created by the vector itself.) If we don't need the internal vector metadata, remove it from the nullable vectors. But, if we do need it, add it to the repeated vectors and to lists. -- This message was sent by Atlassian JIRA (v6.4.14#64029)