[jira] [Commented] (DRILL-3958) Improve error message when JDBC driver not found
[ https://issues.apache.org/jira/browse/DRILL-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297982#comment-16297982 ] Kunal Khatua commented on DRILL-3958: - [~amansinha100] can you review? > Improve error message when JDBC driver not found > > > Key: DRILL-3958 > URL: https://issues.apache.org/jira/browse/DRILL-3958 > Project: Apache Drill > Issue Type: Improvement > Components: Client - HTTP >Affects Versions: 1.2.0 >Reporter: Uwe Geercken >Assignee: Aman Sinha >Priority: Critical > Fix For: 1.13.0 > > > When setting up a storage definition for JDBC in the Drill web UI, the > appropriate driver has to be available in the 3rdparty folder before defining > the storage, otherwise an error is displayed. > The error message refers to a JSON mapping error which is completely > inappropriate in this case, because the error is the missing JDBC driver in > the 3rdparty folder and not the JSON mapping. > I request to change the error message to something appropriate that the > class/driver referred to could not be found (like for example: > com.mysql.jdbc.Driver) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-3958) Improve error message when JDBC driver not found
[ https://issues.apache.org/jira/browse/DRILL-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua resolved DRILL-3958. - Resolution: Done Reviewer: Aman Sinha > Improve error message when JDBC driver not found > > > Key: DRILL-3958 > URL: https://issues.apache.org/jira/browse/DRILL-3958 > Project: Apache Drill > Issue Type: Improvement > Components: Client - HTTP >Affects Versions: 1.2.0 >Reporter: Uwe Geercken >Assignee: Aman Sinha >Priority: Critical > Fix For: 1.13.0 > > > When setting up a storage definition for JDBC in the Drill web UI, the > appropriate driver has to be available in the 3rdparty folder before defining > the storage, otherwise an error is displayed. > The error message refers to a JSON mapping error which is completely > inappropriate in this case, because the error is the missing JDBC driver in > the 3rdparty folder and not the JSON mapping. > I request to change the error message to something appropriate that the > class/driver referred to could not be found (like for example: > com.mysql.jdbc.Driver) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (DRILL-3958) Improve error message when JDBC driver not found
[ https://issues.apache.org/jira/browse/DRILL-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua reassigned DRILL-3958: --- Assignee: Aman Sinha Fix Version/s: 1.13.0 > Improve error message when JDBC driver not found > > > Key: DRILL-3958 > URL: https://issues.apache.org/jira/browse/DRILL-3958 > Project: Apache Drill > Issue Type: Improvement > Components: Client - HTTP >Affects Versions: 1.2.0 >Reporter: Uwe Geercken >Assignee: Aman Sinha >Priority: Critical > Fix For: 1.13.0 > > > When setting up a storage definition for JDBC in the Drill web UI, the > appropriate driver has to be available in the 3rdparty folder before defining > the storage, otherwise an error is displayed. > The error message refers to a JSON mapping error which is completely > inappropriate in this case, because the error is the missing JDBC driver in > the 3rdparty folder and not the JSON mapping. > I request to change the error message to something appropriate that the > class/driver referred to could not be found (like for example: > com.mysql.jdbc.Driver) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (DRILL-3958) Improve error message when JDBC driver not found
[ https://issues.apache.org/jira/browse/DRILL-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua reopened DRILL-3958: - > Improve error message when JDBC driver not found > > > Key: DRILL-3958 > URL: https://issues.apache.org/jira/browse/DRILL-3958 > Project: Apache Drill > Issue Type: Improvement > Components: Client - HTTP >Affects Versions: 1.2.0 >Reporter: Uwe Geercken >Priority: Critical > > When setting up a storage definition for JDBC in the Drill web UI, the > appropriate driver has to be available in the 3rdparty folder before defining > the storage, otherwise an error is displayed. > The error message refers to a JSON mapping error which is completely > inappropriate in this case, because the error is the missing JDBC driver in > the 3rdparty folder and not the JSON mapping. > I request to change the error message to something appropriate that the > class/driver referred to could not be found (like for example: > com.mysql.jdbc.Driver) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-3958) Improve error message when JDBC driver not found
[ https://issues.apache.org/jira/browse/DRILL-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua resolved DRILL-3958. - Resolution: Done > Improve error message when JDBC driver not found > > > Key: DRILL-3958 > URL: https://issues.apache.org/jira/browse/DRILL-3958 > Project: Apache Drill > Issue Type: Improvement > Components: Client - HTTP >Affects Versions: 1.2.0 >Reporter: Uwe Geercken >Priority: Critical > > When setting up a storage definition for JDBC in the Drill web UI, the > appropriate driver has to be available in the 3rdparty folder before defining > the storage, otherwise an error is displayed. > The error message refers to a JSON mapping error which is completely > inappropriate in this case, because the error is the missing JDBC driver in > the 3rdparty folder and not the JSON mapping. > I request to change the error message to something appropriate that the > class/driver referred to could not be found (like for example: > com.mysql.jdbc.Driver) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (DRILL-6035) Specify Drill's JSON behavior
[ https://issues.apache.org/jira/browse/DRILL-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296230#comment-16296230 ] Paul Rogers edited comment on DRILL-6035 at 12/20/17 6:41 AM: -- h4. Lists _*NOTE:* This section describes the Drill {{LIST}} type which turns out to be broken and not supported. The following is based on a prototype using the {{LIST}} type created after fixing some, but not all, {{LIST}} bugs._ JSON supports arrays of the form: {code} {a: ["I'm", "an", "array"] } {code} Drill has two very different ways to represent arrays: 1. As a {{REPEATED}} cardinality for most data types. This gives rise to a {{RepeatedFooVector}} for some type {{Foo}}. 2. As a {{LIST}} type with the {{ListVector}} implementation. Here, Arrow has done a nice job. Arrow unified the {{REPEATED}} cardinality and the {{LIST}} vector type into a single concept. Drill, however, still has two systems. h4. Repeated Cardinality Drill's "go to" way to handle arrays is with the {{REPEATED}} cardinality (AKA "repeated data mode.") Most readers that handle arrays use the {{REPEATED}} form. To help understand the {{LIST}} type, we review {{REPEATED}} support here. When working with a {{REPEATED} column, the rules for nulls are: * Arrays may not contain nulls. (Drill does not support nulls as array elements.) * A null (or missing) array field is treated the same as an empty array. If JSON were to use the {{REPEATED}} vectors, the following would be invalid: {code} [10, null, 20] {code} The following are all valid with {{REPEATED}} vectors: {code} {id: 1} {id: 2, a: null} {id: 3, a: []} {id: 4, a: [10, 20, 30]} {code} h4. Properties of Lists The key properties of a List relative to a repeated type are: * Each list can be a list of nothing, a list of a single type, or a list of a union of multiple types. * Each list value can be null. * Each list entry can be null (for primitive types.) h4. Null Support As explained below, lists support three kinds of nullability: * The type itself can be null (a list of nulls) * The list column value for a row can be null. (This is in contrast to repeated types in which an array can be empty, but the entire array value cannot be null.) * When the list is of a primitive type, entries can be null. (The list is defined as list of nullable items, such as nullable BIGINT.) * When the list is of maps, the map entries *cannot* be null. (Instead, the map columns are nullable and all columns for the "null" map are set to null.) * When the list is of other lists, the list entry *cannot* be null. (Instead, the nested list is empty.) The semantics are a bit confusing when seen from the outside. They make slightly more sense based on the implementation choices made in the code. (Though, generally we want the code to match our requirements, not the other way around.) h4. Lists are Obscure The {{LIST}} type appears to be used only for JSON, and it is unclear how well supported it is in the rest of Drill. For example, it is not clear that functions that work with arrays correctly handle null entries. (This needs to be tested.) JDBC supports array columns, but it is not clear if the Drill JDBC driver has implemented them. ODBC doesn't support arrays at all, so whether it supports arrays with nulls is a moot point. h4. Lists in JSON The {{LIST}} type appears to be used only for JSON where it is a better fit for JSON semantics than Drill's normal {{REPEATED}} cardinality. The list type allows list members to be null. All of the following are legal using lists: {code} {a: null} {a: []} {a: [null, null]} {a: [null, 10, null]} {a: [10, "foo"]} {code} We'll look at each of these in detail. h4. Degenerate Lists Consider the simplest possible list in JSON: a file that contains only an empty list: {noformat} {a: []} {noformat} What is the type of the list? In JSON, lists have no type, they are just lists. Drill requires a type, however when working with a {{REPEATED}} cardinality: the column must be an array of something. Lists, however, can be a list of only nulls using the obscure {{LATE}} data type. That is, the list exists, but has no type. ({{LATE}} seems to suggest that the type will be assigned later.) Next, consider another degenerate array: {noformat} {a: [null, null]} {noformat} Here we have an array of nulls. Again, we don't know what type these are a null of. Again, a LIST allows the JSON reader to produce a row with a single column {{`a`}} that is of type {{LIST}} that contains only the "dummy" {{LATE}} type. The list will indicate that we have two entries, both of which are null. It is unclear, however, if the rest of Drill supports this concept. (DRILL-5970 discusses a case in which an empty array, with a List of {{LATE}}, is exported to Parquet, producing results different than one might naively expect.) h4. Single-type L
[jira] [Comment Edited] (DRILL-6035) Specify Drill's JSON behavior
[ https://issues.apache.org/jira/browse/DRILL-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293316#comment-16293316 ] Paul Rogers edited comment on DRILL-6035 at 12/20/17 6:39 AM: -- h4. JSON Arrays Drill supports simple arrays in JSON using the following rules: * Arrays must contain hetrogeneous elements: any of the scalars described above, or a JSON object. * Single-dimensional arrays cannot contain null entries. * Two-dimensional arrays can contain nulls at the outer level but not the inner level. (See a later comment for nested arrays.) For example, the following are scalar arrays: {code} [10, 20] [10.30, 10.45] ["foo", "bar"] [true, false] {code} h4. Schema Change in Arrays The following will trigger errors: {code} {a: [10, "foo"]} // Mixed types {a: [10]} {a: ["foo"]} // Schema change {a: [10, 12.5]} // Conflicting types: integer and float {code} h4. Nulls in Arrays h4. Missing {{LIST}} Support JSON arrays can contain nulls. Drill provides a (partially completed, inoperable) {{LIST}} type as described below that handles nulls. But, this vector is not used in Drill 1.12 or earlier. Instead, Drill uses repeated types which cannot handle nulls. (The {{LIST}} type is described in a separate note below.) Using array types, the following rules apply to nulls: * An array cannot contain nulls. * An empty array at the start of the file has an unknown type. (Do we select Nullable {{INT}}?) * An entire array can be null, which is represented as an empty array. (That is, an empty array and a {{null}} value are considered the same.) h4. Late Type Identification As described earlier, Drill 1.13 will defer picking an array type if it sees null values. For example: {code} {id: 1} {id: 2, a: null} {id: 3, a: []} {id: 4, a: [10, 20, 30]} {code} In the above example, for id=2, Drill sees column `a` but does not pick a type. For id=3, Drill identifies that `a` is an array, but does not know the type. Finally, for id=4, Drill identifies the array as {{BIGINT}}. h4. Null-Only Arrays A special case occurs if a JSON file contains only empty arrays or arrays of nulls (such as a file that contains only the first three records above.) In Drill 1.12 and earlier, the result is a list of {{LATE}} elements (See the List section below.) It seems that {{SqlLine}} will correctly show the null values. An interesting case occurs when Drill reads two files: one with an array with only nulls, another with real values. For example: {noformat} File A: {a: [null, null] } File B: {a: [10, 20] } {noformat} (The above condition can occur only if JSON uses the broken {{LIST}} type; it cannot occur in Drill 1.12. In 1.12, the equivalent condition is if File A contains: {noformat} {a: []} {noformat} Drill is distributed: one fragment will read File A, another will read File B. At some point, the two arrays will come together. One fragment will have created a list of {{LATE}}, another a list of {{BIGINT}}. Most operators will trigger a schema change error in this case. Interestingly, however, if the query is a simple {{SELECT *}}, then the lists are compatible and {{SqlLine}} will display the correct results. In Drill 1.13, if the first batch contains only nulls and/or empty arrays, Drill guesses that the type is an array of {{VARCHAR}}. Since this is only a guess, a schema change will result if the guess is wrong. was (Author: paul.rogers): h4. JSON Arrays Drill supports simple arrays in JSON using the following rules: * Arrays must contain hetrogeneous elements: any of the scalars described above, or a JSON object. (See a later comment for nested arrays.) For example, the following are scalar arrays: {code} [10, 20] [10.30, 10.45] ["foo", "bar"] [true, false] {code} h4. Schema Change in Arrays The following will trigger errors: {code} {a: [10, "foo"]} // Mixed types {a: [10]} {a: ["foo"]} // Schema change {a: [10, 12.5]} // Conflicting types: integer and float {code} h4. Nulls in Arrays Drill handles nulls in arrays using the {{LIST}} type, described in a separate note below. h4. Late Type Identification As described earlier, Drill will defer picking an array type if it sees null values. For example: {code} {id: 1} {id: 2, a: null} {id: 3, a: []} {id: 4, a: [10, 20, 30]} {code} In the above example, for id=2, Drill sees column `a` but does not pick a type. For id=3, Drill identifies that `a` is an array, but does not know the type. Finally, for id=4, Drill identifies the array as {{BIGINT}}. h4. Null-Only Arrays A special case occurs if a JSON file contains only empty arrays or arrays of nulls (such as a file that contains only the first three records above.) In Drill 1.12 and earlier, the result is a list of {{LATE}} elements (See the List section below.) It seems that {{SqlLine}} will correctly show the null values. An interesting case occurs when Dri
[jira] [Created] (DRILL-6048) ListVector is incomplete and broken, RepeatedListVector works
Paul Rogers created DRILL-6048: -- Summary: ListVector is incomplete and broken, RepeatedListVector works Key: DRILL-6048 URL: https://issues.apache.org/jira/browse/DRILL-6048 Project: Apache Drill Issue Type: Bug Affects Versions: 1.10.0 Reporter: Paul Rogers Drill provides two kinds of "list vectors": {{ListVector}} and {{RepeatedListVector}}. I attempted to use the {{ListVector}} to implement lists in JSON. While some parts work, others are broken and JIRA tickets were filed. Once things worked well enough to run a query, it turned out that the Project operator failed. Digging into the cause, it appears that the {{ListVector}} is incomplete and not used. Its implementation of {{makeTransferPair()}} was clearly never tested. A list has contents, but when this method attempts to create the contents of the target vector, it fails to create the list contents. Elsewhere, we saw that the constructor did correctly create the vector, and that the {{promoteToUnion()}} had holes. The sheer number of bugs leads to the conclusion that this class is not, in fact, used or usable. Looking more carefully at the JSON and older writer code, it appears that the ListVector was *not* used for JSON, and that JSON has the limitations of a repeated vector (it cannot support lists with null elements.) This implies that the JSON reader itself is broken as it does not support fully JSON semantics because it does not use the {{ListVector}} that was intended for this purpose. So, the conclusion is that JSON uses: * Repeated vectors for single-dimensional arrays (without null support) * {{RepeatedListVector}} for two-dimensional arrays This triggers the question: what do we do for three-dimensional arrays? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-6046) Define semantics of vector metadata
[ https://issues.apache.org/jira/browse/DRILL-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297684#comment-16297684 ] Paul Rogers commented on DRILL-6046: Suggested improvements. First, to ensure that the metadata tree remains consistent: * The materialized field passed to the constructor is the one used for the vector. * The materialized field created for the vector is final, it can change but cannot be replaced. To ensure consistent vector creation: * Every vector constructor should build itself as defined by the passed-in materialized field. To avoid clutter: * Every vector includes its internal fields and public child fields in the list of children. * Add a field to mark a materialized field as private. Private fields are not compared when checking if two fields are equal. Private fields are ignored when building a new vector. * Provide a method on materialized field to get the public schema (without internal vectors). > Define semantics of vector metadata > --- > > Key: DRILL-6046 > URL: https://issues.apache.org/jira/browse/DRILL-6046 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Priority: Minor > > Vectors provide metadata in the form of the {{MaterializedField}}. This class > has evolved in an ad-hoc fashion over time, resulting in inconsistent > behavior across vectors. The inconsistent behavior causes bugs and slow > development because each vector follows different rules. Consistent behavior > would, by contrast, lead to faster development and fewer bugs by reducing the > number of variations that code must handle. > Issues include: > * Map vectors, but not lists, can create contents given a list of children in > the {{MaterializedField}} passed to the constructor. > * {{MaterializedField}} appears to want to be immutable, but it does allow > changing of children. Unions also want to change the list of subtypes, but > that is in the immutable {{MajorType}}, causing unions to rebuild and replace > its {{MaterializedField}} on addition of a new type. By contrast, maps do not > replace the field, they just add children. > * Container vectors (maps, unions, lists) hold references to child > {{MaterializedFields}}. But, because unions replace their fields, parents > become out of sync since they point to the old, version before the update, > causing inconsistent metadata, so that code cannot trust the metadata. > * Lists and maps, but not unions, list their children in the field. > * Nullable types, but not repeated types, include internal vectors in their > list of children. > * When creating a map, as discussed above, the map creates children based on > the field. But, the constructor clones the field so that the actual field in > the map is not the one passed in. As a result, a parent vector, which holds a > child map, points to the original map field, not the cloned one, leading to > inconsistency if the child map later adds more fields. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
[ https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297542#comment-16297542 ] ASF GitHub Bot commented on DRILL-5993: --- Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r157894540 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/Copier.java --- @@ -19,13 +19,15 @@ import org.apache.drill.exec.compile.TemplateClassDefinition; import org.apache.drill.exec.exception.SchemaChangeException; -import org.apache.drill.exec.ops.FragmentContext; import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.VectorContainer; public interface Copier { - public static TemplateClassDefinition TEMPLATE_DEFINITION2 = new TemplateClassDefinition(Copier.class, CopierTemplate2.class); - public static TemplateClassDefinition TEMPLATE_DEFINITION4 = new TemplateClassDefinition(Copier.class, CopierTemplate4.class); + TemplateClassDefinition TEMPLATE_DEFINITION2 = new TemplateClassDefinition(Copier.class, CopierTemplate2.class); + TemplateClassDefinition TEMPLATE_DEFINITION4 = new TemplateClassDefinition(Copier.class, CopierTemplate4.class); --- End diff -- I will create a separate PR to do that since that change is unrelated to this PR > Allow Copier to Copy a Record and Append to the End of an Outgoing Batch > > > Key: DRILL-5993 > URL: https://issues.apache.org/jira/browse/DRILL-5993 > Project: Apache Drill > Issue Type: New Feature >Reporter: Timothy Farkas >Assignee: Timothy Farkas > > Currently the copier can only copy record from an incoming batch to the > beginning of an outgoing batch. We need to be able to copy a record and > append it to the end of the outgoing batch. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
[ https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297541#comment-16297541 ] ASF GitHub Bot commented on DRILL-5993: --- Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r157894423 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/CopierTemplate2.java --- @@ -53,17 +51,32 @@ public int copyRecords(int index, int recordCount) throws SchemaChangeException } } -int outgoingPosition = 0; +return insertRecords(0, index, recordCount); + } + + @Override + public int appendRecord(int index) throws SchemaChangeException { +return appendRecords(index, 1); + } + + @Override + public int appendRecords(int index, int recordCount) throws SchemaChangeException { +return insertRecords(outgoing.getRecordCount(), index, recordCount); + } + + private int insertRecords(int outgoingPosition, int index, int recordCount) throws SchemaChangeException { +final int endIndex = index + recordCount; -for(int svIndex = index; svIndex < index + recordCount; svIndex++, outgoingPosition++){ +for(int svIndex = index; svIndex < endIndex; svIndex++, outgoingPosition++){ doEval(sv2.getIndex(svIndex), outgoingPosition); } + +outgoing.setRecordCount(outgoingPosition); return outgoingPosition; } - public abstract void doSetup(@Named("context") FragmentContext context, - @Named("incoming") RecordBatch incoming, - @Named("outgoing") RecordBatch outgoing) + public abstract void doSetup(@Named("incoming") RecordBatch incoming, + @Named("outgoing") VectorContainer outgoing) --- End diff -- The copiers are only used in the SVRemover and TopN operator. I have replaced the code generated copiers in both now to use the GenericCopiers. > Allow Copier to Copy a Record and Append to the End of an Outgoing Batch > > > Key: DRILL-5993 > URL: https://issues.apache.org/jira/browse/DRILL-5993 > Project: Apache Drill > Issue Type: New Feature >Reporter: Timothy Farkas >Assignee: Timothy Farkas > > Currently the copier can only copy record from an incoming batch to the > beginning of an outgoing batch. We need to be able to copy a record and > append it to the end of the outgoing batch. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-6047) Update doc to include instructions for libpam4j
Bridget Bevens created DRILL-6047: - Summary: Update doc to include instructions for libpam4j Key: DRILL-6047 URL: https://issues.apache.org/jira/browse/DRILL-6047 Project: Apache Drill Issue Type: Sub-task Components: Documentation Affects Versions: 1.12.0 Reporter: Bridget Bevens Assignee: Bridget Bevens Priority: Minor Fix For: 1.12.0 Update Apache Drill docs to include JPAM and libpam4j PAM authenticator instructions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-6046) Define semantics of vector metadata
[ https://issues.apache.org/jira/browse/DRILL-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-6046: --- Description: Vectors provide metadata in the form of the {{MaterializedField}}. This class has evolved in an ad-hoc fashion over time, resulting in inconsistent behavior across vectors. The inconsistent behavior causes bugs and slow development because each vector follows different rules. Consistent behavior would, by contrast, lead to faster development and fewer bugs by reducing the number of variations that code must handle. Issues include: * Map vectors, but not lists, can create contents given a list of children in the {{MaterializedField}} passed to the constructor. * {{MaterializedField}} appears to want to be immutable, but it does allow changing of children. Unions also want to change the list of subtypes, but that is in the immutable {{MajorType}}, causing unions to rebuild and replace its {{MaterializedField}} on addition of a new type. By contrast, maps do not replace the field, they just add children. * Container vectors (maps, unions, lists) hold references to child {{MaterializedFields}}. But, because unions replace their fields, parents become out of sync since they point to the old, version before the update, causing inconsistent metadata, so that code cannot trust the metadata. * Lists and maps, but not unions, list their children in the field. * Nullable types, but not repeated types, include internal vectors in their list of children. * When creating a map, as discussed above, the map creates children based on the field. But, the constructor clones the field so that the actual field in the map is not the one passed in. As a result, a parent vector, which holds a child map, points to the original map field, not the cloned one, leading to inconsistency if the child map later adds more fields. was: Vectors provide metadata in the form of the {{MaterializedField}}. This class has evolved in an ad-hoc fashion over time, resulting in inconsistent behavior across vectors. The inconsistent behavior causes bugs and slow development because each vector follows different rules. Consistent behavior would, by contrast, lead to faster development and fewer bugs by reducing the number of variations that code must handle. Issues include: * Map vectors, but not lists, can create contents given a list of children in the {{MaterializedField}} passed to the constructor. * {{MaterializedField}} appears to want to be immutable, but it does allow changing of children. Unions also want to change the list of subtypes, but that is in the immutable {{MajorType}}, causing unions to rebuild and replace its {{MaterializedField}} on addition of a new type. By contrast, maps do not replace the field, they just add children. * Container vectors (maps, unions, lists) hold references to child {{MaterializedFields}}. But, because unions replace their fields, parents become out of sync since they point to the old, version before the update, causing inconsistent metadata, so that code cannot trust the metadata. * Lists and maps, but not unions, list their children in the field. * Nullable types, but not repeated types, include internal vectors in their list of children. > Define semantics of vector metadata > --- > > Key: DRILL-6046 > URL: https://issues.apache.org/jira/browse/DRILL-6046 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Priority: Minor > > Vectors provide metadata in the form of the {{MaterializedField}}. This class > has evolved in an ad-hoc fashion over time, resulting in inconsistent > behavior across vectors. The inconsistent behavior causes bugs and slow > development because each vector follows different rules. Consistent behavior > would, by contrast, lead to faster development and fewer bugs by reducing the > number of variations that code must handle. > Issues include: > * Map vectors, but not lists, can create contents given a list of children in > the {{MaterializedField}} passed to the constructor. > * {{MaterializedField}} appears to want to be immutable, but it does allow > changing of children. Unions also want to change the list of subtypes, but > that is in the immutable {{MajorType}}, causing unions to rebuild and replace > its {{MaterializedField}} on addition of a new type. By contrast, maps do not > replace the field, they just add children. > * Container vectors (maps, unions, lists) hold references to child > {{MaterializedFields}}. But, because unions replace their fields, parents > become out of sync since they point to the old, version before the update, > causing inconsistent metadata, so that code cannot trust the metadata. > * Lists and maps, but not unions, list their c
[jira] [Created] (DRILL-6046) Define semantics of vector metadata
Paul Rogers created DRILL-6046: -- Summary: Define semantics of vector metadata Key: DRILL-6046 URL: https://issues.apache.org/jira/browse/DRILL-6046 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.10.0 Reporter: Paul Rogers Priority: Minor Vectors provide metadata in the form of the {{MaterializedField}}. This class has evolved in an ad-hoc fashion over time, resulting in inconsistent behavior across vectors. The inconsistent behavior causes bugs and slow development because each vector follows different rules. Consistent behavior would, by contrast, lead to faster development and fewer bugs by reducing the number of variations that code must handle. Issues include: * Map vectors, but not lists, can create contents given a list of children in the {{MaterializedField}} passed to the constructor. * {{MaterializedField}} appears to want to be immutable, but it does allow changing of children. Unions also want to change the list of subtypes, but that is in the immutable {{MajorType}}, causing unions to rebuild and replace its {{MaterializedField}} on addition of a new type. By contrast, maps do not replace the field, they just add children. * Container vectors (maps, unions, lists) hold references to child {{MaterializedFields}}. But, because unions replace their fields, parents become out of sync since they point to the old, version before the update, causing inconsistent metadata, so that code cannot trust the metadata. * Lists and maps, but not unions, list their children in the field. * Nullable types, but not repeated types, include internal vectors in their list of children. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-6045) Doc new parameter
Bridget Bevens created DRILL-6045: - Summary: Doc new parameter Key: DRILL-6045 URL: https://issues.apache.org/jira/browse/DRILL-6045 Project: Apache Drill Issue Type: Sub-task Components: Documentation Affects Versions: 1.12.0 Reporter: Bridget Bevens Assignee: Bridget Bevens Document the new parameter listed in DRILL-5815 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-6030) Managed sort should minimize number of batches in a k-way merge
[ https://issues.apache.org/jira/browse/DRILL-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297509#comment-16297509 ] ASF GitHub Bot commented on DRILL-6030: --- Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1075 The scenario when all batches can be merged in memory is covered by 'if (canUseMemoryMerge())` check in `SortImpl.java:399`. The affected code path applies only to cases where merge between spilled and in-memory batches is necessary. Note that this is a short term fix to improve managed sort performance, in a long run, it is necessary to have an ability to merge all batches in memory (using SV4) without spilling and be able to merge it with the spilled data. > Managed sort should minimize number of batches in a k-way merge > --- > > Key: DRILL-6030 > URL: https://issues.apache.org/jira/browse/DRILL-6030 > Project: Apache Drill > Issue Type: Improvement >Reporter: Vlad Rozov >Assignee: Vlad Rozov > > The time complexity of the algorithm is O(n*k*log(k)) where k is a number of > batches to merge and n is a number of records in each batch (assuming equal > size batches). As n*k is the total number of record to merge and it can be > quite large, minimizing k should give better results. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-6030) Managed sort should minimize number of batches in a k-way merge
[ https://issues.apache.org/jira/browse/DRILL-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297490#comment-16297490 ] ASF GitHub Bot commented on DRILL-6030: --- Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1075#discussion_r157885846 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java --- @@ -84,7 +85,7 @@ public SortConfig(DrillConfig config) { if (limit > 0) { mergeLimit = Math.max(limit, MIN_MERGE_LIMIT); } else { - mergeLimit = Integer.MAX_VALUE; + mergeLimit = DEFAULT_MERGE_LIMIT; --- End diff -- IMO, it is better to change the default to avoid upgrade problems. In an upgrade scenario, users may simply overwrite `drill-override.conf` from their prior installations and forget to set the merge limit. Is there a reason not to change the default merge limit? > Managed sort should minimize number of batches in a k-way merge > --- > > Key: DRILL-6030 > URL: https://issues.apache.org/jira/browse/DRILL-6030 > Project: Apache Drill > Issue Type: Improvement >Reporter: Vlad Rozov >Assignee: Vlad Rozov > > The time complexity of the algorithm is O(n*k*log(k)) where k is a number of > batches to merge and n is a number of records in each batch (assuming equal > size batches). As n*k is the total number of record to merge and it can be > quite large, minimizing k should give better results. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
[ https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297299#comment-16297299 ] ASF GitHub Bot commented on DRILL-5993: --- Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r157853697 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/CopierTemplate4.java --- @@ -54,17 +52,33 @@ public int copyRecords(int index, int recordCount) throws SchemaChangeException } } -int outgoingPosition = 0; -for(int svIndex = index; svIndex < index + recordCount; svIndex++, outgoingPosition++){ +return insertRecords(0, index, recordCount); + } + + @Override + public int appendRecord(int index) throws SchemaChangeException { +return appendRecords(index, 1); + } --- End diff -- Updated the code and made an implementation of appendRecord which doesn't use a for loop > Allow Copier to Copy a Record and Append to the End of an Outgoing Batch > > > Key: DRILL-5993 > URL: https://issues.apache.org/jira/browse/DRILL-5993 > Project: Apache Drill > Issue Type: New Feature >Reporter: Timothy Farkas >Assignee: Timothy Farkas > > Currently the copier can only copy record from an incoming batch to the > beginning of an outgoing batch. We need to be able to copy a record and > append it to the end of the outgoing batch. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5967) Memory leak by HashPartitionSender
[ https://issues.apache.org/jira/browse/DRILL-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297291#comment-16297291 ] ASF GitHub Bot commented on DRILL-5967: --- Github user ilooner commented on the issue: https://github.com/apache/drill/pull/1073 Removed unnecessary creation of list in OrderedPartitionSenderCreator as discussed. @paul-rogers please take a look. > Memory leak by HashPartitionSender > -- > > Key: DRILL-5967 > URL: https://issues.apache.org/jira/browse/DRILL-5967 > Project: Apache Drill > Issue Type: Bug >Reporter: Timothy Farkas >Assignee: Timothy Farkas > > The error found by [~cch...@maprtech.com] and [~dechanggu] > {code} > 2017-10-25 15:43:28,658 [260eec84-7de3-03ec-300f-7fdbc111fb7c:frag:2:9] ERROR > o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: > Memory was leaked by query. Memory leaked: (9216) > Allocator(op:2:9:0:HashPartitionSender) 100/9216/12831744/100 > (res/actual/peak/limit) > Fragment 2:9 > [Error Id: 7eae6c2a-868c-49f8-aad8-b690243ffe9b on mperf113.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Memory was leaked by query. Memory leaked: (9216) > Allocator(op:2:9:0:HashPartitionSender) 100/9216/12831744/100 > (res/actual/peak/limit) > Fragment 2:9 > [Error Id: 7eae6c2a-868c-49f8-aad8-b690243ffe9b on mperf113.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) > ~[drill-common-1.11.0-mapr.jar:1.11.0-mapr] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:301) > [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) > [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267) > [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.11.0-mapr.jar:1.11.0-mapr] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_121] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_121] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > Caused by: java.lang.IllegalStateException: Memory was leaked by query. > Memory leaked: (9216) > Allocator(op:2:9:0:HashPartitionSender) 100/9216/12831744/100 > (res/actual/peak/limit) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-6044) Shutdown button does not work from WebUI
[ https://issues.apache.org/jira/browse/DRILL-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krystal updated DRILL-6044: --- Attachment: Screen Shot 2017-12-19 at 10.51.16 AM.png > Shutdown button does not work from WebUI > > > Key: DRILL-6044 > URL: https://issues.apache.org/jira/browse/DRILL-6044 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.13.0 >Reporter: Krystal > Attachments: Screen Shot 2017-12-19 at 10.51.16 AM.png > > > git.commit.id.abbrev=eb0c403 > Nothing happens when click on the SHUTDOWN button from the WebUI. The > browser's debugger showed that the request failed due to access control > checks (see attached screen shot). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-6044) Shutdown button does not work from WebUI
Krystal created DRILL-6044: -- Summary: Shutdown button does not work from WebUI Key: DRILL-6044 URL: https://issues.apache.org/jira/browse/DRILL-6044 Project: Apache Drill Issue Type: Bug Components: Client - HTTP Affects Versions: 1.13.0 Reporter: Krystal git.commit.id.abbrev=eb0c403 Nothing happens when click on the SHUTDOWN button from the WebUI. The browser's debugger showed that the request failed due to access control checks (see attached screen shot). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-6030) Managed sort should minimize number of batches in a k-way merge
[ https://issues.apache.org/jira/browse/DRILL-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297218#comment-16297218 ] ASF GitHub Bot commented on DRILL-6030: --- Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/1075 One additional thought. This bug was found when sorting 18 GB of data in 8 GB of memory. That is, a case in which the sort must spill. What happens in the case in which the 18 GB of data is sorted in, say, 20 GB of memory (an in-memory sort)? We don't want the merge limit to force a spill in this case; kind of defeats the purpose of an in-memory sort. So: 1. Does the limit affect in memory sort? If so, we need to revise the solution. 2. Does the in-memory sort suffer from a similar performance issue? If so, we need to revise the in memory sort. One possible solution is to: 1. Defer sorting of individual batches until necessary. 2. Sort batches just before spilling. 3. If all batches fit in memory, do a single, combined sort (using an SV4). > Managed sort should minimize number of batches in a k-way merge > --- > > Key: DRILL-6030 > URL: https://issues.apache.org/jira/browse/DRILL-6030 > Project: Apache Drill > Issue Type: Improvement >Reporter: Vlad Rozov >Assignee: Vlad Rozov > > The time complexity of the algorithm is O(n*k*log(k)) where k is a number of > batches to merge and n is a number of records in each batch (assuming equal > size batches). As n*k is the total number of record to merge and it can be > quite large, minimizing k should give better results. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-6030) Managed sort should minimize number of batches in a k-way merge
[ https://issues.apache.org/jira/browse/DRILL-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297169#comment-16297169 ] ASF GitHub Bot commented on DRILL-6030: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1075#discussion_r157830252 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java --- @@ -84,7 +85,7 @@ public SortConfig(DrillConfig config) { if (limit > 0) { mergeLimit = Math.max(limit, MIN_MERGE_LIMIT); } else { - mergeLimit = Integer.MAX_VALUE; + mergeLimit = DEFAULT_MERGE_LIMIT; --- End diff -- The merge limit is already a config option. (I'd forgotten about that.) The comment on the config option says "Limit on the number of spilled batches that can be merged in a single pass." So, let's just set that default (in `drill-override-conf`) to your new value of 128 and leave the code here unchanged. > Managed sort should minimize number of batches in a k-way merge > --- > > Key: DRILL-6030 > URL: https://issues.apache.org/jira/browse/DRILL-6030 > Project: Apache Drill > Issue Type: Improvement >Reporter: Vlad Rozov >Assignee: Vlad Rozov > > The time complexity of the algorithm is O(n*k*log(k)) where k is a number of > batches to merge and n is a number of records in each batch (assuming equal > size batches). As n*k is the total number of record to merge and it can be > quite large, minimizing k should give better results. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (DRILL-6020) NullPointerException with Union setting on when querying JSON untyped path
[ https://issues.apache.org/jira/browse/DRILL-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker reassigned DRILL-6020: Assignee: Mitchel Labonte > NullPointerException with Union setting on when querying JSON untyped path > -- > > Key: DRILL-6020 > URL: https://issues.apache.org/jira/browse/DRILL-6020 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: Mitchel Labonte >Assignee: Mitchel Labonte > Labels: ready-to-commit > Fix For: 1.13.0 > > > h1. Steps to reproduce > alter session set `exec.enable_union_type`=true; > select tb.level1.dta from dfs.`file.json` tb; > *Content of file.json:* > {noformat} > {"level1":{"dta":{"test":"test"}}} > {"level1":{"dta":"test"}} > {noformat} > h1. Stack trace > Error: SYSTEM ERROR: NullPointerException > Fragment 0:0 > [Error Id: fe267584-32f3-413c-a77c-fc5b5c1ba513 on localhost:31010] > (java.lang.NullPointerException) null > > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatchesUnion():34 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():135 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():130 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldId():201 > org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches():102 > org.apache.drill.exec.record.VectorContainer.getValueVectorId():298 > org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId():313 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():289 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():272 > org.apache.drill.common.expression.SchemaPath.accept():150 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():399 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():331 > org.apache.drill.common.expression.FunctionCall.accept():60 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():169 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():147 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():421 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133 > org.apache.drill.exec.record.AbstractRecordBatch.next():162 > org.apache.drill.exec.physical.impl.BaseRootExec.next():105 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():95 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1657 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1142 > java.util.concurrent.ThreadPoolExecutor$Worker.run():617 > java.lang.Thread.run():745 (state=,code=0) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5967) Memory leak by HashPartitionSender
[ https://issues.apache.org/jira/browse/DRILL-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297050#comment-16297050 ] ASF GitHub Bot commented on DRILL-5967: --- Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1073 LGTM > Memory leak by HashPartitionSender > -- > > Key: DRILL-5967 > URL: https://issues.apache.org/jira/browse/DRILL-5967 > Project: Apache Drill > Issue Type: Bug >Reporter: Timothy Farkas >Assignee: Timothy Farkas > > The error found by [~cch...@maprtech.com] and [~dechanggu] > {code} > 2017-10-25 15:43:28,658 [260eec84-7de3-03ec-300f-7fdbc111fb7c:frag:2:9] ERROR > o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: > Memory was leaked by query. Memory leaked: (9216) > Allocator(op:2:9:0:HashPartitionSender) 100/9216/12831744/100 > (res/actual/peak/limit) > Fragment 2:9 > [Error Id: 7eae6c2a-868c-49f8-aad8-b690243ffe9b on mperf113.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Memory was leaked by query. Memory leaked: (9216) > Allocator(op:2:9:0:HashPartitionSender) 100/9216/12831744/100 > (res/actual/peak/limit) > Fragment 2:9 > [Error Id: 7eae6c2a-868c-49f8-aad8-b690243ffe9b on mperf113.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) > ~[drill-common-1.11.0-mapr.jar:1.11.0-mapr] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:301) > [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) > [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267) > [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.11.0-mapr.jar:1.11.0-mapr] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_121] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_121] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > Caused by: java.lang.IllegalStateException: Memory was leaked by query. > Memory leaked: (9216) > Allocator(op:2:9:0:HashPartitionSender) 100/9216/12831744/100 > (res/actual/peak/limit) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-5919: - Fix Version/s: (was: Future) 1.13.0 > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Labels: doc-impacting > Fix For: 1.13.0 > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. > *For documentation* > 1. Added two session options {{store.json.reader.non_numeric_numbers}} and > {{store.json.reader.non_numeric_numbers}} that allow to read/write NaN and > Infinity as numbers. By default these options are set to false. > 2. Extended signature of {{convert_toJSON}} and {{convert_fromJSON}} > functions by adding second optional parameter that enables read/write NaN and > Infinity. > For example: > {noformat} > select convert_fromJSON('{"key": NaN}') from (values(1)); will result with > JsonParseException, but > select convert_fromJSON('{"key": NaN}', true) from (values(1)); will parse > NaN as a number. > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-6020) NullPointerException with Union setting on when querying JSON untyped path
[ https://issues.apache.org/jira/browse/DRILL-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6020: Labels: ready-to-commit (was: ) > NullPointerException with Union setting on when querying JSON untyped path > -- > > Key: DRILL-6020 > URL: https://issues.apache.org/jira/browse/DRILL-6020 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: Mitchel Labonte > Labels: ready-to-commit > Fix For: 1.13.0 > > > h1. Steps to reproduce > alter session set `exec.enable_union_type`=true; > select tb.level1.dta from dfs.`file.json` tb; > *Content of file.json:* > {noformat} > {"level1":{"dta":{"test":"test"}}} > {"level1":{"dta":"test"}} > {noformat} > h1. Stack trace > Error: SYSTEM ERROR: NullPointerException > Fragment 0:0 > [Error Id: fe267584-32f3-413c-a77c-fc5b5c1ba513 on localhost:31010] > (java.lang.NullPointerException) null > > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatchesUnion():34 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():135 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():130 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldId():201 > org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches():102 > org.apache.drill.exec.record.VectorContainer.getValueVectorId():298 > org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId():313 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():289 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():272 > org.apache.drill.common.expression.SchemaPath.accept():150 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():399 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():331 > org.apache.drill.common.expression.FunctionCall.accept():60 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():169 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():147 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():421 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133 > org.apache.drill.exec.record.AbstractRecordBatch.next():162 > org.apache.drill.exec.physical.impl.BaseRootExec.next():105 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():95 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1657 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1142 > java.util.concurrent.ThreadPoolExecutor$Worker.run():617 > java.lang.Thread.run():745 (state=,code=0) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-6020) NullPointerException with Union setting on when querying JSON untyped path
[ https://issues.apache.org/jira/browse/DRILL-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296859#comment-16296859 ] ASF GitHub Bot commented on DRILL-6020: --- Github user vvysotskyi commented on the issue: https://github.com/apache/drill/pull/1068 @MitchelLabonte, thanks for the pull request, +1 > NullPointerException with Union setting on when querying JSON untyped path > -- > > Key: DRILL-6020 > URL: https://issues.apache.org/jira/browse/DRILL-6020 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: Mitchel Labonte > Fix For: 1.13.0 > > > h1. Steps to reproduce > alter session set `exec.enable_union_type`=true; > select tb.level1.dta from dfs.`file.json` tb; > *Content of file.json:* > {noformat} > {"level1":{"dta":{"test":"test"}}} > {"level1":{"dta":"test"}} > {noformat} > h1. Stack trace > Error: SYSTEM ERROR: NullPointerException > Fragment 0:0 > [Error Id: fe267584-32f3-413c-a77c-fc5b5c1ba513 on localhost:31010] > (java.lang.NullPointerException) null > > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatchesUnion():34 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():135 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():130 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldId():201 > org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches():102 > org.apache.drill.exec.record.VectorContainer.getValueVectorId():298 > org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId():313 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():289 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():272 > org.apache.drill.common.expression.SchemaPath.accept():150 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():399 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():331 > org.apache.drill.common.expression.FunctionCall.accept():60 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():169 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():147 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():421 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133 > org.apache.drill.exec.record.AbstractRecordBatch.next():162 > org.apache.drill.exec.physical.impl.BaseRootExec.next():105 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():95 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1657 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1142 > java.util.concurrent.ThreadPoolExecutor$Worker.run():617 > java.lang.Thread.run():745 (state=,code=0) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-6020) NullPointerException with Union setting on when querying JSON untyped path
[ https://issues.apache.org/jira/browse/DRILL-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296817#comment-16296817 ] ASF GitHub Bot commented on DRILL-6020: --- Github user MitchelLabonte commented on the issue: https://github.com/apache/drill/pull/1068 @vvysotskyi No problem, yes all tests pass now. > NullPointerException with Union setting on when querying JSON untyped path > -- > > Key: DRILL-6020 > URL: https://issues.apache.org/jira/browse/DRILL-6020 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: Mitchel Labonte > Fix For: 1.13.0 > > > h1. Steps to reproduce > alter session set `exec.enable_union_type`=true; > select tb.level1.dta from dfs.`file.json` tb; > *Content of file.json:* > {noformat} > {"level1":{"dta":{"test":"test"}}} > {"level1":{"dta":"test"}} > {noformat} > h1. Stack trace > Error: SYSTEM ERROR: NullPointerException > Fragment 0:0 > [Error Id: fe267584-32f3-413c-a77c-fc5b5c1ba513 on localhost:31010] > (java.lang.NullPointerException) null > > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatchesUnion():34 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():135 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():130 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldId():201 > org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches():102 > org.apache.drill.exec.record.VectorContainer.getValueVectorId():298 > org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId():313 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():289 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():272 > org.apache.drill.common.expression.SchemaPath.accept():150 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():399 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():331 > org.apache.drill.common.expression.FunctionCall.accept():60 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():169 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():147 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():421 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133 > org.apache.drill.exec.record.AbstractRecordBatch.next():162 > org.apache.drill.exec.physical.impl.BaseRootExec.next():105 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():95 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1657 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1142 > java.util.concurrent.ThreadPoolExecutor$Worker.run():617 > java.lang.Thread.run():745 (state=,code=0) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-6020) NullPointerException with Union setting on when querying JSON untyped path
[ https://issues.apache.org/jira/browse/DRILL-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296808#comment-16296808 ] ASF GitHub Bot commented on DRILL-6020: --- Github user vvysotskyi commented on the issue: https://github.com/apache/drill/pull/1068 Sorry, I read your removed comment and made a suggestion that this test fails. Did you run all unit tests to see that this change does not break anything? > NullPointerException with Union setting on when querying JSON untyped path > -- > > Key: DRILL-6020 > URL: https://issues.apache.org/jira/browse/DRILL-6020 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: Mitchel Labonte > Fix For: 1.13.0 > > > h1. Steps to reproduce > alter session set `exec.enable_union_type`=true; > select tb.level1.dta from dfs.`file.json` tb; > *Content of file.json:* > {noformat} > {"level1":{"dta":{"test":"test"}}} > {"level1":{"dta":"test"}} > {noformat} > h1. Stack trace > Error: SYSTEM ERROR: NullPointerException > Fragment 0:0 > [Error Id: fe267584-32f3-413c-a77c-fc5b5c1ba513 on localhost:31010] > (java.lang.NullPointerException) null > > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatchesUnion():34 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():135 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():130 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldId():201 > org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches():102 > org.apache.drill.exec.record.VectorContainer.getValueVectorId():298 > org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId():313 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():289 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():272 > org.apache.drill.common.expression.SchemaPath.accept():150 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():399 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():331 > org.apache.drill.common.expression.FunctionCall.accept():60 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():169 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():147 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():421 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133 > org.apache.drill.exec.record.AbstractRecordBatch.next():162 > org.apache.drill.exec.physical.impl.BaseRootExec.next():105 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():95 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1657 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1142 > java.util.concurrent.ThreadPoolExecutor$Worker.run():617 > java.lang.Thread.run():745 (state=,code=0) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-6020) NullPointerException with Union setting on when querying JSON untyped path
[ https://issues.apache.org/jira/browse/DRILL-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296776#comment-16296776 ] ASF GitHub Bot commented on DRILL-6020: --- Github user MitchelLabonte commented on the issue: https://github.com/apache/drill/pull/1068 @vvysotskyi This is happening because the type is cached as a json object from the previous row. The fix is similar to the getFieldIdIfMatches() method so it looks like this is intended behaviour. As you can see in the unit test, the results are what is expected after the fix. I am not sure what else could be done. > NullPointerException with Union setting on when querying JSON untyped path > -- > > Key: DRILL-6020 > URL: https://issues.apache.org/jira/browse/DRILL-6020 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: Mitchel Labonte > Fix For: 1.13.0 > > > h1. Steps to reproduce > alter session set `exec.enable_union_type`=true; > select tb.level1.dta from dfs.`file.json` tb; > *Content of file.json:* > {noformat} > {"level1":{"dta":{"test":"test"}}} > {"level1":{"dta":"test"}} > {noformat} > h1. Stack trace > Error: SYSTEM ERROR: NullPointerException > Fragment 0:0 > [Error Id: fe267584-32f3-413c-a77c-fc5b5c1ba513 on localhost:31010] > (java.lang.NullPointerException) null > > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatchesUnion():34 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():135 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():130 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldId():201 > org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches():102 > org.apache.drill.exec.record.VectorContainer.getValueVectorId():298 > org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId():313 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():289 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():272 > org.apache.drill.common.expression.SchemaPath.accept():150 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():399 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():331 > org.apache.drill.common.expression.FunctionCall.accept():60 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():169 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():147 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():421 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133 > org.apache.drill.exec.record.AbstractRecordBatch.next():162 > org.apache.drill.exec.physical.impl.BaseRootExec.next():105 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():95 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1657 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1142 > java.util.concurrent.ThreadPoolExecutor$Worker.run():617 > java.lang.Thread.run():745 (state=,code=0) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-6020) NullPointerException with Union setting on when querying JSON untyped path
[ https://issues.apache.org/jira/browse/DRILL-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296610#comment-16296610 ] ASF GitHub Bot commented on DRILL-6020: --- Github user vvysotskyi commented on the issue: https://github.com/apache/drill/pull/1068 @MitchelLabonte, I think this NPE is just a consequence of the bug that should be fixed. Please investigate why Drill is trying to use child `PathSegment` when a value has VarChar type. > NullPointerException with Union setting on when querying JSON untyped path > -- > > Key: DRILL-6020 > URL: https://issues.apache.org/jira/browse/DRILL-6020 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: Mitchel Labonte > Fix For: 1.13.0 > > > h1. Steps to reproduce > alter session set `exec.enable_union_type`=true; > select tb.level1.dta from dfs.`file.json` tb; > *Content of file.json:* > {noformat} > {"level1":{"dta":{"test":"test"}}} > {"level1":{"dta":"test"}} > {noformat} > h1. Stack trace > Error: SYSTEM ERROR: NullPointerException > Fragment 0:0 > [Error Id: fe267584-32f3-413c-a77c-fc5b5c1ba513 on localhost:31010] > (java.lang.NullPointerException) null > > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatchesUnion():34 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():135 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches():130 > org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldId():201 > org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches():102 > org.apache.drill.exec.record.VectorContainer.getValueVectorId():298 > org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId():313 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():289 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath():272 > org.apache.drill.common.expression.SchemaPath.accept():150 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():399 > > org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall():331 > org.apache.drill.common.expression.FunctionCall.accept():60 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():169 > org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize():147 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():421 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133 > org.apache.drill.exec.record.AbstractRecordBatch.next():162 > org.apache.drill.exec.physical.impl.BaseRootExec.next():105 > > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81 > org.apache.drill.exec.physical.impl.BaseRootExec.next():95 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1657 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1142 > java.util.concurrent.ThreadPoolExecutor$Worker.run():617 > java.lang.Thread.run():745 (state=,code=0) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-6043) Nullable vector, but not List vector, adds its internal vectors to child list
[ https://issues.apache.org/jira/browse/DRILL-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296436#comment-16296436 ] Paul Rogers commented on DRILL-6043: Once the list passes through project, it does pick up the internal vectors: {noformat} `flat`(LIST:OPTIONAL) [`[DEFAULT]`(LATE:OPTIONAL), `$data$`(LIST:OPTIONAL) ...]]] {noformat} But, the way this information was added is broken. Project added a vector for the data portion of the list. But, it did remove the original "dummy" type created when the list is created. This leaves the list with two children when it should have one. We must have multiple ways that we manipulate list internals, leading to these errors. > Nullable vector, but not List vector, adds its internal vectors to child list > - > > Key: DRILL-6043 > URL: https://issues.apache.org/jira/browse/DRILL-6043 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Priority: Minor > > Each Drill vector has associated metadata in the form of a > {{MaterializeField}} instance. The {{MaterializeField}} contains a list of > children. For a Map vector, the list of children lists the vectors that make > up the map. > Nullable vectors use the list of children to identify the hidden vectors that > make up the nullable vectors: {{$bits$}} and {{$values$}}. > However, repeated vectors (including lists) also have hidden internal > vectors: offsets and values. However, the metadata for repeated types and > lists do not include these in the vector metadata. > We should decide if we need metadata for the implied internal vectors. > (Having it does cause problems since a newly-created schema for a nullable > vector is not equal to the actual schema created by the vector itself.) > If we don't need the internal vector metadata, remove it from the nullable > vectors. > But, if we do need it, add it to the repeated vectors and to lists. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-6043) Nullable vector, but not List vector, adds its internal vectors to child list
Paul Rogers created DRILL-6043: -- Summary: Nullable vector, but not List vector, adds its internal vectors to child list Key: DRILL-6043 URL: https://issues.apache.org/jira/browse/DRILL-6043 Project: Apache Drill Issue Type: Bug Affects Versions: 1.10.0 Reporter: Paul Rogers Priority: Minor Each Drill vector has associated metadata in the form of a {{MaterializeField}} instance. The {{MaterializeField}} contains a list of children. For a Map vector, the list of children lists the vectors that make up the map. Nullable vectors use the list of children to identify the hidden vectors that make up the nullable vectors: {{$bits$}} and {{$values$}}. However, repeated vectors (including lists) also have hidden internal vectors: offsets and values. However, the metadata for repeated types and lists do not include these in the vector metadata. We should decide if we need metadata for the implied internal vectors. (Having it does cause problems since a newly-created schema for a nullable vector is not equal to the actual schema created by the vector itself.) If we don't need the internal vector metadata, remove it from the nullable vectors. But, if we do need it, add it to the repeated vectors and to lists. -- This message was sent by Atlassian JIRA (v6.4.14#64029)