[jira] [Commented] (DRILL-7306) Disable "fast schema" batch for new scan framework
[ https://issues.apache.org/jira/browse/DRILL-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875692#comment-16875692 ] ASF GitHub Bot commented on DRILL-7306: --- paul-rogers commented on issue #1813: DRILL-7306: Disable schema-only batch for new scan framework URL: https://github.com/apache/drill/pull/1813#issuecomment-506999568 When running the full tests, the following failed in `java-exec`: ``` [ERROR] Errors: [ERROR] TestDynamicUDFSupport.testReRegisterTheSameJarWithDifferentContent:600->BaseTestQuery.testRunAndReturn:340 » Rpc ``` When running this unit test in Eclipse, two tests failed: `testDropFunction` and `testSuccessfulRegistrationAfterSeveralRetryAttempts`. Then, `testConcurrentRemoteRegistryUpdateWithDuplicates` hung forever. I believe these tests (and one other) seem to fail about 50% of the time on my builds. For example: ``` [ERROR] Errors: [ERROR] TestPStoreProviders.verifyZkStore:67 » NoSuchElement ``` The workaround seems to be to rebuild all of Drill. That is, the rough pattern seems to be that this test will run once after a clean build, but will fail if run a second time or after a code change. Not sure if this is the exact pattern; something like this happens. The result is that it is hard to tell if my code broke something or if the tests are just flaky. I wonder, is there something we can do to stabilize these tests? All other tests run fine if I rerun them a second time on the same build or after I make a small code change. Anyway, after doing a full rebuild and retest, this commit does pass all unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Disable "fast schema" batch for new scan framework > -- > > Key: DRILL-7306 > URL: https://issues.apache.org/jira/browse/DRILL-7306 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.16.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > The EVF framework is set up to return a "fast schema" empty batch with only > schema as its first batch because, when the code was written, it seemed > that's how we wanted operators to work. However, DRILL-7305 notes that many > operators cannot handle empty batches. > Since the empty-batch bugs show that Drill does not, in fact, provide a "fast > schema" batch, this ticket asks to disable the feature in the new scan > framework. The feature is disabled with a config option; it can be re-enabled > if ever it is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7308) Incorrect Metadata from text file queries
[ https://issues.apache.org/jira/browse/DRILL-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875631#comment-16875631 ] Paul Rogers commented on DRILL-7308: Modified the {{SchemaBuilder}} class to do exactly what I said we don't want to do: it avoids setting the precision if the precision is zero. This allows the (wrong) code in this feature to work. The incorrect code should change. Also removed the empty schema batch so that simple queries return just one batch of data. The result is that the broken code in the REST call should work for simple one-batch queries. Nothing I can do, however, will fix the fact that the schema will be repeated for every batch; fixing that will require changes to the REST code itself. > Incorrect Metadata from text file queries > - > > Key: DRILL-7308 > URL: https://issues.apache.org/jira/browse/DRILL-7308 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.17.0 >Reporter: Charles Givre >Priority: Major > Attachments: Screen Shot 2019-06-24 at 3.16.40 PM.png, domains.csvh > > > I'm noticing some strange behavior with the newest version of Drill. If you > query a CSV file, you get the following metadata: > {code:sql} > SELECT * FROM dfs.test.`domains.csvh` LIMIT 1 > {code} > {code:json} > { > "queryId": "22eee85f-c02c-5878-9735-091d18788061", > "columns": [ > "domain" > ], > "rows": [} > { "domain": "thedataist.com" } ], > "metadata": [ > "VARCHAR(0, 0)", > "VARCHAR(0, 0)" > ], > "queryState": "COMPLETED", > "attemptedAutoLimit": 0 > } > {code} > There are two issues here: > 1. VARCHAR now has precision > 2. There are twice as many columns as there should be. > Additionally, if you query a regular CSV, without the columns extracted, you > get the following: > {code:json} > "rows": [ > { > "columns": "[\"ACCT_NUM\",\"PRODUCT\",\"MONTH\",\"REVENUE\"]" } > ], > "metadata": [ > "VARCHAR(0, 0)", > "VARCHAR(0, 0)" > ], > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (DRILL-7308) Incorrect Metadata from text file queries
[ https://issues.apache.org/jira/browse/DRILL-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872033#comment-16872033 ] Paul Rogers edited comment on DRILL-7308 at 6/30/19 1:54 AM: - Recall that Drill can return not only multiple batches, but multiple "result sets": runs of batches with different schemas. A more sophisticated REST solution would handle this case. I can't find any ProtoBuf field that says that the schema changed. Instead, we'd have to reuse code from elsewhere which compares the current schema to the previous one. Ideally, in that case, we'd create a new JSON element for the second schema. Something like: {code:json} { resultSets: [ { "rows": ... "schema": ... }, { "rows": ... "schema": ... } ] } {code} It is easy to create such a case. Simply create two CSV files, one with 2 columns, the other with three. Use just a simple \{{SELECT * FROM yourTable}} query. You will get two data batches, each with a distinct schema. The current implementation will give just the first schema and all rows, with varying schemas. (Actually, the current implementation will list the two columns, then the three columns, duplicating the first two, but we want to fix that...) This is yet another reason to use a provisioned schema: with such a schema we can guarantee that the entire query will return a single, consistent schema regardless of the variation across files. A quick & dirty solution is to clear and rebuild the schema objects on every batch. That way, the value sent to the user will reflect the last schema which, if you are lucky, will be valid for the initial batches as well as later batches. It is a known open, unresolved issue that Drill does not attempt to merge schema changes, and that unmerged schema changes cannot be handled by ODBC or JDBC clients. We can assume, however, that the users of the REST API won't have messy data and won't run into this issue. was (Author: paul.rogers): Recall that Drill can return not only multiple batches, but multiple "result sets": runs of batches with different schemas. A more sophisticated REST solution would handle this case. I can't find any ProtoBuf field that says that the schema changed. Instead, we'd have to reuse code from elsewhere which compares the current schema to the previous one. Ideally, in that case, we'd create a new JSON element for the second schema. Something like: {code:json} { resultSets: [ { "rows": ... "schema": ... }, { "rows": ... "schema": ... } ] } {code} It is easy to create such a case. Simply create two CSV files, one with 2 columns, the other with three. Use just a simple \{{SELECT * FROM yourTable}} query. You will get two data batches, each with a distinct schema. The current implementation will give just the first schema and all rows, with varying schemas. (Actually, the current implementation will list the two columns, then the three columns, duplicating the first two, but we want to fix that...) This is yet another reason to use a provisioned schema: with such a schema we can guarantee that the entire query will return a single, consistent schema regardless of the variation across files. > Incorrect Metadata from text file queries > - > > Key: DRILL-7308 > URL: https://issues.apache.org/jira/browse/DRILL-7308 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.17.0 >Reporter: Charles Givre >Priority: Major > Attachments: Screen Shot 2019-06-24 at 3.16.40 PM.png, domains.csvh > > > I'm noticing some strange behavior with the newest version of Drill. If you > query a CSV file, you get the following metadata: > {code:sql} > SELECT * FROM dfs.test.`domains.csvh` LIMIT 1 > {code} > {code:json} > { > "queryId": "22eee85f-c02c-5878-9735-091d18788061", > "columns": [ > "domain" > ], > "rows": [} > { "domain": "thedataist.com" } ], > "metadata": [ > "VARCHAR(0, 0)", > "VARCHAR(0, 0)" > ], > "queryState": "COMPLETED", > "attemptedAutoLimit": 0 > } > {code} > There are two issues here: > 1. VARCHAR now has precision > 2. There are twice as many columns as there should be. > Additionally, if you query a regular CSV, without the columns extracted, you > get the following: > {code:json} > "rows": [ > { > "columns": "[\"ACCT_NUM\",\"PRODUCT\",\"MONTH\",\"REVENUE\"]" } > ], > "metadata": [ > "VARCHAR(0, 0)", > "VARCHAR(0, 0)" > ], > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (DRILL-7308) Incorrect Metadata from text file queries
[ https://issues.apache.org/jira/browse/DRILL-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875631#comment-16875631 ] Paul Rogers edited comment on DRILL-7308 at 6/30/19 1:55 AM: - Modified the {{SchemaBuilder}} class to do exactly what I said we don't want to do: it avoids setting the precision if the precision is zero. This allows the (wrong) code in the REST feature to work. Still, the incorrect code should change as explained above to avoid breaking the next time someone sets a precision of 0. Also removed the empty schema batch so that simple queries return just one batch of data. The result is that the broken code in the REST call should work for simple one-batch queries. Nothing I can do, however, will fix the fact that the schema will be repeated for every batch; fixing that will require changes to the REST code itself. was (Author: paul.rogers): Modified the {{SchemaBuilder}} class to do exactly what I said we don't want to do: it avoids setting the precision if the precision is zero. This allows the (wrong) code in this feature to work. The incorrect code should change. Also removed the empty schema batch so that simple queries return just one batch of data. The result is that the broken code in the REST call should work for simple one-batch queries. Nothing I can do, however, will fix the fact that the schema will be repeated for every batch; fixing that will require changes to the REST code itself. > Incorrect Metadata from text file queries > - > > Key: DRILL-7308 > URL: https://issues.apache.org/jira/browse/DRILL-7308 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.17.0 >Reporter: Charles Givre >Priority: Major > Attachments: Screen Shot 2019-06-24 at 3.16.40 PM.png, domains.csvh > > > I'm noticing some strange behavior with the newest version of Drill. If you > query a CSV file, you get the following metadata: > {code:sql} > SELECT * FROM dfs.test.`domains.csvh` LIMIT 1 > {code} > {code:json} > { > "queryId": "22eee85f-c02c-5878-9735-091d18788061", > "columns": [ > "domain" > ], > "rows": [} > { "domain": "thedataist.com" } ], > "metadata": [ > "VARCHAR(0, 0)", > "VARCHAR(0, 0)" > ], > "queryState": "COMPLETED", > "attemptedAutoLimit": 0 > } > {code} > There are two issues here: > 1. VARCHAR now has precision > 2. There are twice as many columns as there should be. > Additionally, if you query a regular CSV, without the columns extracted, you > get the following: > {code:json} > "rows": [ > { > "columns": "[\"ACCT_NUM\",\"PRODUCT\",\"MONTH\",\"REVENUE\"]" } > ], > "metadata": [ > "VARCHAR(0, 0)", > "VARCHAR(0, 0)" > ], > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7306) Disable "fast schema" batch for new scan framework
[ https://issues.apache.org/jira/browse/DRILL-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875630#comment-16875630 ] ASF GitHub Bot commented on DRILL-7306: --- paul-rogers commented on issue #1813: DRILL-7306: Disable schema-only batch for new scan framework URL: https://github.com/apache/drill/pull/1813#issuecomment-506999568 When running the full tests, the following failed in `java-exec`: ``` [ERROR] Errors: [ERROR] TestDynamicUDFSupport.testReRegisterTheSameJarWithDifferentContent:600->BaseTestQuery.testRunAndReturn:340 » Rpc ``` When running this unit test in Eclipse, two tests failed: `testDropFunction` and `testSuccessfulRegistrationAfterSeveralRetryAttempts`. Then, `testConcurrentRemoteRegistryUpdateWithDuplicates` hung forever. I believe these tests (and one other that I can't recall) seem to fail about 50% of the time on my builds. The workaround seems to be to rebuild all of Drill. That is, the rough pattern seems to be that this test will run once after a clean build, but will fail if run a second time or after a code change. Not sure if this is the exact pattern; something like this happens. The result is that it is hard to tell if my code broke something or if the tests are just flaky. I wonder, is there something we can do to stabilize these tests? All other tests run fine if I rerun them a second time on the same build or after I make a small code change. Anyway, after doing a full rebuild and retest, this commit does pass all unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Disable "fast schema" batch for new scan framework > -- > > Key: DRILL-7306 > URL: https://issues.apache.org/jira/browse/DRILL-7306 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.16.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > The EVF framework is set up to return a "fast schema" empty batch with only > schema as its first batch because, when the code was written, it seemed > that's how we wanted operators to work. However, DRILL-7305 notes that many > operators cannot handle empty batches. > Since the empty-batch bugs show that Drill does not, in fact, provide a "fast > schema" batch, this ticket asks to disable the feature in the new scan > framework. The feature is disabled with a config option; it can be re-enabled > if ever it is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7306) Disable "fast schema" batch for new scan framework
[ https://issues.apache.org/jira/browse/DRILL-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875606#comment-16875606 ] ASF GitHub Bot commented on DRILL-7306: --- paul-rogers commented on issue #1813: DRILL-7306: Disable schema-only batch for new scan framework URL: https://github.com/apache/drill/pull/1813#issuecomment-506989680 Addressed the TestEmptyInputSql failure. The code now recognizes two cases: 1. Empty results: the reader provided a schema, but had no rows. (This is the case that failed.) 2. Null results: the reader provides neither rows nor schema. This is the case that was always being followed, even if we have a schema. Changed the query builder row set code to return an empty row set if the output contains only an empty batch and contains a schema. The code continues to return no row set if the result is null. (Oddly, Drill will return a batch with no rows and no schema if the reader returns no batches at all.) Will address other issues in separate commits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Disable "fast schema" batch for new scan framework > -- > > Key: DRILL-7306 > URL: https://issues.apache.org/jira/browse/DRILL-7306 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.16.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > The EVF framework is set up to return a "fast schema" empty batch with only > schema as its first batch because, when the code was written, it seemed > that's how we wanted operators to work. However, DRILL-7305 notes that many > operators cannot handle empty batches. > Since the empty-batch bugs show that Drill does not, in fact, provide a "fast > schema" batch, this ticket asks to disable the feature in the new scan > framework. The feature is disabled with a config option; it can be re-enabled > if ever it is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7306) Disable "fast schema" batch for new scan framework
[ https://issues.apache.org/jira/browse/DRILL-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875632#comment-16875632 ] ASF GitHub Bot commented on DRILL-7306: --- paul-rogers commented on issue #1813: DRILL-7306: Disable schema-only batch for new scan framework URL: https://github.com/apache/drill/pull/1813#issuecomment-506989680 Addressed the `TestEmptyInputSql` failure. The code now recognizes two cases: 1. Empty results: the reader provided a schema, but had no rows. (This is the case that failed.) 2. Null results: the reader provides neither rows nor schema. This is the case that was always being followed, even if we have a schema. Changed the query builder row set code to return an empty row set if the output contains only an empty batch and contains a schema. The code continues to return no row set if the result is null. (Oddly, Drill will return a batch with no rows and no schema if the reader returns no batches at all.) Will address other issues in separate commits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Disable "fast schema" batch for new scan framework > -- > > Key: DRILL-7306 > URL: https://issues.apache.org/jira/browse/DRILL-7306 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.16.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > The EVF framework is set up to return a "fast schema" empty batch with only > schema as its first batch because, when the code was written, it seemed > that's how we wanted operators to work. However, DRILL-7305 notes that many > operators cannot handle empty batches. > Since the empty-batch bugs show that Drill does not, in fact, provide a "fast > schema" batch, this ticket asks to disable the feature in the new scan > framework. The feature is disabled with a config option; it can be re-enabled > if ever it is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (DRILL-7308) Incorrect Metadata from text file queries
[ https://issues.apache.org/jira/browse/DRILL-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872964#comment-16872964 ] Paul Rogers edited comment on DRILL-7308 at 6/29/19 6:13 PM: - [~cgivre], the problem here is that the code shown earlier is counting on a Protobuf implementation detail that is not actually a part of the Drill schema specification (to the degree there is such a specification.) For VarChar, a precision of 0 means that the user requested {{VARCHAR}}, while a precision of, say, 10 means the user requested {{VARCHAR(10)}}. The scale field is never valid for {{VARCHAR}}. The output of {{VARCHAR(0,0)}} is not a problem with the code that generated the schema. Instead, it is a problem with the way that the REST code attempts to generate a type name from the schema structures. To be more precise, the REST code incorrectly assumes that the {{isSet()}} methods are the correct way to check for a 0 value. This is an incorrect assumption. The Protobuf issue is that, unlike a regular Java object, if we never actually write to the precision field, then the value is unset. If we write, even if we write 0, the value is set. We certainly don't want to litter our code with things like: {code:java} if (precision != 0) { schemaBuilder.setPrecision(precision); } {code} So, the code that uses the schema objects should do the following to determine if the value is other than the default: both ask if the value is set, and if so, ask if the value is non-zero. As it turns out, the unset value is 0, so there is actually no need to ask if the value is set in this case. Taking a step back, the type formatting code should not even be in the REST API. The proper place for it is in {{Types}}. In fact, {{Types}} already has the desired function: {{getExtendedSqlTypeName()}}. However, this function only formats decimals; we need to add a case clause for VARCHAR. Note that {{getExtendedSqlTypeName()}} exposes the *SQL name* for types. The current REST implementation exposes the internal Drill name. That is, {{getExtendedSqlTypeName()}} will report, say, {{DOUBLE}} while the REST code will report {{Float8}}. This is probably a bug since the documentation explains the SQL types, not the internal types. That said, I actually have not seen any places in Drill where we set or use the VARCHAR width. So, no point in trying to format it. In this case, you can just use {{getExtendedSqlTypeName()}} directly as-is. Or, if we want to display the width, add the required code to that function. Please file a separate JIRA for the UDF issue. Please provide an attachment or link to a sample UDF. I'll see if I can track down that CSV-specific issue in case it relates to the EVF. was (Author: paul.rogers): [~cgivre], the problem here is that the code shown earlier is counting on a Protobuf implementation detail that is not actually a part of the Drill schema specification (to the degree there is such a specification.) For VarChar, a precision of 0 means that the user requested {{VARCHAR}}, while a precision of, say, 10 means the user requested {{VARCHAR(10}}. The scale is never valid for {{VARCHAR}}, it is an artifact of the incorrect way the above code was written. The Protobuf issue is that, unlike a regular Java object, if we never actually write to the precision field, then the value is unset. If we write, even if we write 0, the value is set. We certainly don't want to litter our code with things like: {code:java} if (precision != 0) { schemaBuilder.setPrecision(precision); } {code} So, we should ask if the precision is set and non-zero. In fact, the type formatting code should not even be in the REST API. The proper place for it is in {{Types}}. In fact, that class already has the desired function: {{getExtendedSqlTypeName()}}. However, this function only formats decimals; we need to add a case clause for VARCHAR. That said, I actually have not seen any places in Drill where we set or use the VARCHAR width. So, no point in trying to format it. In this case, you can just use {{getExtendedSqlTypeName()}} directly as-is. Please file a separate JIRA for the UDF issue. Please provide an attachment or link to a sample UDF. I'll see if I can track down that CSV-specific issue in case it relates to the EVF. > Incorrect Metadata from text file queries > - > > Key: DRILL-7308 > URL: https://issues.apache.org/jira/browse/DRILL-7308 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.17.0 >Reporter: Charles Givre >Priority: Major > Attachments: Screen Shot 2019-06-24 at 3.16.40 PM.png, domains.csvh > > > I'm noticing some strange behavior with the newest version of Drill. If you > query a CSV file, you get the following
[jira] [Commented] (DRILL-6225) Add support for boost 1.68 with openSSL 1.1.0/1.1.1 support
[ https://issues.apache.org/jira/browse/DRILL-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875509#comment-16875509 ] ASF GitHub Bot commented on DRILL-6225: --- arina-ielchiieva commented on issue #1817: DRILL-6225: Add support for boost 1.68 with openSSL 1.1.0/1.1.1 support URL: https://github.com/apache/drill/pull/1817#issuecomment-506957237 @debraj92 please squash the commits and address protobuf job failures - https://travis-ci.org/apache/drill/jobs/552040788. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add support for boost 1.68 with openSSL 1.1.0/1.1.1 support > --- > > Key: DRILL-6225 > URL: https://issues.apache.org/jira/browse/DRILL-6225 > Project: Apache Drill > Issue Type: Task > Components: Client - C++ >Reporter: Rob Wu >Assignee: Debraj Ray >Priority: Minor > > Boost 1.57 is not able to compile with openSSL 1.1 > ([https://svn.boost.org/trac10/ticket/12238)] and adding > add_definitions(-DOPENSSL_API_COMPAT=0x1000L) does not work. > > In order to add support for openSSL 1.1, we would need to upgrade boost to > 1.62+. However, it looks like boost 1.62 bcp will segfault on asio component > when you attempt to shade the boost libraries > ([https://svn.boost.org/trac10/ticket/12357)]. So in that case, we should > upgrade to 1.64 > +. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7174) Expose complex to Json control in the Drill C++ Client
[ https://issues.apache.org/jira/browse/DRILL-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva reassigned DRILL-7174: --- Assignee: Arina Ielchiieva > Expose complex to Json control in the Drill C++ Client > -- > > Key: DRILL-7174 > URL: https://issues.apache.org/jira/browse/DRILL-7174 > Project: Apache Drill > Issue Type: Task >Reporter: Rob Wu >Assignee: Arina Ielchiieva >Priority: Minor > Fix For: 1.17.0 > > > Arjun Gupta will be supplying a patch for this > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7174) Expose complex to Json control in the Drill C++ Client
[ https://issues.apache.org/jira/browse/DRILL-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7174: Fix Version/s: 1.17.0 > Expose complex to Json control in the Drill C++ Client > -- > > Key: DRILL-7174 > URL: https://issues.apache.org/jira/browse/DRILL-7174 > Project: Apache Drill > Issue Type: Task >Reporter: Rob Wu >Priority: Minor > Fix For: 1.17.0 > > > Arjun Gupta will be supplying a patch for this > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7174) Expose complex to Json control in the Drill C++ Client
[ https://issues.apache.org/jira/browse/DRILL-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva reassigned DRILL-7174: --- Assignee: (was: Arina Ielchiieva) > Expose complex to Json control in the Drill C++ Client > -- > > Key: DRILL-7174 > URL: https://issues.apache.org/jira/browse/DRILL-7174 > Project: Apache Drill > Issue Type: Task >Reporter: Rob Wu >Priority: Minor > Fix For: 1.17.0 > > > Arjun Gupta will be supplying a patch for this > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7174) Expose complex to Json control in the Drill C++ Client
[ https://issues.apache.org/jira/browse/DRILL-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875510#comment-16875510 ] ASF GitHub Bot commented on DRILL-7174: --- arina-ielchiieva commented on issue #1814: DRILL-7174: Expose complex to Json control in the Drill C++ Client URL: https://github.com/apache/drill/pull/1814#issuecomment-506957298 @vvysotskyi could you please review? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Expose complex to Json control in the Drill C++ Client > -- > > Key: DRILL-7174 > URL: https://issues.apache.org/jira/browse/DRILL-7174 > Project: Apache Drill > Issue Type: Task >Reporter: Rob Wu >Priority: Minor > > Arjun Gupta will be supplying a patch for this > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7174) Expose complex to Json control in the Drill C++ Client
[ https://issues.apache.org/jira/browse/DRILL-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7174: Reviewer: Volodymyr Vysotskyi > Expose complex to Json control in the Drill C++ Client > -- > > Key: DRILL-7174 > URL: https://issues.apache.org/jira/browse/DRILL-7174 > Project: Apache Drill > Issue Type: Task >Reporter: Rob Wu >Assignee: Arina Ielchiieva >Priority: Minor > Fix For: 1.17.0 > > > Arjun Gupta will be supplying a patch for this > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7310) Move schema-related classes from exec module to be able to use them in metastore module
[ https://issues.apache.org/jira/browse/DRILL-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875557#comment-16875557 ] ASF GitHub Bot commented on DRILL-7310: --- asfgit commented on pull request #1816: DRILL-7310: Move schema-related classes from exec module to be able to use them in metastore module URL: https://github.com/apache/drill/pull/1816 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Move schema-related classes from exec module to be able to use them in > metastore module > --- > > Key: DRILL-7310 > URL: https://issues.apache.org/jira/browse/DRILL-7310 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Volodymyr Vysotskyi >Assignee: Volodymyr Vysotskyi >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > Currently, most of the schema related classes are placed in the {{exec}} > module, but some of them should be used in {{metastore}} module. > {{metastore}} module doesn't have a dependency onto exec one. > The solution is to move these classes from {{exec}} into another module which > is used by {{metastore}}, so they will be accessible for {{metastore}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6711) Use jitpack repository for Drill Calcite project artifacts instead of repository.mapr.com
[ https://issues.apache.org/jira/browse/DRILL-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875556#comment-16875556 ] ASF GitHub Bot commented on DRILL-6711: --- asfgit commented on pull request #1815: DRILL-6711: Use jitpack repository for Drill Calcite project artifacts instead of repository.mapr.com URL: https://github.com/apache/drill/pull/1815 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Use jitpack repository for Drill Calcite project artifacts instead of > repository.mapr.com > - > > Key: DRILL-6711 > URL: https://issues.apache.org/jira/browse/DRILL-6711 > Project: Apache Drill > Issue Type: Task >Reporter: Arina Ielchiieva >Assignee: Volodymyr Vysotskyi >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > Simplify deployment of Drill Calcite project artifacts by using > [https://jitpack.io/]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7306) Disable "fast schema" batch for new scan framework
[ https://issues.apache.org/jira/browse/DRILL-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875562#comment-16875562 ] ASF GitHub Bot commented on DRILL-7306: --- arina-ielchiieva commented on issue #1813: DRILL-7306: Disable schema-only batch for new scan framework URL: https://github.com/apache/drill/pull/1813#issuecomment-506972377 @paul-rogers When running tests there are unit and functional test failures. Please run full unit tests suit locally before making the PR, Travis does not do that. UNIT TESTS ``` [INFO] Running org.apache.drill.exec.TestEmptyInputSql 05:53:50.840 [main] ERROR org.apache.drill.TestReporter - Test Failed (d: 0 B(1.0 MiB), h: 6.3 MiB(863.9 MiB), nh: 32 B(324.6 MiB)): testQueryEmptyCsv(org.apache.drill.exec.TestEmptyInputSql) java.lang.Exception: Expected and actual numbers of columns do not match. at org.apache.drill.test.DrillTestWrapper.compareSchemaOnly(DrillTestWrapper.java:486) ~[test-classes/:1.17.0-SNAPSHOT] at org.apache.drill.test.DrillTestWrapper.run(DrillTestWrapper.java:163) ~[test-classes/:1.17.0-SNAPSHOT] at org.apache.drill.exec.TestEmptyInputSql.testQueryEmptyCsv(TestEmptyInputSql.java:222) ~[test-classes/:na] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_161]. ``` FUNCTIONAL TESTS ``` Data Verification Failures: Query: /root/drillAutomation/drill-test-framework/framework/resources/Functional/limit0/union_all/prq_union_all/data/emptyLHS_CSV.q SELECT cast(columns[0] as int) FROM `emptyFiles/empty_1.csv` UNION ALL SELECT col1 FROM notEmpty_csv_v Baseline: /root/drillAutomation/drill-test-framework/framework/resources/Functional/limit0/union_all/prq_union_all/data/emptyLHS_CSV.e Expected number of rows: 10 Actual number of rows from Drill: 10 Number of matching rows: 0 Number of rows missing: 10 Number of rows unexpected: 10 These rows are not expected (first 10): null These rows are missing (first 10): 1 (1 occurence(s)) 2 (1 occurence(s)) 3 (1 occurence(s)) 4 (1 occurence(s)) 5 (1 occurence(s)) 6 (1 occurence(s)) 7 (1 occurence(s)) 8 (1 occurence(s)) 9 (1 occurence(s)) 10 (1 occurence(s)) ``` Please fix the failures and rebase on the latest master. Also when I was cherry-picking DRILL-7306 & DRILL-6951 there were conflicts. You can consider creating merge branch with commits for these Jiras and resolve the conflicts to ease merge process. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Disable "fast schema" batch for new scan framework > -- > > Key: DRILL-7306 > URL: https://issues.apache.org/jira/browse/DRILL-7306 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.16.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > The EVF framework is set up to return a "fast schema" empty batch with only > schema as its first batch because, when the code was written, it seemed > that's how we wanted operators to work. However, DRILL-7305 notes that many > operators cannot handle empty batches. > Since the empty-batch bugs show that Drill does not, in fact, provide a "fast > schema" batch, this ticket asks to disable the feature in the new scan > framework. The feature is disabled with a config option; it can be re-enabled > if ever it is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)