[GitHub] [drill] dobesv opened a new pull request #1977: DRILL-7573: Support htpasswd based authentication
dobesv opened a new pull request #1977: DRILL-7573: Support htpasswd based authentication URL: https://github.com/apache/drill/pull/1977 # [DRILL-7573](https://issues.apache.org/jira/browse/DRILL-7573): Support htpasswd based authentication ## Description This allows you to specify `htpasswd` as your authentication implementation. In this case, users will be authenticated using usernames and password taken from a text file in `htpasswd` format. This gives some more flexibility compared to the PAM authenticator. For example, in docker / kubernetes you can mount a folder with an htpasswd file and update that file when you want to add/remove users, without any concern about interfering with the contents of /etc/passwd and /etc/shadow. ## Documentation # Using a password file for authentication Apache Drill allows you to store valid usernames and passwords in a text file in the popular "htpasswd" format. This can be more convenient than using PAM in containerized environments, because you do not have to modify any system files like `passwd`, `shadow`, or files in `pam.d`. Instead you can mount a volume with the `htpasswd` file in it and tell drill to use that file for authentication. To configure this feature: 1. Create an htpasswd file and copy/mount it to/on the drillbit machines/containers: $ htpasswd /path/to/htpasswd $USER 2. Add the following configuration to the `drill.exec` block in the `/conf/drill-override.conf` file: drill.exec: { security.auth.mechanisms : ["PLAIN"], security.user.auth: { enabled: true, packages += "org.apache.drill.exec.rpc.user.security", impl: "htpasswd", htpasswd: { file: "/path/to/htpasswd" } } } 3. Restart the drillbit(s) 4. Now you must use a username/password from the `htpasswd` file when logging into Drill Note: Currently the `crypt` and `bcrypt` algorithms are not supported, you should probably use the MD5 hashing algorithm used by default by the `htpasswd` command. ## Testing I created an `htpasswd` file using `htpasswd`, configured the auth mechanism as shown above, and testing logging in with both valid and invalid passwords with MD5, SHA-1, and plantext password hashes in the `htpasswd` files. No automated tests so far, but I'm open to advice on how/where to add them. Still very new to the code base. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] cgivre commented on issue #1976: [SECURITY] Use HTTPS to resolve dependencies in Maven Build
cgivre commented on issue #1976: [SECURITY] Use HTTPS to resolve dependencies in Maven Build URL: https://github.com/apache/drill/pull/1976#issuecomment-584436772 Thanks for the contribution. This should be an easy contribution however, Drill's procedure for PRs is to first create a JIRA issue. Could you please: 1. Create a JIRA issue (https://issues.apache.org) 2. Include the issue number in the title and commit message Thank you very much for the contribution. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] JLLeitschuh opened a new pull request #1976: [SECURITY] Use HTTPS to resolve dependencies in Maven Build
JLLeitschuh opened a new pull request #1976: [SECURITY] Use HTTPS to resolve dependencies in Maven Build URL: https://github.com/apache/drill/pull/1976 [![mitm_build](https://user-images.githubusercontent.com/1323708/59226671-90645200-8ba1-11e9-8ab3-39292bef99e9.jpeg)](https://medium.com/@jonathan.leitschuh/want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-1fc329d898fb?source=friends_link=3c99970c55a899ad9ef41f126efcde0e) - [Want to take over the Java ecosystem? All you need is a MITM!](https://medium.com/@jonathan.leitschuh/want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-1fc329d898fb?source=friends_link=3c99970c55a899ad9ef41f126efcde0e) - [Update: Want to take over the Java ecosystem? All you need is a MITM!](https://medium.com/bugbountywriteup/update-want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-d069d253fe23?source=friends_link=8c8e52a7d57b98d0b7e541665688b454) --- This is a security fix for a vulnerability in your [Apache Maven](https://maven.apache.org/) `pom.xml` file(s). The build files indicate that this project is resolving dependencies over HTTP instead of HTTPS. This leaves your build vulnerable to allowing a [Man in the Middle](https://en.wikipedia.org/wiki/Man-in-the-middle_attack) (MITM) attackers to execute arbitrary code on your or your computer or CI/CD system. This vulnerability has a CVSS v3.0 Base Score of [8.1/10](https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H). [POC code](https://max.computer/blog/how-to-take-over-the-computer-of-any-java-or-clojure-or-scala-developer/) has existed since 2014 to maliciously compromise a JAR file in-flight. MITM attacks against HTTP are [increasingly common](https://security.stackexchange.com/a/12050), for example [Comcast is known to have done it to their own users](https://thenextweb.com/insights/2017/12/11/comcast-continues-to-inject-its-own-code-into-websites-you-visit/#). This contribution is a part of a submission to the [GitHub Security Lab](https://securitylab.github.com/) Bug Bounty program. ## Detecting this and Future Vulnerabilities This vulnerability was automatically detected by [LGTM.com](https://lgtm.com) using this [CodeQL Query](https://lgtm.com/rules/155648721/). As of September 2019 LGTM.com and Semmle are [officially a part of GitHub](https://github.blog/2019-09-18-github-welcomes-semmle/). You can automatically detect future vulnerabilities like this by enabling the free (for open-source) [LGTM App](https://github.com/marketplace/lgtm). I'm not an employee of GitHub nor of Semmle, I'm simply a user of [LGTM.com](https://lgtm.com) and an open-source security researcher. ## Source Yes, this contribution was automatically generated, however, the code to generate this PR was lovingly hand crafted to bring this security fix to your repository. The source code that generated and submitted this PR can be found here: [JLLeitschuh/bulk-security-pr-generator](https://github.com/JLLeitschuh/bulk-security-pr-generator) ## Opting-Out If you'd like to opt-out of future automated security vulnerability fixes like this, please consider adding a file called `.github/GH-ROBOTS.txt` to your repository with the line: ``` User-agent: JLLeitschuh/bulk-security-pr-generator Disallow: * ``` This bot will respect the [ROBOTS.txt](https://moz.com/learn/seo/robotstxt) format for future contributions. Alternatively, if this project is no longer actively maintained, consider [archiving](https://help.github.com/en/github/creating-cloning-and-archiving-repositories/about-archiving-repositories) the repository. ## CLA Requirements _This section is only relevant if your project requires contributors to sign a Contributor License Agreement (CLA) for external contributions._ It is unlikely that I'll be able to directly sign CLAs. However, all contributed commits are already automatically signed-off. > The meaning of a signoff depends on the project, but it typically certifies that committer has the rights to submit this work under the same license and agrees to a Developer Certificate of Origin > (see [https://developercertificate.org/](https://developercertificate.org/) for more information). > > \- [Git Commit Signoff documentation](https://developercertificate.org/) If signing your organization's CLA is a strict-requirement for merging this contribution, please feel free to close this PR. ## Tracking All PR's generated as part of this fix are tracked here: https://github.com/JLLeitschuh/bulk-security-pr-generator/issues/2 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL
[GitHub] [drill] cgivre commented on issue #1973: DRILL-7575: Fix typo in class name of FormSecurityHandler
cgivre commented on issue #1973: DRILL-7575: Fix typo in class name of FormSecurityHandler URL: https://github.com/apache/drill/pull/1973#issuecomment-584256826 LGTM +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] dobesv commented on issue #1973: DRILL-7575: Fix typo in class name of FormSecurityHandler
dobesv commented on issue #1973: DRILL-7575: Fix typo in class name of FormSecurityHandler URL: https://github.com/apache/drill/pull/1973#issuecomment-584238274 OK, done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] dobesv commented on issue #1972: DRILL-7562: Support HTTP Basic authentication for REST API calls
dobesv commented on issue #1972: DRILL-7562: Support HTTP Basic authentication for REST API calls URL: https://github.com/apache/drill/pull/1972#issuecomment-584238503 I just updated the commit message to include the JIRA issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376942618 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/WriterRecordBatch.java ## @@ -152,11 +154,18 @@ private void addOutputContainerData() { container.setRecordCount(1); } - protected void setupNewSchema() throws IOException { + protected void setupNewSchema() { try { // update the schema in RecordWriter stats.startSetup(); - recordWriter.updateSchema(incoming); + try { +recordWriter.updateSchema(incoming); + } catch (IOException e) { +// TODO: This is better handled inside updateSchema() Review comment: same question as above This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376999100 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/orderedpartitioner/OrderedPartitionRecordBatch.java ## @@ -87,6 +87,8 @@ * value is determined by where each record falls in the partition table. This * column is used by PartitionSenderRootExec to determine which bucket to assign * each record to. + * + * This code is not used. Review comment: if the class is never used why don't just delete it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r377003326 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/FrameSupportTemplate.java ## @@ -175,7 +177,7 @@ private int processROWS(int row) throws DrillException { return row; } - private int processRANGE(int row) throws DrillException { + private int processRANGE(int row) throws SchemaChangeException { Review comment: ```suggestion private int processRange(int row) throws SchemaChangeException { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376908070 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/RootExec.java ## @@ -20,34 +20,38 @@ import org.apache.drill.exec.proto.ExecProtos.FragmentHandle; /** - * Functionality - * - * A FragmentRoot is a node which is the last processing node in a query plan. FragmentTerminals include Exchange - * output nodes and storage nodes. They are there driving force behind the completion of a query. - * - * Assumptions - * - * All implementations of {@link RootExec} assume that all their methods are called by the same thread. + * Node which is the last processing node in a query plan. FragmentTerminals + * include Exchange output nodes and storage nodes. They are there driving force + * behind the completion of a query. * + * Assumes that all implementations of {@link RootExec} assume that all their + * methods are called by the same thread. */ public interface RootExec extends AutoCloseable { + /** * Do the next batch of work. - * @return Whether or not additional batches of work are necessary. False means that this fragment is done. + * + * @return Whether or not additional batches of work are necessary. False + * means that this fragment is done. */ boolean next(); /** - * Inform sender that receiving fragment is finished and doesn't need any more data. This can be called multiple - * times (once for each downstream receiver). If all receivers are finished then a subsequent call to {@link #next()} - * will return false. - * @param handle The handle pointing to the downstream receiver that does not need anymore data. + * Inform sender that receiving fragment is finished and doesn't need any more + * data. This can be called multiple times (once for each downstream + * receiver). If all receivers are finished then a subsequent call to + * {@link #next()} will return false. + * + * @param handle + * The handle pointing to the downstream receiver that does not need + * anymore data. */ void receivingFragmentFinished(FragmentHandle handle); /** - * Dump failed batches' state preceded by its parent's state to logs. Invoked when there is a - * failure during fragment execution. + * Dump failed batches' state preceded by its parent's state to logs. Invoked + * when there is a failure during fragment execution. Review comment: Please add ```@param``` section to the JavaDoc describing where the trowable came from. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376903426 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/BaseRootExec.java ## @@ -184,4 +179,38 @@ public void close() throws Exception { } } } + + /** + * Given a list of operators and a stack trace, walks the stack trace and + * the operator list to find the leaf-most operator, which is the one + * that was active when the exception was thrown. Handle the cases in + * which no operator was active, each operator had multiple methods on + * the stack, or the exception was thrown in some class called by + * the operator. + * + * Not all operators leave a mark in the trace. In particular if a the + * call stack is only through base-class methods, then we have no way to + * know the actual class during the call. This is OK because the leaf + * methods are just pass-through operations, they are unlikely to fail. + * + * @param the type of the operator. Parameterized to allow easier + * testing + * @param dag the list of operators from root-most to leaf-most + * @param e the exception thrown somewhere in the operator tree + * @return the leaf-most operator, if any + */ + public static T findLeaf(List dag, Throwable e) { +StackTraceElement[] trace = e.getStackTrace(); Review comment: Minor comment: here we could collect a ```Set``` of class name strings ignoring those names which don't start with ```org.apache.drill``` package name. Like: ```java Set drillStackTrace = Stream.of(trace) .map(StackTraceElement::getClassName) .filter(cn -> cn.startsWith(DRILL_PACKAGE_NAME)) .collect(Collectors.toSet()); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376919500 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScreenCreator.java ## @@ -83,47 +83,47 @@ public boolean innerNext() { IterOutcome outcome = next(incoming); logger.trace("Screen Outcome {}", outcome); switch (outcome) { - case STOP: -return false; - case NONE: -if (firstBatch) { - // this is the only data message sent to the client and may contain the schema - QueryWritableBatch batch; - QueryData header = QueryData.newBuilder() -.setQueryId(context.getHandle().getQueryId()) -.setRowCount(0) -.setDef(RecordBatchDef.getDefaultInstance()) -.build(); - batch = new QueryWritableBatch(header); +case STOP: + return false; +case NONE: + if (firstBatch) { +// this is the only data message sent to the client and may contain the schema +QueryWritableBatch batch; +QueryData header = QueryData.newBuilder() + .setQueryId(context.getHandle().getQueryId()) + .setRowCount(0) + .setDef(RecordBatchDef.getDefaultInstance()) + .build(); +batch = new QueryWritableBatch(header); + +stats.startWait(); +try { + userConnection.sendData(batch); +} finally { + stats.stopWait(); +} +firstBatch = false; // we don't really need to set this. But who knows! + } + return false; +case OK_NEW_SCHEMA: + materializer = new VectorRecordMaterializer(context, oContext, incoming); + //$FALL-THROUGH$ +case OK: + injector.injectPause(context.getExecutionControls(), "sending-data", logger); + final QueryWritableBatch batch = materializer.convertNext(); + updateStats(batch); stats.startWait(); try { userConnection.sendData(batch); } finally { stats.stopWait(); } - firstBatch = false; // we don't really need to set this. But who knows! -} - -return false; - case OK_NEW_SCHEMA: -materializer = new VectorRecordMaterializer(context, oContext, incoming); -//$FALL-THROUGH$ - case OK: -injector.injectPause(context.getExecutionControls(), "sending-data", logger); -final QueryWritableBatch batch = materializer.convertNext(); -updateStats(batch); -stats.startWait(); -try { - userConnection.sendData(batch); -} finally { - stats.stopWait(); -} -firstBatch = false; - -return true; - default: -throw new UnsupportedOperationException(); + firstBatch = false; + + return true; +default: + throw new UnsupportedOperationException(); Review comment: Here would be nice to add message containing ```outcome``` which caused the exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376951117 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/filter/RuntimeFilterBatchCreator.java ## @@ -29,7 +29,9 @@ public class RuntimeFilterBatchCreator implements BatchCreator{ Review comment: ```suggestion public class RuntimeFilterBatchCreator implements BatchCreator { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376947666 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/SpilledRecordBatch.java ## @@ -163,11 +162,12 @@ public IterOutcome next() { container = vas.get(); } } catch (IOException e) { - lastOutcome = IterOutcome.STOP; - throw UserException.dataReadError(e).addContext("Failed reading from a spill file").build(logger); + throw UserException.dataReadError(e) + .addContext("Failed reading from a spill file") + .build(logger); } catch (Exception e) { - lastOutcome = IterOutcome.STOP; - throw e; + // TODO: Catch the error closer to the cause and create a better error message. Review comment: can we avoid adding new todos? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376951049 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/filter/RuntimeFilterBatchCreator.java ## @@ -29,7 +29,9 @@ public class RuntimeFilterBatchCreator implements BatchCreator{ @Override - public CloseableRecordBatch getBatch(ExecutorFragmentContext context, RuntimeFilterPOP config, List children) throws ExecutionSetupException { + public CloseableRecordBatch getBatch(ExecutorFragmentContext context, + RuntimeFilterPOP config, List children) + throws ExecutionSetupException { Review comment: ```java public CloseableRecordBatch getBatch(ExecutorFragmentContext context, RuntimeFilterPOP config, List children) throws ExecutionSetupException { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376945843 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java ## @@ -340,16 +340,12 @@ public IterOutcome innerNext() { } case UPDATE_AGGREGATOR: - context.getExecutorState().fail(UserException.unsupportedError() + throw UserException.unsupportedError() .message(SchemaChangeException.schemaChanged( "Hash aggregate does not support schema change", incomingSchema, incoming.getSchema()).getMessage()) - .build(logger)); - close(); - killIncoming(false); - firstBatch = false; - return IterOutcome.STOP; + .build(logger); Review comment: could you please explain why ```close()``` and ```killIncoming(false)``` invocations were removed for the case? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376923043 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/StatisticsWriterRecordBatch.java ## @@ -91,41 +93,47 @@ public IterOutcome innerNext() { } // process the complete upstream in one next() call IterOutcome upstream; -try { - do { -upstream = next(incoming); - -switch(upstream) { - case STOP: -return upstream; - - case NOT_YET: - case NONE: -break; - - case OK_NEW_SCHEMA: -setupNewSchema(); -// $FALL-THROUGH$ - case OK: +do { + upstream = next(incoming); + + switch(upstream) { +case STOP: + return upstream; + +case NOT_YET: +case NONE: + break; + +case OK_NEW_SCHEMA: + setupNewSchema(); + // $FALL-THROUGH$ +case OK: + try { counter += statsRecordWriterImpl.writeStatistics(incoming.getRecordCount()); -logger.debug("Total records written so far: {}", counter); - -for(final VectorWrapper v : incoming) { - v.getValueVector().clear(); -} -break; - - default: -throw new UnsupportedOperationException(); -} - } while(upstream != IterOutcome.NONE); - // Flush blocking writers now + } catch (IOException e) { +// TODO: Better handled inside the write() method. Review comment: such todos usually live forever in the code, could you please improve the ```StatisticsRecordWriterImpl.writeStatistics``` in this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r377003222 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/FrameSupportTemplate.java ## @@ -156,7 +158,7 @@ private int processPartition(final int currentRow) throws DrillException { } } - private int processROWS(int row) throws DrillException { + private int processROWS(int row) throws SchemaChangeException { Review comment: ```suggestion private int processRows(int row) throws SchemaChangeException { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376926556 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/TopN/PriorityQueueTemplate.java ## @@ -142,11 +143,7 @@ public void add(RecordBatchData batch) throws SchemaChangeException{ public void generate() { Stopwatch watch = Stopwatch.createStarted(); final DrillBuf drillBuf = allocator.buffer(4 * queueSize); -try { - finalSv4 = new SelectionVector4(drillBuf, queueSize, 4000); -} catch (SchemaChangeException e) { - throw AbstractRecordBatch.schemaChangeException(e, "Priority Queue", logger); -} +finalSv4 = new SelectionVector4(drillBuf, queueSize, 4000); Review comment: please convert magic number to constant. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376942403 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/WriterRecordBatch.java ## @@ -77,54 +80,53 @@ public BatchSchema getSchema() { @Override public IterOutcome innerNext() { if (processed) { -// cleanup(); // if the upstream record batch is already processed and next() is called by // downstream then return NONE to indicate completion return IterOutcome.NONE; } // process the complete upstream in one next() call IterOutcome upstream; -try { - do { -upstream = next(incoming); - -switch(upstream) { - case STOP: -return upstream; - - case NOT_YET: +do { + upstream = next(incoming); + + switch(upstream) { +case STOP: + return upstream; + +case NOT_YET: + break; +case NONE: + if (schema != null) { +// Schema is for the output batch schema which is setup in setupNewSchema(). Since the output +// schema is fixed ((Fragment(VARCHAR), Number of records written (BIGINT)) we should set it +// up even with 0 records for it to be reported back to the client. break; - case NONE: -if (schema != null) { - // Schema is for the output batch schema which is setup in setupNewSchema(). Since the output - // schema is fixed ((Fragment(VARCHAR), Number of records written (BIGINT)) we should set it - // up even with 0 records for it to be reported back to the client. - break; -} - - case OK_NEW_SCHEMA: -setupNewSchema(); -// $FALL-THROUGH$ - case OK: -counter += eventBasedRecordWriter.write(incoming.getRecordCount()); -logger.debug("Total records written so far: {}", counter); + } -for(final VectorWrapper v : incoming) { - v.getValueVector().clear(); -} -break; - - default: -throw new UnsupportedOperationException(); -} - } while(upstream != IterOutcome.NONE); -} catch(IOException ex) { - logger.error("Failure during query", ex); - kill(false); - context.getExecutorState().fail(ex); - return IterOutcome.STOP; -} +case OK_NEW_SCHEMA: + setupNewSchema(); + // $FALL-THROUGH$ +case OK: + try { +counter += eventBasedRecordWriter.write(incoming.getRecordCount()); + } catch (IOException e) { +// TODO: Better handled inside the write() method. Review comment: can we avoid leaving new todos in the code? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376890243 ## File path: common/src/main/java/org/apache/drill/common/exceptions/DrillRuntimeException.java ## @@ -17,8 +17,11 @@ */ package org.apache.drill.common.exceptions; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + public class DrillRuntimeException extends RuntimeException { - private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(DrillRuntimeException.class); + private static final Logger logger = LoggerFactory.getLogger(DrillRuntimeException.class); Review comment: Seems like the logger is unused and may be safely removed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376924072 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/StatisticsWriterRecordBatch.java ## @@ -154,11 +162,18 @@ private void addOutputContainerData() { container.setRecordCount(1); } - protected void setupNewSchema() throws IOException { + protected void setupNewSchema() { try { // update the schema in RecordWriter stats.startSetup(); - recordWriter.updateSchema(incoming); + try { +recordWriter.updateSchema(incoming); + } catch (IOException e) { +// TODO: This is better handled inside updateSchema() Review comment: I would like to ask for fixing it, thank you in advance. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376954034 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/filter/RuntimeFilterRecordBatch.java ## @@ -196,20 +194,24 @@ private void setupHashHelper() { ValueVectorReadExpression toHashFieldExp = new ValueVectorReadExpression(typedFieldId); hashFieldExps.add(toHashFieldExp); } -hash64 = hashHelper.getHash64(hashFieldExps.toArray(new LogicalExpression[hashFieldExps.size()]), typedFieldIds.toArray(new TypedFieldId[typedFieldIds.size()])); +hash64 = hashHelper.getHash64(hashFieldExps.toArray( +new LogicalExpression[hashFieldExps.size()]), +typedFieldIds.toArray(new TypedFieldId[typedFieldIds.size()])); } catch (Exception e) { throw UserException.internalError(e).build(logger); } } } /** - * If RuntimeFilter is available then applies the filter condition on the incoming batch records and creates an SV2 - * to store indexes which passes the filter condition. In case when RuntimeFilter is not available it just pass + * If RuntimeFilter is available then applies the filter condition on the + * incoming batch records and creates an SV2 to store indexes which passes the + * filter condition. In case when RuntimeFilter is not available it just pass * through all the records from incoming batch to downstream. + * * @throws SchemaChangeException Review comment: ```suggestion ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors URL: https://github.com/apache/drill/pull/1975#discussion_r376912149 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java ## @@ -212,13 +213,24 @@ private boolean shouldContinueAfterNoRecords() throws Exception { } return true; } else { // Regular scan - currentReader.close(); - currentReader = null; + closeCurrentReader(); return true; // In regular case, we always continue the iteration, if no more reader, we will break out at the head of loop } } - private IterOutcome internalNext() throws Exception { + private void closeCurrentReader() { +try { + if (currentReader != null) { Review comment: I would suggest putting whole ```try-catch-finally``` inside the if, or using our ```Autocloseables``` like: ```java AutoCloseables.closeSilently(currentReader); currentReader = null; ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services